Image Feature Extraction
In this tutorial, we will go through the step-by-step process of generating image features with FasterRCNN feature extractors. MMF provides utility scripts for running image feature extractions using different models (Faster RCNN with X-101 backbones and X-152 backbones). For example:
tools/scripts/features/extract_features_vmb.py. The script allows you to optionally parallelize the feature extraction execution.
Here are the steps in a nut shell:
- Prerequisites: setup
- Install vqa-maskrcnn-benchmark
- Download the dataset (not covered in this tutorial)
- Identify which vision feature extractor you'd like to use
- Extract image features
- Extract image features with slurm
Create a new conda environment for feature extraction repo installation:
A new conda environment is created so that the installation does not mess with the mmf conda environment.
Follow this to install mmf in this new conda environment: maskrcnn_benchmark
The following instructions is to install the maskrcnn-benchmark repo from here.
- PyTorch >1.0 from a nightly release. Installation instructions can be found in this
- torchvision from master
- GCC >= 4.9
- (optional) OpenCV for the webcam demo
We provide the model weights of two feature extractors based on FasterRCNN: Resnet101 and Resnet152. They are pretrained on VisualGenome. In this tutorial, we use the FasterRCNN-ResNet101 feature extractor.
To use a different feature extractor, you can override the
model_file param to point at the feature extractor model file.
#Extract Image Features
<FOLDER_PATH_TO_DATASET> that have PNG/JPG/JPEG extensions will have their features extracted with the following invocation.
#Extract Image Features with cluster workload manager (e.g., Slurm)
We can utilize slurms based cluster workload manager to do image feature extraction in parallel on multiple machines. This can greatly speed up the processing time if you have lots of images that need to have their features extracted. Please refer to
mmf/mmf/tools/scripts/features/extract_features_vmb.py to see how you can adapt it to work for your purpose. As an example here, I showcase how to run image feature extraction on Flickr test set on 2 machines.
#Separate the images into 2 set
<IMAGE_LISTS_FOLDER> folder that contains 2 files, each of the file contains a list of full image paths with newline as delimiter.
flickr_test_extract_image_feature.sh write the following: