VQA Challenge

VQA Challenge is available at this link.

In MMF, we provide the starter code for various baseline models for this challenge. VQA2.0 dataset will also be automatically downloaded during first training.

In this tutorial, we provide steps for running training and evaluation with VisualBERT model on VQA2.0 dataset and generating submission file for the challenge. The same steps can be used for other models.


Follow the prerequisites for installation of mmf here.

Training and Evaluation


For running training on train set, run the following command:

mmf_run config=projects/visual_bert/configs/vqa2/defaults.yaml \
model=visual_bert \
dataset=vqa2 \

This will train the visual_bert model on the dataset and generate the checkpoints and best trained model (visual_bert_final.pth) will be stored in an experiment folder under the ./save directory by default.


Next run evaluation on the validation set:

mmf_run config=projects/visual_bert/configs/vqa2/defaults.yaml \
model=visual_bert \
dataset=vqa2 \
run_type=val \

This will give you the performance of your model on the validation set. The metric will be VQA Accuracy.

Predictions for Challenge

After training the model and evaluated on the validation set, we will generate the predictions on the test-dev and test-std set. The prediction file should contain the following for each sample:

  • Question ID, question_id
  • Answer, answer
"question_id": "INT",
"answer": "STRING"
"question_id": "...",
"answer": "..."

With MMF you can directly generate the predictions in the required submission format with the following command:

mmf_predict config=projects/visual_bert/configs/vqa2/defaults.yaml \
model=visual_bert \
dataset=vqa2 \
run_type=test \

This command will output where the generated predictions JSON file is stored.

Submission for Challenge

Next you can upload the generated json file to EvalAI page for VQA here. To check your results, you can go in 'My submissions' section and check the phase where you submitted your results file.

Last updated on by Vedanuj Goswami