In this tutorial, we will go through the step-by-step process of creating a new model using MMF. In this case, we will create a fusion model and train it on the Hateful Memes dataset.
The fusion model that we will create concatenates embeddings from a text encoder and an image encoder and passes them through a two-layer classifier. MMF provides standard image and text encoders out of the box. For the image encoder, we will use ResNet152 image encoder and for the text encoder, we will use BERT-Base Encoder.
Follow the prerequisites for installation and dataset here.
We will start building our model
ConcatBERTTutorial using the various building blocks available in MMF. Helper builder methods like
build_image_encoder for building image encoders,
build_text_encoder for building text encoders,
build_classifier_layer for classifier layers etc take configurable params which are defined in the config we will create in the next section. Follow the code and read through the comments to understand how the model is implemented.
The model’s forward method takes a
SampleList and outputs a dict containing the logit scores predicted by the model. Different losses and metrics can be calculated on the scores output.
We will define two configs needed for our experiments: (i) a model config for the model's default configurations, and (ii) an experiment config for our particular experiment. The model config provides the model’s default hyperparameters and the experiment config defines and overrides the defaults needed for our particular experiment such as optimizer type, learning rate, maximum updates and batch size.
We will now create the model config file with the params we used while creating the model and store the config in
In the next step, we will create the experiment config which will tell MMF which dataset, optimizer, scheduler, metrics for evalauation to use. We will save this config in
We include the
bert.yaml config in this as we want to use BERT tokenizer for preprocessing our language data. With both the configs ready we are ready to launch training and evaluation using our model on the Hateful Memes dataset. You can read more about the MMF’s configuration system here.
Now we are ready to train and evaluate our model with the experiment config we created in previous step.
When training ends it will save a final model
concat_bert_tutorial_final.pth in the experiment folder under
./save directory. More details about checkpoints can be found here. The command will also generate validation scores after the training gets over.