datasets.base_dataset¶
- class mmf.datasets.base_dataset.BaseDataset(dataset_name, config, dataset_type='train', *args, **kwargs)[source]¶
Base class for implementing a dataset. Inherits from PyTorch’s Dataset class but adds some custom functionality on top. Processors mentioned in the configuration are automatically initialized for the end user.
- Parameters
dataset_name (str) – Name of your dataset to be used a representative in text strings
dataset_type (str) – Type of your dataset. Normally, train|val|test
config (DictConfig) – Configuration for the current dataset
- load_item(idx)[source]¶
Implement if you need to separately load the item and cache it.
- Parameters
idx (int) – Index of the sample to be loaded.
- prepare_batch(batch)[source]¶
Can be possibly overridden in your child class. Not supported w Lightning trainer
Prepare batch for passing to model. Whatever returned from here will be directly passed to model’s forward function. Currently moves the batch to proper device.
- Parameters
batch (SampleList) – sample list containing the currently loaded batch
- Returns
- Returns a sample representing current
batch loaded
- Return type
sample_list (SampleList)