xmodaler.datasets¶
- class xmodaler.datasets.DatasetFromList(lst: list, copy: bool = True, serialize: bool = True)[source]¶
Bases:
Dataset
Wrap a list to a torch Dataset. It produces elements of the list as data.
- __init__(lst: list, copy: bool = True, serialize: bool = True)[source]¶
- Parameters:
lst (list) – a list which contains elements to produce.
copy (bool) – whether to deepcopy the element when producing it, so that the result can be modified in place without affecting the source in the list.
serialize (bool) – whether to hold memory using serialized objects, when enabled, data loader workers can use shared RAM from master process instead of making a copy.
- class xmodaler.datasets.MapDataset(dataset, map_func)[source]¶
Bases:
Dataset
Map a function over the elements in a dataset.
- Parameters:
dataset – a dataset where map function is applied.
map_func – a callable which maps the element in dataset. map_func is responsible for error handling, when error happens, it needs to return None so the MapDataset will randomly use other elements from the dataset.
- class xmodaler.datasets.MSCoCoDataset(stage: str, anno_file: str, seq_per_img: int, max_feat_num: int, max_seq_len: int, feats_folder: str, relation_file: str, gv_feat_file: str, attribute_file: str)[source]¶
Bases:
object
- class xmodaler.datasets.MSCoCoSampleByTxtDataset(stage: str, anno_file: str, seq_per_img: int, max_feat_num: int, max_seq_len: int, feats_folder: str, relation_file: str, gv_feat_file: str, attribute_file: str)[source]¶
Bases:
MSCoCoDataset
- __call__(dataset_dict)¶
Call self as a function.
- __init__(stage: str, anno_file: str, seq_per_img: int, max_feat_num: int, max_seq_len: int, feats_folder: str, relation_file: str, gv_feat_file: str, attribute_file: str)[source]¶
- classmethod from_config(cfg, stage: str = 'train')¶
- load_data(cfg)¶
- class xmodaler.datasets.MSCoCoBertDataset(stage: str, anno_file: str, seq_per_img: int, max_seq_length: int, max_feat_num: int, feats_folder: str, images_ids_file: str, tokenizer)[source]¶
Bases:
object
- class xmodaler.datasets.ConceptualCaptionsDataset(stage: str, anno_file: str, max_seq_length: int, max_feat_num: int, feats_folder: str, images_ids_file: str, tokenizer)[source]¶
Bases:
object
- class xmodaler.datasets.ConceptualCaptionsDatasetForSingleStream(stage: str, anno_file: str, max_seq_length: int, max_feat_num: int, feats_folder: str, images_ids_file: str, tokenizer, itm_neg_prob: float)[source]¶
Bases:
ConceptualCaptionsDataset
- __init__(stage: str, anno_file: str, max_seq_length: int, max_feat_num: int, feats_folder: str, images_ids_file: str, tokenizer, itm_neg_prob: float)[source]¶
- load_data(cfg)¶
- class xmodaler.datasets.VQADataset(stage: str, anno_folder: str, ans2label_path: str, label2ans_path: str, feats_folder: str, max_feat_num: int, max_seq_len: int, use_global_v: bool, tokenizer)[source]¶
Bases:
object
- class xmodaler.datasets.VCRDataset(stage: str, task_name: str, anno_folder: str, feats_folder: str, max_feat_num: int, max_seq_len: int, seq_per_img: int, use_global_v: bool, tokenizer)[source]¶
Bases:
object
- class xmodaler.datasets.Flickr30kDataset(stage: str, anno_folder: str, anno_file: str, feats_folder: str, max_feat_num: int, max_seq_len: int, use_global_v: bool, tokenizer)[source]¶
Bases:
object
- class xmodaler.datasets.Flickr30kDatasetForSingleStream(stage: str, anno_folder: str, anno_file: str, feats_folder: str, max_feat_num: int, max_seq_len: int, use_global_v: bool, negative_size: int, tokenizer, cfg)[source]¶
Bases:
Flickr30kDataset
- __init__(stage: str, anno_folder: str, anno_file: str, feats_folder: str, max_feat_num: int, max_seq_len: int, use_global_v: bool, negative_size: int, tokenizer, cfg)[source]¶
- load_data(cfg)¶
- class xmodaler.datasets.Flickr30kDatasetForSingleStreamVal(stage: str, anno_folder: str, anno_file: str, feats_folder: str, max_feat_num: int, max_seq_len: int, use_global_v: bool, inf_batch_size: int, tokenizer, cfg)[source]¶
Bases:
Flickr30kDataset
- class xmodaler.datasets.MSVDDataset(stage: str, anno_file: str, seq_per_img: int, max_feat_num: int, max_seq_len: int, feats_folder: str)[source]¶
Bases:
object
- class xmodaler.datasets.MSRVTTDataset(stage: str, anno_file: str, seq_per_img: int, max_feat_num: int, max_seq_len: int, feats_folder: str)[source]¶
Bases:
object
- xmodaler.datasets.build_xmodaler_train_loader(datalist, *, dataset_mapper, batch_size, num_workers)[source]¶