Data Processor¶
-
class
piepline.data_processor.data_processor.
DataProcessor
(model: torch.nn.modules.module.Module, device: torch.device = None)[source]¶ DataProcessor manage: model, data processing, device choosing
- Args:
- model (Module): model, that will be used for process data device (torch.device): what device pass data for processing
-
predict
(data: torch.Tensor) → object[source]¶ Make predict by data
Parameters: data – data as torch.Tensor
or dict with keydata
Returns: processed output Return type: the model output type
-
set_pick_model_input
(pick_model_input: callable) → piepline.data_processor.data_processor.DataProcessor[source]¶ Set callback, that will get output from
DataLoader
and return model input.Default mode:
lambda data: data[‘data’]
- Args:
- pick_model_input (callable): pick model input callable. This callback need to get one parameter: dataset output
- Returns:
- self object
Examples:
data_processor.set_pick_model_input(lambda data: data['data']) data_processor.set_pick_model_input(lambda data: data[0])
-
class
piepline.data_processor.data_processor.
TrainDataProcessor
(train_config: piepline.train_config.train_config.BaseTrainConfig, device: torch.device = None)[source]¶ TrainDataProcessor is make all of DataProcessor but produce training process.
Parameters: train_config – train config -
get_state
() → {}[source]¶ Get model and optimizer state dicts
Returns: dict with keys [weights, optimizer]
-
predict
(data, is_train=False) → torch.Tensor[source]¶ Make predict by data. If
is_train
isTrue
- this operation will compute gradients. Ifis_train
isFalse
- this will work withmodel.eval()
andtorch.no_grad
Parameters: - data – data in dict
- is_train – is data processor need train on data or just predict
Returns: processed output
Return type: model return type
-
process_batch
(batch: {}, is_train: bool) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor][source]¶ Process one batch of data
- Args:
- batch (dict): contains ‘data’ and ‘target’ keys. The values for key must be instance of torch.Tensor or dict is_train (bool): is batch process for train
- Returns:
- tuple of class:torch.Tensor of losses, predicts and targets with shape (N, …) where N is batch size
-
set_data_preprocess
(data_preprocess: callable) → piepline.data_processor.data_processor.DataProcessor[source]¶ Set callback, that will get output from
DataLoader
and return preprocessed data. For example may be used for pass data to device.Default mode:
_pass_data_to_device()
- Args:
- data_preprocess (callable): preprocess callable. This callback need to get one parameter: dataset output
- Returns:
- self object
Examples:
from piepline.utils import dict_recursive_bypass data_processor.set_data_preprocess(lambda data: dict_recursive_bypass(data, lambda v: v.cuda()))
-
set_pick_target
(pick_target: callable) → piepline.data_processor.data_processor.DataProcessor[source]¶ Set callback, that will get output from
DataLoader
and return target.Default mode:
lambda data: data[‘target’]
- Args:
- pick_target (callable): pick target callable. This callback need to get one parameter: dataset output
- Returns:
- self object
Examples:
data_processor.set_pick_target(lambda data: data['target']) data_processor.set_pick_target(lambda data: data[1])
-