System¶

class pytorch_wrapper.system.System(model, last_activation=None, device=<sphinx.ext.autodoc.importer._MockObject object>)¶

Bases: object

A system contains the usual methods needed for a deep learning model (train, evaluate, predict, save, load, etc).

Parameters:

model – An nn.Module object that represents the whole model. The module’s forward method must return a Tensor or a Dict of Tensors.
last_activation – Callable that needs to be called at non train time. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method. If the last activation is performed inside the model then pass None.
device – Device on which the model should reside.

device¶

evaluate(data_loader, evaluators, batch_input_key='input', verbose=True)¶

Evaluates the model on a dataset.

Parameters:

data_loader – DataLoader object that generates batches of the evaluation dataset. Each batch must be a Dict that contains the input of the model (key=`batch_input_key`) as well as the information needed by the evaluators.
evaluators – Dictionary containing objects derived from AbstractEvaluator. The keys are the evaluators’ names.
batch_input_key – The key of the batches returned by the data_loader that contains the input of the model.
verbose – Whether to print progress info.

Returns:

Dict containing an object derived from AbstractEvaluatorResults for each evaluator.

evaluate_on_multi_gpus(data_loader, evaluators, batch_input_key='input', verbose=True, multi_gpu_device_ids=None, multi_gpu_output_device=None, multi_gpu_dim=0)¶

Evaluates the model on a dataset using multiple GPUs. At the end of training the model is moved back to the device it was on at the beginning.

Parameters:

data_loader – DataLoader object that generates batches of the evaluation dataset. Each batch must be a Dict that contains the input of the model (key=`batch_input_key`) as well as the information needed by the evaluators.
evaluators – Dictionary containing objects derived from AbstractEvaluator. The keys are the evaluators’ names.
batch_input_key – The key of the batches returned by the data_loader that contains the input of the model.
verbose – Whether to print progress info.
multi_gpu_device_ids – CUDA devices used during training (default: all devices).
multi_gpu_output_device – Device location of output (default: device_ids[0]).
multi_gpu_dim – Int dimension on which to split each batch.

Returns:

Dict containing an object derived from AbstractEvaluatorResults for each evaluator.

static load(f)¶

Loads a System from a file. The model will reside in the CPU initially.

Parameters:	f – a file-like object (has to implement write and flush) or a string containing a file name.

load_model_state(f, strict=True)¶

Loads the model’s state from a file.

Parameters:	f – a file-like object (has to implement write and flush) or a string containing a file name. strict – Whether the file must contain exactly the same weight keys as the model.
Returns:	NamedTuple with two lists (missing_keys and unexpected_keys).

predict(data_loader, perform_last_activation=True, batch_id_key=None, batch_input_key='input', model_output_key=None, verbose=True)¶

Computes the outputs of the model on a dataset.

Parameters:

data_loader – DataLoader object that generates batches of data. Each batch must be a Dict that contains at least a Tensor or a list/tuple of Tensors containing the input(s) of the model(key=`batch_input_key`).
perform_last_activation – Whether to perform the last_activation.
batch_id_key – Key where the dict returned by the dataloader contains the ids of the examples. Leave None if there are no ids.
batch_input_key – Key where the dict returned by the dataloader contains the input of the model.
model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
verbose – Whether to print progress info.

Returns:

Dict containing a list of predictions (key=`outputs`) and a list of ids (key=`batch_id_key`) if provided by the dataloader.

predict_batch(single_batch_input)¶

Computes the output of the model for a single batch.

Parameters:	single_batch_input – Tensor or list of Tensors [tensor_1, tensor_2, …] that correspond to the input of the model.
Returns:	The output of the model.

predict_on_multi_gpus(data_loader, perform_last_activation=True, batch_id_key=None, batch_input_key='input', model_output_key=None, verbose=True, multi_gpu_device_ids=None, multi_gpu_output_device=None, multi_gpu_dim=0)¶

Computes the outputs of the model on a dataset using multiple GPUs. At the end of training the model is moved back to the device it was on at the beginning.

Parameters:

data_loader – DataLoader object that generates batches of data. Each batch must be a Dict that contains at least a Tensor or a list/tuple of Tensors containing the input(s) of the model(key=`batch_input_key`).
perform_last_activation – Whether to perform the last_activation.
batch_id_key – Key where the dict returned by the dataloader contains the ids of the examples. Leave None if there are no ids.
batch_input_key – Key where the dict returned by the dataloader contains the input of the model.
model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
verbose – Whether to print progress info.
multi_gpu_device_ids – CUDA devices used during training (default: all devices).
multi_gpu_output_device – Device location of output (default: device_ids[0]).
multi_gpu_dim – Int dimension on which to split each batch.

Returns:

Dict containing a list of predictions (key=`outputs`) and a list of ids (key=`batch_id_key`) if provided by the dataloader.

pure_predict(data_loader, batch_input_key='input', keep_batches=True, verbose=True)¶

Computes the output of the model on a dataset.

Parameters:

data_loader – DataLoader object that generates batches of data. Each batch must be a Dict that contains at least a Tensor or a list/tuple of Tensors containing the input(s) of the model(key=`batch_input_key`).
batch_input_key – The key of the batches returned by the data_loader that contains the input of the model.
keep_batches – If set to True then the method also returns a list of the batches returned by the dataloader.
verbose – Whether to print progress info.

Returns:

Dict containing a list of batched model outputs (key=`output_list`) and a list of batches as returned by the dataloader (key=`batch_list`) if keep_batches is set to True.

pure_predict_on_multi_gpus(data_loader, batch_input_key='input', keep_batches=True, verbose=True, multi_gpu_device_ids=None, multi_gpu_output_device=None, multi_gpu_dim=0)¶

Computes the output of the model on a dataset using multiple GPUs. At the end of training the model is moved back to the device it was on at the beginning.

Parameters:

data_loader – DataLoader object that generates batches of data. Each batch must be a Dict that contains at least a Tensor or a list/tuple of Tensors containing the input(s) of the model(key=`batch_input_key`).
batch_input_key – The key of the batches returned by the data_loader that contains the input of the model.
keep_batches – If set to True then the method also returns a list of the batches returned by the dataloader.
verbose – Whether to print progress info.
multi_gpu_device_ids – CUDA devices used during training (default: all devices).
multi_gpu_output_device – Device location of output (default: device_ids[0]).
multi_gpu_dim – Int dimension on which to split each batch.

Returns:

Dict containing a list of batched model outputs (key=`output_list`) and a list of batches as returned by the dataloader (key=`batch_list`) if keep_batches is set to True.

save(f)¶

Saves the System to a file.

Parameters:	f – a file-like object (has to implement write and flush) or a string containing a file name.

save_model_state(f)¶

Saves the model’s state to a file.

Parameters:	f – a file-like object (has to implement write and flush) or a string containing a file name.

to(device)¶

Transfers the model to the specified device.

Parameters:	device – Device to be transferred to.
Returns:	Returns the model after moving it to the device (inplace).

train(loss_wrapper, optimizer, train_data_loader, evaluation_data_loaders=None, batch_input_key='input', evaluators=None, callbacks=None, gradient_accumulation_steps=1, verbose=True)¶

Trains the model on a dataset.

Parameters:

loss_wrapper – Object derived from AbstractLossWrapper that wraps the calculation of the loss.
optimizer – Optimizer object.
train_data_loader – DataLoader object that generates batches of the train dataset. Each batch must be a Dict that contains at least a Tensor or a list/tuple of Tensors containing the input(s) of the model (key=`batch_input_key`) as well as all the information needed by the loss_wrapper.
evaluation_data_loaders – Dictionary containing the evaluation data-loaders. The keys are the datasets’ names. Each batch generated by the dataloaders must be a Dict that contains the input of the model (key=`batch_input_key`) as well as the information needed by the evaluators.
batch_input_key – Key of the Dicts returned by the Dataloader objects that corresponds to the input of the model.
evaluators – Dictionary containing objects derived from AbstractEvaluator. The keys are the evaluators’ names.
callbacks – List containing TrainingCallback objects. They are used in order to inject functionality at several points of the training process. Default is NumberOfEpochsStoppingCriterionCallback(10) that stops training after the 10th iteration (counting from 0).
gradient_accumulation_steps – Number of backward calls before an optimization step. Used in order to simulate a larger batch size).
verbose – Whether to print progress info.

Returns:

List containing the results for each epoch.

train_on_multi_gpus(loss_wrapper, optimizer, train_data_loader, evaluation_data_loaders=None, batch_input_key='input', evaluators=None, callbacks=None, gradient_accumulation_steps=1, verbose=True, multi_gpu_device_ids=None, multi_gpu_output_device=None, multi_gpu_dim=0)¶

Trains the model on a dataset using multiple GPUs. At the end of training the model is moved back to the device it was on at the beginning.

Parameters:

loss_wrapper – Object derived from AbstractLossWrapper that wraps the calculation of the loss.
optimizer – Optimizer object.
train_data_loader – DataLoader object that generates batches of the train dataset. Each batch must be a Dict that contains at least a Tensor or a list/tuple of Tensors containing the input(s) of the model (key=`batch_input_key`).
evaluation_data_loaders – Dictionary containing the evaluation data-loaders. The keys are the datasets’ names. Each batch generated by the dataloaders must be a Dict that contains the input of the model (key=`batch_input_key`) as well as the information needed by the evaluators.
batch_input_key – Key of the Dicts returned by the Dataloader objects that corresponds to the input of the model.
evaluators – Dictionary containing objects derived from AbstractEvaluator. The keys are the evaluators’ names.
callbacks – List containing TrainingCallback objects. They are used in order to inject functionality at several points of the training process. Default is NumberOfEpochsStoppingCriterionCallback(10) that stops training after the 10th iteration (counting from 0).
gradient_accumulation_steps – Number of backward calls before an optimization step. Used in order to simulate a larger batch size).
verbose – Whether to print progress info.
multi_gpu_device_ids – CUDA devices used during training (default: all devices).
multi_gpu_output_device – Device location of output (default: device_ids[0]).
multi_gpu_dim – Int dimension on which to split each batch.

Returns:

List containing the results for each epoch.