Evaluators¶

class pytorch_wrapper.evaluators.AUROCEvaluator(model_output_key=None, batch_target_key='target', average='macro', target_threshold=0.5)¶

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

AUROC evaluator.

Parameters:	model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions. batch_target_key – Key where the dict (batch) contains the target values. average – Type [‘macro’ or ‘micro’] of averaging performed on the results in case of multi-label task.

calculate()¶

Called after all batches have been processed. Calculates the metric.

Returns:	AbstractEvaluatorResults object.

reset()¶: (Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)¶

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:	output – Output of the model. batch – Dict that contains all information needed for a single batch by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.

class pytorch_wrapper.evaluators.AbstractEvaluator¶

Bases: abc.ABC

Objects of derived classes are used to evaluate a model on a dataset using a specific metric.

calculate()¶

Called after all batches have been processed. Calculates the metric.

Returns:	AbstractEvaluatorResults object.

calculate_at_once(output, dataset, last_activation=None)¶

Calculates the metric at once for the whole dataset.

Parameters:	output – Output of the model. dataset – Dict that contains all information needed for a dataset by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
Returns:	AbstractEvaluatorResults object.

reset()¶: (Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)¶

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:	output – Output of the model. batch – Dict that contains all information needed for a single batch by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.

class pytorch_wrapper.evaluators.AbstractEvaluatorResults¶

Bases: abc.ABC

Objects of derives classes encapsulate results of an evaluation metric.

compare_to(other_results_object)¶

Compares these results with the results of another object.

Parameters:	other_results_object – Object of the same class.

is_better_than(other_results_object)¶

Compares these results with the results of another object.

Parameters:	other_results_object – Object of the same class.

class pytorch_wrapper.evaluators.AccuracyEvaluator(threshold=0.5, model_output_key=None, batch_target_key='target')¶

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Accuracy evaluator.

Parameters:	threshold – Threshold above which an example is considered positive. model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions. batch_target_key – Key where the dict (batch) contains the target values.

calculate()¶

Called after all batches have been processed. Calculates the metric.

Returns:	AbstractEvaluatorResults object.

reset()¶: (Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)¶

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:	output – Output of the model. batch – Dict that contains all information needed for a single batch by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.

class pytorch_wrapper.evaluators.F1Evaluator(threshold=0.5, model_output_key=None, batch_target_key='target', average='binary')¶

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

F1 evaluator.

Parameters:

threshold – Threshold above which an example is considered positive.
model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
batch_target_key – Key where the dict (batch) contains the target values.
average – Type [‘binary’, ‘macro’ or ‘micro’] of averaging performed on the results.

calculate()¶

Called after all batches have been processed. Calculates the metric.

Returns:	AbstractEvaluatorResults object.

reset()¶: (Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)¶

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:	output – Output of the model. batch – Dict that contains all information needed for a single batch by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.

class pytorch_wrapper.evaluators.GenericEvaluatorResults(score, label='score', score_format='%f', is_max_better=True)¶

Bases: pytorch_wrapper.evaluators.AbstractEvaluatorResults

Generic evaluator results.

Parameters:	score – Numeric value that represents the score. label – String used in the str representation. score_format – Format String used in the str representation. is_max_better – Flag that signifies if larger means better.

compare_to(other_results_object)¶

Compares these results with the results of another object.

Parameters:	other_results_object – Object of the same class.

is_better_than(other_results_object)¶

Compares these results with the results of another object.

Parameters:	other_results_object – Object of the same class.

is_max_better¶

score¶

class pytorch_wrapper.evaluators.GenericPointWiseLossEvaluator(loss_wrapper, label='loss', score_format='%f', batch_target_key='target')¶

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Adapter that uses an object of a class derived from AbstractLossWrapper to calculate the loss during evaluation.

Parameters:	loss_wrapper – AbstractLossWrapper object that calculates the loss. label – Str used as label during printing of the loss. score_format – Format used for str representation of the loss. batch_target_key – Key where the dict (batch) contains the target values.

calculate()¶

Called after all batches have been processed. Calculates the metric.

Returns:	AbstractEvaluatorResults object.

reset()¶: (Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)¶

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:	output – Output of the model. batch – Dict that contains all information needed for a single batch by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.

class pytorch_wrapper.evaluators.MultiClassAccuracyEvaluator(model_output_key=None, batch_target_key='target')¶

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Multi-Class Accuracy evaluator.

Parameters:	model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions. batch_target_key – Key where the dict (batch) contains the target values.

calculate()¶

Called after all batches have been processed. Calculates the metric.

Returns:	AbstractEvaluatorResults object.

reset()¶: (Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)¶

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:	output – Output of the model. batch – Dict that contains all information needed for a single batch by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.

class pytorch_wrapper.evaluators.MultiClassF1Evaluator(model_output_key=None, batch_target_key='target', average='macro')¶

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Multi-Class F1 evaluator.

Parameters:	model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions. batch_target_key – Key where the dict (batch) contains the target values. average – Type [‘macro’ or ‘micro’] of averaging performed on the results.

calculate()¶

Called after all batches have been processed. Calculates the metric.

Returns:	AbstractEvaluatorResults object.

reset()¶: (Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)¶

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:	output – Output of the model. batch – Dict that contains all information needed for a single batch by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.

class pytorch_wrapper.evaluators.MultiClassPrecisionEvaluator(model_output_key=None, batch_target_key='target', average='macro')¶

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Multi-Class Precision evaluator.

Parameters:	model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions. batch_target_key – Key where the dict (batch) contains the target values. average – Type [‘macro’ or ‘micro’] of averaging performed on the results.

calculate()¶

Called after all batches have been processed. Calculates the metric.

Returns:	AbstractEvaluatorResults object.

reset()¶: (Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)¶

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:	output – Output of the model. batch – Dict that contains all information needed for a single batch by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.

class pytorch_wrapper.evaluators.MultiClassRecallEvaluator(model_output_key=None, batch_target_key='target', average='macro')¶

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Multi-Class Recall evaluator.

Parameters:	model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions. batch_target_key – Key where the dict (batch) contains the target values. average – Type [‘macro’ or ‘micro’] of averaging performed on the results.

calculate()¶

Called after all batches have been processed. Calculates the metric.

Returns:	AbstractEvaluatorResults object.

reset()¶: (Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)¶

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:	output – Output of the model. batch – Dict that contains all information needed for a single batch by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.

class pytorch_wrapper.evaluators.PrecisionEvaluator(threshold=0.5, model_output_key=None, batch_target_key='target', average='binary')¶

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Precision evaluator.

Parameters:

threshold – Threshold above which an example is considered positive.
model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
batch_target_key – Key where the dict (batch) contains the target values.
average – Type [‘binary’, ‘macro’ or ‘micro’] of averaging performed on the results.

calculate()¶

Called after all batches have been processed. Calculates the metric.

Returns:	AbstractEvaluatorResults object.

reset()¶: (Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)¶

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:	output – Output of the model. batch – Dict that contains all information needed for a single batch by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.

class pytorch_wrapper.evaluators.RecallEvaluator(threshold=0.5, model_output_key=None, batch_target_key='target', average='binary')¶

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Recall evaluator.

Parameters:

threshold – Threshold above which an example is considered positive.
model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
batch_target_key – Key where the dict (batch) contains the target values.
average – Type [‘binary’, ‘macro’ or ‘micro’] of averaging performed on the results.

calculate()¶

Called after all batches have been processed. Calculates the metric.

Returns:	AbstractEvaluatorResults object.

reset()¶: (Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)¶

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:	output – Output of the model. batch – Dict that contains all information needed for a single batch by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.

class pytorch_wrapper.evaluators.TokenLabelingEvaluatorWrapper(evaluator, batch_input_sequence_length_idx, batch_input_key='input', model_output_key=None, batch_target_key='target', end_padded=True)¶

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Adapter that wraps an evaluator. It is used in token labeling tasks in order to flat the output and target while discarding invalid values due to padding.

Parameters:

evaluator – The evaluator.
batch_input_sequence_length_idx – The index of the input list where the lengths of the sequences can be found.
batch_input_key – Key of the Dicts returned by the Dataloader objects that corresponds to the input of the model.
model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
batch_target_key – Key where the dict (batch) contains the target values.
end_padded – Whether the sequences are end-padded.

calculate()¶

Called after all batches have been processed. Calculates the metric.

Returns:	AbstractEvaluatorResults object.

reset()¶: (Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)¶

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:	output – Output of the model. batch – Dict that contains all information needed for a single batch by the evaluator. last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.