Evaluators

class pytorch_wrapper.evaluators.AUROCEvaluator(model_output_key=None, batch_target_key='target', average='macro', target_threshold=0.5)

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

AUROC evaluator.

Parameters:
  • model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
  • batch_target_key – Key where the dict (batch) contains the target values.
  • average – Type [‘macro’ or ‘micro’] of averaging performed on the results in case of multi-label task.
calculate()

Called after all batches have been processed. Calculates the metric.

Returns:AbstractEvaluatorResults object.
reset()

(Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:
  • output – Output of the model.
  • batch – Dict that contains all information needed for a single batch by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
class pytorch_wrapper.evaluators.AbstractEvaluator

Bases: abc.ABC

Objects of derived classes are used to evaluate a model on a dataset using a specific metric.

calculate()

Called after all batches have been processed. Calculates the metric.

Returns:AbstractEvaluatorResults object.
calculate_at_once(output, dataset, last_activation=None)

Calculates the metric at once for the whole dataset.

Parameters:
  • output – Output of the model.
  • dataset – Dict that contains all information needed for a dataset by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
Returns:

AbstractEvaluatorResults object.

reset()

(Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:
  • output – Output of the model.
  • batch – Dict that contains all information needed for a single batch by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
class pytorch_wrapper.evaluators.AbstractEvaluatorResults

Bases: abc.ABC

Objects of derives classes encapsulate results of an evaluation metric.

compare_to(other_results_object)

Compares these results with the results of another object.

Parameters:other_results_object – Object of the same class.
is_better_than(other_results_object)

Compares these results with the results of another object.

Parameters:other_results_object – Object of the same class.
class pytorch_wrapper.evaluators.AccuracyEvaluator(threshold=0.5, model_output_key=None, batch_target_key='target')

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Accuracy evaluator.

Parameters:
  • threshold – Threshold above which an example is considered positive.
  • model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
  • batch_target_key – Key where the dict (batch) contains the target values.
calculate()

Called after all batches have been processed. Calculates the metric.

Returns:AbstractEvaluatorResults object.
reset()

(Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:
  • output – Output of the model.
  • batch – Dict that contains all information needed for a single batch by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
class pytorch_wrapper.evaluators.F1Evaluator(threshold=0.5, model_output_key=None, batch_target_key='target', average='binary')

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

F1 evaluator.

Parameters:
  • threshold – Threshold above which an example is considered positive.
  • model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
  • batch_target_key – Key where the dict (batch) contains the target values.
  • average – Type [‘binary’, ‘macro’ or ‘micro’] of averaging performed on the results.
calculate()

Called after all batches have been processed. Calculates the metric.

Returns:AbstractEvaluatorResults object.
reset()

(Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:
  • output – Output of the model.
  • batch – Dict that contains all information needed for a single batch by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
class pytorch_wrapper.evaluators.GenericEvaluatorResults(score, label='score', score_format='%f', is_max_better=True)

Bases: pytorch_wrapper.evaluators.AbstractEvaluatorResults

Generic evaluator results.

Parameters:
  • score – Numeric value that represents the score.
  • label – String used in the str representation.
  • score_format – Format String used in the str representation.
  • is_max_better – Flag that signifies if larger means better.
compare_to(other_results_object)

Compares these results with the results of another object.

Parameters:other_results_object – Object of the same class.
is_better_than(other_results_object)

Compares these results with the results of another object.

Parameters:other_results_object – Object of the same class.
is_max_better
score
class pytorch_wrapper.evaluators.GenericPointWiseLossEvaluator(loss_wrapper, label='loss', score_format='%f', batch_target_key='target')

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Adapter that uses an object of a class derived from AbstractLossWrapper to calculate the loss during evaluation.

Parameters:
  • loss_wrapper – AbstractLossWrapper object that calculates the loss.
  • label – Str used as label during printing of the loss.
  • score_format – Format used for str representation of the loss.
  • batch_target_key – Key where the dict (batch) contains the target values.
calculate()

Called after all batches have been processed. Calculates the metric.

Returns:AbstractEvaluatorResults object.
reset()

(Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:
  • output – Output of the model.
  • batch – Dict that contains all information needed for a single batch by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
class pytorch_wrapper.evaluators.MultiClassAccuracyEvaluator(model_output_key=None, batch_target_key='target')

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Multi-Class Accuracy evaluator.

Parameters:
  • model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
  • batch_target_key – Key where the dict (batch) contains the target values.
calculate()

Called after all batches have been processed. Calculates the metric.

Returns:AbstractEvaluatorResults object.
reset()

(Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:
  • output – Output of the model.
  • batch – Dict that contains all information needed for a single batch by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
class pytorch_wrapper.evaluators.MultiClassF1Evaluator(model_output_key=None, batch_target_key='target', average='macro')

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Multi-Class F1 evaluator.

Parameters:
  • model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
  • batch_target_key – Key where the dict (batch) contains the target values.
  • average – Type [‘macro’ or ‘micro’] of averaging performed on the results.
calculate()

Called after all batches have been processed. Calculates the metric.

Returns:AbstractEvaluatorResults object.
reset()

(Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:
  • output – Output of the model.
  • batch – Dict that contains all information needed for a single batch by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
class pytorch_wrapper.evaluators.MultiClassPrecisionEvaluator(model_output_key=None, batch_target_key='target', average='macro')

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Multi-Class Precision evaluator.

Parameters:
  • model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
  • batch_target_key – Key where the dict (batch) contains the target values.
  • average – Type [‘macro’ or ‘micro’] of averaging performed on the results.
calculate()

Called after all batches have been processed. Calculates the metric.

Returns:AbstractEvaluatorResults object.
reset()

(Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:
  • output – Output of the model.
  • batch – Dict that contains all information needed for a single batch by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
class pytorch_wrapper.evaluators.MultiClassRecallEvaluator(model_output_key=None, batch_target_key='target', average='macro')

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Multi-Class Recall evaluator.

Parameters:
  • model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
  • batch_target_key – Key where the dict (batch) contains the target values.
  • average – Type [‘macro’ or ‘micro’] of averaging performed on the results.
calculate()

Called after all batches have been processed. Calculates the metric.

Returns:AbstractEvaluatorResults object.
reset()

(Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:
  • output – Output of the model.
  • batch – Dict that contains all information needed for a single batch by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
class pytorch_wrapper.evaluators.PrecisionEvaluator(threshold=0.5, model_output_key=None, batch_target_key='target', average='binary')

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Precision evaluator.

Parameters:
  • threshold – Threshold above which an example is considered positive.
  • model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
  • batch_target_key – Key where the dict (batch) contains the target values.
  • average – Type [‘binary’, ‘macro’ or ‘micro’] of averaging performed on the results.
calculate()

Called after all batches have been processed. Calculates the metric.

Returns:AbstractEvaluatorResults object.
reset()

(Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:
  • output – Output of the model.
  • batch – Dict that contains all information needed for a single batch by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
class pytorch_wrapper.evaluators.RecallEvaluator(threshold=0.5, model_output_key=None, batch_target_key='target', average='binary')

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Recall evaluator.

Parameters:
  • threshold – Threshold above which an example is considered positive.
  • model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
  • batch_target_key – Key where the dict (batch) contains the target values.
  • average – Type [‘binary’, ‘macro’ or ‘micro’] of averaging performed on the results.
calculate()

Called after all batches have been processed. Calculates the metric.

Returns:AbstractEvaluatorResults object.
reset()

(Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:
  • output – Output of the model.
  • batch – Dict that contains all information needed for a single batch by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.
class pytorch_wrapper.evaluators.TokenLabelingEvaluatorWrapper(evaluator, batch_input_sequence_length_idx, batch_input_key='input', model_output_key=None, batch_target_key='target', end_padded=True)

Bases: pytorch_wrapper.evaluators.AbstractEvaluator

Adapter that wraps an evaluator. It is used in token labeling tasks in order to flat the output and target while discarding invalid values due to padding.

Parameters:
  • evaluator – The evaluator.
  • batch_input_sequence_length_idx – The index of the input list where the lengths of the sequences can be found.
  • batch_input_key – Key of the Dicts returned by the Dataloader objects that corresponds to the input of the model.
  • model_output_key – Key where the dict returned by the model contains the actual predictions. Leave None if the model returns only the predictions.
  • batch_target_key – Key where the dict (batch) contains the target values.
  • end_padded – Whether the sequences are end-padded.
calculate()

Called after all batches have been processed. Calculates the metric.

Returns:AbstractEvaluatorResults object.
reset()

(Re)initializes the object. Called at the beginning of the evaluation step.

step(output, batch, last_activation=None)

Gathers information needed for performance measurement about a single batch. Called after each batch in the evaluation step.

Parameters:
  • output – Output of the model.
  • batch – Dict that contains all information needed for a single batch by the evaluator.
  • last_activation – The last activation of the model. Some losses work with logits and as such the last activation might not be performed inside the model’s forward method.