Modules¶

Dynamic Self Attention Encoder¶

class pytorch_wrapper.modules.dynamic_self_attention_encoder.DynamicSelfAttentionEncoder(time_step_size, att_scores_nb=1, att_iterations=2, projection_size=100, projection_activation=<sphinx.ext.autodoc.importer._MockObject object>, attended_representation_activation=<sphinx.ext.autodoc.importer._MockObject object>, is_end_padded=True)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Dynamic Self Attention Encoder (https://arxiv.org/abs/1808.07383).

Parameters:

time_step_size – Time step size.
att_scores_nb – Number of attended representations.
att_iterations – Number of iterations of the dynamic self-attention algorithm.
projection_size – Size of the projection layer.
projection_activation – Callable that creates the activation of the projection layer.
attended_representation_activation – Callable that creates the activation used on the attended representations after each iteration.
is_end_padded – Whether to mask at the end.

forward(batch_sequences, batch_sequence_lengths)¶

Parameters:	batch_sequences – 3D Tensor (batch_size, sequence_length, time_step_size). batch_sequence_lengths – 1D Tensor (batch_size) containing the lengths of the sequences.
Returns:	2D Tensor (batch_size, projection_size * att_scores_nb) containing the encodings.

Embedding Layer¶

class pytorch_wrapper.modules.embedding_layer.EmbeddingLayer(vocab_size, emb_size, trainable, padding_idx=None)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Embedding Layer.

Parameters:	vocab_size – Size of the vocabulary. emb_size – Size of the embeddings. trainable – Whether the embeddings should be altered during training. padding_idx – Index of the vector to be initialized with zeros.

forward(x)¶

load_embeddings(embeddings)¶

Loads pre-trained embeddings.

Parameters:	embeddings – Numpy array of the appropriate size containing the pre-trained embeddings.

Layer Norm¶

class pytorch_wrapper.modules.layer_norm.LayerNorm(last_dim_size, eps=1e-06)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Layer Normalization (https://arxiv.org/pdf/1607.06450.pdf).

Parameters:	last_dim_size – Size of last dimension. eps – Small number for numerical stability (avoid division by zero).

forward(x)¶

Parameters:	x – Tensor to be layer normalized.
Returns:	Layer normalized Tensor.

MLP¶

class pytorch_wrapper.modules.mlp.MLP(input_size, input_activation=None, input_dp=None, input_pre_activation_bn=False, input_post_activation_bn=False, input_pre_activation_ln=False, input_post_activation_ln=False, num_hidden_layers=1, hidden_layer_size=128, hidden_layer_bias=True, hidden_layer_init=None, hidden_layer_bias_init=None, hidden_activation=<sphinx.ext.autodoc.importer._MockObject object>, hidden_dp=None, hidden_layer_pre_activation_bn=False, hidden_layer_post_activation_bn=False, hidden_layer_pre_activation_ln=False, hidden_layer_post_activation_ln=False, output_layer_init=None, output_layer_bias_init=None, output_size=1, output_layer_bias=True, output_activation=None, output_dp=None, output_layer_pre_activation_bn=False, output_layer_post_activation_bn=False, output_layer_pre_activation_ln=False, output_layer_post_activation_ln=False)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Multi Layer Perceptron.

Parameters:

input_size – Size of the last dimension of the input.
input_activation – Callable that creates the activation used on the input.
input_dp – Callable that creates the activation used on the input.
input_pre_activation_bn – Whether to use batch normalization before the activation of the input layer.
input_post_activation_bn – Whether to use batch normalization after the activation of the input layer.
input_pre_activation_ln – Whether to use layer normalization before the activation of the input layer.
input_post_activation_ln – Whether to use layer normalization after the activation of the input layer.
num_hidden_layers – Number of hidden layers.
hidden_layer_size – Size of hidden layers. It is also possible to provide a list containing a different size for each hidden layer.
hidden_layer_bias – Whether to use bias. It is also possible to provide a list containing a different option for each hidden layer.
hidden_layer_init – Callable that initializes inplace the weights of the hidden layers.
hidden_layer_bias_init – Callable that initializes inplace the bias of the hidden layers.
hidden_activation – Callable that creates the activation used after each hidden layer. It is also possible to provide a list containing num_hidden_layers callables.
hidden_dp – Dropout probability for the hidden layers. It is also possible to provide a list containing num_hidden_layers probabilities.
hidden_layer_pre_activation_bn – Whether to use batch normalization before the activation of each hidden layer.
hidden_layer_post_activation_bn – Whether to use batch normalization after the activation of each hidden layer.
hidden_layer_pre_activation_ln – Whether to use layer normalization before the activation of each hidden layer.
hidden_layer_post_activation_ln – Whether to use layer normalization after the activation of each hidden layer.
output_layer_init – Callable that initializes inplace the weights of the output layer.
output_layer_bias_init – Callable that initializes inplace the bias of the output layer.
output_size – Output size.
output_layer_bias – Whether to use bias.
output_activation – Callable that creates the activation used after the output layer.
output_dp – Dropout probability for the output layer.
output_layer_pre_activation_bn – Whether to use batch normalization before the activation of the output layer.
output_layer_post_activation_bn – Whether to use batch normalization before the activation of the output layer.
output_layer_pre_activation_ln – Whether to use layer normalization before the activation of the output layer.
output_layer_post_activation_ln – Whether to use layer normalization before the activation of the output layer.

forward(x)¶

Parameters:	x – Tensor having its last dimension being of size input_size.
Returns:	Tensor with the same shape as x except the last dimension which is of size output_size.

Multi-Head Attention¶

class pytorch_wrapper.modules.multi_head_attention.MultiHeadAttention(q_time_step_size, k_time_step_size, v_time_step_size, heads, attention_type='dot', dp=0, is_end_padded=True)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Multi Head Attention (https://arxiv.org/pdf/1706.03762.pdf).

Parameters:	q_time_step_size – Query time step size. k_time_step_size – Key time step size. v_time_step_size – Value time step size. heads – Number of attention heads. attention_type – Attention type [‘dot’, ‘multiplicative’, ‘additive’]. dp – Dropout probability. is_end_padded – Whether to mask at the end.

forward(q, k, v, q_sequence_lengths, k_sequence_lengths)¶

Parameters:

q – 3D Tensor (batch_size, q_sequence_length, time_step_size) containing the queries.
k – 3D Tensor (batch_size, k_sequence_length, time_step_size) containing the keys.
v – 3D Tensor (batch_size, k_sequence_length, time_step_size) containing the values.
q_sequence_lengths – 1D Tensor (batch_size) containing the lengths of the query sequences.
k_sequence_lengths – 1D Tensor (batch_size) containing the lengths of the key sequences.

Returns:

3D Tensor (batch_size, q_sequence_length, time_step_size).

Residual¶

class pytorch_wrapper.modules.residual.Residual(module, residual_index=None, model_output_key=None)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Adds the input of a module to it’s output.

Parameters:	module – The module to wrap. residual_index – The index of the input to be added. Leave None if it is not a multi-input module. model_output_key – The key of the output of the model to be added. Leave None if it is not a multi-output module.

forward(*x)¶

Parameters:	x – The input of the wrapped module.
Returns:	The output of the wrapped module added to it’s input.

Sequence Basic CNN Block¶

class pytorch_wrapper.modules.sequence_basic_cnn_block.SequenceBasicCNNBlock(time_step_size, kernel_height=3, out_channels=300, activation=<sphinx.ext.autodoc.importer._MockObject object>, dp=0)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Sequence Basic CNN Block.

Parameters:	time_step_size – Time step size. kernel_height – Filter height. out_channels – Number of filters. activation – Callable that creates the activation function. dp – Dropout probability.

forward(batch_sequences)¶

Parameters:	batch_sequences – 3D Tensor (batch_size, sequence_length, time_step_size) containing the sequence.
Returns:	2D Tensor (batch_size, sequence_length, out_channels) containing the encodings.

Sequence Basic CNN Encoder¶

class pytorch_wrapper.modules.sequence_basic_cnn_encoder.SequenceBasicCNNEncoder(time_step_size, input_activation=None, kernel_heights=(1, 2, 3, 4, 5), out_channels=300, pre_pooling_activation=<sphinx.ext.autodoc.importer._MockObject object>, pooling_function=<sphinx.ext.autodoc.importer._MockObject object>, post_pooling_activation=None, post_pooling_dp=0)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Basic CNN Encoder for sequences (https://arxiv.org/abs/1408.5882).

Parameters:

time_step_size – Time step size.
input_activation – Callable that creates the activation used on the input.
kernel_heights – Tuple containing filter heights.
out_channels – Number of filters for each filter height.
pre_pooling_activation – Callable that creates the activation used before pooling.
pooling_function – Callable that performs a pooling function before the activation.
post_pooling_activation – Callable that creates the activation used after pooling.
post_pooling_dp – Callable that performs a pooling function before the activation.

forward(batch_sequences)¶

Parameters:	batch_sequences – 3D Tensor (batch_size, sequence_length, time_step_size) containing the sequence.
Returns:	2D Tensor (batch_size, len(kernel_heights) * out_channels) containing the encodings.

Sequence Dense CNN¶

class pytorch_wrapper.modules.sequence_dense_cnn.SequenceDenseCNN(input_size, projection_layer_size=150, kernel_heights=(3, 5), feature_map_increase=75, cnn_depth=3, output_projection_layer_size=300, activation=<sphinx.ext.autodoc.importer._MockObject object>, dp=0, normalize_output=True)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Dense CNN for sequences (https://arxiv.org/abs/1808.07383).

Parameters:

input_size – Time step size.
projection_layer_size – Size of projection_layer.
kernel_heights – Kernel height of the filters.
feature_map_increase – Number of filters of each convolutional layer.
cnn_depth – Number of convolutional layers per kernel height.
output_projection_layer_size – Size of the output time_steps.
activation – Callable that creates the activation used after each layer.
dp – Dropout probability.
normalize_output – Whether to perform l2 normalization on the output.

forward(batch_sequences)¶

Parameters:	batch_sequences – 3D Tensor (batch_size, sequence_length, time_step_size).
Returns:	3D Tensor (batch_size, sequence_length, output_projection_layer_size).

Sinusoidal Positional Embedding Layer¶

class pytorch_wrapper.modules.sinusoidal_positional_embedding_layer.SinusoidalPositionalEmbeddingLayer(emb_size, pad_at_end=True, init_max_sentence_length=1024)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Sinusoidal Positional Embeddings (https://arxiv.org/pdf/1706.03762.pdf).

Parameters:	emb_size – Size of the positional embeddings. pad_at_end – Whether to pad at the end. init_max_sentence_length – Initial maximum length of sentence.

create_embeddings(num_embeddings)¶

forward(length_tensor, max_sequence_length)¶

Parameters:	length_tensor – ND Tensor containing the real lengths. max_sequence_length – Int that corresponds to the size of (N+1)D dimension.
Returns:	(N+2)D Tensor with the positional embeddings.

Softmax Attention Layer¶

class pytorch_wrapper.modules.softmax_attention_encoder.SoftmaxAttentionEncoder(attention_mlp, is_end_padded=True)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Encodes a sequence using context based soft-max attention.

Parameters:	attention_mlp – MLP object used to generate unnormalized attention score(s). If the last dimension of the tensor returned by the MLP is larger than 1 then multi-attention is applied. is_end_padded – Whether to mask at the end.

forward(batch_sequences, batch_context_vector, batch_sequence_lengths)¶

Parameters:	batch_sequences – 3D Tensor (batch_size, sequence_length, time_step_size). batch_context_vector – 2D Tensor (batch_size, context_vector_size). batch_sequence_lengths – 1D Tensor (batch_size) containing the lengths of the sequences.
Returns:	Dict with a 2D Tensor (batch_size, time_step_size) or a 3D Tensor in case of multi-attention (batch_size, nb_attentions, time_step_size) containing the encodings (key=`output`) and a 2D Tensor (batch_size, sequence_length) or a 3D Tensor (batch_size, sequence_length, nb_attentions) containing the attention scores (key=`att_scores`).

Softmax Self Attention Layer¶

class pytorch_wrapper.modules.softmax_self_attention_encoder.SoftmaxSelfAttentionEncoder(attention_mlp, is_end_padded=True)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Encodes a sequence using soft-max self-attention.

Parameters:	attention_mlp – MLP object used to generate unnormalized attention score(s). If the last dimension of the tensor returned by the MLP is larger than 1 then multi-attention is applied. is_end_padded – Whether to mask at the end.

forward(batch_sequences, batch_sequence_lengths)¶

Parameters:	batch_sequences – 3D Tensor (batch_size, sequence_length, time_step_size). batch_sequence_lengths – 1D Tensor (batch_size) containing the lengths of the sequences.
Returns:	Dict with a 2D Tensor (batch_size, time_step_size) or a 3D Tensor in case of multi-attention (batch_size, nb_attentions, time_step_size) containing the encodings (key=`output`) and a 2D Tensor (batch_size, sequence_length) or a 3D Tensor (batch_size, sequence_length, nb_attentions) containing the attention scores (key=`att_scores`).

Transformer Encoder¶

class pytorch_wrapper.modules.transformer_encoder.TransformerEncoder(time_step_size, heads, depth, dp=0, use_positional_embeddings=True, is_end_padded=True)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Transformer Encoder (https://arxiv.org/pdf/1706.03762.pdf).

Parameters:	time_step_size – Time step size. heads – Number of attention heads. depth – Number of transformer blocks. dp – Dropout probability. use_positional_embeddings – Whether to use positional embeddings. is_end_padded – Whether to mask at the end.

forward(batch_sequences, batch_sequence_lengths)¶

Parameters:	batch_sequences – batch_sequences: 3D Tensor (batch_size, sequence_length, time_step_size). batch_sequence_lengths – 1D Tensor (batch_size) containing the lengths of the sequences.
Returns:	3D Tensor (batch_size, sequence_length, time_step_size).

Transformer Encoder Block¶

class pytorch_wrapper.modules.transformer_encoder_block.TransformerEncoderBlock(time_step_size, heads, out_mlp, dp=0, is_end_padded=True)¶

Bases: sphinx.ext.autodoc.importer._MockObject

Transformer Encoder Block (https://arxiv.org/pdf/1706.03762.pdf).

Parameters:	time_step_size – Time step size. heads – Number of attention heads. out_mlp – MLP that will be performed after the attended sequence is generated. dp – Dropout probability. is_end_padded – Whether to mask at the end.

forward(batch_sequences, batch_sequence_lengths)¶

Parameters:	batch_sequences – batch_sequences: 3D Tensor (batch_size, sequence_length, time_step_size). batch_sequence_lengths – 1D Tensor (batch_size) containing the lengths of the sequences.
Returns:	3D Tensor (batch_size, sequence_length, time_step_size).