class atten_lstm.SelfAttention(*args, **kwargs)[source]

SelfAttention is originally proposed by Cheng et al., 2016 1 Here using the implementation of Philipperemy from 2 with modification that attn_units and attn_activation attributes can be changed. The default values of these attributes are same as used by the auther. However, there is another implementation of SelfAttention at 3 but the author have cited a different paper i.e. Zheng et al., 2018 4 and named it as additive attention. A useful discussion about this (in this class) implementation can be found at 5

Examples

>>> from atten_lstm import SelfAttention
>>> from tensorflow.keras.layers import Input, LSTM, Dense
>>> from tensorflow.keras.models import Model
>>> import numpy as np
>>> inp = Input(shape=(10, 1))
>>> lstm = LSTM(2, return_sequences=True)(inp)
>>> sa, _ = SelfAttention()(lstm)
>>> out = Dense(1)(sa)
...
>>> model = Model(inputs=inp, outputs=out)
>>> model.compile(loss="mse")
...
>>> print(model.summary())
...
>>> x = np.random.random((100, 10, 1))
>>> y = np.random.random((100, 1))
>>> h = model.fit(x=x, y=y)

References

1

https://arxiv.org/pdf/1601.06733.pdf

2

https://github.com/philipperemy/keras-attention-mechanism/blob/master/attention/attention.py

3

https://github.com/CyberZHG/keras-self-attention/blob/master/keras_self_attention/seq_self_attention.py

4

https://arxiv.org/pdf/1806.01264.pdf

5

https://github.com/philipperemy/keras-attention-mechanism/issues/14

__init__(units: int = 128, activation: str = 'tanh', return_attention_weights: bool = True, **kwargs)[source]
Parameters
  • units (int, optional (default=128)) – number of units for attention mechanism

  • activation (str, optional (default="tanh")) – activation function to use in attention mechanism

  • return_attention_weights (bool, optional (default=True)) – if True, then it returns two outputs, first is attention vector of shape (batch_size, units) and second is of shape (batch_size, time_steps) If False, then returns only attention vector.

  • **kwargs – any additional keyword arguments for keras Layer.

class atten_lstm.AttentionLSTM(*args, **kwargs)[source]

This layer combines Self Attention 7 mechanism with LSTM 8. It uses one separate LSTM+SelfAttention block for each input feature. The output from each LSTM+SelfAttention block is concatenated and returned. The layer expects same input dimension as by LSTM i.e. (batch_size, time_steps, input_features). For usage see example 9.

References

7

https://ai4water.readthedocs.io/en/dev/models/layers.html#selfattention

8

https://www.tensorflow.org/api_docs/python/tf/keras/layers/LSTM

9

https://ai4water.readthedocs.io/en/dev/auto_examples/attention_lstm.html#

__init__(num_inputs: int, lstm_units: int, attn_units: int = 128, attn_activation: str = 'tanh', lstm_kwargs: Optional[dict] = None, **kwargs)[source]
Parameters
  • num_inputs (int) – number of inputs

  • lstm_units (int) – number of units in LSTM layers

  • attn_units (int, optional (default=128)) – number of units in SelfAttention layers

  • attn_activation (str, optional (default="tanh")) – activation function in SelfAttention layers

  • lstm_kwargs (dict, optional (default=None)) – any keyword arguments for LSTM layer.

Example

>>> import numpy as np
>>> from tensorflow.keras.models import Model
>>> from tensorflow.keras.layers import Input, Dense
>>> from atten_lstm import AttentionLSTM
>>> seq_len = 20
>>> num_inputs = 2
>>> inp = Input(shape=(seq_len, num_inputs))
>>> outs = AttentionLSTM(num_inputs, 16)(inp)
>>> outs = Dense(1)(outs)
...
>>> model = Model(inputs=inp, outputs=outs)
>>> model.compile(loss="mse")
...
>>> print(model.summary())
... # define input
>>> x = np.random.random((100, seq_len, num_inputs))
>>> y = np.random.random((100, 1))
>>> h = model.fit(x=x, y=y)
build(input_shape)

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs, *args, **kwargs)

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
  • inputs

    Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

    arguments, and inputs cannot be provided via the default value of a keyword argument.

    • NumPy array or Python scalar values in inputs get cast as tensors.

    • Keras mask metadata is only collected from inputs.

    • Layers are built (build(input_shape) method) using shape info from inputs only.

    • input_spec compatibility is only checked against inputs.

    • Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

    • The SavedModel input specification is generated using inputs only.

    • Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

  • *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

  • **kwargs

    Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

    whether the call is meant for training or inference.

    • mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

Indices and tables