Bidirectional LSTM

This post creates a Bidirectional LSTM and learns a simple pattern in the sequence.

Toy Dataset

The simulated dataset contains input sequences of length 10. The output sequence is one that contains a series of 0’s and 1’s based on whether the cumulative sum exceeds a particular level

Data Preparation

import numpy as np
from keras.models import Sequential
from keras.layers import LSTM, Dense, Bidirectional, TimeDistributed
import matplotlib.pyplot as plt
np.random.seed(1234)
X   = np.random.random(10000).reshape(-1,10)
lim = 10/4.
Y  =np.apply_along_axis(lambda x : np.where(np.cumsum(x)>lim,1,0),1,X )
X  = X.reshape(1000,10,1)
Y  = Y.reshape(1000,10,1)

Building Model

1
2
3
4
5
6


n_timesteps = 10
model = Sequential()
model.add(Bidirectional(LSTM(50, return_sequences=True), input_shape=(n_timesteps,1)))
model.add(TimeDistributed(Dense(1,activation='sigmoid')))
model.compile(loss='binary_crossentropy',optimizer='adam', metrics=['acc'])
model.summary()

Training the Model

1
2


n_epochs = 50
history = model.fit(X,Y,epochs=n_epochs, validation_split=0.2)

Testing the Model

1
2
3
4
5


X = np.random.random(1000).reshape(-1,10)
Y  =np.apply_along_axis(lambda x : np.where(np.cumsum(x)>lim,1,0),1,X )
X= X.reshape(100,10,1)
Y = Y.reshape(100,10,1)
print(model.evaluate(X,Y))

Takeaways

Bidirectional LSTM layer is easy to add. One can easily specify a BidirectionalWrapper on LSTM and this gives rise to two layers, one the original layer and other a reversed layer.
One can specify the merge mode in keras and the options are( default is concat)
- sum
- mul
- concat
- ave
Although BidirectionalLSTMs were developed for speech recognition applications, they have been found to be very useful in other areas including Finance
The need for using TimeDistributed wrapper in the context of cumsum prediction has given me a good understanding of this wrapper

Contents