This post creates a Bidirectional LSTM and learns a simple pattern in the sequence.

Toy Dataset

The simulated dataset contains input sequences of length 10. The output sequence is one that contains a series of 0’s and 1’s based on whether the cumulative sum exceeds a particular level

Data Preparation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import numpy as np
from keras.models import Sequential
from keras.layers import LSTM, Dense, Bidirectional, TimeDistributed
import matplotlib.pyplot as plt

np.random.seed(1234) X = np.random.random(10000).reshape(-1,10) lim = 10/4. Y =np.apply_along_axis(lambda x : np.where(np.cumsum(x)>lim,1,0),1,X ) X = X.reshape(1000,10,1) Y = Y.reshape(1000,10,1)

Building Model

1
2
3
4
5
6
n_timesteps = 10
model = Sequential()
model.add(Bidirectional(LSTM(50, return_sequences=True), input_shape=(n_timesteps,1)))
model.add(TimeDistributed(Dense(1,activation='sigmoid')))
model.compile(loss='binary_crossentropy',optimizer='adam', metrics=['acc'])
model.summary()

Training the Model

1
2
n_epochs = 50
history = model.fit(X,Y,epochs=n_epochs, validation_split=0.2)

Testing the Model

1
2
3
4
5
X = np.random.random(1000).reshape(-1,10)
Y  =np.apply_along_axis(lambda x : np.where(np.cumsum(x)>lim,1,0),1,X )
X= X.reshape(100,10,1)
Y = Y.reshape(100,10,1)
print(model.evaluate(X,Y))

Takeaways

  • Bidirectional LSTM layer is easy to add. One can easily specify a BidirectionalWrapper on LSTM and this gives rise to two layers, one the original layer and other a reversed layer.
  • One can specify the merge mode in keras and the options are( default is concat)
    • sum
    • mul
    • concat
    • ave
  • Although BidirectionalLSTMs were developed for speech recognition applications, they have been found to be very useful in other areas including Finance
  • The need for using TimeDistributed wrapper in the context of cumsum prediction has given me a good understanding of this wrapper