In this article, I will explain the way you can code a simple RNN that tracks a simple shift in the pattern, i.e a value from a normal distribution.
As compared to previous implementations where we had used OutputProjectionWrapper
, this code does away with that component and does it more efficiently
Create Training and Validation Data
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
import numpy as np
import re
from sklearn.model_selection import train_test_split
import tensorflow as tf
input_seed = 1234
time_steps = 24
n_samples = 100000
X = np.random.randint(1,30,n_samples*time_steps).reshape(n_samples, time_steps)
Y = np.apply_along_axis(lambda x : x + np.random.normal(3, 1, 1),1,X)
X = X.reshape(X.shape[0],X.shape[1],1)
Y = X.reshape(Y.shape[0],Y.shape[1],1)
np.random.seed(input_seed)
idx = np.arange(len(X))
np.random.shuffle(idx)
X, Y = X[idx,:,:], Y[idx,:,:]
X_train, X_valid, Y_train, Y_valid = train_test_split(X,Y,test_size=0.25, random_state = input_seed)
|
Set up the RNN Model in TensorFlow
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
tf.reset_default_graph()
hidden_units = 32
tf_X = tf.placeholder(tf.float32, shape=[None, time_steps, 1])
tf_Y = tf.placeholder(tf.float32, shape=[None, time_steps, 1])
rnn_cell = tf.contrib.rnn.BasicRNNCell(num_units= hidden_units, activation=tf.nn.relu)
outputs, states =tf.nn.dynamic_rnn(rnn_cell, inputs = tf_X,
dtype=tf.float32)
stacked_outputs = tf.reshape(outputs,[-1,hidden_units])
stacked_outputs = tf.contrib.layers.fully_connected(stacked_outputs, 1,activation_fn=None)
outputs = tf.reshape(stacked_outputs,[-1,time_steps,1])
loss = tf.square(outputs - tf_Y)
total_loss = tf.reduce_mean(loss)
learning_rate = 0.001
optimizer = tf.train.AdamOptimizer(learning_rate= learning_rate).minimize(loss=total_loss)
batch_size =1000
n_batches = int(X_train.shape[0]/batch_size)
epochs = 20
i = 0
|
Train the Model
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for e in range(epochs):
idx = np.arange(len(X_train))
np.random.shuffle(idx)
X_train, Y_train = X_train[idx,:,:], Y_train[idx]
for i in range(n_batches):
x = X_train[(i*batch_size):((i+1)*batch_size),:,:]
y = Y_train[(i*batch_size):((i+1)*batch_size),:,:]
_, curr_loss = sess.run([optimizer, total_loss],
feed_dict={tf_X:x, tf_Y:y})
loss_val,output_val = sess.run([total_loss,outputs], feed_dict={tf_X:X_valid, tf_Y:Y_valid})
print("Epoch:",str(e), " Loss:", loss_val)
|
The output from testing validation data is
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
Epoch: 0 Loss: 35.66197
Epoch: 1 Loss: 6.2947197
Epoch: 2 Loss: 3.493777
Epoch: 3 Loss: 2.589133
Epoch: 4 Loss: 1.9215883
Epoch: 5 Loss: 1.4201827
Epoch: 6 Loss: 1.0227635
Epoch: 7 Loss: 0.642627
Epoch: 8 Loss: 0.4225436
Epoch: 9 Loss: 0.2940228
Epoch: 10 Loss: 0.20289473
Epoch: 11 Loss: 0.14451711
Epoch: 12 Loss: 0.11506326
Epoch: 13 Loss: 0.103697464
Epoch: 14 Loss: 0.06420918
Epoch: 15 Loss: 0.052835397
Epoch: 16 Loss: 0.07495382
Epoch: 17 Loss: 0.038493495
Epoch: 18 Loss: 0.034656726
Epoch: 19 Loss: 0.033339214
|
Hence one can see the with in 20 epochs, the network has learned the pattern
You could also plot and see the pattern of the residuals
1
2
3
|
import matplotlib.pyplot as plt
plt.scatter(np.arange(600000),output_val.flatten() -Y_valid.flatten())
plt.show()
|