Updated On : Jul-05,2022 Tags keras, lstm, time-series…

Keras: RNNs (LSTM) for Time Series Data (Regression Tasks)

Time Series is a type of data where we capture one or more data variables at specified time intervals. The data is indexed by date, time, or date-time. The most common example of time-series datasets are stock prices every second/minute/hour/day, temperature captured every minute/hour/day, etc. This extra time component is referred to as a temporal component. For time-series data, the order of data is very important and we can not shuffle time-series dataset examples as we do with other tasks. The time-series dataset can be univariate or multi-variate. When working with time-series datasets to make predictions at a particular time, the best available data features will be a single example measured at a previous timestamp or multiple examples measured earlier. It is generally the hyperparameter of the network that we need to find out how many previous examples should be used by the network to make predictions of the current example. Time-series datasets have an inner sequence that we need to capture and research has proven that Recurrent Neural Networks are quite good at capturing sequences. They can easily capture univariate or multi-variate sequences.

As a part of this tutorial, we have explained how to create Recurrent Neural Networks (RNNs) consisting of LSTM Layers using Python Deep Learning library Keras that can be used to solve time-series regression tasks involving multi-variate dataset. The tutorial uses Tetouan City Power Consumption dataset available from UCI ML Datasets Repository. The dataset has variables like temperature, humidity, wind speed, diffuse flows, power consumption, etc measured every 10 minutes. We'll train the network to predict power consumption for the future. It'll be a regression task as we are predicting continuous variable. We have also explained how we should organize data for RNNs. After completion of network training and performance evaluation, we have also given a few suggestions on what can be tried further to improve network performance.

Below, we have listed important sections of the Tutorial to give an overview of the material covered.

Important Sections Of Tutorial

  1. Prepare Data
    • 1.1 Download Data
    • 1.2 Load Data
    • 1.3 Organize Data
    • 1.4 Scale Target Values
  2. Define Network
  3. Compile and Train Network
  4. Evaluate Network Performance
  5. Visualize Predictions
  6. Further Suggestions

Below, we have imported the necessary Python libraries and printed the versions of them that we have used in our tutorial.

from tensorflow import keras

print("Keras Version : {}".format(keras.__version__))
Keras Version : 2.6.0

1. Prepare Data

In this section, we have followed step by step process to prepare data for our time series regression task. We are following the below steps to prepare data for our LSTM network.

  1. Download Data
  2. Load Data into memory.
  3. Organize data by moving the window of size 30. We'll be looking back the 30 previous examples' data features (X) in order to make a prediction of the current target value (Y).
  4. Scale target values to bring them in the same range as data features for faster convergence.

The steps will become clear as we implement them one by one next.

1.1 Download Data

In this section, we have downloaded the [Tetouan City Power Consumption] dataset (available from UCI ML Repository) that we are going to use for our regression task. The dataset has power consumption information for three distribution networks of Tetouan city of Morocco. Next, we'll load it into memory to take a look at the data.

!wget https://archive.ics.uci.edu/ml/machine-learning-databases/00616/Tetuan%20City%20power%20consumption.csv
--2022-06-03 04:39:37--  https://archive.ics.uci.edu/ml/machine-learning-databases/00616/Tetuan%20City%20power%20consumption.csv
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4222390 (4.0M) [application/x-httpd-php]
Saving to: ‘Tetuan City power consumption.csv’

Tetuan City power c 100%[===================>]   4.03M  3.76MB/s    in 1.1s

2022-06-03 04:39:39 (3.76 MB/s) - ‘Tetuan City power consumption.csv’ saved [4222390/4222390]

%ls
'Tetuan City power consumption.csv'   __notebook__.ipynb

1.2 Load Data

In this section, we have loaded data into the main memory as Pandas dataframe using read_csv() method. The dataset has below data variables measured every 10 minutes.

  1. Date-time
  2. Temperature
  3. Humidity
  4. Wind Speed
  5. General diffuse flows
  6. Diffuse flows
  7. Zone 1 Power Consumption (units)
  8. Zone 2 Power Consumption
  9. Zone 3 Power Consumption

After loading data, we have set the DateTime column as an index of our dataframe. We have also displayed the first few rows of the dataset to give an idea about the contents.

For our purpose, we'll be using columns [Temperature, Humidity, Wind Speed, general diffuse flows, diffuse flows] as data features (X) and column Zone 1 Power Consumption as target column (Y).

In the next cell after the below cell, we have plotted the power consumption of three zones for December 2017. We can notice from the chart that there is clear seasonality in the data.

We have also plotted the power consumption for a single day (1st Dec 2017) to give an idea of how power demand varies throughout the day. We can notice from the chart that power requirement is less at the beginning of the day and keeps increasing till around 9 PM and then starts to drop again.

import pandas as pd

data_df = pd.read_csv("Tetuan City power consumption.csv")
data_df["DateTime"] = pd.to_datetime(data_df["DateTime"])
data_df = data_df.set_index('DateTime')
data_df.columns = [col.strip() for col in data_df.columns]

print("Columns : {}".format(data_df.columns.values.tolist()))
print("Dataset Shape : {}".format(data_df.shape))

data_df.head()
Columns : ['Temperature', 'Humidity', 'Wind Speed', 'general diffuse flows', 'diffuse flows', 'Zone 1 Power Consumption', 'Zone 2  Power Consumption', 'Zone 3  Power Consumption']
Dataset Shape : (52416, 8)
Temperature Humidity Wind Speed general diffuse flows diffuse flows Zone 1 Power Consumption Zone 2 Power Consumption Zone 3 Power Consumption
DateTime
2017-01-01 00:00:00 6.559 73.8 0.083 0.051 0.119 34055.69620 16128.87538 20240.96386
2017-01-01 00:10:00 6.414 74.5 0.083 0.070 0.085 29814.68354 19375.07599 20131.08434
2017-01-01 00:20:00 6.313 74.5 0.080 0.062 0.100 29128.10127 19006.68693 19668.43373
2017-01-01 00:30:00 6.121 75.0 0.083 0.091 0.096 28228.86076 18361.09422 18899.27711
2017-01-01 00:40:00 5.921 75.7 0.081 0.048 0.085 27335.69620 17872.34043 18442.40964
import matplotlib.pyplot as plt

data_df.loc["2017-12"].plot(y=["Zone 1 Power Consumption", "Zone 2  Power Consumption", "Zone 3  Power Consumption"], figsize=(18, 7), grid=True);
plt.grid(which='minor', linestyle=':', linewidth='0.5', color='black');

Keras: RNNs (LSTM) for Time Series Data (Regression Tasks)

import matplotlib.pyplot as plt

data_df.loc["2017-12-1"].plot(y="Zone 1 Power Consumption", figsize=(18, 7), color="tomato", grid=True);
plt.grid(which='minor', linestyle=':', linewidth='0.5', color='black');

Keras: RNNs (LSTM) for Time Series Data (Regression Tasks)

1.3 Organize Data

In this section, we have organized data for our network. As mentioned earlier, to make a prediction of any target value, we'll be using data features from the previous 30 examples. We have set the variable lookback to this value. As data is measured every 10 minutes, we'll be looking at the previous 5 hours of data to make a prediction of the current value.

We'll be using columns ['Temperature', 'Humidity', 'Wind Speed', 'general diffuse flows', 'diffuse flows'] as data features (X) and column Zone 1 Power Consumption as target value (Y). In order to prepare data, we are looping through the data moving window of size 30 through it. For each iteration, we add data features of examples that fall into a window in X_organized array and the value of the target column from the next example after the window into Y_organized. A window will move by a step size of 1 each time.

After organizing the dataset by moving the window of lookback size, we have divided data into the train (first 50k examples) and test sets (remaining examples till the end). We have also printed the shape of datasets for reference.

import numpy as np

feature_cols = ['Temperature', 'Humidity', 'Wind Speed', 'general diffuse flows', 'diffuse flows']
target_col = 'Zone 1 Power Consumption'

X = data_df[feature_cols].values
Y = data_df[target_col].values

n_features = X.shape[1]
lookback = 30 ## 5 hours lookback to make prediction

X_organized, Y_organized = [], []
for i in range(0, X.shape[0]-lookback, 1):
    X_organized.append(X[i:i+lookback])
    Y_organized.append(Y[i+lookback])

X_organized, Y_organized = np.array(X_organized), np.array(Y_organized)
X_train, Y_train, X_test, Y_test = X_organized[:50000], Y_organized[:50000], X_organized[50000:], Y_organized[50000:]

X_organized.shape, Y_organized.shape, X_train.shape, Y_train.shape, X_test.shape, Y_test.shape
((52386, 30, 5), (52386,), (50000, 30, 5), (50000,), (2386, 30, 5), (2386,))

1.4 Scale Target Values

If you have looked at the data values of various columns earlier, you can notice that values present in all three consumption columns are quite high compared to our data features columns. This can create problems for our network as it might need to move weights by big amounts to make such large predictions. This can prevent it from converging or can take a lot of time to converge.

To make it converge faster, we have normalized target values. We have recorded the mean and standard deviation of target values. Then, we subtracted the mean from train/test target values and divided subtracted values by standard deviation. This process has brought target values of train and test datasets into the lower range. When making predictions using our algorithm, we'll reverse this process to make an actual prediction of consumption units.

mean, std = Y_train.mean(), Y_train.std()

print("Mean : {:.2f}, Standard Deviation : {:.2f}".format(mean, std))
Y_train_scaled, Y_test_scaled = (Y_train - mean)/std , (Y_test-mean)/std

Y_train_scaled.min(), Y_train_scaled.max(), Y_test_scaled.min(), Y_test_scaled.max()
Mean : 32492.51, Standard Deviation : 7137.51
(-2.605504215104729, 2.761730965773633, -2.002983020068412, 1.6084197046321758)
import gc

del X, Y

gc.collect()
5870

2. Define Network

Here, we have defined a neural network that we'll use for our time-series regression task using Keras. The network is simple and consists of 3 layers (two LSTM layers and one dense layer). Both LSTM layers have 256 output units. The first LSTM layer transforms input data shape from (batch_size, 30, 5) to (batch_size, 30, 256) after processing. The output of the first LSTM layer will be given to the second LSTM layer which will apply its processing. The shape of output data from the second LSTM layer is the same as the first because both have the same output units. The output of the second LSTM layer is given to a dense layer that has one output unit. The dense layer will process data and output processed data of shape (batch_size, 1). The output of the dense layer is a prediction of our network.

After defining a network, we have also printed a summary of the network showing the output shape and parameters (weights/biases) count of each layer.

Please make a note that we have assumed that the reader has background knowledge on RNNs and LSTM hence we have not covered the inner workings of them in detail here. Please feel free to check the below link if you are looking for a little background.

from keras.models import Sequential
from keras.layers import LSTM, Dense, Embedding

lstm_out = 256

model = Sequential([
                    LSTM(lstm_out, return_sequences=True, input_shape=(lookback, n_features)),
                    LSTM(lstm_out),
                    Dense(1)
                ])

model.summary()
2022-06-03 04:39:47.361115: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
lstm (LSTM)                  (None, 30, 256)           268288
_________________________________________________________________
lstm_1 (LSTM)                (None, 256)               525312
_________________________________________________________________
dense (Dense)                (None, 1)                 257
=================================================================
Total params: 793,857
Trainable params: 793,857
Non-trainable params: 0
_________________________________________________________________

3. Compile and Train Network

Here, we have first compiled our network to use Adam optimizer, mean squared error (MSE) loss function, and mean absolute error (MAE) metric. After compiling the network, we performed the training process by calling fit() method. We have trained the network for 15 epochs. To train the network, we have provided train data for the method. The network will be trained by looping through train data in a batch size of 32. We can notice from the MSE loss getting printed after each epoch that our network is improving because loss is decreasing constantly. Next, we'll check the performance of our network by making predictions on the test dataset.

model.compile(optimizer="adam", loss="mse", metrics=["mae"])
history = model.fit(X_train, Y_train_scaled, batch_size=32, epochs=15, validation_data=(X_test, Y_test_scaled))
2022-06-03 04:39:48.299769: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/15
1563/1563 [==============================] - 241s 151ms/step - loss: 0.1907 - mae: 0.3299 - val_loss: 0.1159 - val_mae: 0.2668
Epoch 2/15
1563/1563 [==============================] - 231s 148ms/step - loss: 0.1577 - mae: 0.2987 - val_loss: 0.1052 - val_mae: 0.2588
Epoch 3/15
1563/1563 [==============================] - 233s 149ms/step - loss: 0.1499 - mae: 0.2898 - val_loss: 0.1028 - val_mae: 0.2542
Epoch 4/15
1563/1563 [==============================] - 233s 149ms/step - loss: 0.1412 - mae: 0.2799 - val_loss: 0.1055 - val_mae: 0.2623
Epoch 5/15
1563/1563 [==============================] - 232s 149ms/step - loss: 0.1335 - mae: 0.2717 - val_loss: 0.0971 - val_mae: 0.2454
Epoch 6/15
1563/1563 [==============================] - 237s 152ms/step - loss: 0.1286 - mae: 0.2663 - val_loss: 0.0968 - val_mae: 0.2452
Epoch 7/15
1563/1563 [==============================] - 235s 151ms/step - loss: 0.1224 - mae: 0.2590 - val_loss: 0.1084 - val_mae: 0.2687
Epoch 8/15
1563/1563 [==============================] - 236s 151ms/step - loss: 0.1178 - mae: 0.2540 - val_loss: 0.0988 - val_mae: 0.2462
Epoch 9/15
1563/1563 [==============================] - 236s 151ms/step - loss: 0.1123 - mae: 0.2470 - val_loss: 0.0845 - val_mae: 0.2338
Epoch 10/15
1563/1563 [==============================] - 235s 150ms/step - loss: 0.1078 - mae: 0.2407 - val_loss: 0.1054 - val_mae: 0.2532
Epoch 11/15
1563/1563 [==============================] - 234s 150ms/step - loss: 0.1051 - mae: 0.2372 - val_loss: 0.0865 - val_mae: 0.2277
Epoch 12/15
1563/1563 [==============================] - 236s 151ms/step - loss: 0.0990 - mae: 0.2296 - val_loss: 0.0965 - val_mae: 0.2436
Epoch 13/15
1563/1563 [==============================] - 237s 152ms/step - loss: 0.0945 - mae: 0.2230 - val_loss: 0.0927 - val_mae: 0.2333
Epoch 14/15
1563/1563 [==============================] - 235s 150ms/step - loss: 0.0926 - mae: 0.2198 - val_loss: 0.0977 - val_mae: 0.2436
Epoch 15/15
1563/1563 [==============================] - 237s 151ms/step - loss: 0.0891 - mae: 0.2152 - val_loss: 0.0836 - val_mae: 0.2202

4. Evaluate Network Performance

In this section, we have evaluated the performance of our network on the test dataset to see how it has been done.

Below, we have first made predictions on the test dataset using our trained network. After making a prediction, we have rescaled the prediction by multiplying it with the standard deviation and adding the mean of the target values calculated earlier. This will bring our predictions into an actual range of consumption units.

Then, we calculated R2 score on test predictions. The R2 score is a very common metric used to measure the performance of the network on regression tasks. We have calculated score using r2_score() function available from scikit-learn library. The score generally has values in the range 0-1 and values near 1 are considered signs of a good generalized model. We can notice from the r2 score in our case that our model has done a good job at the regression task.

Please feel free to check the below link if you want to understand how r2 score works internally. Sklearn provides many ML metrics. The majority of them are covered in detail.

test_preds = model.predict(X_test) ## Make Predictions on test dataset
test_preds  = (test_preds*std) + mean

test_preds[:5]
array([[30938.98 ],
       [31556.996],
       [31947.947],
       [32407.133],
       [32803.54 ]], dtype=float32)
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error

print("Test  MSE : {:.2f}".format(mean_squared_error(test_preds.squeeze(), Y_test)))
print("Test  R^2 Score : {:.2f}".format(r2_score(test_preds.squeeze(), Y_test)))
print("Test  MAE : {:.2f}".format(mean_absolute_error(test_preds.squeeze(), Y_test)))
Test  MSE : 4258571.87
Test  R^2 Score : 0.90
Test  MAE : 1571.64

5. Visualize Predictions

After making predictions, we have added them to the dataframe and also plotted them. We have plotted actual and predicted power consumption next to each other for verification purposes. We can notice from the chart that our model has done quite a good job at capturing the seasonality. It is properly able to capture upward movement. It is making little mistakes in downward movements. Next, we have suggested a few points which can be tried to further improve network performance.

data_df_final = data_df[50000:].copy()

data_df_final["Zone 1 Power Consumption Prediction"] = [None]*lookback + test_preds.squeeze().tolist()

data_df_final.tail()
Temperature Humidity Wind Speed general diffuse flows diffuse flows Zone 1 Power Consumption Zone 2 Power Consumption Zone 3 Power Consumption Zone 1 Power Consumption Prediction
DateTime
2017-12-30 23:10:00 7.010 72.4 0.080 0.040 0.096 31160.45627 26857.31820 14780.31212 31590.791016
2017-12-30 23:20:00 6.947 72.6 0.082 0.051 0.093 30430.41825 26124.57809 14428.81152 30474.587891
2017-12-30 23:30:00 6.900 72.8 0.086 0.084 0.074 29590.87452 25277.69254 13806.48259 29905.843750
2017-12-30 23:40:00 6.758 73.0 0.080 0.066 0.089 28958.17490 24692.23688 13512.60504 22921.175781
2017-12-30 23:50:00 6.580 74.1 0.081 0.062 0.111 28349.80989 24055.23167 13345.49820 22784.238281
data_df_final.plot(y=["Zone 1 Power Consumption", "Zone 1 Power Consumption Prediction"],figsize=(18,7));
plt.grid(which='minor', linestyle=':', linewidth='0.5', color='black');

Keras: RNNs (LSTM) for Time Series Data (Regression Tasks)

6. Further Suggestions

  • Train network for more epochs.
  • Make the network predict more than one future target value. In our case, we made the network predict only one future value. We can modify the network to simply predict the next 5 or more values as well. We'll need to organize data in that way as well.
  • Try different output units for LSTM layers.
  • Try stacking more LSTM layers though it can increase training time.
  • Try adding more dense layers after LSTM layers.
  • Try adding date-related features to data like a month, year, day, day of the week, month-end, month-start, quarter-end, quarter-start, year-end, year-start, etc.
  • Try learning rate schedulers
Sunny Solanki  Sunny Solanki

Share Views Want to Share Your Views? Have Any Suggestions?

If you want to

  • provide some suggestions on topic
  • share your views
  • include some details in tutorial
  • suggest some new topics on which we should create tutorials/blogs
Please feel free to contact us at coderzcolumn07@gmail.com. We appreciate and value your feedbacks. You can also support us with a small contribution by clicking DONATE.