Artificial Intelligence
1 Nov 2013

Recurrent Product Unit Neural Networks

Time Series Forecasting (TSF) consists on estimating models to predict future values based on previously observed values of time series, and it can be applied to solve many real-world problems. TSF has been traditionally tackled by considering AutoRegressive Neural Networks (ARNNs) or Recurrent Neural Networks (RNNs), where hidden nodes are usually configured using additive activation functions, such as sigmoidal functions. ARNNs are based on a short-term memory of the time series in the form of lagged time series values used as inputs, while RNNs include a long-term memory structure. The objective of this project is twofold. First, it explores the potential of multiplicative nodes for ARNNs, by considering Product Unit (PU) activation functions, motivated by the fact that PUs are specially useful for modelling highly correlated features, such as the lagged time series values used as inputs for ARNNs. Second, it proposes a new hybrid RNN model based on PUs, by estimating the PU outputs from the combination of a long-term reservoir and the short-term lagged time series values. A complete set of experiments with 29 datasets shows competitive performance for both model proposals, and a set of statistical tests confirms that they achieve the state-of-the-art in TSF, with specially promising results for the proposed hybrid RNN.

Artificial Neural Networks (ANNs) are a very popular machine learning tool used for TSF. In the field of TSF, there are different ANN architectures. Feedforward Neural Networks (FFNNs) are the most common and simplest type of ANNs, where the information moves in a forward direction. For example, the Time Delay Neural Network (TDNN) consists on a FFNN whose inputs are the delayed values of the TS. Instead, Recurrent Neural Networks (RNN) are based on a different architecture where the information through the system moves constituting a direct cycle. This cycle can storage information from previous data in the internal memory of the network, which can be useful for certain kinds of applications. One example of RNN is the Long Short Term Memory Neural Network (LSTMNN), whose main characteristic is the capability of its nodes to remember a time series value for an arbitrary length of time. Robot control and real time recognition are examples of real applications of LSTMNNs. Echo State Networks (ESNs) are RNNs whose architecture includes a random number of neurons whose interconnections are also randomly decided. This provides the network with a long term memory and a competitive generalisation performance. From the analysis of ANNs in the context of TSF, it can be derived that one of the main differences between FFNNs and RNNs lies on their storage capacity. RNNs have a long term memory because of the architecture of the model, whereas the memory of FFNNs is provided by the lagged terms at the input of the network.


the project

This project is focused on Product Unit Neural Networks (PUNNs) and its application to TSF. The basis function of the hidden neurons of PUNNs is the Product Unit (PU) function, where the output of the neuron is the product of their inputs raised to real valued weights. PUNNs are an alternative to sigmoidal neural networks and are based on multiplicative nodes instead of additive ones. This model has the ability to express strong interactions between input variables, providing big variations at the output from small variations at the inputs. However, they result in a highly convoluted error function, plenty of local minima. This handicap makes convenient the use of global search algorithms, such as genetic algorithms or swarm optimisation algorithms, in order to find the parameters minimising the error function. PUNNs has been widely used in classification and regression problems, but scarcely applied to TSF, with the exception of some attempts on hydrological TSF. It is important to point out that, in TSF, there is an autocorrelation between the lagged values of the series. In this way, theoretically, PUNNs should constitute an appropriate model for TSF because they can easily model the interactions (correlations) between the lagged values of the time series.

Summarising, the main contributions of this project are the following:

  • The use of PU basis functions in the field of TSF. PU basis functions were already investigated in the field of regression and classification. However, their mathematical expression makes them specially interesting for addressing TSF problems.
  • A new hybrid RNN combining Reservoir Computing (RC) models (specifically, ESNs) and FFNs with PU basis functions, with the goal of providing to the model with a long-term memory.
  • The use of a hybrid algorithm combining the CMA-ES method and the MP generalised inverse for parameter estimation, given that PUNNs tend to generate complex error functions with multiple local minima.
Hamburger icon
Menu
Advanced Concepts Team