• No results found

1.3 Reservoir computing

1.3.1 General concepts

Reservoir computing [6, 7, 8, 9, 10, 11, 12] is a recently introduced, bio-inspired, paradigm in machine-learning. With this approach, state-of-the-art performance has been found for processing empirical data. Even for tasks that are deemed computationally hard, such as chaotic time series prediction [9], or speech recognition [13, 14], amongst others, good results are obtained with a computationally efficient process. The main inspiration underlying reservoir computing is the insight that the brain processes information by generating patterns of transient neuronal activity excited by input sensory signals [15]. The electrical discharges of billions of neurons are organized in such a way that our brain can deliver the correct response to an external stimulus in a very short time. An analogy that is often brought up by the machine learning community is the one with waves emerging in a bucket of water when small pebbles are thrown into it. With the naked eye it might be tricky to make an estimation on the weight of the pebbles. The key idea is to transform this original question to another one that is much easier to solve. When the pebbles are thrown into a bucket of water wave patterns will emerge. The wave is a transient phenomenon because if no more perturbations are introduced, eventually it will fade out. By studying the wave pattern, one could deduce where the pebble hit the water surface or when it happened. The magnitude of the wave could even give an indication about the size and weight of the stone or about the velocity with which it was thrown. The water serves as a reservoir that will not solve the original problem, but it translates it into another form, allowing other methods to be used to interpret the information. Although just an analogy, the bucket of water provides an insight into some of the crucial elements of a potentially successful reservoir.

The objective of reservoir computing is to implement a specific nonlinear transformation of the input signal or to classify the inputs. Classification involves the discrimination between a set of input data, e.g., identifying fea-tures of images, voices, time series, etc. In order to perform the task, neural networks require a training procedure. Since recurrent networks are notori-ously difficult to train, they were not widely used until the advent of reservoir computing. Another layer is added and the only part of the system that is trained are the connections from the reservoir to this extra layer. Thus, the training does not affect the dynamics of the reservoir itself. The situation is depicted in Fig. 1.4.

8 1 Introduction Input layer Reservoir Output layer

Fig. 1.4: Network topology: reservoir computing. A reservoir computing network is shown. The reservoir is a recurrent network, explicitly separated from the output layer.

To efficiently solve its tasks, a reservoir should satisfy several key proper-ties. Firstly, it should nonlinearly transform the input signal into a high-dimensional state space in which the signal is represented. In machine learn-ing this is achieved through the use of a large number of reservoir nodes which are connected to each other through the recurrent connections of the reservoir. In practice, traditional reservoir computing architectures employ several hundreds/thousands of nonlinear reservoir nodes to obtain good per-formance. In Fig. 1.5, we illustrate how such a nonlinear mapping to a high-dimensional state space facilitates separation (classification) of states with the example of an XOR. Consider the situation depicted in Fig. 1.5(a).

Two binary input variables, x and y, lead to a target that corresponds to an XOR logical function. If x and y both have the same value this results in a 0, represented by a star. If x and y have different values, the result is a 1, represented by a sphere. The goal is to separate the red stars from the yellow spheres, but this cannot be achieved with one straight line. If this would have been the case, the problem would have been linearly sepa-rable. Linearly separable problems are regarded as easy, since they can be solved with a linear training algorithm. When smartly mapping this prob-lem from a two-dimensional space onto a three-dimensional one, the nature of the separability changes. In Fig. 1.5(b) both variables kept their initialx -and y-positions, but the yellow spheres were given a different position along the z-axis compared to the red stars. It suffices to introduce one straight plane to separate the two types of variables. The 2D plane in 3D space is the equivalent of a straight line in 2D space. The nonlinear transformation to high-dimensional space does not construct the hyperplane itself, but it allows its existence by reshaping the nonlinear separation problem onto a linear one. Reservoir computing implements this idea: the input signal is

1.3 Reservoir computing 9

x y

1 0

1

0

x z y

1 0

1

0

(a) (b)

Fig. 1.5: Illustration of linear separability. In (a) The XOR problem in a two-dimensional input space: a 0 corresponds to a star and a 1 to a sphere. The yellow spheres and the red stars cannot be separated by a single straight line. (b) With a nonlinear mapping into a three-dimensional space the spheres and stars can be separated by a single linear 2D plane. Figure taken from Appeltantet al. [17].

nonlinearly mapped into the high-dimensional reservoir state represented by a large number of nodes. It can be shown that, the higher the dimension of the space is, the more likely it is that the data become linearly separable, see e.g. [16] .

Secondly the dynamics of the reservoir should be such that it exhibits a fading memory (i.e., a short term memory): the reservoir state is influenced by inputs from the recent past, but it is independent of the inputs from the remote past. This property is essential for processing temporal sequences (such as speech) for which the history of the signal is important. Additionally, the results of reservoir computing must be reproducible and robust against noise. For this, the reservoir should exhibit sufficiently different dynamical responses to inputs belonging to different classes. At the same time, the reservoir should not be too sensitive: similar inputs should not be associated to different classes. These competing requirements define when a reservoir performs well. Typically reservoirs depend on a few parameters, which must be adjusted to satisfy the above constraints. Experience shows that these requirements are satisfied when the reservoir operates (in the absence of input) in a steady regime. However, many aspects of dynamics leading to good performance are not yet known. For the reader interested in a more in-depth presentation of reservoir computing, we refer to the recent review articles [18, 19, 20, 21].

10 1 Introduction

Fig. 1.6: Reservoir computing applications. (a) Modeling the movements of a robot arm based on sensory inputs, picture by [40]

(b) Predicting and explaining traffic jam situations, picture by [41]

(c) Speech recognition, picture by [42] (d) Handwriting recognition, picture by [43].