• No results found

Santa Fe time series prediction

3.3 Results

3.3.2 Santa Fe time series prediction

3.3.2.1 Numerically obtained performance

Here, we show numerical results for the Santa Fe laser task, which was already introduced in Chapter 1, section 1.4.2.3 and in Chapter 2, section 2.2.3.

Usually, in literature, performance on this task is expressed as the normalized mean square error (NMSE) of the predicted versus the actual value. In Fig.

3.7 the NMSE values are shown in color coding, while scanning the input scaling (γ) and the feedback strength (η). The exponent is chosen as p= 1.

Because now in the calculation of the error values the square root is omitted, the reference values of the error are much lower. In the plot of Fig. 3.7 the lowest value is NMSE = 0.019 (corresponding to an NRMSE = 0.137) and the region of good performance is for low values ofγandη. Whenηincreases, the valley of good performance becomes narrower. Note that, although the values of γ are smaller than the ones considered in the case of NARMA10, the part of the nonlinearity that is explored is still reasonably large. While a typical input of NARMA10 belongs to the interval [0,0.5], a Santa Fe data point corresponds to a normalized intensity between 0 and 250. The resulting value injected in the Santa Fe case is therefore still larger than in the case

3.3 Results 55

Fig. 3.7: Performance for the Santa Fe time series prediction task. The two scanned parameters areγ (input scaling) and η (feed-back strength) and the error is plotted in color code as an NMSE. The nonlinearity is of the Mackey-Glass type and the exponent in Eq.(3.3) is set to p = 1. Other characteristics of the reservoir are τ = 80, N = 400,θ= 0.2).

of our NARMA10 scan. For the virtual node separation we choose θ = 0.2, which was previously found to be optimal for the NARMA10 task. Also for the Santa Fe time series prediction this seems to be a suitable value.

3.3.2.2 Comparison with state of the art

In Table 3.2 a comparison with literature is made, comparing the delayed feedback approach with a cycle reservoir as described in [79]. Rodan et al.

achieve slightly better results, but in the same order of magnitude.

3.3.3 Isolated spoken digit recognition

3.3.3.1 Performance: numerical simulations and experiments

For spoken digit recognition, memory is not as crucial as for the NARMA10 task, allowing us to use a higher exponent in the Mackey-Glass equation.

This is beneficial since it is easier to implement in an experimental setup.

We have verified through numerical simulations that a broad range of values ofpyields similar results. The virtual node separation is set atθ = 0.2, while the total number of virtual nodes is N = 400. Larger values of θ also yield

56 3 Modeling an electronic implementation

Table 3.2: Santa Fe performance literature review. For several sets of reservoirs sizes the performance is given as an NMSE. The first performance column gives the results found by by Rodan et al.

[79]. The last column shows the results found with a delayed feedback reservoir. Both are results coming from numerical simulations.

Res. size NMSE [79] NMSE Our system

50 0.0184 0.0228

100 0.0125 0.0214

150 0.00945 0.0212

200 0.00819 0.0210

400 - 0.0190

good results for this task, but the shortest one is chosen because of speed considerations. The specifics of the training procedure are detailed in section 2.2.2. Fig. 3.8 depicts the numerically obtained classification performance of unknown samples as a function of η for γ = 0.5, which has been chosen such that input and feedback signals are of the same order of magnitude.

The classification performance is expressed in two ways: the word error rate (WER) that shows the percentage of words that have been wrongly classified, and the margin (distance) between the reservoir’s best guess of the target and the closest competitor. It can be seen that an increase in margin corresponds to a decrease in WER. Our results show that there is a broad parameter range in η with good performance, with an optimum for both, margin and WER, aroundη= 0.8. Note that the performance breaks down when η approaches 1. This is expected, as it corresponds to the threshold of instability of the Mackey-Glass oscillator when there is no input (γ = 0). At the optimum value ofη, we obtain a WER as low as 0.14%. This corresponds to less than one misclassification in 500 words. These performance levels are comparable to or even better than those obtained with traditional reservoir computing composed of more than 1200 nodes for which a WER of 4.3% was reported [13], with a reservoir of 308 nodes for which more recently a WER of 0.2% was obtained [14] and also with alternative approaches based on Hidden Markov Models which achieved a WER of 0.55% [96].

3.3.3.2 Speaker identification: numerical results

A side task that we implemented is the identification of the speaker uttering the digit. The processing of the signal in the reservoir is exactly the same, In the considered data set there are five female speakers, hence identifying them

3.3 Results 57

0 0.5 1 1.5 2 2.5

2 2.5

3 3.5

4 4.5

5 5.5

η

Margin (a.u.)

0 0.5 1 1.5 2 2.5

0 0.5 1 1.5 2 2.5 3 3.5

Word Error Rate / %

Fig. 3.8: Numerical and experimental results for spoken digit recognition. The y-axis on the left-hand side denotes the margin, whereas they-axis on the right-hand side denotes the word error rate.

The abscissa represents the parameterη. γ has been kept fixed at 0.5 and the exponent is set top= 7. The delay time is set atτ = 80, with N = 400 nodes of τ = 0.2 separation. The red line represents results for the numerically obtained margin and the black line represents the numerically obtained word error rate. The red and black crosses denote the corresponding experimental results. Figure taken from Appeltant et al. [17].

58 3 Modeling an electronic implementation

Table 3.3: Isolated Spoken Digit Recognition performance liter-ature review. For several sets of reservoirs sizes the performance is given as an WER in %. The first performance column gives the results found by Verstraeten et al. [13, 14] and the second column the ones obtained with a delayed feedback reservoir. Both are results coming from numerical simulations.

Res. size WER (%) [13] and [14] WER (%) Our system

50 7.32 3.0

100 2.96 1.6

150 1.82 1.2

200 1.38 0.8

308 0.2 0.3

400 - 0.14

only requires five extra classifiers. This way the system will process the data, while the readout layer will solve two different tasks on the same reservoir states. When training purely on the cochleagram (before feeding it into the reservoir) the WER in terms of speaker identification is 2%. Although this is already a relatively good result, we can still improve significantly by letting the reservoir process the cochleagram. The obtained error then becomes 0.4% for the speaker task.

3.3.3.3 Comparison with state of the art

As an indication of the quality of the obtained results we give some perfor-mance data for this test from literature. In Table 3.3 the first perforperfor-mance column are results listed by Verstraetenet al. [13, 14] while the last column shows our results using delayed feedback systems. Again, one can conclude that the single node with delayed feedback performs as well or even better than tradition reservoir simulations.