System Identification of a Non-Uniformly Sampled Multi-Rate System in Aluminium Electrolysis Cells

(1)

System Identification of a Non-Uniformly Sampled Multi-Rate System in Aluminium Electrolysis Cells

H. Viumdal

¹

S. Mylvaganam

²

D. Di Ruscio

²

1Tel-Tek, N-3918 Porsgrunn, Norway and Telemark University College, Faculty of Technology, N-3918 Porsgrunn, Norway. E-mail: [email protected]

2Telemark University College, Faculty of Technology, N-3918 Porsgrunn, Norway. E-mail: [email protected] [email protected]

Abstract

Standard system identification algorithms are usually designed to generate mathematical models with equidistant sampling instants, that are equal for both input variables and output variables. Unfortunately, real industrial data sets are often disrupted by missing samples, variations of sampling rates in the different variables (also known as multi-rate systems), and intermittent measurements. In industries with varying events based maintenance or manual operational measures, intermittent measurements are performed leading to uneven sampling rates. Such is the case with aluminium smelters, where in addition the materials fed into the cell create even more irregularity in sampling. Both measurements and feeding are mostly manually controlled. A simplified simulation of the metal level in an aluminium electrolysis cell is performed based on mass balance considerations. System identification methods based on Prediction Error Methods (PEM) such as Ordinary Least Squares (OLS), and the sub-space method combined Deterministic and Stochastic system identification and Realization (DSR), and its variants are applied to the model of a single electrolysis cell as found in the aluminium smelters. Aliasing phenomena due to large sampling intervals can be crucial in avoiding unsuitable models, but with knowledge about the system dynamics, it is easier to optimize the sampling performance, and hence achieve successful models. The results based on the simulation studies of molten aluminium height in the cells using the various algorithms give results which tally well with the synthetic data sets used. System identification on a smaller data set from a real plant is also implemented in this work. Finally, some concrete suggestions are made for using these models in the smelters.

Keywords: Height measurements, aluminium electrolysis, system identification

1 Introduction

Many industrial processes involve systems where two or more physical processes have strongly differing temporal characteristics, i.e large differences in time constants. Simulations of such systems may be solved without excessive use of small time steps, by using two different models, one for the rapid variations, assuming the slow varying process to be constant, and another model for the slow variations, where the rapid

varying process is neglected. This is a possible way of attacking the system identification problem of such systems, also known as stiff systems. In some cases a model should be determined based upon already exist- ing data from industrial processes. In such cases the re- searcher may be confronted with lacking data samples or outliers that should be deleted, hence making the system identification problem some more complicated.

In addition multi-rate system widely exists in chemical process industries, typically with a slower sampling

(2)

Table 1: Nomenclature Symbol Quantity

A Cross section of aluminium CE Current Efficiency

F Faraday constant

I Line current

K Kalman gain matrix

M Sampling difference factor betweenT_f andT_s

MAl Molecular mass of aluminium N Number of data samples P Covariance matrix

Q Charge

Tf Fast sampling time Ts Slow sampling time V Volume of aluminium Vt Criterion function

h Height of molten aluminium b₁ andb₂ Parameters in mechanistic model f_s Sampling frequency

f_m System frequency

∆bh Predicted change of Aluminium height

¯h Aluminium height prediction h Aluminium height measurement

k Discrete time

m Mass of molten aluminium ni Discrete sampling instant of the

i-th output measurement

t Continuous time

w1 Weight put on the model

w2 Weight put on the measurements z Charge number of an ion

α Forgetting factor

∆t Sampling time of input measurements

Prediction error

θ Parameter vector of OLS models λ Parameter weight in OLS model

ρ Density

Φ Measurement vector for fast sampling ψ Measurement vector for slow sampling ACD Anode Cathode Distance

CCA Canonical Correlation Analysis DSR Deterministic and Stochastic

system identification and Realization MDSR Multiple time series DSR

MSE Mean Square Error

NMSE Normalized Mean Square Error NRMSE Normalized Root Mean Square Error NUSM Non-Uniformly Sampled Multi-rate syst.

OLS Ordinary Least Squares PEM Prediction Error Method

ROLS Recursive Ordinary Least Squares SNR Signal to Noise Ratio

rate for the outputs, compared to the inputs of the system. Some of the measurements may also be manually performed, causing rare and intermittent sampling instants. All these factors complicate the system identification process. As the standard methods of system identifications are based upon equidistant sampling instants, these methods cannot be directly applied for such data sets.

During periods without measurement data, gathered data will consist of several data segments. Simply con- catenating the data segments may lead to false tran- sients in their connection points. A more consistent option is to calculate model parameters for each of the segments, using the identical model structure. Finally, the segment estimates should somehow be merged into the resulting model of the system. InLjung(1999) the parameter estimates of the segments are given weight according to their estimated inverse covariance matri- ces. In 1957 Kranc introduced a method of replacing multi-rate sampled systems with single-rate models, using z transform methods (Kranc, 1957). In Kranc’s approach the sampling and updating instants are syn- chronized but operate with different sampling periods.

The sampling periods could all be expressed as unit fractions of an overall sampling rateT, later known as the frame period (Sheng et al., 2002). T is the least common multiple of the periods of the sampling and updating pattern of the system. Within each frame period T, there is one or more “sub”-sampling periods. The most common case is the system in which the inputs of the system are sampled on a higher rate than the output. The extracted single-rate model of the multi-rate system is designed with the sampling rate of the frame period T. This method is therefore known as the lifting technique (Sheng et al.,2002), as sampling periods are lifted to a higher and mutual level for all the variables. A dual-rate modeling case study on continuous catalytic reforming in the oil industry, using lifting technique, is presented inLi et al.(2003).

A least squares method was used by Lu and Fisher to estimate the inter-sample outputs, for applications where the outputs are sampled at a slower rate, compared to the inputs (Lu and Grant Fisher, 1989). In Li et al.(1999) system identification of multi-rate systems, using subspace methods are discussed. In particular the systems considered are systems where the sampling rate of the input variables aren times faster than the sampling rate of the output variables, where n is an integer.

The lifting method has been used to generate a minimum variance predictor with the fast sampling rate of the input variables. This predictor shows enhanced performance compared to a similar predictor using the slow output sampling rate. The controllability and ob-

(3)

servability of lifted systems are discussed inWang et al.

(2004) andDing et al. (2009), in addition to a recursive auxiliary least square model based upon a dual- rate system. A lifted state space model with an implemented Kalman filter for Non-Uniformly Sampled Multi-rate (NUSM) systems is presented in Li et al.

(2008).

A simplified simulated model of the aluminium metal height in a pre-baked aluminium electrolysis process is used as a representative example throughout this paper. Aluminium electrolysis processes applying pre- baked anodes are the main procedure to manufacture primary aluminium to day. This is an old method heav- ily based upon frequent operator interventions involving a plethora of manual operations involving materials feeds, handling of measurands, removing crusts and excessive alumina of cryolite. Although there has been a considerable improvement by the introductions of the automated control system and some few automatically sampled measurements, the majority of the measurements are still manually performed. Regular sensor applications are normally not suitable for this electrolysis process due to the high temperature, the invasive cor- rosive environment of the bath and the generally harsh environment of the plants. In this paper the focus will be upon estimating the molten aluminium height, also called the metal height. By improving the monitoring of this vital variable, the process can be stabilized at more optimal conditions, which are beneficial to the en- vironmental and economical performance of the whole plant. In other words undesired variations of the process should be reduced or eliminated.

The main focus in this paper is on using system identification methods on non-uniformly sampled multi- rate systems, and in particular on a simplified model of the aluminium electrolysis cell. The main contribu- tions in this paper are as itemized in the following:

• Deriving a simplified model of the metal height in an aluminium electrolysis cell.

• Deriving and testing a Prediction Error Method (PEM) (Ljung, 1999) model for a non-uniformly sampled multi-rate system.

• Reducing the impact of the computer precision on updating the covariance matrix of the Multi- ple Input Single Output (MISO) system, in the Recursive Ordinary Least Square (ROLS) method (Di Ruscio, 2001), by rearranging the updating equation.

• Testing the combined Deterministic and Stochas- tic Realization (DSR) algorithm (Di Ruscio,1996) on a data set with multi-rate sampling.

• Testing the Multiple time series Deterministic and Stochastic Realization (MDSR) algorithm (Di Ruscio, 1997) on a data set with multi-rate sampling and multiple time series.

• Comparing different system identification methods on the simplified discrete model of the metal height, showing saw-tooth-shaped behavior.

The rest of the paper is organized as follows. In Sec. 2 we derive a simplified discrete model of the metal height in an aluminium electrolysis cell, which represents the synthetic system that is to be identified throughout the rest of the paper. In Sec. 3 the black, white and gray model approaches in system identification are emphasized. Data sampling for system identification methods are discussed in Sec. 4. In Sec. 5 prediction error methods are described for use on the given synthetic system, assuming non-uniformly multi- rate sampling. In Sec.6the sub space method DSR and a corresponding version for multiple series are tested, assuming synchronous but multi-rate sampling. Sam- pled data from a real plant is utilized in Sec. 7. The results from the different methods are compared and analyzed in Sec.8, whereas this paper is summarized and concluded in Sec.9.

2 Simplified model of the metal height

There are several model approaches describing the level and shape of the metal/bath interface in aluminium electrolysis cells. The interface is not totally flat, but consists of gravitational standing waves, and unstable propagating waves (Kurenkov et al.,2004). The waves are generated by the high electrical and magnetic fields (Lorentz forces), and the Kelvin-Helmholtz instability of the mean flow. Under stable conditions, the interface waves are estimated to have an amplitude of about 0.5 mm, at fixed positions (for a 500 kA cell), but these waves can increase by variations in stability affecting parameters, like the Anode Cathode Distance (ACD) and the height of the liquid metal (Bojarevics and Peri- cleous,2006,2009). Larger deviations are present along the horizontal axis, due to the Lorentz forces. In this work we will consider a fixed horizontal position of the metal/bath interface, and are focusing on its mean value.

Consider the sketch of an aluminium electrolysis cell of Figure 1. A simplified model of the metal height, i.e. h, will be derived in this section, based upon the aluminium mass balance.

Due to the spatial mass balance of molten aluminium the time gradient of mass equals

(4)

Molten Aluminium Carbon Lining Carbon Anode

Carbon Anode

Current Collector Bar Current Collector Bar

Electrolyte Anode

rod Removable

cover

Crust Side ledge

Carbon Lining

Insulation

Steel Shell Crust Breaker

Point Feeder

Sludge Alumina

supply

h

ACD

Figure 1: Schematic drawing of an aluminium electrolysis cell with the different structures in it forming the system under consideration. Our model focuses on temporal variations of molten metal heighth.

dm

dt =m^·_gen(t) +m^·_in(t)−m^·_out(t). (1) wherem^·_gen(t) is the instant mass rate of molten aluminium generated inside the cell, whereas m^·_in(t) and m^·_out(t) represents the mass flow of molten aluminium into and out of the electrolysis cell, respectively. For small values of ∆t, a good approximation of the mass balance is the discrete model

∆m(k) = ∆t(m^·_gen(k) +m^·_in(k)−m^·_out(k)), (2) where k = {1,2,3,· · · } represents the discrete time, whereas ∆t is the sampling time. Normally, there are no inflow of molten aluminium to the system, as the aluminium is generated within the electrolysis cell, which is regarded as the system of interest.

The generated mass of aluminium per time unit is given by Faraday’s laws of electrolysis (Grjotheim, 1993);

m^·_gen(k) = ∆Q(k)·CE·MAl

F·z·∆t (3) where CEis the current efficiency of the cell which in the following simulation is assumed to be constant for all t. MAl = 26.98g/mol is the molecular mass of aluminium, F the Faraday constant, whereas z is

the number of electrons involved in the electrode re- action generating one single aluminium atom from a aluminium-ion. The charge transferred from timek−1 tok, i.e. ∆Q(k) is given by the time integral of the current

∆Q(k) = Z k

k−1

I(τ)dτ (4)

Assuming constant value of the current within each

∆t, i.e. zero-order hold, the generated mass of aluminium per time unit equals

m^·_gen(k) =CE·MAl

F·z I(k), (5) There is usually no inflow of molten aluminium to the process, hence

m^·_in(k) = 0 (6)

The outflow of molten aluminium per time unit re- lates to the tapping proceeding. As the tapped mass of aluminium, and not its mass flow is measured in the tapping procedure, it is more convenient to put

mout(k) = ∆t·m^·_out(k), (7) in Eq. (2), hence assuming a constant flow rate within each time step.

(5)

Assuming a crisp interface between the electrolyte and the molten aluminium, and that all the generated molten aluminium will be located at the bottom of the cell, i.e. assuming no time delay due to transportation of aluminium in the electrolyte, and no precipitations of other substances within this area, and assuming a homogeneous pure aluminium, ∆m(k) equals

∆m(k) =ρ·∆V(k). (8)

Due to side ledge formation of frozen electrolyte at the side walls of the cell, the area of the horizontal section of molten aluminium varies with both height and time. Here the side ledge profile (Figure1) is assumed to be straight vertical. Hence, the change in volume of the molten aluminium in one time step is, according to Figure2, given by

∆V(k) = ∆h(k)·A(k) +h(k−1)·∆A(k) (9)

Figure 2: Schematic figure of the volume of the metal pad, showing change in volume of the molten metal described by the variablesAandh.

Consequently, in one time step, the net mass aluminium accumulated is

∆m(k) =ρ(∆h(k)·A(k) +h(k−1)·∆A(k)). (10) Based upon Eq. 10, the change in metal height in one time step is given by

∆h(k) = ∆m(k)

ρ·A(k)−h(k−1)·∆A(k)

A(k) . (11)

Inserting Eq. (2), (5), (7) into Eq. (11), leads to this simplified model of the height of aluminium in the cell:

∆h(k) = 1

ρ·A(k)[∆tCE·M_Al

F·z I(k)−m_out(k)]

−h(k−1)·∆A(k) A(k) , (12)

The model in Eq. (12) is used for generating the synthetic data set simulating the aluminium height throughout this paper.

An even simpler model of the metal height is derived, by assuming that the area A is constant. Hence, the last term in Eq. (12) will disappear. By further col- lecting the constants into two constants, a very simple finite impulse response (FIR) model is formed

∆h(k) =b₁·∆t·I(k)−b₂·m_out(k), (13) where

b₁= CE·M_Al

ρ·A(k)·F·z, (14) and

b2= 1

ρ·A(k). (15) This model is the basis of the PEM models used for the system identification problem in Sec.5.

The assumptions used in this model are:

1. Homogeneous layers of molten cryolite (electrolyte) and molten aluminium.

2. Perfect crisp interface between the layers. No mix- ing of the materials in the different layers.

3. No transportation delay for any of the materials in the system.

4. The process variables are fixed for each sampling interval ∆t.

5. A straight vertical side ledge profile.

6. Constant current efficiency.

7. No precipitates or impurities in the aluminium volume, only pure aluminium.

8. Constant density of the molten aluminium.

The aluminium electrolysis process requires a very stable energy balance, as the operation temperature of the electrolyte within the cell is close to its freezing point, so that the top surface of the bath is covered with frozen electrolyte, as well as the side ledges. The frozen electrolyte has a desired heat insulating effect, as well as it protects the side shell of the cell from cor- rosion. Simultaneously it is important that the active electrolyte is in molten condition. To maintain these steady conditions of the heat transfer, and hence the side ledge profiles, the variation of the metal height varies with only about some centimeters between maximum and minimum position of the interface. However, the total height of the aluminium layer is normally close to 20 cm.

(6)

2.1 Simulation of the molten metal height

A simulation of the aluminium height was performed using the simplified model derived in Eq. (12). Normal distributed noise with mean value µ = 0 and standard deviation σ = 0.75 cm was added to the height measurement, and rounded to nearest half centimeter.

This is a very high, but a realistic measurement uncertainty of the manually performed height measurements in aluminium electrolysis plants. The time of the daily performed tapping procedure is randomly varied from day to day, as seen in in Figure 3. In the simulation typical variations of the variables I and mout are included. To make the simulation more dynamic, a variation of the horizontal cross sectional areaAhas been implemented as a sine-curve. The simplified model is simulated with ∆t= 5min. This simulation approach will be used throughout this paper, examining different system identification methods.

3 Black, white and gray box models

There are several ways of categorizing different system identification methods. In this paper we use the box system to categorize the utilized algorithms, to have a better overview of the benefits and drawbacks of each method.

3.1 Introduction to model types

There are two main approaches to generate mathematical models of industrial processes. Mechanistic models, also called white box models, are merging several well-know physical relations between the different variables of concern to achieve a reasonable model of the process. With set of mechanistic model variables that are assumed to influence the total system, the outputs may be calculated with these models. Although the basic models are well established and may work ex- cellent for small processes, the many assumptions and uncertainties induced by the often huge number of relations integrated in the model, the final simulation of the variables may not be satisfying. The deviations of the simulated variables from the real values, regarded as random noise is normally not white noise, and some of the noise might be eliminated by improving the models.

The optional approach of creating mechanistic models is a conglomerate of different empirical methods that all depend on sampled data from the particular system of interest. In that sense such systems are more individually calibrated to the actual system, whereas the mechanistic approach induces more general models. System identification, neural networks, multivari- ate data analysis are all groups of algorithms that

0 2 4 6 8 10 12 14 16 18 20

0 100 200 300 400

time [Days]

Line Current [kA]

0 2 4 6 8 10 12 14 16 18 20

0 500 1000 1500 2000 2500

time [Days]

Tapped Al [kg]

0 2 4 6 8 10 12 14 16 18 20

38 39 40 41 42

time [Days]

Cross sec. area [m2]

0 2 4 6 8 10 12 14 16 18 20

17 18 19 20 21 22

time [Days]

Metal height [cm]

Figure 3: Simulations of the three input variables (Line current, Tapped Aluminium, and The horizontal Cross sectional area of the metal pad) and the output variable (Metal height) considered in the simplified model of an aluminium electrolysis cell.

are belonging to algorithms based on empirical rea- soning/analysis. Models where the model structure is not influenced by physical relations is often called black box models (Ljung,1999), because parameters are adjusted to fit the input and output data-sets, without reflecting physical considerations of the system. The model structure inside is not reflecting the structure of the real system, whereas the mechanistic models at- tempt to have identical structure to the real systems, and hence called white models.

It is also possible to take advantage of the empirical methods in conjunction with the mechanistic models, by utilizing the well known physical relations and calibrate the physical parameters by empirical methods, called parameter estimation. With this approach, all the variables are still easily available as the model structure is identical or similar to the mechanistic

(7)

model, but simultaneously more individually fitted, due to the empirical methods. There is a large range of such gray models,varying from models with almost any physically based model structures to mechanistic models where some parameters are estimated. Hence, gray model is a collective term of such mixed models.

The main focus in this approach will be parameter estimation using least square methods, which can be considered as a gray box, as the structure is designed based on physical insight, and the parameters calibrated according to empirical data.

Many testing techniques of different processes have been developed over the years based on functional, process and system modeling (among a plethora of other possibilities), aiming at testing, verifying and predicting system behavior robustly and as accurately as possible. The different modeling approaches display di- verse characteristics as shown in Table 2. Naturally, some solutions may involve a series of white box, black box, and gray box in tandem depending on the system architecture and the needed set of outputs.

3.2 White Box Models

White box models are based on knowledge of the internal behavior of the system. The available form of knowledge can be in terms of equations, coherent data with the related temporal and spatial associations etc.

This approach may increase effectiveness, reveal internal structures, but is essentially a general approach, that is not calibrated to the actual process. White box modeling is widely performed in simulation purposes, as no measurement of the process has to be performed, to generate the model. The process does not need to exist at all, which makes this modeling approach ideal for designing and modifying systems. The simplified model of the metal height, derived in Sec.2 is an example of a white box model.

3.3 Black Box Models

Black box models are based on the insurmountable fact of not having any knowledge about the internal behavior of the system under scrutiny. Black box model does not have any information of the system architecture or any underlying equations describing the internal behavior of the variables involved. A typical black box model is based on the study of a set of inputs provided by the user to the system and outputs from the system oblivious to where, when and how these inputs were operated inside the system to deliver the observed outputs. In other words, how these outputs are generated or what is inside the black box representing the system are not important or unknown to the user of the black box model. The main advantages may be in the

ease of the usage and implementation of the model, and the process specific approach. On the other side, generating the model requires consistent measurement of the system, measurements that can both be cum- bersome and expensive. The DSR and MDSR model of the metal height, that will be presented in Sec.6are examples of black box models.

3.4 Gray Box Models

Gray box models address systems with limited knowledge of the internal behavior of the system under scrutiny. As the name implies, the model will have the strategies of white box and black box models in analyzing a given system. This method may have the advantage of harvesting from both white box and black box analysis of a given system. For complicated systems like the one we have in the case of aluminium electrolysis cell, this method may be an unavoidable option, as models addressing all the phenomena in the cells are not available or incomplete. The OLS and ROLS model of the metal height, that will be presented in Sec.5 are examples of gray box models.

4 Data sampling

Table 3 provides a set of the most common variables that are measured in the aluminium electrolysis process.

In the aluminium electrolysis process, there is a large variety in sampling intervals for the different variables.

The online measurements; line current and cell voltage are the most regular performed measurements often given as mean values every 5 minutes. The tapping procedure is intermittently performed typically once a day. The height of the molten aluminium in industrial plants is only measured just before the tapping proceeding, and may influence the amount of aluminium to be tapped. In case of generating a model of this process, if just implementing these few measurements into a black box model, without adding any additional information of the system, a complete unfeasible model of the aluminium height will be the result. A black box model will “see” a folded version of the system as the dynamic of the saw-tooth shaped periodic oscillation is not sampled. By further including a realistic normal distributed measurement error with meanµ = 0, and standard deviationσ= 0.75 cm, the identification process will become even worse, as can be seen in Figure4.

The uncertainty of the metal height measurements will add random dynamics to the already poor folded measurements.

Tests with real data from the industry show that the simple prediction model;

(8)

Table 2: Different “Box”Systems in conjunction with modeling and their overall characteristics

Model Type Physical insight Model structure Validity

White Box Crucial Rigid General

Black Box Less important Flexible Specific

Gray Box Important Adjustable Fairly Specific

0 1 2 3 4 5 6 7 8 9 10

17 18 19 20 21 22 23

Time [Days]

Metal height [cm]

Metal height Metal height measurements

Figure 4: Synthetic data set showing metal height steadily increasing with sudden drops in its values, due to tapping, over a period of time, in this case about 10 days. It is common to measure the metal height just before the metal tapping, once a day.

¯h(k+ 1) =h(k), (16) provides smaller error predictions than many ad- vanced system identification models, because of the lack of information of the dynamic system the few sampling instants gives. To overcome these challenges both more physical insight of the system and more measurements have to be considered.

In cases where the output variables make stepwise jumps, like the saw-tooth jumps of the aluminium height, making the measurements at the right time instants is even more important than just increasing the number of measurements in general. The Nyquist the- orem states that the sampling frequency f_s should at least be twice the frequency of the variable to be measured (Shannon,1949,1998);

fs>2fm (17) In many reel applications it is more common to sample with a sample frequency about 10 times the system frequency. In the aluminium electrolysis example fs = fm, which according to both the Nyquist theo- rem and simulations provides a too poor sampling rate for system identification purposes. At the other side, increased number of manually performed metal height measurements might be to time demanding, and hence expensive for the plant, as the sampling range might

span some weeks. Especially in cases where recursive models are considered, as few manual measurements as possible should be included.

A more cost efficient solution to the high sampling rate is to use knowledge of the system to decide crucial measurement instants, i.e. including more information on the physical and chemical connections between the variables to establish a more realistic model with improved predictability. Based on sensible rea- soning of the system, or if a mechanistic model of the system is available, it is possible to set up decisive intermittent measurement instants without performing a lot of measurements with a high sampling frequency.

One measurement just before, and one just after the tapping procedure would have been a reasonable minimum requirement for detecting the model of the aluminium height. If the mechanistic model structure is assumed to be of high accuracy, a prediction-error- method (PEM) would be the first choice, to estimate its parameters (Ljung,1999).

When deciding the crucial measurement instants, variables with rapid short term variations are isolated from variables with long term slow variations. After this segregation of variables, it is possible to utilize different system identification methods for each of the two variable groups. This will be considered in the following example from the aluminium electrolysis cell.

(9)

5 Prediction Error Methods (PEM)

In this section different PEMs are used to predict the metal height, based on the synthetic data set described in Sec. 2. To make it realistic, the input variables are assumed sampled with a “fast” sampling rateT_f, while the output is sampled with a “slow” variable sampling rateTs(i). Ts(i) will vary with time, due to manual intervention in performing the measurements. For convenience Ts(i) is adjusted to fit a multiple of Tf, as shown in Figure 5. ni is the discrete time of output measurement number i, while the inputs are sampled at eachk.

The Ordinary Least Square (OLS) algorithm is designed for both these sampling rates, entailing two OLS-models. The model with sampling rate T_s(t) is used for determining the parameter vectorθ, while the model with sampling rateT_f is used for predicting the inter-sample outputs. It is utilized that the identified parameter vectorθ is identical for both models.

5.1 Ordinary Least Square (OLS)

Based upon the simple discrete model in Eq. (13), a linear regression model is constructed;

∆bh(k|θ) =

Φ^T(k)

z }| { ∆t·I(k) mout(k)

θ

z }| { θ1

θ2

(18) where ∆bh(k|θ) is the predicted change of the metal height from time stepk−1 tok, given the parameter vector θ. The parameters are estimated by the least square method, where the optimalθ, given the defined model structure and a specific data set with N input and output samples, is the parameter vectorθb^LS_N that minimizes the least square criterion for the linear regression

VN(θ) = 1 N

N

X

k=1

^T(k)λ(k), (19) where(k) = ∆h(k)−∆bh(k|θ) describes the prediction error at time stepk.

Table 3: Common measured variables in aluminium electrolysis plants. The statistics are based upon process data from one single aluminium electrolysis cell generated over one year. The measurements can be categorized into two main categories; on-line measurements (⊕) and measurements performed with manual interactions ( ).

Index Variable Unit Category

Average sample

time

Average Standard deviation

Minimum value

Maximum value

1 Metal height cm 24 h 20.1 1.2 17 28

2 Cell Voltage V ⊕ 5 min 4.49 0.29 0.44 28.3

3 Line current A ⊕ 5 min 293.4 3.7 0 315.6

4 Bath height cm 24 h 20.7 1.5 14 32

5 Anode position 1 ^rotations_day ⊕ 78 min 0.0 1679 150 2569

6 Anode position 2 ^rotations_day ⊕ 78 min 0.0 1677 112 2634

7 Alumina feeding _day^kg ⊕ 15 min 4137 323 3044 5121

8 Tapped aluminium _day^kg 24 h 2182 301 0 3125

9 Acidity % 48 h 10.7 1.7 6.3 15.2

10 Added bath _day^kg 10 days 590 264 350 1750

11 Tapped bath _day^kg 7 days 665 354 150 2100

12 Bath temperature C 7.5 h 962 9.0 939 998

13 Added CaF2 _day^kg 24 days 35 12.7 25 50

14 CaF2

concentration % 56 h 5.1 0.2 4.3 5.7

15 Anode effects _day¹ 24 hours 0.19 0.6 0 5

16 Feeder defects _day¹ 13 days 0.080 0.31 0 2

17 Superheat C 4 days 7.58 5.19 1.9 36

(10)

1 2 3 …… n1-1

Discrete time (k) Tf

n1 n1+1 …… n2-1 n2 n2+1 ……

Ts(1) Tf

Ts(2)

n³-1 n³ n³+1

……

Tf

Tf Tf Tf Tf Tf

……

Ts(3)

……

Figure 5: Time line showing the fast sampling rateT_f used for the input variables, and the slow sampling rate T_s(i) used for the output measurement (the aluminium height). As indicated T_s(i) will vary with time, due to manual intervention, but it is adjusted to fit a variable multiple of T_f. i indicates the i-th element of the output measurements sequence, while the inputs are sampled at eachk.

The analytical solution of the optimal θb^LS_N is given by

θb^LS_N =

N

X

k=1

Φ(k)Φ^T(k)

!⁻¹ _N X

k=1

Φ(k)∆h(k), (20) if the inverse of the indicated matrix exists. As this is a MISO (multiple input, single output) system, the weightλis a scalar, henceλ⁻¹λ= 1, andλis therefore not included in Eq. (20);

Consider a data set of input and output variables describing a period of 20 days. Assuming a regular consistent measurement regime, the number of metal height measurements will beN_h= 20, whereas the number of line current measurements are NI = 5760. Although the tapping is performed 20 times, the number of samples in the “tapping” vector is for convenience set to Nt= 5760 to match Nh, where the sample values between the tapping incidents are zero. Due to the manual intervention of the height measurements and tapping procedure, variations in both the number and time instants are common. The LS-model of Eq. (18) has to incorporate these properties, hence become more flexible in utilizing intermittent measurements from real plants.

The elements of the regression vector Φ(k) are defined asφ₁(k) = ∆t·I(k) and φ₂(k) =m_out(k). The predicted change of the metal height between two metal height measurements, i.e. from time stepn_iton_i+1are due to the LS-model in Eq. (18), given by

ni+1

X

k=ni+1

∆bh(k|bθ^LS_N ) =

θ1[φ1(ni+ 1) +φ1(ni+ 2) +· · · +φ₁(n_i+1−1) +φ₁(n_i+1)]

+θ2[φ2(ni+ 1) +φ2(ni+ 2) +· · ·

+φ₂(n_i+1−1) +φ₂(n_i+1)] (21)

To simplify the notation, we write

∆bh(n_i+1|bθ^LS_N ) =

n_i+1

X

k=n_i+1

∆bh(k|bθ_N^LS) (22)

By summing up the right-hand side of Eq.( 21), a prediction model with linear regression structure is achieved;

∆bh(n_i+1|bθ_N^LS) = (23)

Pn_i+1

k=n_i+1φ1(k) Pn_i+1

k=n_i+1φ2(k)

| {z }

Ψ^T_ni+1

θ₁ θ₂

Although the linear regression model of Eq. (23) is used for estimating the least square parameter vector θ, this parameter vector is mutual for both Eq. (23) and Eq. (18), hence it can be used to predict the metal height by Eq. (18).

5.1.1 Simulation of the metal height

As the tapping proceeding is a rapid process compared to the slow electrolysis process where molten aluminium is generated, height measurements should ideally be made both just before and immediately after the metal tapping. Hence, the top and at the bottom level of the metal height would be measured in each cy- cle. In that way, the rapid changes of the metal height, due to metal tapping, is isolated into a small temporal regression vector Ψn_i. The next regression vector Ψn_i+1 will isolate the influence of the slower electrolysis process. However, the second measurement instant should be carefully chosen due to an eventually time lag. With this approach every second Ψ-vector in the considered simulation, the last element of this vector, representing the amount of tapped metal, is zero.

(11)

5.1.2 The prediction algorithm

With the predictor model of Eq. (18), the measurement based update of the predicted metal height is very rare.

Hence, the predictions of ∆h(k) will be updated only by the predictor model for all the time instants between two metal height measurements. The estimated height of the metal height is given by;

h(k+ 1) =h(k) + ∆bh(k+ 1) (24) Although the parameter estimation may be good, noise and uncertainties in the model may cause the predicted output of the aluminium height to drift away from the real value of the aluminium height. On the other side the manually performed metal height measurements are likely to be corrupted by decisive noise.

Therefore, a weighting algorithm has been incorpo- rated in the predictor at each time step a new metal height measurement is performed;

h(k+ 1) =w1h(k) +w2˜h(k) + ∆bh(k+ 1), (25) where ˜h(k) is the metal height measurement at time step k. The weights w₁ and w₂ are related so that w1 +w2 = 1. The weights are adjusted by trial and error. Generally w1 should be close to 1, if the model is assumed to be of proper quality and simultaneously the measurements having large uncertainties.

w1should be reduced if the model seems unreliable or the measurements more reliable. In the following ex- amplew1= 0.75.

The data sets given in Figure3, were used to generate the θ-parameters of the OLS-algorithm. The predictions are compared to the results of simulated model and measurements in Figure 6. The predicted metal height is following the real metal height quite well, in spite of large uncertainties in the metal height measurements.

5.2 Recursive Ordinary Least Square (ROLS)

Variations in the cell performance may cause a need for updating the parameters of the model. In this case where the measurements are both rare and displaying low accuracy, updating the model has to be very slow, in order to avoid rapid variations of the model parameters, due to measurement uncertainties. Hence, short term variations of the plant will have minor influence on the model, but prospective seasonal and aging variations will have impact on the parameter variations of the model. In an OLS or a recursive ordinary least square (ROLS) method, by increasing the number of

samples, each sample will have less influence on the estimated parameters. In systems where it is likely that there will be long term variations of the model parameters, a forgetting factor αcould be included (Ljung, 1999). The criterion function at timetwill then obtain the following form;

Vt(θ) = 1 t

t

X

k=1

α^t−k^T(k)λ(k), (26) Hence, the newest measurements are weighted more than the older ones. The choice of α is a trade off between reducing the sensitivity of the model regarding measurement noise, and simultaneously be able to adopt to time variations of the system parameters. Val- ues between 0.95 and 0.99 are common choices. For a given forgetting factor, e.g. α = 0.95, the weight as- signed to a sample-value 30 samples before the current sample, is reduced to approximately 21% (0.95³⁰ ≈ 0.21) of the weight put on the current sample-value, whereas definingα= 0.99, the representative weight is reduced to approximately 74% (0.99³⁰≈0.74).

According to Eq. (20), an optimal parameter vector given a data sample with discrete time [1,2,· · · , t−2, t−1] is given by

θbt= (

t

X

k=1

Φ(k)λΦ^T(k)

| {z }

P_t

)⁻¹

t

X

k=1

Φ(k)λ∆h(k), (27)

if the inverse ofPtexists.

A recursive ordinary least square (ROLS) method with a forgetting factor is possible to include in the parameter estimation in the following way (Di Ruscio, 2001):

1. Initial values of the covariance matrixPt and the parameter vectorθt are defined. It is common to let

P₀ = δ 1 0

0 1

=δI θ₀ =

0 0

,

whereδis a “large” number, e.g. 10 000.

2. The regression vector Ψ is updated at each metal height measurement:

Ψ_n_i+1= Pⁿi+1

k=ni+1φ1(k) Pⁿi+1

k=ni+1φ2(k) ^T (28) 3. Updating the inverse of the covariance matrix

P_n⁻¹_i+1=αP_n⁻¹_i + Ψn_i+1λΨ^T_n_i+1 (29)

(12)

0 1 2 3 4 5 6 7 8 9 10 17

18 19 20 21 22 23

Time [Days]

Metal height [cm]

Real metal height Metal height measurements OLS-model of metal height

Figure 6: Prediction of metal height for the earlier presented synthetic data set showing fairly well correlation between the simulated and predicted values. The prediction model is based on the OLS algorithm and the noisy metal height measurements, just before and just after metal tapping, shown with the green circles in the figure.

and the “Kalman” gain matrix

Kn_i+1 =Pn_i+1Ψn_i+1λ (30) after each metal height measurement.

4. Updating the present parameter vector at each metal height measurement

θn_i+1 =θn_i+Kn_i+1 ∆hni+1−Ψn_i+1θn_i

(31) where ∆hn_i+1=h(n_i+1)−h(n_i)

5. Computing the covariance matrix after each metal height measurement:

P_n_i+1= αP_n⁻¹

i + Ψ_n_i+1λΨ^T_n

i+1

−1

(32) 6. At each time instant k the predicted change of

metal height is given by

∆h(k) = Φ(k)θ^T (33)

5.2.1 Matrix inversion lemma

An equivalent form of Eq. (29) and Eq. (32) that is more suited for rapid computation is given by the Sherman-Morrison-Woodbury formula (Golub and Loan,1996), also called the matrix inversion lemma;

(A+U V^T)⁻¹=A⁻¹−A⁻¹U(I+V^TA⁻¹U)⁻¹V^TA⁻¹ (34) Applying Eq. (34) to Eq. (32), where

A = αP_n⁻¹_i U = Ψ_n_i+1 V^T = Ψ^T_n_i+1

gives

P_n_i+1= (35)

1 αPn_i−

1

αPn_iΨn_i+1

·

I+ Ψ^T_n

i+1

1

αP_n_iΨ_n_i+1₎ −1

Ψ^T_n

i+1

1 αP_n_i

!

The matrix inversion in Eq. (32) is modified using Eq. (34), hence the inversion of a matrix is now re- placed by a inversion of a scalar, i.e. the denominator in the following deducted equation:

P_n_i+1= (36)

1

αPn_i I− Ψ_n_i+1Ψ^T_n

i+1P_n_i α+ Ψ^T_n_i+1Pn_iΨn_i+1

!

Note that this way of calculatingP_n_i+1 is more sen- sitive regarding selection of initial values of P₀ when working with regression vectors where one element is zero, and the other is “large”. In the simulated system that is considered, only proper parameter estimation is achieved for values ofδ≤10⁻⁴. For larger values ofδ, the first element of the P-matrix will become zero, and henceθ1will also remain at zero throughout the whole estimation period. For values ofδ≤10⁻⁸,P and hence θwill converge very slowly towards its proper values.

If the normal routine of letting the initial value ofP11

be “large” is followed, and in additionψ1is “large”, the impact ofα∈<0,1], will easily be neglected due to the limitations of the computer precision. Assuming that α << Ψ^T₁P0Ψ1, the first estimate of P will become a lower triangular matrix due to the computer precision;

(13)

P1 = 1 αP0

I− Ψ1Ψ^T₁P α+ Ψ^T₁P0Ψ1

= 1

αP₀ 1 0

0 1

− 1 αP₀

1 0 0 0

= 1

αP0

0 0 0 1

(37) However, by rearranging Eq. (36) in the following way, proper values ofθwill be estimated also for larger values ofδ:

P1 = 1 αP0

α α+ Ψ^T₁P0Ψ1

I+Ψ^T₁P0Ψ1I−Ψ1Ψ^T₁P α+ Ψ^T₁P0Ψ1

! (38)

= 1

αP0

" α

Ψ^T₁P₀Ψ₁ 0

0 ^α

Ψ^T₁P0Ψ1

# +

0 0

0 1 !

= 1

αP0

" _α

Ψ^T₁P₀Ψ₁ 0

0 1

#

(39)

P1will in this case remain a diagonal matrix, which is crucial. The reason why, the matrix inversion lemma had to be rearranged is that displaying _α+Ψ^Ψ¹^Ψ_T^T¹^P

1P₀Ψ₁

requires higher computer precision than displaying

α

α+Ψ^T₁P₀Ψ₁, since the former has a very small perturbation from 1, whereas the other has a small perturbation from 0.

Given a realistic example from the system considered in this paper, where Ψ^T₁ =

2.5·10¹⁰ 0 and P0 = 10⁴I. Computing P1 will then involve computation with numbers close to the precision limits of the computer. By using the ordinary matrix inversion lemma, only values where 10⁻⁷ < δ < 10⁻⁴ caused reasonable results. Model predictions outside this area where unstable as one of the parameters maintained zero throughout the regression. For larger values P₁₁ and θ₁ became zero for the ordinary matrix inversion lemma, whereas the modified inversion lemma performed proper estimations for all larger values ofP11. 5.2.2 Prediction of the metal height

The data sets given in Figure3, were used to generate theθ-parameters of the ROLS-algorithm. The prediction is compared to the simulated model and measurements in Figure 7. The predicted metal height is following the real metal height quite well, in spite of large uncertainties in the metal height measurements.

In Figure8 the temporal variation of the regression parameters is shown. For this simulation the recursive variant of the OLS is superfluous, as these parameters are not following the constant system parameters very well.

0 1 2 3 4 5 6 7 8 9 10

0 1 2

3x 10^-12 Parameter estimation with ROLS

Time [Days]

0 1 2 3 4 5 6 7 8 9 10

-2 -1 0 1x 10^-5

Time [Days]

θ₁ b₁

θ₂ b₂

Figure 8: Simulating the values of the estimated parameters, calculated in the ROLS model, using the earlier presented synthetic data set.

The ”‘rapid”’ oscillation of the parameters is not detected by the ROLS-model. The estimated parameters are close to the mean value of the parameters, and will only follow long term variations of the parameters.

6 System Identification using DSR

Assume that there are dynamics within the system that are not detected by the strict mechanistic model used with PEM, as how the temperature is influencing the areaA. An alternative is to identify black box models, like the DSR (Deterministic and Stochastic Realiza- tion) model. The DSR method is based upon a linear discrete time invariant State Space Model (SSM)

x_k+1=Ax_k+Bu_k+Ce_k (40) yk=Dxk+Euk+F ek, (41) and is explained inDi Ruscio(1996).

6.1 Regular DSR

The sample scenario used with the OLS and ROLS algorithms in the former section is not consistent with the SSM in Eq. (40). Further the number of measurements has to be increased to extract information on the short term system dynamics between the tapping instants. Hence, a new measurement regime has been set up for the identification of a DSR model, as described in the following.

Assume that the system of interest might be described by the linear SSM, and that the inputs are sampled with one fast sampling rate, Tf, whereas the output is known at a slow sampling rate,Ts. Contrary to Sec. 5, Ts is constant in this section. The input sampling rate is M times faster than the output sampling rate, hence Ts = M Tf. If it is not possible to increaseTs, themodus operandiis to consider only the

(14)

0 1 2 3 4 5 6 7 8 9 10 16

17 18 19 20 21 22 23

Time [Days]

Metal height [cm]

Real metal height Metal height measurements ROLS-model of metal height

Figure 7: Simulating the predicted metal height using ROLS. Prediction of metal height for the earlier presented synthetic data set showing good correlation between the simulated and predicted values. The prediction model is based on the ROLS algorithm and the noisy metal height measurements shown with the green circles in the figure. Although the initial value is set to high, the predicted values

”‘soon”’ reaches the desirable level. ROLS needs some time to adjust its model parameters and hence the variations in its initial phase.

inputs at a sampling rate of Ts, in order to achieve a common sampling rate, which is required to use the standard DSR-method (Di Ruscio, 2000). Then, to get a discrete SSM with the fast sampling rate, Tf, the generated discrete SSM is first transformed into a continuous SSM, utilizing e.g. the MATLAB function [Ac, Bc] = d2c(A, B, Ts). Next the continuous model is transformed back to a discrete model, but now with another sampling rate by the MATLAB function [A_d, B_d] =c2d(A_c, B_c, T_f). A periodical variation in the cell temperature introducing a variation in the width of the metal bath has been introduced in the following simulations. Hence, in addition to the sawtooth variation, already explained, the metal height will fluc- tuate with a period of about 10 hours. Assuming that the three input variables I, mout and T (bath temperature) are sampled every 5 minutes, and the metal height measured every 3rd hour. To be able to utilize the DSR algorithm the mean values every third hour are used as inputs, to generate the state space model.

The model is then transformed from a sampling time of 3 hours to 5 minutes as already described. Figure9 shows the result of a simulation of the metal height in the aluminium electrolysis cell, using this method. The simulation shows a fairly good estimation, in spite of the poor metal height measurements also shown in the figure. By running several additional simulations with random generated measurement noise, some of the con- comitant predictions had a tendency of drifting away from the real metal height. This could be compensated by including measurement in the prediction algorithm, e.g. including a Kalman filter, when running the prediction model online. This could be a useful estimation technique if implementing automatic metal height

measurements as discussed in (Viumdal et al., 2010;

Viumdal and Mylvaganam, 2010). To be able to utilize the DSR algorithm the mean values every 3rd hour are used as inputs, to generate the state space model.

The model is then transformed from a sampling time of 3 hours to 5 minutes as already described. This transformation of the model unfortunately more often involves less robust models.

6.2 DSR in case of multiple time series

Another approach to identify a model of the metal height is to use the MDSR, as addressed in (Di Rus- cio,1997). MDSR is designed to handle multiple time series, e.g. an industrial process where the time series are interrupted by shutdowns in the production line, or where each time series represents a batch process. By just placing measurements from different time series successively in a large data matrix, the initial states of each time series is not computed when using ordinary DSR algorithm. By using MDSR, initial state values are calculated for each time series, in addition to the overall state space model. In the aluminium electrolysis example, the increase in metal height is “interrupted” by the metal taps. Instead of measuring the metal height every 3rd hour, as assumed with the DSR- algorithm, another measurement scenario is applied for the MDSR approach. These measurement series are di- vided into several sub time series, each starting immediately after the metal tapping. As the tapping has an intermittent sampling interval, there will be a shift in the sampling incidents between each of the time series.

Within each sub time series, new metal height measurements are sampled each 3rd hour. The mean value

(15)

0 1 2 3 4 5 6 7 8 9 10 17

18 19 20 21 22 23

Time [Days]

Metal height [cm]

Real metal height Metal height measurements DSR-model of metal height

Figure 9: Prediction of the metal height, using DSR on the earlier synthetic data set. Unlike the assumed measurement regime for the OLS and ROLS models, the metal height is here assumed to be measured every 3rd hour. The noisy metal height measurements are shown with the green circles in the figure.

In general, the simulated values are closer to the predicted than the measured values.

of the input variables are taken for each time step as done with the DSR model. The result is a new model, with as many initial states as there are sub time series.

Figure10shows an example where MDSR is used, still with a sampling time between each metal height measurement of 3 hour, but here with sampling occurrence according to the latest metal tap instance. Attempts of transforming the MDSR model with a sampling rate of 3 hours to a sampling rate of 5 minutes failed, therefore the graph in Figure10is marked with dots, at the locations where model has calculated the predictions.

However, in a real aluminium electrolysis process, the metal height is still measured manually in an intermittent manner, and normally only just before the tapping proceeding. Hence, the PEM-algorithms seem to be the most realistic approach at the moment. Using MDSR online is not straightforward, one alternative is to use any PEM-algorithm to predict the drop in metal height due to the metal tapping, and using the MDSR as the model describing the metal height between the tapping proceeding.

7 System identification using real measurements

A measurement sequence from an Aluminium reduction cell in Norsk Hydro, ˚Ardal, was performed with a time span of 76 hours. The metal height was measured just before, and just after the tapping instants, and then approximately every 4th hour. The input variables “Line current” and “Tapped bath” were sampled each 5 minute. Two different OLS models of the process was generated for this data set. The first model was only based on the metal height measurements performed in conjunction with the metal tapping, while

the inter sampled measurements are included in the second OLS model. Figure11shows the predictions of the two OLS models, the measurements, and a simulation of the mechanistic model. The mechanistic model is based upon geometrical parameters of the aluminium reduction cell, and the initial value is adjusted to fit the measurements. The element values of the parameter vector θ are about 36% to 50% less than the corresponding parameter values in the mechanistic model. The parameter vector of the two OLS models are similar, but as the first model is based on fewer measurements than the second model, the former is expected to be less reliable. However, the main difference between these two models, is that the second OLS model is updated more often by new measurements. It is difficult to verify which model is closest to the real system based of these few measurements, particularly as the measurements are corrupted with so much noise. More reliable models can be achieved by extending the time span of the sampling, and by introducing improved measurement systems, which are beyond the scope of this paper.

8 Comparing performance of the models in metal height

predictions

In addition to the methods already described, the problem was also organized as a lifted model approach, using the same data set, as used for the DSR approach, i.e. the sampling time for the three input variables were every 5 minute, whereas the sampling time of the output was every 3rd hour. Hence, the inputs were sampled 36 times as often as the output, resulting in an input matrix with 72 (2x36) variables. Due to the