• No results found

Dynamic system multivariate calibration

N/A
N/A
Protected

Academic year: 2022

Share "Dynamic system multivariate calibration"

Copied!
12
0
0

Laster.... (Se fulltekst nå)

Fulltekst

(1)
(2)

Fig. 1. Basic principle for estimation of primary system outputsy1 from known inputsuand measured secondary outputsy2in pres- ence of process noisev.

the covariance of the y1 estimate is also given. In Section 4 it is shown that least squares estimation ŽLSE , principal component regression PCR and. Ž .

Ž .

partial least squares regression PLSR can be seen as special static cases of this dynamical solution. The relations between these data based estimators and theoretical estimators based on known or assumed static models and noise properties are also presented, and the relation between two different PLSR algo- rithms falls out as a neat result. Extensions of the PCR and PLSR methods to cover also dynamic systems with collinear measurements are presented in Section 4. Section 5 gives some numerical examples and Monte Carlo simulations, and concluding remarks are given in Section 6.

2. Background and preliminaries

Linear regression and static calibration methods have roots in the classical least squares technique used by Gauss around 1800. When the number of es- timator variables is large and the number of observa- tions is limited, the ordinary solution to the least squares problem may have very large variance due to overfitting. This situation requires some form of reg-

w x

ularization, e.g., PCR or PLSR 1,2 . In many cases of great practical interest, the estimator variables far outnumber the observations at hand. An example is product quality characterization by use of near in- frared spectroscopy, with several thousand estimator

Ž .

variables frequencies and often less than 100 obser- vations. In such cases, the estimator variables are of- ten strongly collinear, and most of the information

within a subspace of the variable space. Basic tools for this data compression are singular value decom-

Ž .

position SVD and principal component analysis ŽPCA , and the regression method directly based on. this is PCR, while PLSR combines data compression and regression in an iterative approach. Such tools for multivariate data analysis are used in many scientific fields like biometrics, chemometrics, econometrics and psychometrics.

Linear regression can also be used to identify the parameters in dynamic system finite impulse re-

Ž .

sponse FIR models or autoregressive models with

Ž .w x

external inputs ARX 3 . Due to lack of noise mod- eling, this will normally result in biased parameter estimates, and the FIR truncation error comes in ad- dition. Identification of FIR and ARX models by PCR and PLSR have also been investigated, see, e.g., Refs.

w4,5 .x

In parallel with the development of the PCR and PLSR methods, the field of general dynamic SI has been developed into a sophisticated set of methods and practical tools. Classical SI methods are summa-

w x rized in comprehensive books, e.g., Refs. 3,6 . At present, subspace identification methods attract a great deal if interest, see, e.g., Ref. 7 with furtherw x references. In all forms of SI, one finds that LSE is used as a basic tool. It is, however, refined and in some cases replaced by, e.g., prediction optimization methods in order to account for the noise influence in a proper way.

SI is also closely linked to the Kalman filtering theory 8 . This is done by use of innovation models,w x where the different process and measurement noise sources are replaced by the white noise innovations in an underlying Kalman filter.

From a SI and Kalman filtering point of view, it is intuitively evident that the classical linear regres- sion and the modern multivariate calibration methods may be seen as special static cases of the more gen- eral parametric SI methods for dynamic systems. An early attempt to look into these similarities was made in Ref. 9 , and the present paper includes a furtherw x and more detailed attempt to do so. When these simi- larities are to be investigated, three basic facts have to be acknowledged.

Ž .1 Methods of multivariate calibration are used to find models for estimation of unknown output vari-

(3)

ablesyfrom both independent and dependent known variables x. In SI terminology, this means methods for estimation of unknown system outputs y1 from both independent system inputsuand dependent sys- tem outputsy2. The basic observation here is that also dependent outputsy2 have to be used as inputs in the SI procedure.

Ž .2 When the multivariate calibration models are used for estimation, they1outputs are not known, and this will also be the case for the corresponding dy- namical models found by SI. We are, therefore, lead

Ž .

to consider output error OE models and not the qualitatively different ARMAX autoregressive mov-Ž ing average with external inputs type of models used. for, e.g., control design based on knowny1 outputs.

Ž .3 In order to find the optimal y1 estimate, the underlying Kalman filter must be of the predictor–

corrector form, which is normally not the case when innovation models are used in SI.

These basic facts must be reflected in the theoreti- cal analysis of the relations between SI and LSE, PCR and PLSR, and this is quite independent of the spe- cific SI methods considered. The use of both inde- pendent inputsuand dependenty2 measurements as inputs in a SI procedure raises questions about iden- tifiability and applications on deterministic and per- fect measurement systems. A preliminary discussion

w x

of this is given in Ref. 10 . A detailed comparison of ARMAX and OE models for prediction ofy1based

w x onuandy2 is given in Ref. 11 .

3. Secondary measurements as inputs in system identification

3.1. Statement of problem

Consider the discrete system model xkq1 s AxkqBukqGvk

y1,k s C x1 kqD u1 kqw1,k Ž .1 y2,k s C x2 kqD u2 kqw2,k,

w T TxT wherexis the state vector, whilevandws w w1 2 are white and independent process and measurement noise vectors. Also assume a stable system withŽA, G R

(

v.reachable, whereRv is given by the expecta-

tion RvsEv vk kT. Note that some or all of the sec- ondaryy2 measurements may be collinear with some or all of the primaryy1 measurements.

Further assume that input–output data are avail- w x

able from an informative experiment 12 , i.e., that data records for uk, y1,k andy2,k for ks1,2, . . . ,N are at hand, with uk persistently exciting of appro- priate order and N sufficiently high. The problem is now to identify the optimal one-step-ahead y1,k pre- dictor based on past and presentuk and pasty2,kval- ues, and the optimaly1,k current estimator based also on present y2,k values.

Note that it is a part of the problem thaty1,k is not available as a basis for the prediction estimatey1,k<ky1

or the current estimatey1,k<k. This is a common situa- tion in industrial applications, e.g., in polymer ex- truding, where product quality measurements involve costly laboratory analyses. Product samples are then collected at a rather low sampling rate, and product quality estimates at a higher rate may thus be valu- able.

3.2. Optimal one-step-ahead predictor when y is1 aÕailable

Ž Ž ..

The model Eq. 1 can be expressed in the ordi- nary innovation form 6 given by the followingw x

w x

equations, where AKsA K K1 2 is the gain in a predictor type Kalman filter formulation with white innovationse1ande2:

e1

w x

xkq1 s AxkqBukqA K K1 2 e2

k Ž .2

y1,k s C x1 kqD u1 kqe1,k y2,k s C x2 kqD u2 kqe2,k.

The optimal one-step-ahead y1 predictor with all measurements available and a knownukwill then be xkq1sA IŽ yK C1 1yK C2 2.xk

ByAK D1 1yAK D2 2.uk

qAK y1 1,kqAK y2 2,k

y1,ksC x1 kqD u1 k. Ž .3 This will be the best linear one-step-ahead predic- tor ifx0,vk andwk have arbitrary statistics, and the optimal predictor assuming that x0, vk and wk are

(4)

normally distributed 8 . This is also the predictorw x normally used in prediction error identification meth-

w x ods 3,6 .

3.3. Optimal one-step-ahead predictor when y is not1 aÕailable

When they1 measurements are not available as a

Ž Ž ..

basis for prediction, the ARMAX predictor Eq. 3 w x

is no longer optimal 11 . The obvious reason for this is that Eq. 3 is based on an underlying Kalman fil-Ž . ter driven byy1 in addition tou andy2, and the in- formation in the y2 measurements will then not be utilized in an optimal way wheny1 is not available.

In a prediction error identification method, we must instead base the prediction on an underlying Kalman filter driven byu and only the y2 measure-

Ž .

ments. With the assumption that C2, A is de- tectable, the following innovation form can then be derived from Eq. 1 :Ž .

OEP OEP OE

xkq1 s Axk qBukqAK2 e2,k

4

OEP Ž .

y2,k s C x2 k qD u2 kqe2,k. They1output is then given by

OEP OEP

y1,ksC x1 k qD u1 kqqk , Ž .5 where

OEP OEP

qk sC x1

Ž

kyxk

.

qw1,k Ž .6 is colored noise.

The underlying Kalman filter is governed by the well known Kalman filter equations 8 . The Kalmanw x gain is determined by

y1

OE OEP T OEP T

K2 sP C2

Ž

C P2 C2qR22

.

, Ž .7 where the prediction state estimation covariance

OEP Ž OEPOEP T.

P sE xkyxk xkyxk is given by the Riccati equation

POEPsAPOEPATqGR Gv TyAKOE2 C P2 OEPAT, 8 Ž . and whereRvsEv vk kT andR22sEw2,kwT2,k.

Theoretically, it is possible to identify the system

Ž . Ž .

determined by Eqs. 4 and 5 using y1 and y2 as outputs, i.e., to identify Eq. 2 with a simplified noiseŽ . model employingK1s0. With many secondaryy2 measurements it is, however, a simpler task to usey2

as an input signal, and identify the OE prediction

Ž .

model OEP model

OEP OE OEP

xkq1sA I

Ž

yK2 C2

.

xk

q

Ž

ByAKOE2 D2

.

ukqAKOE2 y2,k

OEP OEP

y1,ksC x1 k qD u1 kqqk . Ž .9 The corresponding input–output model is then

y1

y1,ksC1 qIyAqAKOE2 C2

OE OE

P

Ž

ByAK2 D2

.

ukqAK2 y2,k

qD u1 kqqkOEP

sy1,k<ky1qqkOEP, Ž10. wherey1,k<ky1 is the optimal prediction estimate.

This model can also be expressed as

y1,ksG1

Ž

qy1

.

ukqG2

Ž

qy1

.

y2,kqqkOEP, Ž11. where qy1 is the unit delay operator. The transfer functions are here

y1

y1 ˜ OE

G1

Ž

q

.

sC1

Ž

qIyA

. Ž

ByAK2 D2

.

qD1

12 Ž . and

y1

y1 ˜ OE

G2

Ž

q

.

sC1

Ž

qIyA

.

AK2 , Ž13.

˜ OE

withAsAyAK2 C2.

In order to identify the deterministic part of the

Ž . OEP

system 10 , i.e., G1 and G2, we model qk by some unknown white noise sequence and use the prediction

ˆ y1 ˆ y1

yˆ1,ksG1

Ž

q ;u

.

ukqG2

Ž

q ;u

.

y2,k. Ž14. The prediction error is then

´1,ks y1,kyyˆ1,k

y1 ˆ y1

s G1

Ž

q

.

yG1

Ž

q ;u

.

uk

y1 ˆ y1 OEP

q G2

Ž

q

.

yG2

Ž

q ;u

.

y2,kqqk . 15 Ž . When evaluating the result of minimizing a criterion

Ž . ŽŽ . N T .

function VN u str 1rN Ýks1 1,k´ ´1,k , we must now consider the fact that y2,k and qkOEP are not in-

Ž Ž ..

dependent. We then note that the predictor Eq. 14

(5)

has the form of an observer driven byu and they2 measurements, and that the criterion function deter-

Ž Ž ..

mined by the prediction error Eq. 15 under the as- sumption of Gaussian noise therefore is minimized when and only when both

Ø the deterministic model is correct, and ˆ Ø the observer gain is a Kalman gain, i.e., K2'

KOE2 .

Minimization will therefore asymptoticallyŽN

ˆ ˆ

`.result inG1'G1andG2'G2, withG1andG2

Ž . Ž .

given by Eqs. 12 and 13 . The prediction estimate y1,k<ky1will thus be asymptotically unbiased.

3.4. Optimal current estimator when y is not aÕail-1

able

Utilizing also currenty2 values, the optimal esti- mator considering that y1 is not available will be found by identifying the following OE model based on an underlying predictor–corrector Kalman filter w x8 utilizing also current data OEC model :Ž .

y1

OE OE

y1,ksC I1

Ž

yK2 C2

.

qIyAqAK2 C2

OE OE

P

Ž

ByAK2 D2

.

ukqAK2 y2,k

qC K1 OE2 Žy2,kyD u2 k.qD u1 kqqkOEC. 16 Ž . Here we introduce the colored noise

OEC OEC

qk sC x1

Ž

kyxk

.

qw1,k, Ž17. based on

OEC OE OEP OE

xk s

Ž

IyK2 C2

.

xk qK2 Žy2,kyD u2 k.. 18 Ž .

Ž .

From Eq. 16 , we find the asymptotically unbi- ased and optimaly1 current estimator

y1

OE OE

y1,k<ksC I1

Ž

yK2 C2

.

qIyAqAK2 C2

OE OE

P

Ž

ByAK2 D2

.

ukqAK2 y2,k

qC K1 OE2 Žy2,kyD u2 k.qD u1 k. Ž19. This is the central relation in the paper, showing how past and presentuandy2values can be utilized in an optimal way to find the current estimatey1,k<k. It is straightforward to show, however, that identification

Ž .

of Eq. 19 by use of a prediction error method will

result in a correct result only whenw1,k andw2,k are w x

uncorrelated 11 .

Ž Ž ..

The optimal estimator Eq. 19 is also the basis for Section 4, where LSE, PCR and PLSR are found as special static cases, and for the dynamic PCR and PLS solutions presented in Section 5.

3.5. Theoretical y current estimation coÕariance1

Ž Ž ..

When the OEC model Eq. 16 is identified us- ing a large data set, i.e., N™`, the estimate y1,k<k

will be asymptotically unbiased when we use either onlyuor bothuandy2 as input signals. The asymp- totic covariance will, however, depend on the model and the quality of the data. In the following we as- sume perfect model and noise information, and de- rive theoretical asymptotic expressions for the y1 current estimation covariance.

The underlying Kalman filter driven byuand the

Ž . Ž .

y2 measurements is governed by Eqs. 7 and 8 . The Ž .

current state estimate is given by Eq. 18 , and the

OEC Ž

current state estimation covariance P sE x y

OECOEC T. k

xk xkyxk is thus

OEC OE OEP OE T

P s

Ž

IyK2 C2

.

P

Ž

IyK2 C2

.

OE OE T

qK2 R22

Ž

K2

.

. Ž20. As the current estimatey1,k<k is directly based on xOECk , the theoretical asymptotic y1 current estima- tion covariance becomes

Cov

Ž

y1,k<k

.

s E y

Ž

1,kyy1,k<k

. Ž

y1,kyy1,k<k

.

T OEC T

s C P1 C1qR11,

21 Ž .

OEC Ž . T

withP given by Eq. 20 andR11sEw1,kw1,k. Assume now for convenience a scalar y1 mea- surement. When the model is identified and validated by use of independent data sets with N™`, we will then find the theoretical root mean square error ŽRMSE.

1 N 2

RMSE<u,y2s

(

N k

Ý

s1

Ž

y1,kyy1,k<k

.

OEC T

(

C P1 C1qR11. Ž22.

(6)

4. Multivariate calibration as special cases

4.1. Assumptions according experimental setup and data

Ž Ž ..

Consider again the system Eq. 1 with the opti-

Ž Ž ..

maly1 current estimator Eq. 19 , and expand the inputuwith a vectordof unknown offsets or distur-

w T TxT

bances, i.e., useus d um , whereumis the known vector of manipulated or measured inputs. Let the in- put uk be piecewise constant over periods that are much longer than both the time constants in the un- derlying continuous system and the discretization sampling time, and assume possibly collinear obser- vationsy1,j andy2,j at the of each such period. Also assume thatdjis a white noise sequence, i.e., that the unknown offsets and disturbances are independent from one observation to the next. With a piecewise static input vector uk and enough time for settle-

Ž Ž ..

ment, it follows from Eq. 1 that the observations will be given by

d

y1

y1,js C I1Ž yA. BqD1 um

j j

q

Ý

v gk 1,jykqw1,j Ž23a.

ks y`

d

y1

y2,js C2ŽIyA. BqD2 um

j j

q

Ý

v gk 2,jykqw2,j, Ž23b.

ks y`

whereg1andg2stand for the impulse responses from v to y1 and y2. All measurements are thus linear combinations ofd andum plus noise, and since we assume a stable system with piecewise constant in- puts and a settling time shorter than the data sam- pling time, this noise will be approximately white.

Note, however, that since the noise terms in Eqs.

Ž23a and 23b are partly determined by the com-. Ž . mon process noisevk, they will not be independent, as required for the optimal current estimator Eq.Ž Ž19 . For calibration purposes it is also a normal..

procedure to use mean values of the measurements over a certain period of time in order to reduce the noise, but this does not affect the theoretical analy- sis.

4.2. Least squares estimation

If both d andum are completely known, there is no need to utilize the information in they2 measure-

Ž .

ments, we can simply solve Eq. 23a as an ordinary least squares problem. In our case, however, we con- sider d as unknown, and they2 measurements may then give valuable information aboutdand indirectly also about y1. In the following analysis we assume that um,j is a persistently exciting stochastic signal, and that all data are centralized, i.e., thatdj,um,j,y1,j andy2,j are stochastic variables with zero mean. For details about centralization and the subsequent modi- fication of the estimator, see, e.g., Ref. 1 . We alsow x assume observations of um,j, y1,j and y2,j from an informative experiment with samples for js 1,2, . . . ,J.

In order to use the Kalman filter formalism, we modeldj as generated by a white noise sequencee1,j through a pure delay system. In the same way we model the common noise part hc,j iny1,j andy2,j as generated by a white noise sequencee2,j. Expressing

w T TxT y1andy2 as linear combinations ofzs d hc and um, we then arrive at the following dynamic system

d e1

zjq1s hc jq1s e2 jsej

24 Ž . y1,jsL z11 jqL u12 m ,jqh1,j

y2,jsL z21 jqL u22 m ,jqh2,j,

where the detailed expressions for theLmatrices fol-

Ž . Ž .

low from Eqs. 23a and 23b , and where h1,j and h2,jare white and independent noise sequences. This is a dynamic system as given in Eq. 1 withŽ . As0, Bs0 andGsI, and the algebraic Riccati 8 thenŽ . results in

PsPzsResEe ej Tj. Ž25. From Eq. 7 follows that the Kalman gain relatedŽ . to they2 measurements is

y1

OE T T

K2 sR Le 21

Ž

L R L21 e 21qR22

.

, Ž26. whereR22sEh2,jhT2,j. WithAsBs0 and appro-

Ž . priate change of notation according to Eq. 26 , the

(7)

Ž .

optimal current estimator 19 now gives the theoret- ical static estimator

y1,j<j sL K11 OE2

Ž

y2,jyL u22 m ,j

.

qL u12 m ,j

OE OE

s

Ž

L12yL K11 2 L22

.

um ,jqL K11 2 y2,j, 27 Ž . or with

T T T B1

y s u y

Ž

1,j<j

.

m ,j 2,j B2

OE T

L yL K L B1

Ž

12 11 2 22

.

Bs s T . Ž28.

B2

Ž

L K11 OE2

.

Without known manipulated inputs, i.e., with um,js0, we have a simplified model

zjq1sej

y1,jsL z1 jqh1,j Ž29. y2,jsL z2 jqh2,j,

resulting in the simplified theoretical estimate

y1

T T

y1,jrjsL R L1 e 2

Ž

L R L2 e 2qR22

.

y2,j, Ž30.

T T

Ž .

or with y1,jrj sy2,jB

y1

T T

Bs

Ž

L R L2 e 2qR222

.

L R L2 e 1. Ž31. Ž . In the same way as with the parameters in Eq. 19 ,

Ž . Ž .

we can findB1 andB2 in Eq. 28 or Bin Eq. 31 by identification of an OE model. In this special static case, however, we can also find the parameters di- rectly as the solution to a least squares problem. Let us start by comparing the theoretical estimator Eq.

Ž31 with a data based least squares solution. By. stacking the collected datay1,1T , y1,2T , . . . , y1,NT in a data matrixY1andy2,1T ,y2,2T , . . . ,y2,NT in a data ma- trixY2, we can express the relation betweenY2 and Y1 as

Y1sY B2 qE, Ž32. resulting in the least squares estimator

y1

1 1

y1

T T T T

Bˆs

Ž

Y Y2 2

.

Y Y2 1s

ž

NY Y2 2

/

PNY Y2 1. 33 Ž .

Ž T T For a theoretical analysis we also stack z1,z2, . . . ,

T. Ž T T T . Ž T T

zN , h1,1,h1,2, . . . ,h1,N and h2,1,h2,2, . . . ,

T .

h2,N in data matricesZ, E1 andE2, and by use of Ž .

Eq. 29 and for N™`we will then find

1 T 1 T T T

Y Y2 2s

Ž

ZL2qE2

. Ž

ZL E2 2

.

N N

L R L2 e T2qR22, Ž34. and

1Y Y2T 1s 1

Ž

ZLT2qE2

. Ž

T ZLT1qE1

.

N N

L R L2 e T1. Ž35.

Ž . T

Here, we make use of the fact that 1rN Z ZRe and that h1 and h2 are independent white noise se-

Ž . T

quences, which means that 1rN L Z E2 2™0, Ž1rN L Z E. 2 T 1™ 0, 1Ž rN E Z L. T2 T1™ 0 and Ž1rN E E. T2 1™0 when N™`. By inserting Eqs.

Ž34 and 35 into Eq. 33 , we now find the estima-. Ž . Ž .

Ž . Ž .

tor 31 , showing that the estimators 31, 33 are asymptotically equivalent. This connection between Kalman filtering without dynamics and ordinary LSE was found also in Ref. 9 , but then without the gen-w x

Ž .

eral dynamic estimator 19 as a basis, and also lim-

Ž .

ited to the case wereL1sI or at least invertible and h1s0, i.e., the case werey1are noise-free measure- ments of all states in the system possibly after aŽ similarity transformation ..

In a similar although somewhat more involved way we can also show that the estimator

y1

T T

1 Um 1 Um

ˆ w x

Bs

ž

N Y2T U Ym 2

/

N Y2T Y1 Ž36. Ž .

is asymptotically equivalent with Eq. 28 . 4.3. Principal component regression

With a large number ofy2 variables and a limited

Ž .

number of observations, the estimators 33, 36 may have very large variance. In the common case with collinear y2 variables, we can then make use of the fact that the information can be compressed into a smaller number of latent variables determined by the total number of independent variables in um and z.

w x We then collect all input data in either Xs U Ym 2

Ž . Ž .

as in Eq. 36 or XsY2 as in Eq. 33 , dependent

(8)

on the problem formulation. By use of an appropriate w x

number of principal components 1,2 , the data is then expressed as

XfTPT, Ž37.

whereTis the score matrix andPis the loading ma- trix.

For convenience and due to space limitations we now limit the treatment to the case where um,js0,

w xT

i.e., to the case where Xs y2,1,y2,2, . . . ,y2,N . We then replace the measured variablesy2,j with la- tent variables tjsPTy2,j, and make use of the fact

T Ž .

thatP PsI, and the system 29 is thus replaced by zjq1 s ej

y1,j s L z1 jqh1,j Ž38.

T T

tj f P L z2 jqP h2,j. Ž .

The theoretical estimator 31 is then replaced by

y1

T T T T T

BsP P L R L P

Ž

2 e 2 qP R22P

.

P L R L2 e 1. 39 Ž .

Ž . By replacing y2 with T and inserting Eqs. 34 and Ž35 into Eq. 39 , we find the corresponding data. Ž . based PCR estimator

y1 y1

T T T T T T

ˆ

BsP T TŽ . T y1sP P X XPŽ . P X Y1. 40 Ž .

4.4. Partial least squares regression

The aim of PLSR is to improve PCR by finding t variables that explain both theXand theY1data, and there exist at least two slightly different PLSR algo- rithms 1 . Also here we limit the treatment to the casew x were um,js0, and it is convenient to start with the PLSR method of Martens that makes use of linear

T Ž

combinations tMsW y2 where the weight matrix Wis found iteratively . The result of this is that Eq.. Ž29 is replaced by the PLSR model. M

zjq1 s ej

y1,j s L z1 jqh1,j Ž41.

T T

tM ,j f W L z2 jqW h2,j.

Ž .

The theoretical PCR estimator 39 is then replaced by the theoretical PLSR estimator

y1

T T T

BsW W L R L W

Ž

2 e 2 qW R W22

.

=WTL R L2 e T1, Ž42. Ž .

while Eq. 40 is replaced by the data based PLSR estimator

y1

T T

Bˆ s W T T

Ž

M M

.

T YM 1

43 Ž .

y1

T T T T

s W W X XWŽ . W X Y1.

The original PLSR method of Wold uses linear

Ž T .y1 T

combinations tWs W PW W y2, with the same Wmatrix as Martens and with a special load-

Ž .

ing matrix PW. The model 29 is then replaced by the PLSR modelW

zjq1 s ej

y1,j s L z1 jqh1,j

y1 y1

T T T T

tW ,j f ŽW PW. W L z2 jqŽW PW. W h2,j.

44 Ž .

Ž T .yT

With W W PW instead of W, the theoretical Ž .

PLSR estimator 42 becomes

y1 y1 T T yT

P W L R L W P

Ž . 2 e 2 Ž .

yT

BsWŽ .P

ž

qŽ .P y1W R WT 22 Ž .P yT

/

y1 T T

P PŽ . W L R L2 e 1

y1

T T T

sW W L R L W

Ž

2 e 2 qW R W22

.

PWTL R L2 e T1, Ž45. Ž .

while the data based PLSR estimator 43 becomes

y1 yT y1 T T yT

BˆsWŽ .P

Ž

Ž .P W X XWŽ .P

.

y1 T T

P PŽ . W X Y1

y1 y1

T T T

sW P W

Ž

W

. Ž

T TW W

.

T yW y1

T T T T

sW W X XWŽ . W X Y1. Ž46. We see from this that PW disappears from the esti- mator expressions, and that the final theoretical as well as the data based estimators are the same for the Wold and Martens algorithms. This is, of course, well known 1 , although the relation to the underlyingw x Kalman filter is new.

(9)

4.5. Dynamic system PCR and PLSR solutions The optimal y1 estimator for dynamic systems

Ž .

given in Eq. 19 may also form a basis for dynamic system solutions using PCR or PLSR. It is then natu- ral to split the secondary measurements into y2,ks wy21,kT y22,kT xT, wherey21,k are the secondary measure- ments that are linked toy1 only through a static sys- tem. The y21,k measurements are then internally collinear, and they can thus be replaced by latent t

Ž . Ž . Ž .

variables as in Eqs. 38 , 41 and 44 , i.e., both PCR and PLSR may be used. Using, e.g., the score defini- tion in the PLSR method of Martens, i.e.,tsWTy2,

Ž .

the OEC estimator 19 will be replaced by

OE T OE

y1,k<ksC I1

Ž

yKt W C21yK22C22

.

y1 OE

qIyAest. P

Ž

B uest kqAKt t tk

qAKOE22 y22,k

.

qC K1 OEt t t

Ž

kyWTD u21 k

.

qC K1 OE22Žy22,kyD u22 k.qD u1 k, Ž47.

T Ž OE T

where t sW y , A sA IyK W C y

k 21,k est t 21

O E . O E T

K2 2 C2 2 and Bests By A Kt W D2 1y AKOE22D22. The Kalman gains are here determined as

Ž .

the solution to the Kalman filter 7, 8 with

T T

T T

C2s

Ž

W C21

.

C22 and

T T T T

EW w w W21 21 EW w w21 22

R22s T T .

Ew w W22 21 Ew w22 22

If we find the t variables by use of the PLSR me- thod of Wold, we have to replace WT with ŽW PT W.y1WT, while the PCR method usesPT in- stead ofWT.

Ž .

When the current estimator 47 is identified by use of, e.g., a prediction error method, also past tk values will be used as a basis for determiningy1,k<k, with reduced variance as the expected result, and we can in fact look at and treat the latent variables as or- dinary measurement signals. An essential assumption is here that the linear relations betweeny21,k and tk

T T Ž T .y1 T

given byP ,W or W PW W are time invari- ant and determined as in the static case either by PCA or by the iterative PLSR algorithms. Note, however, that time invariance is an essential assumption also in

Ž . the general estimator 19 .

If all or some of they22 measurements follow the same dynamic response except for noise, and thus are

internally collinear, such measurements may also be replaced by latent variables in order to reduce the variance in the solution. However, sincey22 is linked to y1 through a dynamic system, the iterative PLSR method cannot be expected to work, and we must be content with using SVD or PCA to find these latent variables. They may also then be combined with known inputs or other measurements, and also with other latent variables found by PCA or PLSR.

Ž .

With uks0 and y22,ks0, Eq. 47 is simplified to

OE T

y1,k<ksC I1

Ž

yKt tW C21

.

y1

OE T

P

Ž

qIyA I

Ž

yKt tW C21

. .

PAKOEt t tkqC K1 OEt t tk, Ž48. showing the dynamic relation between the collinear time series y21 represented by t and the time series y1.

We end this subsection with a general discussion on dynamic system multivariate calibration methods.

Ž .

The proposed estimator 47 is based on the asymp- Ž .

totically optimal estimator 19 . This is in contrast with PCR and PLSR methods for identification of FIR models 4 , where also the asymptotical least squaresw x solution is biased due to truncation as well as lack of noise modeling 3 . It is also in contrast with PLSRw x methods for identification of ARX models 5 , wherew x again the asymptotical least squares solution is bi-

w x ased when the observation error is colored 3,6 .

In addition we must consider the fact that an ARX estimator would make use of pasty1values that are not available as the present problem is formulated in Section 3.1. As for the optimal ARMAX estimator Ž .3 , an ARX estimator would then not utilize sec- ondaryy2 information in an optimal way. One obvi- ous effect of this would be that noisy y21 measure- ments collinear withy1would be effectively ignored when the identification experiment gives low noisey1 information.

The fact that FIR and ARX least squares solutions are not asymptotically optimal does not mean, of course, that the PCR and PLSR solutions presented

w x

in, e.g., Refs. 4,5 may not give good results in some realistic cases with a limited number of observations.

An in-depth comparison between such known solu-

(10)

Ž .

tions and the proposed estimator 47 is, however, beyond the scope of the present paper.

5. Simulation examples

Simulation studies are undertaken by use ofMAT-

LAB, primarily the dlsim.m function in the control w x

system toolbox 13 , and the prediction error method implemented in the pem.mfunction in the SI toolbox w14 . With an appropriate OE model specified, thex pem.mfunction identifies the optimal current estima-

Ž .

tor 19 , where the secondary measurements y2 are also used as input signals. For validation compar-

Ž .

isons, the RMSE criterion in Eq. 22 was used, with y1,k<k replaced by the appropriate estimate.

5.1. Example 1: a second-order system with a first- order process noise model

As a starting point, the following continuous sec- ond-order process model with an additional first-order process noise model was used e.g., interacting mix-Ž ing tanks or thermal processes :.

y1 1 0 0 0 xs 1 y2 1 xq 1 uq 0 v

0 0 y1 0 1 Ž49.

w x

y1s 1 0 0 xqw1

w x

y2s 0 1 0 xqw2.

The system was discretized assuming zero-order hold elements on theuandvinputs and a sampling intervalTs0.1, and then simulated withuk as a fil- tered PRBS signal with autocovariance r Ž .p s

<p< uu

Ž w x .

0.8 in Ref. 3 , example 5.11 with as0.8 , e.g., an input that was persistently exciting of sufficient order. The noise sourcesvk,w1,k andw2,kwere inde- pendent and normally distributed white noise se- quences with zero mean and variances given below.

The simulated system was identified using the

Ž .

OEP and OEC models 10, 16 withuk and y2,k as input signals and y1,k as output signal, using Ns 10 000 samples.

Ž .

The OEP model 10 was specified as

w x w x w x

nnOEPs 0, 3,3 ,0,0, 3,3 , 1,1 , Ž50. i.e., a model

B q1

Ž

y1

.

B q2

Ž

y1

.

OEP

y1,ksF q1

Ž

y1

.

ukqF q2

Ž

y1

.

y2,kqek Ž51.

with

B q1

Ž

y1

.

sb q11 y1qb q12 y2qb q13 y3 Ž52. B q2

Ž

y1

.

sb q21 y1qb q22 y2qb q23 y3 Ž53. F q1

Ž

y1

.

s1qf q11 y1qf q12 y2qf q13 y3 Ž54. F q2

Ž

y1

.

s1qf q21 y1qf22qy2qf q23 y3. Ž55.

Ž Ž ..

The OEC model Eq. 16 was specified as

w x w x w x

nnOECs 0, 3,4 ,0,0, 3,3 , 1,0 , Ž56.

Ž . Ž y1.

i.e., the same model as Eq. 51 , but with B q2 altered to

B q2

Ž

y1

.

sb20qb q21 y1qb q22 y2qb q23 y3. Ž57. As the main purpose of the simulations was to verify the theory, no attempt was made to find the model order and model structure from the data. The model order can, however, be found by ordinary use of one of the several available subspace identifica- tion methods 7 , and a systematic method for find-w x

w x

ing the structure is presented in Ref. 10 . For the OEP and OEC models, no attempt was made to force

Ž y1. Ž y1.

F q1 and F q2 to be identical, which they the- oretically should be.

As a basis for comparisons given a specific exper- imental condition, each model was identified and validated in Ms100 Monte Carlo runs using inde- pendent data sets. In order to limit the influence of local minima problems, each identification and vali- dation given a specific data set was repeated Js5 times with randomized initial B parameters Žbil,jq1

Ž .

sbil,j= 1q0.5e, witheas a normal random vari- able with zero mean and variance 1 ..

The mean RMSE values and RMSE standard de- viations for Ns10 000, rvs0.1, r22s0.01 and varyingr11values are given in Table 1, showing an obvious agreement between results based on simula- tion and theory. Table 1 also includes theoretical

OEP T

RMSE values Var

( Ž

y1,k<ky1

.

s

(

C P1 C1qr11

Table 1

Validation RMSE mean values and standard deviations and theo-

Ž .

retical mean values for OE models multiplied with 10000 r11 OEP OEPth. OEC OECth.

y8

10 177"5 177 173"6 173

y6

10 177"5 177 173"5 173

y4

10 204"6 203 200"5 200

(11)

Fig. 2. Segment of validation responses for the OEP model Eq.Ž Ž51 using both.. uandy2as inputs dashed, RMSEŽ s0.0273 and.

Ž w x

an OE model using onlyuas input nnOEUs0,3,0,0,3,1 , dotted, RMSEs0.0906 . The experimental conditions are given by. rvs 1,r11s0.0001,r22s0.01 and Ns200, and the ideal validation response is shown by solid line.

OEC T OEP

and Var

( Ž

y1,k<k

.

s

(

C P1 C1qr11 with P

OEC Ž . Ž .

andP computed according to Eqs. 8 and 20 . The results in Table 1 were obtained from Ns 10 000 samples. To indicate expected results for a more realistic number of samples, and at the same time visualize the degree of model misfit behind the RMSE values in Table 1, specific validation re- sponses for models based on Ns200 samples are shown in Fig. 2. Fig. 2 also gives a representative picture of the improvement achieved by includingy2 as an input signal in addition tou.

5.2. Example 2: dynamic system PCR and PLSR solutions

For simulations of the dynamical DPCR and DPLSR solutions in Section 5, three independent fil- tered white noise sequences were generated. The fol- lowing continuous system of three independent sec- ond order systems was used as a starting point withŽ as y1 :.

a 0 0 1 0 0 0

0 a 0 0 1 0 0

0 0 a 0 0 1 0

x s xq

v1

0 0 0 a 0 0

0 0 0 0 a 0 v2

0 0 0 0 0 a v3

w x

y1 s 1 1 1 0 0 0 xqw1

w x

y2 s L21 0 xqw2. 58 Ž .

Here, L21 was a 200=3 matrix with uniformly

Ž .

distributed random parameters in the interval 0,1 . The system was discretized assuming zero-order hold elements on the vinputs and a sampling interval T s0.1. The system was then simulated withv,w1and w2 as independent and normally distributed white noise sequences with zero mean. The Rv and R22 covariance matrices were diagonal, with uniformly

Ž .

distributed random parameters in the intervals 0,1

Ž .

and 0,r22 , respectively, while the y1 variance was r11s0.0001. Different values of r22 were used as described below.

The simulations started with r22s0.01 and Ns 200. In order to find the appropriate number of com-

Ž .

ponents, the static PCR and PLSR estimators 40, 46 were first determined for different numbers of com- ponents A. In addition the dynamical DPCR and

Ž .

DPLSR estimates according to Eq. 48 were identi-

Ž w x

fied using the OE model see Ref. 14 for definition of nn.

w x w x w x

nns 0, 2, . . . ,2 ,0,0, 2, . . . ,2 , 0, . . . ,0 . Ž59. Each model was identified in Ms10 Monte Carlo runs, with validation against independent data sets with Ns200 samples. The resulting mean RMSE values are plotted in Fig. 3, and there we find the op- timal number of components As3. This is not sur- prising since the system has three independent noise sources. Fig. 3 also indicates that PLSR is slightly better than PCR, and that the dynamic solutions are better than the static ones.

Fig. 3. RMSE mean values as function of number of components used in PCR, PLSR, DPCR and DPLSR models forr22s0.01, based on 10 Monte Carlo runs with Ns200 samples.

Referanser

RELATERTE DOKUMENTER

In order to test the method presented above, static PLSR and dynamic PCA+OE estimators were identified using only every fifth response observation in the modeling data (the

Assuming a static system with a scalar primary output or response variable yl and multivariate secondary outputs y 2 , the static calibration problem is to find an estimator b

This study follows the more traditional approach of seeing interaction designing as dealing with digital materials as it engages with SR-RFID as design material.

This article shows the development and characterization of a radar testbed based on the USRP, that allows testing of some adaptive or cognitive algorithms.. The testbed is flexible

“we have welcomed Baker who is on a trip to […] the Middle East after public massacres in the region.” One terrorist died in a premature explosion that wounded a colleague as they

However, at this point it is important to take note of King’s (2015) findings that sometimes women can be denigrated pre- cisely because they are highly able

Keywords: Multibeam echo sounder, seabed, backscatter, reflectivity, sediment, grain size, ground truth, angular range analysis, correlation coefficient, sound speed,

Particularly famous are the Iskander-M short range ballistic missile, the Kalibr land attack and anti-ship cruise missiles, and the S-400 air defence system.. Other new