Parameter estimation for externally simulated thermal network models

(1)

ContentslistsavailableatScienceDirect

Energy & Buildings

journalhomepage:www.elsevier.com/locate/enbuild

Parameter estimation for externally simulated thermal network models ^R

O.M. Brastein

^∗

, B. Lie , R. Sharma , N.-O. Skeie

Department of Electrical Engineering, Information Technology and Cybernetics, University of South-Eastern Norway, Porsgrunn N-3918, Norway

a rt i c l e i nf o

Article history:

Received 17 November 2018 Revised 14 February 2019 Accepted 11 March 2019 Available online 11 March 2019 Keywords:

Grey-box models

Stochastic differential equations Parameter estimation Profile likelihood Thermal network models Unscented Kalman filter Ensemble Kalman filter

a b s t ra c t

Obtainingaccuratedynamicmodelsofbuildingthermalbehaviourrequiresastatisticallysolidfoundation forestimatingunknownparameters.Thisisespeciallyimportantforthermalnetworkgrey-boxmodels, sincealltheirparametersnormallyneed tobeestimatedfromdata.Oneattractivesolution istomax- imisethelikelihoodfunction,undertheassumptionofGaussiandistributedresiduals.Thistechniquewas developedpreviouslyandimplementedintheContinuousTimeStochasticModellingframework,where an ExtendedKalman Filter isused tocomputeresiduals and theircovariances. Themain result ofthis paperisasimilarmethodappliedtoathermal networkgrey-boxmodelofabuilding,simulatedasan electriccircuitinanexternaltool.Themodelisdescribedasalistofinterconnectedcomponentswithout derivingexplicitequations.Sincethismodelimplementationisnotdifferentiable,analternativeKalman ﬁlterformulationisneeded.TheUnscentedandEnsembleKalmanFiltersaredesignedtohandlenon-linear modelswithoutusingJacobians,andcanthereforealsobeusedwithmodelsinanon-differentiableform.

BothKalmanfilterimplementationsaretestedand comparedwithrespecttoestimationaccuracy and computation time. The ProfileLikelihood methodis used toanalyse structural and practicalparameter identifiability.Thismethodisextendedtocomputetwo-dimensionalprofiles,whichcanalsobeusedto analyseparameterinterdependencebyprovidinginsightintotheparameterspacetopology.

ThisisanopenaccessarticleundertheCCBY-NC-NDlicense.

(http://creativecommons.org/licenses/by-nc-nd/4.0/)

1. Introduction 1.1.Background

The heating and cooling of buildings consumes a signiﬁcant partof the world’s total energy production.While new building materials andtechniquesmay reduce theenergy consumption of buildings,therenewalrateofbuildingsislow[1].Hence,itisim- portanttostudymethodsthatcanalsoreduceenergyconsumption inexistingbuildings.

Building Energy Management Systems (BEMS) utilising ad- vancedmodel-based control methods [2]to forecast the temper- aturevariations of a building in order to predict an optimal se- quenceofcontrol inputsis apromisingmethodforthereduction ofenergyconsumption.Since themodel’spredictionaccuracy di- rectlyinﬂuences theeﬃciencyofsuch methods,itisimportantto

R This research did not receive any speciﬁc grant from funding agencies in the public, commercial or not-for-proﬁt sectors.

∗ Corresponding author.

E-mail address: [email protected] (O.M. Brastein).

develop accurate models of building thermal behaviour. In addi- tiontodescribingthetimeevolutionofthesystemstatesandout- puts,agoodmodelmustaccommodatedescriptionsofbothmea- surementnoiseandprocessnoise[3,4].Thisrequiresastatistically solidframeworkforestimatingunknownparameters[5].

Thermalnetworkmodels areoftenused tomodelthethermal behaviourofbuildings[1,6–8].Implemented asResistor–Capacitor equivalent circuits, these models offer an intuitive model design based on a cognitive understanding of the thermal physics in- volved. Since, typically, all parameters of such models must be identiﬁedfromdata,itisimportanttoinvestigateparameteriden- tiﬁability prior to assuming physical interpretation of the esti- matedparametervalues[8].

1.2. Previouswork

1.2.1. Modellingofdynamicsystems

Modelsaresometimes classiﬁedbasedonthelevelofphysical insight used in their derivation. If the model is mechanistic, i.e., based purely on physical equations, it is classiﬁed as white-box. Such models excel at describing non-linear state transitions and https://doi.org/10.1016/j.enbuild.2019.03.018

(2)

measurements. Theyalsotendto generalisewell betweensimilar systems[5,9].Analternativeapproach istheuseofsystemidenti- fication (SID) methods [3,4,10–12], wherea predeterminedmodel structure with unknown coefficientsis calibratedusing measurements ofthe system inputs andoutputs. This results in a black- boxmodelinwhichnopriorphysicalinsightisused,exceptinthe choice of input and output measurements, sample time, andthe approximate model complexity. These models tend to have bet- ter predictionaccuracy, butlesscapability togeneralise[5,9].SID methodstendtoprovidebetterstatisticsonthemodeluncertainty, whicharetypicallycomputedduringthecalibrationprocess[3–5]. A third,intermediate, possibility isthe grey-box model,which is based on a simplified modelstructure constructed using naive physicalknowledgeofthesystem.Modelparametersarecalibrated frommeasurements ofthe system, similarlyto black-boxmodels.

Grey-boxmodels are oftentreated ina stochastic framework [5]. It couldbe arguedthat mostwhite-boxmodelsincludesome ap- proximationsand/orneedcalibrationofcertainparameters.Hence, they can beneﬁtfrom theapplication ofstochastic grey-boxcali- brationmethods.Thisapproachhasindeedbeenclaimedasanat- uralframeworkformodellingdynamicsystemsingeneral[13].

1.2.2. TheCTSMframework

Estimation of parameters is essentially an optimisation problem, whichrequires a well-deﬁnedobjective function.Severalal- ternatives are used in the literature, such as the deterministic simulation error approach [1]. A statistically solid alternative for stochastic grey-boxmodels is found in [5,14],which is based on maximisingthelikelihoodfunctionevaluatedby computingresid- uals in a Kalman Filter. This method has been previously developed in a number ofpublications [5,14–16] andimplemented in theContinuousTimeStochasticModelling(CTSM)framework[15]. In CTSM, the residuals needed to evaluate the likelihood func- tionarecomputedusinganExtendedKalmanFilter(EKF)withsub- samplingof thestate transition equationsto improveresponse to non-linearmodels[5,15].TheEKFisbasedonlinearisingthestate transitionsand/ormeasurementequations,whichrequiresthatthe modelequationsaredifferentiable[17–19].

1.2.3. Identiﬁability

Since thermalnetwork buildingmodels are partially basedon physical knowledge, it is often suggested that the parameters can be assigneda physical interpretation[1,5,6]. Thisassumption should,however,beverifiedinthecontextofparameteridentifia- bility[3,20].Itiswellknownthat modelscancontainparameters thatarestructurallynon-identifiable[3,20].Further,lackofproper excitationofthesystemduringdataacquisitionmayleadtopracti- cal non-identifiability[3,8,20–22].Whilethemodelstructuremay be designedsuch thattheparameters areintendedtohavea spe- cificphysicalmeaning,itisnotcertainthattheestimatedparame- terssupportthisassumption.Agoodtoolforidentifiabilityanalysis istheprofilelikelihoodmethod[8,21,22].

1.3. Overviewofpaper

In this paper, a resistor-capacitor equivalent thermal network modelofabuildingisexpressedasalistofinterconnectedelectri- cal components.Themodelissimulatedinan externaltool with- out derivingexplicitmodelequations,hencethemodelcannot be differentiated. This is motivated by the need to simplify experi- mentationwithdifferentmodelstructuresinawaythatcouldpo- tentiallybeautomated.Theparameterestimationmethodfromthe CTSM framework is adapted to non-differentiable models, which requires an alternative to the EKF for computing residuals. Both the Unscented Kalman Filter (UKF) [18] and Ensemble Kalman Fil- ter(EnKF)[23]arecomparedandconsideredfortheestimationof

residuals. The explicitmodel equations are alsoderived on stan- dardlinearform,andusedwithastandardKalmanFilterasabase- line for comparison.Observe that while the model used here is linear,themethodisnotrestrictedtolinearmodels;theexternally simulatedstatetransitionscouldwellbenon-linear.

Aprofilelikelihoodapproachisused[22]toanalyseparameter identifiability.The methodisextended tocreatetwo-dimensional profiles in the form oftopological heat maps. These 2D plots are computedforallcombinationsofparameters.Inadditiontodiag- nosingtheidentifiabilityofthe parameters,theseplots allowde- tectionofparameterinterdependence.

Thepaperis organisedasfollows.Thetheoretical basisis dis- cussedinSection2.Themodel,externalsimulatorandexperimen- talset-upispresentedinSection3,andtheresultsarepresented anddiscussedinSection4.

2. Theoreticalbasis

2.1. Stochasticmodelparameterestimation

Estimationofparametersforaknownmodelstructure[17]can bedeﬁnedassolvingtheoptimisationproblem:

θ

ˆ⁼^arg^min

θ g

( θ

^;M^,^K,^A

)

⁽¹⁾

s.t.

θ

^∈

Here, M is a predetermined model structure, which is parametrised by

θ

∈^,^where ⊆Rⁿ^θ isa set of feasible values forthemodelparametersthat forminequality constraintsforthe optimisation problem in Eq. (1). K represents the experimental conditions,includinga setofmeasurements ofsysteminputsand outputs. These measurements are used to evaluate the objective functiongwhen

θ

îs^variedôver^the^feasible^set^byâ^numerical optimisationalgorithmA.Inthesequel,thealgorithmConstrained Optimisation By Linear Approximation (COBYLA)[24] is used. This algorithmisgradientfree,henceidealforsolving Eq.(1).COBYLA alsosupportsinequality constraintswhichcanbe usedto impose thelimitsofthefeasibleregionôn^the^parameterêstimates.

SincethemodelstructureMisarepresentationofasystemS, itis oftenassumed that S∈M() ^and^that consequently there existsa trueparametervector

θ

^∗ ^such^that M(

θ

^∗)=S.However, this is rarely the case, especially for simpliﬁed grey-box models based on a naive physical understanding of the system S. Typi- cally,theestimate

θ

ˆ^dependsôn^theâmountôf^dynamicînforma- tionin K,the choice ofobjectivefunction g,andto some extent ontheoptimisationalgorithmA.Hence,itisnecessarytoanalyse theidentifiabilityoftheestimatedparameters.Thistopicisfurther discussedinSection2.4.

Next, deﬁne the continuous time input ut∈Rⁿ^u and output yt∈Rⁿ^y,andthecorrespondingorderedsequencesofdiscretetime measurementsu_kandy_ktakenfromthesystemS:

y_[_N_]=[y₀,y₁,...,y_N] (2) u_[_N_]=[u₀,u₁,...,u_N] (3) Here, the integer subscripts k=0,1,...,N denote the discrete time samplinginstants, andthesubscript enclosed in [·] is used toindicateanorderedsequence.

A grey-box model can be expressed as a continuous time stochastic differential equation (SDE) with a discrete time mea- surementequation;adoptingthenotationof[5]:

dxt=f

(

^x^t^,^u^t^,^t^,

θ )

^d^t⁺

σ (

^u^t^,^t,

θ )

^d

ω

t (4) yk=h

(

^xk,uk,tk,

θ )

+ek (5) where t∈R is the time variable and xt∈Rⁿ^x is the continuous time state vector. The ﬁrst and second terms in the state transi- tionequation, giveninEq.(4),are commonlycalledthedriftand

(3)

diffusionterm,respectively[5,25].Thediffusiontermexpressesthe processnoiseasthefunction

σ

^multiplied^with^thedifferentialofa standardWienerprocess

ω

^t^.^The^discrete^time measurementequa- tionisgiveninEq.(5).

2.2.Maximumlikelihood

This section givesa summary ofthe theoretical basis adopted from the CTSM framework [5,14,15]. The objective function g in Eq.(1)can bederived from thelikelihood function,which isde- ﬁned asthe probability of observing the measurement sequence y_[_N_] when

θ

^andMareknown,i.e.:

L

θ

^;^y[N],M

=p

y_[N]

| θ

^,^M

(6) In the sequel, the model structure M is implicitly assumed knownandomittedfromthecondition.Byapplicationoftherule P(^A∩B)=P(^A

|

^B)^P(^B)^[25]^,^Eq.⁽⁶⁾^can^be^expanded^such^that:

L

θ

^;^y[N]

=

_N

k=1

p

y_k

|

^y[k−1],

θ

p

(

^y⁰

| θ )

⁽⁷⁾

The diffusion term in Eq. (4), which is assumed to be addi- tiveand independentof thestate x, is driven by aWiener pro- cesswhosedifferentialisGaussiandistributed[5].Hence,itisrea- sonabletoassumethattheconditionalprobabilitiesinEq.(7)can

beapproximatedbyGaussiandistributions[5,15].Thisassumption canbecheckedduringmodelvalidationbytestingtheresidualsfor normality [3,5].The likelihoodcan then beexpressed asamulti- variateGaussiandistribution[5],

L

θ

;y_[_N_]

=

⎛

⎝

^N

k=1

exp −¹₂

_k^TE_k⁻¹_|_k₋₁

k

det

Ek|^k−1

√ 2

π

ny

⎞

⎠

p

(

^y0

| θ )

⁽⁸⁾

AKalmanFiltermaybeusedtoestimatethequantities ˆ

y_k_|_k₋₁=E

y_k

|

^y[k−1],

θ

(9)

k=yk−yˆk|^k−1 (10)

Ek|^k−1=E

k

k^T

(11) IntheCTSMframework,anEKFisused.InSection2.3thealterna- tiveuseofUKFandEnKFisdiscussed.

Eq.(8)canfurther be simpliﬁedby takingthenegativeofthe logarithm;deﬁningtheloglikelihoodfunction(

θ

^;^y[N]):

θ

^;^y[N]

=−ln

L

θ

^;^y[N]

(12)

Thesolutiontotheoptimisationproblemisnotaffectedsince argmax

θ∈L

θ

;y[N]

=argmin θ∈

θ

;y₍N)

(13)

Table 1

Comparing equations for UKF (left) and EnKF (right).

Deﬁnitions and initialisation

ζm⁽⁰⁾=_λ₊^λ_n_x ζc⁽⁰⁾=_λ₊^λ_n_x+

1−α²+β

ζm⁽ⁱ⁾=ζc⁽ⁱ⁾=₂₍_λ₊¹_n_x₎, i∈{1,...,2nx} λ=α²(ⁿx+κ)−nx

w⁽_kⁱ⁾ ∼N(^w^¯k,Wk), i∈{¹,...,np} v⁽_kⁱ⁾ _∼_N(v^¯_k_,_V_k), i∈{¹,...,np} x₀⁽ⁱ_|⁾₀ ∼N(^x^¯0,X₀), i∈{¹,...,np}

ˆ

x₀_|₀=E[x0]=x¯0

X₀_|₀=V x0−xˆ₀_|₀

=X0

ˆ

x₀_|₀=_n¹_pnp i=1x⁽₀ⁱ_|⁾₀ X₀_|₀=_n_p¹₋₁np

i=1 x⁽₀ⁱ_|⁾₀−xˆ₀_|₀ (...)^T

State propagation x^{_k²₋₁ⁿ^x_|⁺¹_k₋₁^}=ς

xˆk−1|^k−1,Xk−1|^k−1

x⁽_kⁱ_|⁾_k₋₁=f x_k⁽ⁱ₋₁⁾ _|_k₋₁,u_k₋₁,w¯_k

i∈{⁰,...,2nx} ˆ

x_k_|_k₋₁=2nx i=0ζm⁽ⁱ⁾x⁽_kⁱ_|⁾_k₋₁

a)X_k_|_k₋₁=2nx

i=0ζc⁽ⁱ⁾ x⁽_kⁱ_|⁾_k₋₁−xˆ_k_|_k₋₁

(...)^T+Wk

xⁱ_k_|_k₋₁=f x⁽_kⁱ₋₁⁾_|_k₋₁,uk−1,w⁽_kⁱ₋₁⁾

i∈{1,...,np} xˆk|^k−1=_n¹_pnp

i=1x_k⁽ⁱ_|⁾_k₋₁

b)Xk|^k−1=_n_p¹₋₁np

i=1 x_k⁽ⁱ_|⁾_k₋₁−xˆk|^k−1

(...)^T

Measurement estimate x^{_k²_|_kⁿ₋₁^x⁺¹^}=ς

ˆ

x_k_|_k₋₁,X_k_|_k₋₁ y⁽_kⁱ_|⁾_k₋₁=h x_k⁽ⁱ_|⁾_k₋₁,u_k₋₁,v¯_k

i∈{⁰,...,2nx} ˆ

y_k_|_k₋₁=2nx i=0ζm⁽ⁱ⁾y_k⁽ⁱ_|⁾_k₋₁

y⁽_kⁱ_|⁾_k₋₁=h x⁽_kⁱ_|⁾_k₋₁,uk−1,v⁽_kⁱ₋₁⁾

i∈{¹,...,np} ˆ

yk|^k−1=_n¹_pnp i=1yˆ⁽_kⁱ_|⁾_k₋₁

Innovation and cross covariance Z_k_|_k₋₁=2nx

i=0ζc⁽ⁱ⁾ y⁽_kⁱ_|⁾_k₋₁−yˆ_k_|_k₋₁

(...)^T+Vk

Z_k_|_k₋₁=_n_p¹₋₁np

i=1 x⁽_kⁱ_|⁾_k₋₁−xˆ_k_|_k₋₁ y_k⁽ⁱ_|⁾_k₋₁−yˆ_k_|_k₋₁T

Ek|^k−1=np¹−1np

i=1 y⁽_kⁱ_|⁾_k₋₁−yˆ_k_|_k₋₁ (...)^T

K k= Z k|k−1E _k⁻¹_|_k₋₁ K k= Z k|k−1E _k⁻¹_|_k₋₁ Aposteriori update ^c⁾

k|^k−1=y_k−yˆ_k_|_k₋₁ ˆ

xk|^k=xˆk|^k−1+Kkk|^k−1

X_k_|_k=X_k_|_k₋₁−K_kEk|k−1K_k^T

x_k⁽ⁱ_|⁾_k=x⁽_kⁱ_|⁾_k₋₁+K_k(^yk−y⁽_kⁱ_|⁾_k₋₁) ⁱ∈{¹,...,np}

b)xˆ_k_|_k=n¹p

np i=npx⁽_kⁱ_|⁾_k

b)X_k_|_k=np¹−1np

i=1(^x⁽_kⁱ_|⁾_k−xˆ_k_|_k)(...)^T

a) Assuming aﬃne noise. (See Remark 3).

b) Can be omitted (See Remark 5).

c) Mathematically equivalent but not interchangable (See Remark 6).

(4)

Finally,byconditioningonknowingy₀,andeliminatingthescaling constants¹₂ from(

θ

^;

θ

^;^y[N]),theobjectivefunctionfromEq.(1)is givenas:

g

( θ

^;M^,^K

)

⁼ N

k=1

k^TE_k⁻_|¹_k₋₁

k+ln

det

Ek|^k−1

(14)

wheretheconstanttermc=N·ny·ln(²

π

)^is^dropped.

2.3. AlternativeKFformulations

The popularity of the Kalman Filter has led to a number of adaptions. The Extended Kalman Filter (EKF) is perhaps the most commonsuchadaptionandisusedin[5].Inthesequel,twoother wellknownKFvariationsareoutlined;theUnscentedKalmanFilter (UKF)[18]andtheEnsembleKalmanFilter(EnKF)[23].Inaddition tobetterapproximationsfornon-linearmodels,UKFandEnKFdis- pensewiththecomputationofJacobiansandthereforedonotre- quirethemodeltobedifferentiable[18].Bothﬁltersarelistedand comparedinTable1.

Given the SDE forthe state transition asin Eq. (4), the time evolutionoftheprobabilitydensityfunction(pdf)ofthestate,p(x, t), is described by the Fokker–Planck equation [23], also known as the Kolmogorov forward equation [5]. The multi-dimensional Fokker–Planckequation[25]canbeexpressedas

∂

^p

(

^x,^t

)

∂

^t ⁺

i

∂ ∂

^xi

(

^fi

(

^x^t^,^u^t^,^t^,

θ )

^p

(

^x,^t

) )

= 1 2

i,j

∂

²

∂

^xi

∂

^xj

p

(

^x,^t

) σ

^W

σ

^T

i j (15)

wheref_iistheithcomponentofthestatetransitionmodel.

In the EKF, the linearised model is used to approximate the ﬁrst moments ofthispdf [23] by aTaylor seriesexpansion trun- catedaftertheﬁrstterm[17,19].InbothUKFandEnKF,theFokker–

Planckequationisinsteadsolvedbyapproximatingthesolutionto Eq. (15) using a set of state realisations. The key difference be- tween the UKF andEnKF is in how that set is constructed. The UKF draws its state realisation set,called sigmapoints, usingthe unscentedtransform(UT).TheUT ofan expectedstate x¯withco- varianceXdeterministicallycomputesasetofsigmapointsx^{^N^}=

x⁽ⁱ⁾: i=0,1,...,N

,where theshorthand{·}superscriptin- dicates a set and a superscript (·) denotes a member. For con- venience of notation, a UT operator

ς

(^x^¯,X) ^that ^returns â ^set ôf N=2nx+1sigmapointsisdefinedas

x⁽⁰⁾=xˆ (16)

x⁽ⁱ⁾=xˆ+

(

ⁿ^x⁺

λ )

^X

i,i∈

{

¹^,^.^.^.^,ⁿ^x

}

⁽¹⁷⁾

x⁽ⁿ^x⁺ⁱ⁾=xˆ−

(

ⁿ^x⁺

λ )

^X

i,i∈

{

¹^,^.^.^.^,ⁿ^x

}

⁽¹⁸⁾

Thesquare rootisoftenimplementedusingaCholeskydecompo- sition, and the subscript i denotes the i-th column [17,18]. Note that there are different versions ofthe UT [3,19],where the one presented in Eqs. (16)–(18) is used in the sequel. For a Gaus- sian random variable (GRV), the UT is known to approximate the pdf p(x, t) to third order accuracy, and to the second order for non-Gaussian random variables [17]. The introduction of

λ

=

α

²(ⁿx+

κ

)−n_x inEqs.(16)–(18)givesasetoftuningparam- eters that can improveapproximations ofhigher order moments [17–19].

In contrast to the deterministic UT, the EnKF represents the state pdf using a Monte Carlo (MC) samplingmethod [17,18,23]. Thepdf isapproximatedasp(^x,t)=^dNn_p,wheredNisthenumber ofstaterealisationsinsomesmallunitvolumeandnp isthetotal

numberofrealisations[23].Thesetofrealisations,i.e.,theensem- ble,is initially drawn at random using the mean andcovariance oftheinitial state.Subsequently,eachrealisationispropagatedas a distinct trajectory, thus making the EnKF equivalentto usinga MarkovChain MonteCarlo (MCMC)method tosolve the Fokker–

Planckequation[23].

2.3.1. RemarkstoTable1

Remark 1. Initialisation for both ﬁlters is equivalent if np is

“large”, since the computedensemble values based on MC sam- plingconvergetotheexpectationvaluesx¯₀andX₀.

Remark 2. In the UKF, the sigma transform is applied twice to computethesigmapointsforbothaprioriandaposterioristateand covarianceestimates.IntheEnKF,the realisationsaredrawn only intheinitialisation,andsubsequentlypropagatedindependently.

Remark 3. The process noise w_k∼N(^w^¯k,Wk) ^andmeasurement noise

v

_k_∼_N₍

v

¯_k_,_V_k₎entertheUKFandEnKFindifferentways.The model in Eqs. (4) and(5) assumes aﬃne noise, hence the noise covariancesare added to therespective propagationequations in theUKF.Fornon-aﬃnenoise,thereareotheradaptionsoftheUKF, e.g.,estimatingnoise byaugmenting thestate vector,that canbe used[18].IntheEnKF,arandomnumbergenerator(RNG)isused todrawinstances ofthenoise whichissubsequentlyused inthe statetransitionandmeasurementequationsforpropagationofthe ensemble.

Remark4. If

ζ

m⁽ⁱ⁾= n¹_p and

ζ

c⁽ⁱ⁾=n_p¹−1 intheUKFformulation,the correspondingequationsforestimatingmeanandcovariancefrom therealisationsetwouldbeidenticaltoEnKF(exceptfortheiter- ationindex)whennpislargeand

λ

=0↔

α

=1,

κ

=0.

Remark5. InordertoshowthesimilarityofUKFandEnKF,both ﬁltersare formulated withexpressionsforcomputing aprioriand aposterioricovarianceforthestate estimate.Observethat forthe UKF these are needed in order to compute new sets of sigma points,whileintheEnKFthiscomputationcanbeomitted.Indeed, afundamentaladvantageoftheEnKFisthatitdoesnotrequireex- plicitcomputationoftheaprioriandaposterioristateestimateco- variance matrices, butrather propagatesthem asapproximations intheensemble.ThisisanadvantageoftheEnKFformodelswith ahighnumberofstates.

Remark 6. The EnKF aposteriori update of state realisations and covariance can be shown to be equivalent to the corresponding aposterioriupdateinthe UKF.However, since EnKFtreatsthe set ofrealisationasindependentstatetrajectories,theensemblemust beupdated fromaprioritoaposterioristateestimates.Hence, the two formulations are not interchangeable, despite being mathe- maticallyequivalent.

Remark 7. UKF hasthree hyperparameters,

α

^,

κ

^and

β

^; ^default

tuningsare suggested for standard noise models in the UKF literature.The EnKF hasonly one hyper parameter: thenumber of realisationsnp.

2.4.Proﬁlelikelihood

Parameterestimatesareoftenreportedasapointintheparam- eterspace^,ôrâsâ^confidenceînterval^[26]^with^some^stated^confidence

α

^.Ânalternativesolutionistopresentthedistributionof theparametersoverthe feasiblerange^.^Since^theêstimationôf parametersisbasedonthelikelihoodfunctioninEq.(6),oneattrac- tivechoiceforcreatingparameterdistributionsistheprofilelikeli- hood (PL)method presented in [8,21,22]. Thisapproach was also suggestedbytheauthorsofCTSM[27,28].ThePLmethodexplores the parameter space by optimising the parameters in two steps,

(5)

ratherthansimultaneouslyasinEq.(1).Forsimplicityofnotation, thedependenceon y_[_N_] is omittedfromthe loglikelihoodfunc- tion(

θ

^;^y[N])inthesequel.TheproﬁlelikelihoodPL(

θ

i)isdeﬁned astheminimumloglikelihoodfor

θ

iwhentheremainingparam- etersarefreelyoptimised[22,29]:

PL

( θ

i

)

⁼^min

θj =i

g

θ

j =i;M,K,

θ

i

(19)

Values of

θ

i must be chosen prior to optimising the remaining

θ

j=i[22]. Astraightforwardsolution,iftheobjectivefunction gis well behaved within the constraintsof ^, îs^to ûse â ^brute forceapproach withan evensampling of

θ

i.Alternatively, atwo- sidedgradientdecentalgorithm,usingafreelyoptimisedparame- tervectorasastartingpoint,canbeapplied[22,30].Theresulting likelihooddistributioncan beplottedasafunctionof

θ

i andsub- sequentlyanalysed according to the definitions of structural and practicalidentifiability forlikelihood-based confidenceintervals [8]. Unliketheasymptotic confidenceinterval, which isbased onthe curvatureofthelikelihoodfunctionbycomputationoftheHessian [8,22],thelikelihood-basedconfidenceintervaliscomputedbyap- plying a threshold to the likelihood function to compute a confi- denceregion[22,29].Let

θ

^:

( θ )

⁻

θ

^ˆ

<

α

,

α=

χ

²

( α

^,ⁿdf

)

⁽²⁰⁾

where

θ

ˆîsâ^freelyêstimated,^presumedôptimal,^parameter^vector, andthe threshold α is the

α

^percentile ^of^the

χ

²-distribution with n_df degrees of freedom. It follows from Wilks’ theorem [31]thatthelogarithmofthelikelihoodratio^test^statistic

2ln

( )

⁼²^ln

⎛

⎝

^L

⁽ θ )

L

θ

ˆ

⎞

⎠

=

( θ )

⁻

( θ

^ˆ

)

⁽²¹⁾

can be used to compare two models. The difference in log likelihood(

θ

)−

θ

ˆ

is asymptotically

χ

²-distributed [22,32], with n_df equal tothe difference inthe numberoffree parameters be- tween

θ

^and

θ

^ˆ^. ^Hence, ^the^PL ^method^uses ^a

χ

² ^threshold ^with

n_df=1. This formof confidence interval allows interpretation of structuraland practical identifiability by inspection of the upper andlower confidence boundaries [22]. If(

θ

⁾ ^is ^lower ^than ^the

threshold in both directions, i.e., the interval at the stated confidence level is unbounded (± ∞), the parameter is classified as structurallynon-identifiable [22].If(

θ

⁾ ^is^boundedⁱⁿ ^one^direc-

tion,thisindicatespracticalnon-identifiability[22,29].Profilelike- lihoodplotsareinterpretedsimilarly.Iftheplotislowerthan the confidencethresholdinbothdirectionsoronlyone,thisindicates structuralorpracticalnon-identifiability,respectively.

2.4.1. Two-dimensionalproﬁlelikelihood

The PL method essentially projects the n_θ dimensional space ^onto^the ^single^parameter

θ

i, byfreely estimatingtheremain- ingparameters. Hence,ifparameters are notindependent,the PL method tends to overestimate the width of the likelihood-based conﬁdenceinterval.Asteptowardsremedyingthisissueistomod- ifythePLmethodtoholdouttwoparametersratherthanone,i.e., PL2

θ

i,

θ

j

=min θk=i,j

g

θ

k=i,j;M,K,

θ

i,

θ

j

(22)

Thisresultsina two-dimensionaldistributionwhich canbeanal- ysed in a similar wayto the one-dimensional PL[22], using the deﬁnition in Eq. (20). The PL2 results are plotted as topological surfaces[22].Thisprojectstheparameterspace^onto^the^plane of

θ

i and

θ

j.Inadditionto diagnosingidentiﬁabilityissues,these plotscanbeusedtodiagnoseparameterinterdependence.Observe that since

θ

ˆ ^has ⁿθ free parameters while thePL2 estimate has

n_θ−2,thisgivesn_df=2forthecomputationofα fromthe

χ

²^-

distributioninEq.(20).

Applying a conﬁdence threshold to the PL2 method produces conﬁdence regions in the (

θ

i,

θ

j) plane, rather than intervals in a single parameter. Based on conﬁdence thresholds computed from the

χ

² distribution, a similar interpretation of these two- dimensional topologies can be applied to diagnose identifiability by requiringthat the regionis boundedin alldirections. Ifthere isan unbounded equipotential valleywitha log likelihoodbelow theα threshold,theparameterisstructurallynon-identifiable.If theintervalorregionisunboundedonlyinone direction,thisin- dicatesapractically non-identifiableparameter.Examples oftwo- dimensionalPLplotsare giveninSection4.Ifparameter interde- pendence is observed, re-parametrisationof the modelsuch that theinterdependencyisresolved,maybeadvisableinordertoob- tainamodelwithtighterconfidenceboundsontheestimatedpa- rameters.

2.4.2. Interpretationofwideconﬁdenceregions

Itcan beargued thata wide conﬁdenceregionisindicativeof anidentiﬁabilityissueeveniftheregionisbounded.Iftherangeof acceptableparametervaluesislarge,theinterpretationoftheesti- matedparameters asbeingdeterminedbythephysicalproperties ofthesystem,i.e.,S∈M()→M

θ

ˆ

S,isquestionable.

Onepossiblecauseofwideconﬁdenceboundsontheestimated parametersisthepresenceofnuisanceparameters,i.e.,parameters whosevalueisinsigniﬁcantforthemodelestimates.

2.4.3. Effectofconstrainedparameters

Observe that solving the two-step optimisation problem in Eq.(19)subjectedtotheconstraint

θ

∈împosesârestrictionon theidentifiedprofilePL(

θ

i).Thisconstraintmayskewtheresults, since the remaining parameters

θ

i=j are only considered within theregion^.Îf^parametersâre^notindependent,theprofileofone parametermaybeinfluencedbytheconstraintsofanother.Inthe PL2method,the effectofconstrained optimisationof parameters is easierto diagnose, since dependent parameters can be identi- fiedfromthetopologyplots.

2.5. Modelvalidation

The CTSM methodrequires evaluationof theresiduals to ver- ify that the assumption of Gaussian distributed residuals is jus- tiﬁed [5,15]. In the CTSM literature, the autocorrelation function (ACF)isusedtotestfornormalityofresidualsinthetime-domain, whileacumulativeperiodogram(CP)isusedinthefrequencydo- main[5,8,15].Therearealsoanumberofalternativetestsfornor- mality that can be applied, such as thezero-crossings test orthe Kolmogorov–Smirnovtest[3].

3. Casestudymodelandsimulation 3.1. Model

Athermalnetwork modelofabuildingcan be expressedasa resistor-capacitor(RC) circuit. Thesemodels are basedon a naive physical understanding of temperature variations in the building structure, which entailssimpliﬁcations that necessarily introduce modelling errors. The result is a simpliﬁed, lumped parameter model, which should be treated in the framework of grey-box modelling, and hence formulated as stochastic differential equations(SDE)asinEq.(4)[5].

Fig. 1showsan exampleof a candidateRC model which was developed to approximate the thermal behaviour of the experimental building discussed in Section 3.2, partially based on the

(6)

Fig. 1. The R3C2 thermal network model of an experimental building can be ex- pressed as a resistor–capacitor equivalent circuit containing three resistors and two capacitors.

Fig. 2. Calibration data for the R3C2 model. The model outputs T b (red) and T w(blue) are plotted together with the outdoor temperature input T ∞(green). The input power Q is plotted separately. (For interpretation of the references to colour ˙ in this ﬁgure legend, the reader is referred to the web version of this article.)

Table 2

Nominal parameter values and min/max limits for resis- tances [K/W] and capacitances [J/K].

R b R w R g C b C w

θ0 0.100 0.100 0.250 1200 k 1200 k θmin 0.030 0.030 0.075 360 k 360 k θmax 0.170 0.170 0.425 2040 k 2040 k

R4C2 model presented in [1]. The model has two outputs: the room temperature T_b andthe wall surface temperatureTw, and twoinputs:theconsumedpowerbyanelectricheatingelementQ˙ andtheoutsidetemperatureT_∞.Fivecomponentsformthemodel structure: the thermal resistance betweenroom airand wall R_b, thebuildingenvelopeRw,andthethermalresistanceofwindows anddoorsR_g.Thetwo capacitancesC_b andC_wrepresentthether- malcapacitanceofthebuildinginteriorandenvelope,respectively.

Anominalparametervector

θ

0,listedinTable2,isusedastheini- tialvalueforparameterestimation.Additionally,thefeasiblevalues region^is^limited^by

θ

minand

θ

^max^,^which^are^chosen^as^0.3×

θ

0

and1.7×

θ

0, respectively.

3.2. Calibrationdata

The calibration data used for parameter estimation was ob- tainedfromanexperimentalbuildinglocatedatCampusPorsgrunn of the University of South–Eastern Norway (USN). The data was collected by multiple data acquisition systems,each producing a separatedatasubset,andcombinedintoaconsistentdatasetinthe preprocessingstep.Thedatawasﬁrstﬁlteredtoremovenoiseand

Fig. 3. Illustration of Kalman Filter (KF) with externally simulated (SIM) state propagation.

subsequentlyresampledintoauniformtemporalscale.Inorderto maintainmeasurementuncertaintyafter preprocessing,arandom noisecomponentofcovariance0.1wasadded tothetemperature measurements.TheresultingdataispresentedinFig.2.

3.3.RCSimulator

Thechoiceofmodelstructureforathermalnetworkmodel,i.e., theRCcircuit,usuallyinvolvessignificantexperimentation[1,7,16]. Tosimplify,andpossiblyautomate,theprocessoffindingappropri- atemodelstructures,itisusefultosimulatesuch modelswithout requiringexplicitmodelequations.Sincethethermalnetworksare modelledasRCcircuits,itisnaturaltolooktotheelectronicsfield wherecircuitsareoftensimulatedusingtoolssuchasSPICE[33].A circuitsimulatorcanbeusedtopropagatethestate,hencereplac- ingthedrifttermofEq.(4),asillustratedinFig.3.Usingthisset- upwiththeparameter estimationmethodinSection 2.2requires a KF implementation that can handle non-differentiable models, suchasUKFandEnKF.

A simple circuit simulator is constructed, named RCSimulator forreferenceinthesequel. Circuit simulatorstypically define the circuit model asa list ofinterconnected components,which can be takendirectlyfromthe schematicinFig.1.By convention,all componentshave two terminalsnamed in and out. Each node is assignedaninteger indexwhich isusedtoconfigure theconnec- tionsofthecomponentsasacircuit.Forexample,lettingnodeT_b haveindex 1and Tw index2,the component R_b would havein- put/outputassignment(1,2).Foreachnodeinthecircuit,Kirchoff’s nodecurrentlawisusedtobalancetheflowinandoutofthenode [34]. The system of node equations can be written in difference form:

Axk+Amxk−1+Buk=0 (23)

Thecontributionsfromallcomponentsaresummedtogether,such that rows i in A, A_m, and B constitute the balance equation for node i. Eq. (23) is solved for x_k at each time-step in order to propagate the state. The only dynamicelement is the capacitor,which is implemented using an implicit Eulerdiscretisa- tion,

_dx

dt

t_k≈^x^k⁻^x_t^k⁻¹, by contributingto both the A andAm matrices.Voltagesources areimplementedasconstraintsonthedif- ferencebetweenthestatesofthetwoconnectednodes.Themea- surementEq.(5)canbe implementedasmeasuring thepotential betweenselectednodesintheRCcircuit.

The simulation scheme,andin particular the discretisation of the capacitive elements, could be extended with more accurate approximationssuch asthe Runge–Kutta 4th order (RK4) scheme [35].It isalso possibleto introduce non-linearcomponents,such asvariableresistors. Observethatwhilethetest casemodelused hereislinear,themethodofestimatingresidualswithUKForEnKF forexternallysimulatedmodelshasnosuchrestriction.

3.4.Discretetimelinearmodel

Forcomparison,themodelisalsoexpressedinastandardlinear statespaceform

dx

dt =Axt+But+Gwt (24)

Parameter estimation for externally simulated thermal network models

Energy & Buildings

Parameter estimation for externally simulated thermal network models R

O.M. Brastein

, B. Lie , R. Sharma , N.-O. Skeie

θ

( θ

)

θ

θ

θ

θ

θ

θ

(

θ )

σ (

θ )

ω

(

θ )

σ

ω

θ

θ

| θ

|

θ

|

θ

(

| θ )

θ

⎛

⎝

π

⎞

⎠

(

| θ )

|

θ

θ

θ

θ

θ

θ

θ

θ

( θ

)

π

∂

(

)

∂

∂ ∂

(

(

θ )

(

) )

∂

∂

∂

(

) σ

σ

ς

(

λ )

{

}

(

λ )

{

}

λ

α

κ

Parameter estimation for externally simulated thermal network models ^R