Research Department Oslo, December 22, 1991
1991/9
Inference in small cointegrated systems Some Monte Carlo results
by
Øyvind Eitrheim
al?n±srs
I: ISSN 0801-2504
ISBN 82-7553-031-8 r-"' 0 EK
Norges Banks arbeidsnotater inneholder forskningsarbeider og utredninger som vanligvis ikke har fått sin endelige form.
Hensikten er blant annet at forfatteren kan motta kommentarer fra kolleger og andre interesserte.
Synspunkter og konklusjoner står for forfatterens regning.
Copies of this Working Paper are obtainable from Norges Bank, Library,
P.B. 1179, Sentrum, 0107 Oslo 1, Norway.
Norges Bank's Working papers present research projects and reports (not usually in their final form), and are intended inter alia to enable the author to benefit from the comments of colleagues and other interested parties.
Views and conclusions expressed are the responsibility of the author alone.
Inference in small cointegrated systems Some Monte Carlo results
by
Øyvind Eitrheiml 22.December 1991
'The research for this paper was carried out while the author was visiting the De- partment of Economics, University of California, San Diego in 1990/1991. Thanks to Tony Hall, Eugen Nowak and Timo Teråsvirta for many stimulating discussions and to Clive W.J. Granger and Rob Engle for providing such a stimulating research environment.
The paper was presented at the Nordic workshop on cointegration analysis in Helsingør, oct.28.-nov.1 1991 and I am indebted to Søren Johansen for his stimulating comments and suggestions. Finally I thank my colleages Kjell G. Bleivik, Eilev S. Jansen and Ragnar Nymoen for their comments to a previous version. Authors adress: Research Department, C51, Bank of Norway, P.B. 1179 Sentrum, 0107 Oslo 1, Norway. Phone: +47 2 316161, Fax: +47 2 424062, Bitnet: EXT90004 at NOBIVM.
from a practitioner's perspective. We adress the robustness of the cointegration tests in small samples and with respect to particular types of misspecification of the model.
A small cointegrated system is parameterized and forms the basis for the Monte Carlo simulations. Non-parametric estimates of the distribution of the Trace and )-
Max tests are reported, as well as for some of the estimators for long- and short-run parameters in the model respectively. Power properties and finite sample perfor- mance for the cointegration test and estimators are discussed and the results are interpreted in the light of available asymptotics.
The types of model misspecification considered include the case with wrong dynamic specification (i.e. wrong order k in the VAR model) and the case when we ignore non- normality in the DGP residuals (i.e. when the DGP residuals are subject to ARCH (Auto Regressive Conditional Heteroscedasticity) or are serially correlated). We also discuss how data properties like temporal aggregation or systematic sampling may affect the inference on cointegration, and how the Johansen procedure performs under those conditions. Finally, we consider the case with cointegration between non-stationary latent variables which are observed with measurement errors.
Inference on cointegration 1
Contents
1 Introduction 2
1.1 Background and motivation . . . 2
1.2 Non-stationarity and persistence in economic data . . . 3
1.3 Some remarks about the methodology . . . 3
2 The model 5 2.1 A small cointegrated system . . . 5
2.2 A simulated example . . . 6
2.3 The dynamic properties . . . 6
2.4 A note on renormalization . . . 10
2.5 The Trace test for the cointegrating rank . . . 12
2.6 The asymptotic distribution of the Trace test . . . 12
2.7 The estimators for a,
r1
and Q . . . 132.8 A case study: Estimation results based on simulated data . . . 14
3 Monte Carlo results for a correctly specified model 17 3.1 Baseline results for the prototype model . . . 18
3.2 Power properties in the case of "near cointegration" . . . 21
3.3 Finite sample performance . . . 25
4 Monte Carlo results for some misspecified models 30 4.1 Misspecified dynamics, the VAR(k) has wrong order . . . 30
4.2 Data are filtered through systematic sampling or temporal aggregation 34 4.3 DGP residuals are heterogeneous with weak to medium strong ARCH 39 4.4 DGP residuals are serially correlated . . . 42
4.5 Data are observed with measurement errors . . . 45
5 Concluding remarks 49
References 51
Issued in the series ARBEIDSNOTATER from Bank of Norway 54
1 Introduction
1.1 Background and motivation
Empirical analysis of small systems of cointegrated time series has become a pop- ular approach to modelling simultaneous relationships between 1(1) non-stationary economic variables. The flourishing literature on cointegration has provided insights about postulated long run relationships between economic variables and has sti- mulated to the research on how these should be modelled in a dynamic context. The pioneering work on cointegration was presented in Granger(1981,1983) and Granger and Weiss(1983) and came as a natural extension of the univariate analysis of time series with a unit root in Fuller(1976), Dickey and Fuller(1979), Phillips(1987) and others. Engle and Granger(1987) proposed a simple two step approach to coin- tegration based on OLS estimation which immideately caught widespread atten- tion. Stock(1987) compared OLS and NLS estimators for the parameters in a single equation error correction model (ECM) and analysed the asymptotic distribution for long run and short run parameter estimates. The single equation ECM has later been further analysed in Phillips and Hansen(1990), Stock and Watson(1990) and Phillips and Loretan(1991).
An alternative approach to cointegration has been proposed in Johansen(1988). He suggested to use the theory of reduced rank regression , inspired by the seminal work by Anderson(1951), cf. also e .g. Velu, Reinsel and Wichern(1986). This ap- proach has been further developed in Johansen(1989,1992) and in Johansen and Juselius(1990). Gonzalo(1989) has compared five different approaches to model bi- variate cointegration, and his Monte Carlo results came up with very favourable properties for Johansen's approach, which stimulated us to further analysis of this approach. See also Phillips and Loretan(1991) for additional Monte Carlo evidence on the relative merits of different approaches to cointegration.
The Johansen method is reasonably simple to implement on a PC, and it is also fairly general in the sense that it can be used in a multivariate context to estimate a small cointegrated system as we
ll
as in the usual single equation error correction model (ECM) framework . A VAR( k) model with Gaussian errors forms the back- bone in Johansen 's approach while e .g. Phillips ( 1991) analyses cointegration underInference on cointegration 3
more general assumptions for the error distribution (i.e. assuming weakly dependent and heterogeneously distributed errors).
1.2 Non - stationarity and persistence in economic data
There is a controversy in the literature between people who believe in I(1) non- stationary time series representations of economic time series in contrast to those who favor trend stationary models (where the time series are 1(0) but non-stationary and the non-stationarity stems from a deterministic polynomial trend (possibly in- cluding break points for shift in the trend)). At the centre of this debate is how literally one should interpret the unit root assumption. While most econometrici- ans probably agree that economic variables seem to be characterized by substantial persistence in memory (that shock die out very slowly), and frequently also by some degree of time heterogeneity like heteroscedasticity, (e.g. with ARCH character- istics), the claim that the variables contain a unit root seem to be more controver- sial.
Phillips(1991) argues that prior knowledge about unit roots and cointegration re- present important information about the data generating mechanism, DGP, and that the restrictions which follows should be imposed during estimation. Phillips and Loretan(1989), Johansen(1988) and Johansen and Juselius(1990) follow this view while e.g. Sims, Stock and Watson(1990) are in favour of using unrestricted models. Campbell and Perron(1991) take a more pragmatic standpoint in a recent survey paper and suggest that the strategy in each case should be motivated by the particular purpose of the analysis.
1.3 Some remarks about the methodology
A small cointegrated system is parameterized and used as basis for the Monte Carlo simulations in this paper. We have specified a four dimensional stochastic process and a system approach is preferred in order to determine the rank of the cointegra- ting space and estimate the parameters in the model. The approach to cointegra- tion in Johansen(1988) satisfies this requirement and has been used throughout the paper. A reduced rank (canonical regression) procedure, similar to the one developed in Anderson(1951), forms the basis for the tests and the asymptotics
has been worked out for I(1) and cointegrated variables in Johansen(1988)1. Two test statistics denoted Trace and Max respectively were derived in order to deter- mine the rank of the cointegration space. These methods have later been further developed by Johansen and Juselius(1990). Their approach have attracted parti- cular interest among proponents of structural econometric modelling (SEM), since it allows for quite general classes of (linear) parameter restrictions to be tested wit- hin this framework'. The relevant test statistics turn out to be surprisingly simple, with a limiting X2 distribution, cf. e.g. Johansen(1992) for details. Phillips(1991) proves that this result holds more generally and in particular for the class of LAMN (Locally Asymptotic Multivariate Normal) models.
The distribution of the two rank tests for cointegration and the estimators for the long run parameters in the model turn out to be more complicated and have non- standard asymptotic distributions. It is the performance of these tests and esti- mators we will focus on in the following chapters.
A prototype DGP is constructed such that it contains a small cointegrated system with three relationships (part 2). Some properties of the cointegration tests are discussed in part 3 along with a discussion of the small sample performance. Part 4 presents some results from experiments where the model is misspecified in some direction. Part 5 concludes the paper.
'Anderson(1991) give interesting insights in the early development of econometric techniques in the Cowles Commision in the 1940- 1950s and claim that many of the recent contributions in econometrics have aspects of "rediscovery " to them.
'Cf. e.g. Hendry and Mizon ( 1991 ) and Clements and Mizon ( 1991 ) who have applied the VAR cointegration model in this context.
Inference on cointegration
2 The model
A popular definition of cointegration is the following:
5
Definition 2.1 A vector of time series, xt, with elements xi, i = 1, .., p, each of which are assumed to be I(d) are said to be cointegrated I(d, k) if a linear combina- tion zt = /3'xt is integrated I(d - k) with 0 < k < d.
2.1 A small cointegrated system
Let the p-dimensional vector xt be generated from a linear VAR(k) model with Gaussian errors3. All variables are assumed to be I(1), and we apply the compact notation xt = (xlt, x2t, ... , xpt)'. The VAR(k) model is given by
k
(2.1) xt = X: Rixt-i + /20+ et
i=1
and we assume et - Niid(O, S2). The long-run multipliers in this model may be ex- pressed by the matrix H(1) = Ip - J:k l IIi, where I. is the identity matrix. It is con- venient to rewrite the model for xt using the interim multiplier reparameterization4.
k-1
(2.2) Oxt riAxt-i + aO'xt-k + Et i=1
The parameters in F3 =
-4
+ E:ij=1 Ili j = 1,...,(k-1) and a may be thought of as short run parameters while ,Q contain the long run parameters in the model. The prototype model is given by:0.1 0.4 0 0 Oxl,t-1 0.3 0.1 0 0 Ox2,t-1
0 0 0 0 Ax3,t-I 0 0 0 0 J L Ox4,t-1
1-0.4 0.2 0
0.2 -0.2 0 xi,t -2 - x2,t-2
0 0 0.25 x2,t-2 - x3,t-2
0 0 0
j L
x3,t-2 - x4,t-2where etIIt-1 ^' Niid (0, , 14).
Elt Ett E3t E4t
3This model has been extensively analysed by Johansen(1988) and Johansen and Juselius(1990).
4Cf. e.g. Johansen(1988), Johansen and Juselius(1990) or Hylleberg and Mizon(1989).
0 v
i x4
`'• :
k i•: x3
,, : .. .
N, ;,..
, i • r ` .
rpr.. 'i:ti•,. . r.. . `
. i. f
'/.
,:'..
-10
I
r 1-"
'kl-20
-30
-40
-50
.
1. I •.
•., • i`;:
.. • ti : .. V '• : `r :,
kil l
/•
0 40 80 120 160 200 240 280 320 360 400
Figure 2.1: A simulated trajectory for the model (2.3)
2.2 A simulated example
The intuition behind cointegration can be expressed as follows. Consider a group of variables, for which the development over time is determined by (at least) one unit root, and hence creates a non-stationary typically I(1) type of time trajec- tory. Cointegration means that the non-stationarity is a common characteristic (or feature) among the variables, and we expect the variables to stay together and not drift too far apart from each other over time. A trajectory for cointegrated variables is illustrated in figure 2.1 and shows a particular realization of the four-dimensional stochastic process (2.3).
2.3 The dynamic properties
Some further remarks about the DGP are useful. (2.3) is parameterized with p = 4, r = 3 k = 2 and with a particular normalization of the long run system imposed such that the differences between the series are stationary. In addition, the DGP contain some short run dynamics but such that only one root has modulus one. The unit root corresponds to the variable x4t which is specified as a random walk. The
Inference on cointegration 7
other roots have moduli less than one and there is also one pair of complex conjugate roots which contributes to stable cycles in the trajectory. The error correction part of the model will prevent the four variables from drifting too far apart and we will typically see a long swing (unit root) behaviour for the cluster of non-stationary variables, cf. figure 2.1 above.
How do the dynamic properties change when we change one of the columns in a? In order to describe the dynamics in the DGP it is useful to look at the characteristic roots of the following determinant5.
det
We have considered multiplicative changes in either the 2. or 3. column in a using a scalar s E [4, 2, 1,1/5,1/100]. Table 2.1 decompose the roots of the system determinant.
The roots for the prototype model (2.3) are shown in the middle part of table 2.1.
The two upper sets belong to models where we have increased the absolute values of either aj2 or aj3 and the two bottom sets belong to models where we let either aj2 or aj3 approach 0. Figure 2.2 show simulated trajectories for the five cases in the left part of table 2.1 (different aj2s) while figure 2.3 show the five right hand cases (for different aj3s).
Since x4t is a simple random walk process and enters into the system only through the parameter a33, we expect to see at least one unit root in table 2.1. In the limit when aj-2 --i 0 or aj3 -p 0, we see that there will be two unit roots and the rank of the cointegration matrix will be reduced from 3 to 2. The model has one or two pairs of complex conjugate roots which give rise to cyclical movements in the series. When the absolute values of either aj2 or aj3 are sufficiently increased the modulus becomes larger than 1 for one of the pairs, and we obtain the explosive cyclical pattern which is evident in the upper left part of figure 2.2 and 2.3. The block diagonal structure in a will generate some interesting differences in the two limiting cases when either aj2 --* 0 or aj3 --4 0. When aj3 ---+ 0, the system will be driven by two stochastic trends, one which drives the three series x1t, X2t and X3t and
another which drives x4t. When aj2 -* 0, the pairs (xlt , x2t) and (X3t , x4t) will be 5Cf. Hendry and Mizon(1991) where they suggests a similar procedure based on the companion form representation of the VAR(2) model.
Table 2.1 Dynamic characteristics of the DGP (2.3). Multiplicative changes in the columns aj2 or aj3.
Changes in aj2 Changes in aj3
Real root
Imaginary root
Modulus Real
root
Imaginary root
Modulus
4 x <e, 4x 0 3
0.75 0 .24 0.79 0.35 0.61 0.70
0.75 -0.24 0.79 0.35 -0.61 0.70
0.35 0 .95 1.01 0.85 - 0.85
0.35 -0.95 1.01 0 .64 - 0.64
0.50 . 0.50 0.50 0 .87 1.00
0.50 0.50 0.50 - 0.87 1.00
0.00 - 0 .00 0.00 - 0.00
1.00 - 1.00 1.00 - 1.00
2 x a 2 x a 3
0.35 0 .73 0.81 0.35 0.61 0.70
0.35 -0 .73 0.81 0 .34 -0.81 0.70
0.76 0. 15 0.77 0.85 - 0.85
0.76 -0 .15 0.77 0.64 - 0.64
0.50 - 0.50 0.50 0.50 0.71
0.50 - 0.50 0.50 -0.50 0.71
0.00 - 0.00 0.00 - 0.00
1.00 - 1.00 1.00 - 1.00
1 x a 2 Ix a 3
0.35 0 .61 0.70 0 .35 0.61 0.70
0.35 -0.61 0.70 0.35 -0.61 0.70
0.86 0 0.86 0.85 0 0.85
0.84 0 0.64 0.64 0 0.64
0.50 0 0.50 0.50 0 0.50
0.50 0 0 . 50 0.50 0 0.50
0.00 0 0 . 00 0.00 0 0.00
1.00 0 1.00 1.00 0 1.00
1/5 x121 1/5 xa '3
0.38 0 .48 0.61 0 .35 0.61 0.70
0.38 -0 .48 0.61 0 .35 -0.61 0.70
0.98 0 0.98 0.85 0 0.85
0.45 0 0 . 45 0.64 0 0.64
0.50 0 0 .50 0.01 0 0.01
0.50 0 0 .50 0.99 0 0.99
0.00 0 0.00 0.00 0 0.00
1.00 0 1 .00 1.00 0 1.00
1/100 X a 2 1/100 x 0 3
0.39 0 .46 0.61 0.35 0 .61 0.70
0.39 -0 . 46 0.61 0 . 35 -0.61 0.70
1.00 0 1.00 0 .85 0 0.85
0.42 0 0 . 42 0.64 0 0.64
0.50 0 0.50 0.00 0 0.00
0.50 0 0.50 1.00 0 1.00
0.00 0 0.00 0.00 0 0.00
1.00 0 1 .00 1.00 0 1.00
eY
aw
Alpha( j,2) x 4.0 » Alpha(j,2) x 2.0
f
-.a
-i
.e0, • Y i i .i ,e, a.0
, 1 er I
1w a,0 li NO
• . N•
Alpha(j.2) x 0.20
s
x3,x4
i `'•1.,.
"', w » .» i aw :Y ,w ai » r e Y w m .e am aY a am aw w
Prototype model (2.3)
Figure 2.2: Simulated trajectories for (2.3) when we change aj2 to aj2
X C
E [4,2,1,1/5,1/100] j = 1,...,4.Alpha(j.3) x 2.0 Alpha(j.3)
r, •<<
'.
.X ,h fik'
Prototype model (2.3)
~, Y w .i .„ Ø aY aw ai a»
Y w
Y -i , w w .i .Y aw aY a» a„ ai o
xl,x2,x3
r,
Alpha (j.3) x 0.2
x4
ti11r`'• 1,'r .
Alpha (j.3) x 0.01
- e Y • .i .< ie He i a» ai YO
Aipha(j,2)
-i e Y w .i .i aw aY ai a» !„ A
Figure 2.3: Simulated trajectories for (2.3) when we change aj3 to aj3 X c E [4,2,1,1/5,1/100] j = 1,..,,4.
driven by separate stochastic trends. In both cases a typical long swing behaviour is reckognized for each of the two sub-clusters.
2.4 A note on renormalization
The model (2.3) is deliberately overparameterized (cf. Johansen(1989)), and the 2pr (= 24) parameters in a and
/3 are identified only up to some arbitrary linear
transformation. Any non-singular r x r matrix D can be used to obtain 'new' (but observationally equivalent) matrices at and Qt such that /3t = /3D, i.e.-II (1) = a//' = aDY ' D'/3'
at pt,
In order to facilitate the interpretation of the estimated results, it is often necessary to renormalize the original parameters Q and å. In applied work we often impose a normalization of Q by dividing each column-with a column specific element kj, say More generally we can renormalize by defining the matrix D = (/3'R)-1 for an appropriate choice of the p x r matrix R. For instance, to obtain the normalization rule above we set R = [t1, ... , t,.] where tj are p x 1 vectors with (p - 1) 0's and a 1 in the kjth element.
For given rank r, we suggest the following procedure to renormalize the estimated parameters . We let ,Q denote the renormalized estimates when we use a particular transformation matrix D, i.e. = /3D. D has to be designed carefully, and we use whatever available information we can think of about the long run system and its natural representation in order to construct it. First we assign prior values r parameters in each column of a tentative parameter matrix on the basis of this information. The number of restricted parameters is r2 (=9) which leaves (p - r)r (=3) free parameters to be calculated to obtain /3. The r2 elements in the transformation matrix D can be obtained from the expression
Svec(/3) = S(I ®
0
')vec(D)We have premultiplied with a r2 x pr (= 9 x 12) matrix S which picks out the r2 restricted parameters in ,Q. This expression is solved for vec(D).
(2.4) vec(D) = [S(I 0 /3')]-1Svec(/3)
Inference on cointegration 11
Finally, we calculate the free parameters and obtain fl = #D-
In our case it is natural to select the a priori values in
Q on the basis of the DGP given by (2.3). The preassigned values are represented by the 0's and l's in the matrix below, leaving /921, /332, Q43as free parameters. Note that this transformation is performed in order to facilitate the interpretation and that the procedure does not restrict the estimated long run matrix -11(1) in any way.Tentative long run parameters ,Q and renormalized estimator Q
1
3
( 0
0 0 _1
P32
0
0
i / N32
0 0 0 i
/ N43
Appropriate selection matrix S
1 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0
S= 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 1 0 0 0
2.5 The Trace test for the cointegrating rank
Let zo = Oxt, z1 = L1xt_1 and z2 = xt_2. The sample moments between the z's are denoted Mi; for j = 0, 1, 2. Now, define by Rot, Rkt the residuals from two sets of auxilliary regressions , i.e. those of zo and z2 on z1 respectively, and let Si; = T-1 > 1 RttR;t , for i, j = 0, k denote their sample covariances. To determine the rank of the long run parameter matrix II(1), we solve for the eigenvalues in the following equation.
(2.5) I i1Skk - SkoSpp
SOkI = 0
We test the null hypothesis Ho : r < r using two different test statistics which are constructed in the following way. We first sort the elements in A in descending order.
If there are r cointegrating equations , we would expect the r largest eigenvalues to be greater than zero and represent the r linearly independent columns in
II(1) which
determines its rank. The last (p-r) eigenvalues are expected to be zero . The Trace- test for cointegration is based on the sum of these (p - r) smallest ai's which are zero under Ho, hence we test Ho : r < r against H1 : r + 1 < r < p.(2.6) Trace = -21n Q = -T > ln(l - Ai)
i=f+1
The alternative test statistic is called AMax and test Ho : r < r, but now against the alternative H1 : r = r + 1. In this case the test is based on the largest of the (p - r) .pi' s which are zero under Ho.
(2.7)
A
Max = -21n Q = -T ln(1 - aT+1)2.6 The asymptotic distribution of the Trace test
Johansen(1988) derived the asymptotic distribution of the Trace-test statistic under the null hypothesis above, and it can be shown that the limiting distribution is a function of a m dimensional Brownian motion B (m = p - r). Apart from m it can be shown that the limiting distribution of Trace is independent of other "nuisance parameters", cf. theorem 4.1 in Johansen(1989). More precisely, it can be shown that
Inference on cointegration 13
p p
(2.8) Trace - T L åi --- T E pi = tr{J
dBB'[J BB'du]-1J BdB'}
i=r+1 i-i+1
where B is the m-dimensional Brownian motion, pi for i = 1, . . . , m are eigenvalues associated with the stochastic matrix above and all integrals are defined on the unit interval. Critical values have been simulated for the case with no linear trend in Johansen(1988). Johansen and Juselius(1989) have extended the simulations to situations when there may be a deterministic trend in the DGP and they shown how the asymptotic distribution of Trace change in this case6.
2.7 The estimators for a , IF, and 0
When the variables are cointegrated, we can write the residuals et in (2.2) as et = Rot + a[/9'Rkt]. In the hypothetical case when (3 is known, a can be estimated by OLS since the model is linear in a. In practice, /3 is of course unknown and has to be estimated . If we know a ML estimator for /3, say /3
,
we can still estimate & by regressing Rot on / 'Rkt and obtain(2.9) a(/3) = -S J (N I'SJ )-1
A similar estimator for Pl is obtained from the expression (2.10) I'1(Q) = M01M111- II(1)Mk1M111
where -II(1) = &$'. The estimators are consistent and converge to their true values at the normal rate T112, cf. Johansen(1989,1992) for details.
Johansen(1988) showed that maximum likelihood estimators for the long run para- meters, /3, can be found in terms of the estimated eigenvectors associated with the r greatest eigenvalues in k We set ,Q = (v1 , ... , vf) where v is the corresponding matrix of eigenvectors. It can be shown that these estimators are superconsistent (in the sense of Stock(1987)) and converge at the rate T, cf. Johansen(1989,1992) for details.
'Extended tables which cover higher dimensions of the VAR have been simulated by Osterwald- Lenum ( 1990).
2.8 A case study : Estimation results based on simulated data
In this part we will briefly discuss some experiences from the estimation of (2.2) based on a simulated sample with T = 400 observations. A correctly specified model is estimated "recursively" using the abovementioned procedures and we report the tests for cointegration and parameter estimates. The initial sample size is 25 observations and subsequent observations are added until the entire sample is used, which yields a time sequence of 376 estimates of test statistics, parameters and so on.
Recursive plots of the four eigenvalues are shown in figure 2.4. As we would expect, one of the eigenvalues turn out to be close to zero, while the other three approach non-zero limiting values. The convergence towards stable values is reasonably fast, but we note that all eigenvalues seem to be overstated at very small sample sizes (in particular for less than 40 observations). The scaled Trace- and 'Max tests in figure 2.5 (scaled by the appropriate 5% fractiles reported in Osterwald-Lenum(1990)), indicate that r = 3 (which is correct since the prototype model (2.3) is constructed such that the true rank f = 3.). Interestingly, there is a striking similarity between the two sequences of rank tests reported in figure 2.5. In order to understand this similarity, and consequently the advocated use of the cointegration tests Ho : r <= i for i = 0,1, . . . , p - 1, it is useful to apply some results from Johansen(1991) which are inspired by the work by Pantula(1989).
Let C, (e) denote the (e) critical region of the test sequence {To > co (E), ... , T,, >
c,7(e)}, which can be constructed for 77 = 0, 1,. .. , p - 1 where Ti denotes the test statistic used to test Ho : r < i and ci(E) the e-quantile in the corresponding limiting distribution. It can be shown that lim Prob(r E C,7) = 1 if the true rank f is not contained in the range defined by j, (i.e. q < F), such that the test will reject a false null hypothesis with certainty (at least asymptotically). Similarly, we will have that lim Prob(r E C„) = c when q = F and lim Prob(r E C,7) < e when ri > F. For further details about the advocated sequence of tests, cf. Johansen(1991). In simple terms this procedure may be formulated as follows. Start out with q = 0, rl = 1, .. . and so on and continue until the first non-rejection. In figure 2.5 this corresponds to simply count the number of curves above 1. The first non-rejection occurred for
0.7
0.6
0.5
0.4
0.3 1 P 1 1i `r' ti .--. .... -....--
0.2
0.1
0.0
1 1 t ...,
A1
A2
X3
),4
0 40 80 120 160 200 240 280 320 360 400
Figure 2.4: Estimated eigenvalues from simulated data . Recursive plots for observa- tions 25 to 400. Correct VAR(2) specification.
77 = 3 in our case so we conclude that r = 3.
The estimated values and & show a more problematic pattern. Figure 2.6 shows recursive plots of the estimated parameters for the four elements in the third column (i.e. Oj3 and c 3 j = 1,... , 4 respectively). The estimates are unstable and seem to jump around as we increase the sample size. This picture is however turned around when we renormalize the parameters by the procedure suggested above, and
t. 4t
we obtain new and remarkably stable estimates given by Oj3 and
åj3, cf. figure 2.7.
It is easily verified that the renormalized parameters lie close to their true values in the DGP (2.3) and the former instability is a pure artifact which (in this case) is easily removed. More evidence about these properties will be discussed later in this paper, in connection with the Monte Carlo experiments.
TRACE - test for cointegration Scaled .ith 5 2 critical values 12
10
8
6
4
2
0
0 40 80 120 160 200 240 280 320 360 400 12
10
Lembdn_MAX - test for cointegration Sealed with 5 % critical values
0 40 80 120 160 200 240 280 320 360 400
Figure 2.5 : Scaled Trace tests (left) and ' Max tests (right ) for cointegration. Re- cursive plots for observations 25 to 400 . Correct VAR( 2) specification.
0.6
-0.0
-0.2
-0.4
-06
14
Unnormalized beta-coefficients Unnormalized alpha-coefficients
0.6
-06
0 40 80 120 160 200 240 280 320 360 400 0
p43 °33
r 1 1
L "
- I.
... . 3.
04
23 0.2 III-0.0
I
hl t 11 ltM I I11 J1 IIt I II.,1 .. II I.
1 11 fil,
-na lt•
a43 a33
3.
40 80 120 160 200 240 280 320 360 400
Figure 2.6: Estimates of the parameters /3 3 and £ 3 Vj (eigenvectors). Recursive plots for observations 25 to 400. Correct VAR(2) specification
Inference on cointegration
17Renormalized beta-coe ff icients Renormalized alpha-coefficients
20 4
1.6
2
1.2 033 - I (normaliselion)
°l3 - 23 - E43 (clog* to 0) 0.8
-0.30 -0.27
0 4 °33
- 0 (eaeludon)
=0
0 13 23 (True value -0.25)
-0.0
-0.4 -4
-0.8 043 (True alue - -1) _ ,
---'--- - - -6
--- '' - -
-1.2 i
- 1 6 -8
0 40 80 120 160 200 240 280 320 360 400 0 40 80 120 160 200 240 280 320 360 400
Figure 2.7: Renormalized estimates
#j3 and
&j3, Vj. Recursive plots for observations
25 to 400. Correct VAR(2) specification3 Monte Carlo results for a correctly specified model
The small cointegrated system (2.3) is simulated n times and we have analysed the distribution of the Trace- and A-max statistics. The objectives have been to learn more about the small sample behaviour for these tests, and to see how the different estimators perform in finite samples. To simplify the presentation, we have focused on the distribution of three particular estimators (for one element in each of the three matrices 0, a and F1), namely /321, &11 and ill. The corresponding "true"
values in the DGP (2.3) are 021 = -1, all = -0.4 and tyll = 0.1.
The number of replications n vary between the experiments, between ca. 1500 and 5000 and we have presented the results by simple non-parametric estimates of the Monte Carlo distributions 7.
For the Trace- and A-max test statistics we have also reported the rejection frequen- cies, based on the 5 % fractiles in the limiting distribution reported in Osterwald- Lenum(1990).
7A standard kernel estimator is used with a Gaussian kernel , and the densities are estimated from the expression f„(x) = 1/(nh„) Es 1 K((x - x;)/h
,,
) where K(y) = 1/v/2 aexp(-y2/2). Cf.Hendry(1989) or Silverman(1986) for details.
Correct VAR(2) specification, T = 400, n = 5000, (per cent)
Trace Å
Pr[RejectHo : r < 3] 5.02 5.02
Pr[RejectHo : r < 2] 100 100
Pr[RejectHo : r <
1]
100 100Pr[RejectHo : r = 0] 100 100
3.1 Baseline results for the prototype model
The first Monte Carlo results for the Trace and A-max tests and the estimators are based on a correct model specification (i.e. a VAR(2) model with Niid residu- als) and we approximate the large sample properties using T = 400 observations.
The relevant asymptotics for appropriately normalized estimators of the long run parameters, ,;j, has been derived in Johansen(1989). The limiting distribution is non-standard and can be expressed as a function of Brownian Motions. The distri- bution of the short run parameters å;; and % can be shown to be asymptotically normal. For details on the limiting distributions, see Johansen(1989).
The Trace and
AMax
testsThe Trace test rejected the hypothesis Ho : r < 3Ir = 3 (i.e. that r is less than or equal to the true rank 3) in 5.02 % of the replications for a sample size T = 400, which indicates a correctly sized tests. The Trace test has substantial power according to these results (at least when evaluated at this particular parameter point and we note e.g. that the hypothesis Ho : r < 2Ir = 3 was rejected in 100 % of the cases.
Rejection frequencies are reported in table 3.1. The estimated distribution of Trace is shown in figure 3.1. The 5% critical value is 12.54, cf. Osterwald-Lenum(1990), and is marked with a vertical line in the figure.
In figure 3.2 the estimated distribution of the 'Max test observator shows similar results as for Trace. The 5% critical value is 11.44.
'The size evaluated at T = 400 observations corresponds to the number of observations at which the critical values were originally simulated . In simulations of the asymptotical distribution of Trace for T = 1000 using n = 10000 replications, we have obtained ca. 1 percent point larger critical values at the 5 % level.
Inference on cointegration
0.050 0.045 0.040 0.035
0.030
0.025
0.020
0.015
0.010
0.005
Traoe-teet for ootnteeratlon
H0: r <= 2 I True r = 3 (power)
57 crit.
value (12.54)
0.0D0
10 20 30 40 50 60 70 80
19
Figure 3.1: Estimated density function for the Trace test statistic . Correct VAR(2) specification , T = 400, n = 5000
0.06
0.05
0.04
0.03
0.02
Leibda , faz-test for cotntegratlon
57.
C rit.
value
(11.44)
0.01
0.00 10 20 30 40 50 60 70 80
Figure 3.2: Estimated density function for the 'Max test statistic. Correct VAR(2) specification, T = 400, n = 5000
12
10
8
6
4
2
"qc True value = -1
0 -1.3 -1.2 -1 . 1 -1.0 -0 .9 -0.8 -0.7 -0.6
Figure 3.3: Estimated density functions for 021. Correct VAR( 2) specification,
T = 400,n = 5000.
The estimators
The Monte Carlo results for the estimators seem to support the asymptotic results in Johansen(1989,1992). Figure 3.3 shows the estimated distribution of the free long run coefficient in the first cointegrating equation, 021.
The estimator is median unbiased, symmetrically distributed around its true value -1 and seem to be highly concentrated around this value. The asymptotical results in Johansen(1989,1992) (to which we refer the readers for technical details and proofs) tells us to expect the limiting distributions for the estimators of the long run and the short run parameters respectively to be very different. The essential difference can be interpreted in terms of the super consistency property which hold for the estimators of the long run parameters, 0. The Monte Carlo distributions for the short run parameters å11 and yll are shown in figure 3.4 and 3.5. Both seem to be median unbiased and symmetrically distributed around the true values but both distributions seem to be less concentrated than we observed for the long run estimators. The predicted difference between the two sets of parameters in this model, in terms of conducting inference, seem to be confirmed by the Monte Carlo simulations reported here.
The parameter restriction y = 0 has been imposed during the estimation of this
Inference on cointegration 21
20 12
Gamma(U) Uyha(].1)
16
16 14
True value
= -0.40
12
10
6
10 6
e
4 6
4
2
2
True value
= 0.10
0 -0.40 -0 .46 -0.44 -0.42 -0. 40 -0.36 - 0.36 -0 .34 -0.32 0 - 0.02 0 .02 0.06 0.10 0.14 0.16 0 .22 0.26
Figure 3.4: Estimated density functions for &11 and 711. Correct VAR( 2) specifica- tion, T = 400, n = 5000.
model. We have also estimated models including a constant term, in which case the Monte Carlo distribution for 021 came out with heavier tails, indicating a loss of efficiency.
3.2 Power properties in the case of "near cointegration"
The parameters in (2.3), for simplicity abbreviated 80, belong to a finite dimensional
parameter space 0 (8o E 0). We will now demonstrate how the distributions of the cointegration tests will be affected when we change 00 in certain directions9.The particular changes we consider here have already been introduced in part 2.
We compare five DGPs where the values in o 1j2 differ and gradually approach zero.
Some of the consequences from this were briefly discussed above and we saw that in the limit (when aj2 = 0), the cointegrating rank was reduced from three to two. For very small values in a particular column in a, the feed-back from the corresponding cointegrating relationship to the rest of the system will be weak, and we denote this as "near cointegration". In (2.3) there are originally three cointegrating vectors, 'Ideally, we would of course prefer to analyse a more complete power surface for the cointegration tests but this would have blown up the computational costs considerably. We have therefore only considered a few points in the parameter space. All calculations are based on GAUSS-386 version 2.1.
but when aj2 - 0, only two will be left and the difference 12t - x3t will become I(1) non-stationary in the limit. This property is evident from the bottom right trajectory in figure 2.2 where x2t and X3t seem to be driven by separate stochastic trends.
The Trace test
The power of the cointegration test can be defined by the probability Pr[Reject Ho : r < 21
r = 3]). When aj2 -
+ 0, this probability will approach the nominal size of a different test, namely Pr[Reject Ho : r < 21r = 2]) since the limiting true rank will be reduced by 1.It follows that the numerical value aj2i or conversely the particular parameter point eo E O in which we simulate the model, may have considerable effects on the power of the Trace test. Hence, the ability to determiner (and conduct correct inference about the long run parameters in the model) will depend on the true paramet- ers in the DGP. Johansen(1989) has shown that in the case with "near cointegra- tion", (e.g. with asymptotically vanishing values of say %2), the limiting distribu- tions of the trace and A-Max tests for cointegration have to be modified such that terms involving Brownian motions (cf. (2.8) are replaced by similar terms from an Ornstein-Uhlenbeck type stochastic process. Some simulation results are reported in Johansen(1989) for the Trace test in the case when r is reduced with 1 asym- ptotically.
A different approach is used here. Instead of simulating the limiting distribu- tion for Trace under the Ornstein-Uhlenbeck assumptions, we have investigated the power properties on the basis of simulations of the small cointegrated system (2.3).
Five experiments are compared where we gradually reduce the weights in aj3 to- wards zero. The results turn out to yield strong support to the power results in Johansen(1989), cf. figure 3.6. and table 3.2. It becomes increasingly difficult to detect cointegrating vectors when they have a sufficiently small weight in the error correction representation. Similar results hold for both the Trace and the A-Max test for cointegration.
Inference on cointegration 23
0.14
0.12
Trace-test tør
1/250 c=1/sgrt(250)
0.10 i
5% crit. value 12.54
+ t c=1
0.06 f
.t c=2
0.04 .t •' c=4
o.oa :+.+ t
0.02 +
0.00
t•:, .
0 10 20 30 40 50 60 70 80 90 100
Figure 3.5: Estimated density function for the Trace test statistic for ail x c E
[4,2,1,1/5, 1/100)
j = 1, ... , 4. Correct VAR(2) specification , T = 250, n = 2000Table 3.2 Size and power of the Trace test for cointegration.
Different parameterizations aj2 x c E [4, 2, 1, 1/v', 1/250]
c = 4 c = 2 c = 1 c = 1/v' 250 c = 1/250
j = 1, ... , 4 T = 250, n Size
Pr[Reject Ho : r < 3]
2000, (per cent).
Power
Pr[Reject H0 : r < 2]
Power
Pr[Reject Ho : r < 1]
5.90 100.00 100.00
6.00 100.00 100.00
5.90 99.15 100.00
1.60 7.05 100.00
1.05 5.05 100.00
3.2 2.8
2.4 2.0 1.6
1.2 0.8
0.4
0.0
True value = -1
-2.6 -2.2 - 1.8 -1.4 -1.0 -0.6 -0.2 0 .2 0.6 1.0
Figure 3.6: Estimated density functions for
P21
fora,2 X c E
[4,2,1,1/V2-50,1/2501,
j =1,...
, 4. Correct VAR(2) specification,T = 400,n = 2000.
The Estimators
We saw above that a reduction in aj2 caused gradual shifts to the left in the distri-
bution of Trace. In contrast, there seems to be almost no effect on the distribution of the estimator P21 from changing a;2 in this way, cf. figure 3.6. Hence, the inference about long run parameters in the model appears to be robust with respect to the size of a12, provided that we conduct correct inference about r and use the relevant information to renormalize the system.On the other hand, we see from figure 3.7 that the effects on the distribution of all and yll are significant. When we decrease a12 the shape of the distributions of the short run estimators change, and the variance seem to increase substantially (although the estimators seem to remain median unbiased).
Inference on cointegration
40
36
32 2e
24
urwU.U
True value = -0.40
20 to
c=1/sgrt(250) !
12
e or c=1/250
0 -0.56 - 0.52 -0 .46 -0.44
wm.+.t1.U
16
c=4 14 c=4
12
10
c=2 e
c=1
-0.40 -0.34 -0.32 -ose
True value = 0.10
c=1/sgrt ( 250) .
or c=1/250 /' s
4
2
0
c=2 C=1
25
-0.10 -0.05 -0.00 0.05 0.10 0.15 0.20 0.25 0.30
Figure 3.7: Estimated density functions for å1I and ly11 for a j2 X c E [4, 2,1,1 /V2_50, 1/250] j = 1, ... , 4. Correct VAR(2) specification, T = 400, n = 2000.
3.3 Finite sample performance
In applied work, the available data consist of a finite (and often relatively small) number of observations. In many cases we will try to determine the cointegrating rank on the basis of e.g. 50 or 100 observations using annual or quarterly time series.
It would therefore be of great advantage to know the small sample properties of the different tests and estimators. We have conducted a number of experiments where we gradually increase the sample size by 50 or 100 observations and compare results for T E [50,100,150, 200, 250] and T E [100, 200, 300, 400, 500]. Figures from the latter set of experiments are shown below.
The Trace and
AMaxtests
Figure 3.8 shows the Monte Carlo distributions for Trace and A-Max for the se- quence of hypotheses described in part 2.8. It is evident that the distributions of Trace and A-Max go off to infinity with T in the cases when the true rank (F = 3) is not contained in the range 77 of r-values satisfying the null hypothesis. We see that the distributions shifts to the right when we increase the sample size. This reflects the consistency property of the test sequence in Johansen(1991), i.e. that lim Prob(r E CO) = 1 for 77 < F. Similarly, note that for 71 = F = 3, the distribution