The sensitivity of structural equation modeling with ordinal data to underlying non-normality and observed distributional forms

(1)

This file was downloaded from BI Open, the institutional repository (open access) at BI Norwegian Business School http://brage.bibsys.no/bi.

It contains the accepted and peer reviewed manuscript to the article cited below. It may contain minor differences from the journal's pdf version.

Foldnes N, Grønneberg S. The sensitivity of structural equation modeling with ordinal data to underlying non-normality and observed distributional forms. Psychol Methods. 2021 Apr 1.

doi: 10.1037/met0000385. Epub ahead of print. PMID: 33793270.

©American Psychological Association, [2021]. This paper is not the copy of record and may not exactly replicate the authoritative document published in the APA journal. Please do not copy or cite without author's permission. The final article is

available, upon publication, at:

doi: 10.1037/met0000385.

Copyright policy of American Psychological Association, the publisher of this journal:

“Authors of articles published in APA journals — the authoritative document, i.e., peer reviewed publication of record — may post a prepublication copy of the final manuscript, as accepted for publication as a word processing file, on their personal website, their employer's server, in their institution's repository, reference managers (e.g., Mendeley) and author social

networks (e.g., Academia.edu and ResearchGate) after it is accepted for publication.”

http://www.apa.org/pubs/authors/posting.aspx

(2)

The sensitivity of structural equation modeling with ordinal data to underlying non-normality and observed distributional forms

Njål Foldnes and Ste ff en Grønneberg

BI Norwegian Business School

Structural equation modeling (SEM) of ordinal data is often performed using normal theory maximum likelihood estimation based on the Pearson correlation (cont-ML) or using least squares principles based on the polychoric correlation matrix (cat-LS). While cont-ML ignores the categorical nature of the data, cat-LS assumes underlying multivariate normality. The- oretical results are provided on the validity of treating ordinal data as continuous when the number of categories increases, leading to an adjustment to cont-ML (cont-ML-adj). Previ- ous simulation studies have concluded that cat-LS outperforms cont-ML, and that it is quite robust to violations of underlying normality. However, this conclusion was based on a data simulation methodology equivalent to discretizing exactly normal data. The present study employs a new simulation method for ordinal data to re-investigate whether ordinal SEM is robust to underlying non-normality. In contrast to previous studies, we include a large set of ordinal distributions, and our results indicate that ordinal SEM estimation and inference is highly sensitive to the interaction between underlying non-normality and the ordinal observed distributions. Our results show that cont-ML-adj consistently outperforms cont-ML, and that cat-LS is less biased than cont-ML-adj. The sensitivity of cat-LS to violation of underlying normality necessitates the need for a test of underlying normality. A bootstrap test is found to reliably detect underlying non-normality.

Keywords:Ordinal data, latent variable models, underlying normality, simulation

Variables measured on an ordered categorical scale abound in the psychological and educational sciences. Such ordinal variables often stem from Likert-type scales, and are frequently used as indicators for latent variables that are re- lated according to substantive theory in a structural equation model (SEM). In the following we use the term ordinal SEM to refer to SEM applied to ordinal data. Initially, SEM estimators were developed under the assumption of multivariate normal continuous data (Jöreskog,1967), with later de- velopments for non-normal continuous data (Browne,1984;

Satorra & Bentler,1988). In principle, one should not apply estimators based on continuous data to ordinal indicators, since data at the ordinal level of measurement do not support calculation of means and covariances. However, many researchers still model data with ordinal indicators as continuous (e.g., Brennan et al., 2017; Gaspard et al., 2018;

Marsh, Abduljabbar, et al.,2013; Marsh, Vallerand, et al., 2013;Neff, Whittaker, & Karl,2017), using the normal theory maximum likelihood (ML) estimator, with robustified standard errors and model fit statistics. We refer to this con-

Correspondence concerning this article should be addressed to Njål Foldnes, BI Norwegian Business School. E-mail:

[email protected]

tinuous methodology ascont-ML. One reason for using cont- ML with ordinal data is the relative ease of handling missing data and large complex models.

A popular approach of modeling the ordinal nature of the data is to add a discretization process based on thresholds to the standard SEM. This approach was first proposed by Christoffersson(1977) for binary data, and subsequently developed to handle observed variables measured with more than two ordered categories (Jöreskog,1994;Muthén,1984).

First, for any pair of ordinal variables, the so-called polychoric correlation is estimated. This is the correlation between two underlying unobserved variables that are posited to have created the observed categorical data through discretization. Since the time ofPearson(1900) it has been a standard assumption that these underlying variables follow a normal distribution. In the second stage, the model parameters are estimated by fitting the model to the matrix of polychoric correlations using least-square principles, and we refer to this methodology ascat-LS. In cat-LS formulas for standard errors and model test statistics (Muthén,1993) are similar to those proposed for non-normal data in the continuous case (Satorra & Bentler, 1988)¹. Simulation stud-

1In Mplus (Muthén & Muthén, 2012) cont-ML and cat-LS are available as MLR and WLSMV estimation, respectively. In lavaan (Rosseel, 2012) the cont-ML is obtained with “estima-

(3)

ies (e.g., Beauducel & Herzberg,2006; Li, 2016b; Rhem- tulla, Brosseau-Liard, & Savalei,2012) indicate that cat-LS in general outperforms cont-ML. However, the discrepancy between these two methodologies has been reported to de- crease with five or more categories that are symmetrically distributed (e.g.,Rhemtulla et al.,2012).

Both cont-ML and cat-LS are frequently used to estimate SEM models with ordinal data, and their properties have been extensively studied in many simulation studies.

The consequences of underlying non-normality on estimation bias have been thoroughly investigated (e.g., Flora &

Curran, 2004; Li, 2016a; Quiroga, 1994; Rhemtulla et al., 2012), resulting in a general consensus that ordinal SEM is quite robust to moderate violation of underlying normality. However, these studies were conducted using simulation methodology which is equivalent to simulating exactly normal data (Grønneberg & Foldnes,2019). In the present study, we reassess the robustness issue using a newly developed simulation methodology (Foldnes & Grønneberg, 2019) which ensures proper violation of underlying normality. We also propose and evaluate an alternative to cont-ML which takes advantage of the discretization model employed by cat-LS. We refer to this new method ascont-ML-adj. In- ference for cont-ML-adj utilizes the same robustification formulas for standard errors and model fit statistics as cont-ML and cat-LS.

Our research questions are as follows. First, how sensitive are cont-ML, cont-ML-adj, and cat-LS to variation in the observed ordinal distributions? That is, we will study the stability, or lack thereof, of correlation and ordinal SEM estimates under a large number of discretizations of the same underlying continuous distribution. Previous studies (e.g., Beauducel & Herzberg,2006;Li,2016a,2016b;Muthén &

Kaplan, 1985; Rhemtulla et al., 2012) have included only a small set of threshold configurations to generate ordinal data, typically consisting of a symmetrical distribution and one or two asymmetrical observed ordinal distributions. The focus in these studies has been to compare the performance of different estimators, but using only a handful of threshold configurations. In the present study we utilize 150 threshold configurations in each design condition to assess the extent to which cont-ML, cont-ML-adj, and cat-LS estimation depend on the specific marginal ordinal distribution. To the best of our knowledge, this is a novel perspective leading to insights into the sensitivity of the three estimators to the distributional shapes of the ordinal univariate marginals.

The second research question is: How sensitive are cont- ML, cont-ML-adj, and cat-LS to departures from normality in the underlying dataset? In the case where the observed ordinal data are produced by the discretization of an underlying continuous vector that does not have the multivariate normal distribution, the normal theory polychoric estimator will typically be biased (Foldnes & Grønneberg,2019b). That is,

the true correlation matrix among the underlying continuous variables will differ from the estimated polychoric correlations, even at the population level. This could have a pro- found impact on the second stage of cat-LS, with the SEM model estimated from a biased polychoric correlation matrix. Hence, in the case of underlying non-normality, a well- fitting SEM may be estimated with substantial bias in model parameters and fit indices, which invalidates inference. We apply newly developed simulation methodology for ordinal data in the context of a realistic SEM model to investigate the extent to which underlying non-normality produces bias in parameter estimation, confidence interval (CI) coverage rates, and in model fit assessment.

Our third research question relates to the proposed cont- ML-adj estimator. What are its theoretical properties, in relation to both cont-ML and cat-LS? And how does it empiri- cally perform in the setting of a realistic SEM model?

Our fourth and final question is: How well can we detect underlying non-normality in an ordinal dataset, in the context of a medium-sized SEM model? In conditions where underlying non-normality is detrimental to SEM inference, we aim at detecting the underlying non-normality using a statistical test. A bootstrap test for this purpose has recently been proposed (Foldnes & Grønneberg,2019b), and found to outperform the test ofMaydeu-Olivares(2006) in a limited simulation study with small- and medium-sized factor models. In the present article we evaluate the performance of this bootstrap test in a medium-sized SEM setting.

The remainder of the article is structured as follows: We first provide a concise description of cont-ML, cont-ML-adj and cat-LS, and then describe a new simulation method for ordinal data that we employ in the present study. Next, a literature review provides the context of our research questions. This is followed by a discussion of the complex- ity of interpreting ordinal SEM simulation studies, which provides justification for employing the new simulation approach. Theoretical results on the consistency of our three estimators as the number of categories increase are provided.

We report results from three empirical studies on the performance of cont-ML, cont-ML-adj and cat-LS in the context of a medium-sized SEM. We investigate how the shape of ordinal data distributions and underlying non-normality af- fect parameter bias, confidence interval coverage and model fit testing. The performance of the bootstrap test for underlying non-normality is also evaluated. We conclude with a discussion of the results with recommendations for applied researchers.

Estimation methods

There are two popular limited-information approaches to ordinal SEM. The first approach is cont-ML, which calcu- tor=MLM/MLMV” and cat-LS by specifying the "ordered" option.

(4)

lates the correlation or covariance matrix directly on the ordinal observations, which is then provided as input to the standard normal theory maximum likelihood estimator. The second approach is cat-LS, where we first estimate the correlation matrix of an underlying continuous random vector X^∗and then uses least squares estimation methods to fit the proposed SEM to the estimated correlation matrix (Muthén, 1984). This approach is implemented in current software packages such asEQS (Bentler,2006),Mplus (Muthén &

Muthén, 2012), LISREL (Jöreskog & Sörbom, 2015), and lavaan (Rosseel,2012), and is frequently employed by researchers.

cat-LS is based on the following discretization model. We havenindependent and identically distributed observations of a random ordinal vectorX=(X1,X2, . . . ,Xp)⁰. We assume that each Xk may take on discrete values xk,1, . . . ,xk,K for some numberK, and that the data stem from discretization of an unobserved continuous p-dimensional random vector X^∗=(X₁^∗,X₂^∗, . . . ,X^∗_p)⁰as follows:

Xk=











xk,1, if X^∗_k≤τk,1

xk,2, if τk,1<X^∗_k≤τk,2

...

xk,K, if τ_k,K−1<X_k^∗

(1)

for k = 1,2, . . . ,p. We refer to theτ as thresholds which fulfillsτk,j−1 ≤τk,jand defineτk,0 =F⁻¹_k (0), τk,K =F⁻¹_k (1), whereFk(x) = P(X_k^∗ ≤ x) is thek-th marginal distribution ofX^∗. For all distributions with infinite support, such as the normal distribution, we haveτk,0 =−∞, τk,K =∞. To iden- tify the thresholds, the marginals of X^∗ have to be known.

Usual practice is to assume thatX^∗has standardized normal marginals:X_k^∗∼N(0,1) fork=1,2, . . . ,p.

The SEM specified by the researcher relates to the continuous X^∗ vector. That is, model estimation requires the correlation matrix of X^∗. This matrix is not identified, and the traditional assumption since Pearson (1900) has therefore been to assume that X^∗ is multivariate normal. Under this assumption the correlation matrix ofX^∗may be consistently estimated by the polychoric correlation matrix, which is a limited information maximum likelihood estimator (Ols- son,1979a). The polychoric correlation matrix then forms the basis for estimating the parameters of the SEM, typically using diagonally weighted least squares estimation (DWLS).

This means that the discrepancy between model-implied and estimated polychoric correlations is minimized, where the weights are obtained from the diagonal of the asymptotic covariance matrix of the estimated polychoric correlations. Ro- bust standard errors and goodness-of-fit testing are also obtained based on the asymptotic covariance matrix (seeMon- roe(2018) for further details).

We next propose a third limited-information approach to ordinal SEM, with complete technical descriptions given in a later section. Usually, the values of the coordinates ofX

are taken as a sequence of consecutive integers 1,2, . . . ,K, which is not in general the scale used by X^∗. This is the approach of cont-ML. We propose to adjust cont-ML as follows. Instead of integer values, we choose the values of X in such a way that the covariance matrix of X approxi- mates that of X^∗. To estimate the parameters of the SEM, the usual normal theory ML procedure is then applied to the appropriately encoded observations. The resulting estimator is called cont-ML-adj, and it borrows from cat-LS the discretization model in Equation (1) in order to adjust the cont- ML correlations. While polychoric correlations assume that X^∗ is normal, cont-ML-adj requires only that the marginal distributions of X^∗ are known, i.e., we know the functions P(X_k^∗ ≤ x) for k = 1, . . . ,p. The marginals do not have to be normal. Since the marginals are known, we will show later in the article that the thresholds (τk,j) can be estimated from data by estimators, say, (ˆτk,j). Consider the first co- ordinate X1. From Equation (1),X1 takes the value of x1,1

when X₁^∗ ≤ τ_1,1. We therefore choose x1,1 somewhere in the interval<−∞, τ_1,1]. In general, we adjust xk,j to a new value ˆxk,j which is placed in the interval [τ_k,j−1, τ_k,j]. One way to achieve this is to use ˆx_k,j = m_k(ˆτ_k,j−1,τˆ_k,j) where mk(x,y) = E

X_k^∗

x≤X^∗_k≤y

is the conditional expectation ofX_k^∗when the value ofX^∗_kis compatible with the observed value ofXin the original encoding. WhenX^∗_kis assumed to be standard normal so thatFk = Φwith densityφ= Φ⁰, we have thatmk(x,y)=[φ(x)−φ(y)]/[Φ(y)−Φ(x)], so that

ˆ

x_k,_j=[φ(ˆτ_k,j−1)−φ(ˆτ_k,j)]/[Φ(ˆτ_k,_j)−Φ(ˆτ_k,j−1)].

Figure1shows the probability plot for the original and adjusted values, for a K = 4 distribution with floor effects.

While the original values are equidistant, the adjusted values are unevenly spaced. The original values range fromX =1 to X = 4, with probabilities 0.54,0.36,0.07,and 0.03, resulting in adjusted values ˆX = −0.74,0.60,1.49,and 2.21, respectively.

0.0 0.2 0.4

−1 0 1 2 3 4

x

prob

Original integer coded Adjusted values

Figure 1. Cont-ML-adj: Adjustment of original valuesX = 1, . . . ,4.

The followingR-code implements the adjustment on the ordinal dataframeordinal.df:

(5)

adjust <- function(x){

taus<-c(-Inf,qnorm(cumsum(prop.table(table(x)) ,→ )))

newvalues <- apply(embed(taus, 2)[, 2:1], 1, function(x) (dnorm(x[2])-dnorm(x[1]))/(pnorm(x

,→ [1])-pnorm(x[2])))

plyr::mapvalues(x, sort(unique(x)), newvalues) ,→ }

adjusted.df <- sapply(ordinal.df, adjust)

A new simulation technique for non-normal ordinal covariance models

The work of Grønneberg and Foldnes (2019) provided a major impetus for the present study in pointing out that, surprisingly, a large portion of the literature on robustness of ordinal SEM against underlying non-normality has to be re-interpreted. These studies generated ordinal data by discretizing a non-normal continuous vector and the results showed that the effect of non-normality on ordinal SEM was moderate or minor. However, careful examination of the properties of the non-normal continuous vector reveals that the ordinal data produced from it are indistinguishable from ordinal data produced by discretizing a fully normal continuous vector. In brief, thresholds, and marginal distributions are not jointly identified, and the effect of marginals and threshold choice is confounded. These observations have complex implications for the interpretation of simulation studies for ordinal data methods, which we discuss in a separate and upcoming section, see p.7, as well as in the online supplementary material (p.1).

As a response to the shortcoming of the existing simulation methodology, Foldnes and Grønneberg(2019) proposed a new simulation technique for ordinal data in the context of covariance structure analysis, where the continuous marginals are fixed to the standard normal distribution. That is, multivariate ordinal data for covariance structure models are generated by randomly drawing observations from the vectorX^∗with standard normal marginals, and the ordinal data are obtained through discretizing these observations with user-specified thresholds. Importantly, since we are dealing with factor or structural equation models, the vector X^∗must have a user-specified correlation matrix. There are many possible non-normal multivariate distributions for X^∗ where the univariate marginals are standard normal, and with a given correlation matrix. Foldnes and Grønneberg(2019) proposed to investigate robustness to non-normality by vary- ing the type and degree of non-normality embedded inX^∗. Since the marginals are always standard normal, keeping the thresholds fixed is equivalent to keeping the marginal distribution of the observed ordinal variables fixed. That is, we may vary the distribution ofX^∗from the multivariate normal case to different types of non-normal distributions, without

changing either the univariate observed ordinal variables or the correlation matrix ofX^∗. This approach therefore allows us to disentangle effects of the observed ordinal distribution (level of symmetry, ceiling effects, etc) from effects of the underlying degree and type of non-normality.

To construct non-normal distributions X^∗ with standard normal marginals and a given correlation matrix, Foldnes and Grønneberg(2019) suggested to use the VITA (VIne-To- Anything) method (Grønneberg & Foldnes,2017). We here informally present VITA. A more technical introduction to copulas, regular vines and VITA is available in the online supplementary material (p. 6). VITA constructs a multivariate non-normal distribution, with given marginal distributions and correlation matrix, called a regular vine (Bed- ford & Cooke, 2002). Regular vines is a flexible class of distributions, constructed by assembling bivariate copulas in a hierarchy of tree structures. There are many copulas to choose from, and many possible tree structures. We empha- size that the VITA vectors used in this study all have standard normal marginals, and all match the same fixed target population correlation matrix. To illustrate bivariate copulas, which serve as basic building blocks for VITA distributions, we paired three different bivariate copulas with standard normal marginals and plotted the contours in Figure2. All three distributions have correlationρ=0.56 and standard normal marginals. However, the distributions are constructed from different copulas. In Figure 2(a) the copula is the normal copula, which when combined with normal marginals yields the bivariate normal distribution. In Figure 2(b) the copula belongs to the class of Clayton copulas, and the figure presents contours for the distribution obtained by combining this copula with standard normal marginals. Similarly, Fig- ure2(c)depicts a Joe-type copula paired with standard normal marginals. The two latter distributions differ from the bivariate normal distribution by allowing tail dependencies.

As can be seen in Figure2, relative to the normal copula, the Clayton copula has stronger lower tail dependency, while the Joe copula has stronger upper tail dependency. Also, to illustrate the discretization process, in Figure2 we have drawn lines representing fixed thresholds ofτ1,1 =−1, τ1,2 =0 for X^∗₁, and τ2,1 = −1, τ2,2 = 1 for X^∗₂. The ordinal variables X1andX2 will then have three levels: 1, 2 and 3, for example. The marginal distributions ofX1andX2will be identical across the three distributions in Figure2. However, the bivariate distributions of the ordinal vector (X1,X2)⁰will not be identical. For instance, the lower left corner of each figure corresponds to the eventX1=1∩X2 =1, and inspection of the contour lines suggests that this event is most likely in the Clayton case in Figure2(b), and least likely in the Joe case in Figure2(c). In fact, we may calculate these probabilities using the copula package (Hofert, Kojadinovic, Maechler, &

Yan,2013) for R (R Core Team,2020). The probabilities for X1 = 1∩X2 = 1 are 0.068, 0.095 and 0.046 for the

(6)

distributions in Figures2(a),2(b)and2(c), respectively. The normality-based maximum likelihood for the polychoric correlation is maximized atρ=0.56 only in the normal case in Figure2(a). For the Clayton distribution in Figure2(b)the value that maximizes the likelihood is 0.613, while for the Joe distribution in Figure2(c)the likelihood is maximized at 0.496. This simple illustration suggests that the polychoric estimator is sensitive to the underlying distribution. For the distributions in Figures2(b)and2(c), the true correlation is 0.56, but the polychoric estimates are biased, with a relative bias of+9.5% and−11.4%, respectively.

The full VITA distribution is constructed by specifying bivariate copulas for every pair of variables, where the construction order is dictated by the sequence of trees. Each sequence of trees results in a different distribution, as does each choice of bivariate copulas for the variable pairs, lend- ing the VITA class great flexibility. VITA construction in- volves numerically solving for a bivariate copula parameter for each pair of variables, rendering VITA calibration more time consuming than other, less flexible methods (e.g., Foldnes & Olsson,2016;Vale & Maurelli,1983). VITA calibration is implemented in theRpackagecovsim(Foldnes &

Grønneberg,2020). After VITA calibration, fast simulation is available using thervinecopulibpackage (Nagler & Vatter, 2019).

Literature review and the present study

In previous simulation studies only a small number of threshold configurations have been used, i.e., the simulated ordinal data exhibited only a small number of distributional forms. For instance,Rhemtulla et al.(2012) andLi(2016b) employed only three specific threshold configurations (symmetrical, moderately skewed, and severely skewed). That is, for each generated continuous dataset only three marginal distributions of the ordinal datasets were produced. In the present article, our first empirical study reverses this perspective, by generating a large number of ordinal datasets from a limited number of underlying continuous large-sample datasets. We generate a dataset from each of three underlying distributions (multivariate normal and two non-normal distributions) and discretize it using several different thresholds, chosen from a large set of threshold configurations.

Then, variation in parameter bias may be studied as a function of variation in threshold sets. Also, in previous studies, with the exception ofRhemtulla et al.(2012), all ordinal variables in the observed ordinal vector shared the same distribution. In other words, ordinal data were generated by applying the same set of thresholds to each continuous variable. Such distributions are rarely encountered in a real- world dataset, where one variable is typically distributed dif- ferently than another variable. In the present study no two ordinal variables have the exact same distribution, and thereby better reflecting a real-world dataset. Foldnes and Grøn-

x1

x2

−3 −2 −1 0 1 2 3

−3−2−10123

(a) Normal copula

x1

x2

−3 −2 −1 0 1 2 3

−3−2−10123

(b) Clayton copula

x1

x2

−3 −2 −1 0 1 2 3

−3−2−10123

(c) Joe copula

Figure 2. Three bivariate distributions with correlation 0.56 and standard normal marginals.

(7)

neberg(2019b) demonstrated, under bivariate non-normality, that the specific choice of thresholds, i.e., of observed ordinal distributions, substantively impacts the bias of the estimated polychoric correlation. In the present study we extend this line of research from the bivariate case of a polychoric correlation to the multivariate case of model parameters, CI coverage, and model fit assessment in a medium-sized SEM.

We expect that the variation in bias observed byFoldnes and Grønneberg(2019b) for the polychoric case will propagate into ordinal SEM estimation.

Many simulation studies (e.g., Flora & Curran, 2004;

Jin, Luo, & Yang-Wallentin,2016;Li,2016a;Moshagen &

Musch,2014;Natesan,2015;Nestler,2013;Quiroga,1994;

Rhemtulla et al.,2012;Yang & Green,2015) have investigated the robustness of ordinal SEM to non-normality. The general consensus formed from these studies is that “estimation of polychoric correlations is robust to modest violations of underlying normality” (Flora & Curran,2004). For instance,Liu et al.(2017) refer to these studies and suggest that cat-LS factor loading bias may tend to be trivial when the underlying distribution is non-normal (p.501). Recent research (Grønneberg & Foldnes,2019) identified a problem in the way data were generated in these studies, that weakens and severely complicates claims of robustness of the polychoric estimator. These studies relied on the popular data generation technique ofVale and Maurelli(1983), which is readily available in software packages. Grønneberg and Foldnes(2019) pointed out that, surprisingly, discretizing a non-normal vector obtained using the approach of Vale and Maurelli (VM) is in most cases equivalent to discretizing a normal vector.

This is the result of a fundamental lack of identifiability of the underlying continuous vectorX^∗, combined with the fact that the VM approach in most cases results in data that have a normal copula (Foldnes & Grønneberg,2015). A careful and detailed discussion of this surprising equivalence is given in the next section.

There are few studies focusing on polychoric correlation estimation bias based on underlying bivariate distributions that are not VM transforms.Monroe(2018) used the bivari- atet-distribution, a skew-normal distribution, and a normal mixture and found substantial bias, especially in the two latter distributions. Jin and Yang-Wallentin(2017) used skew- normal and skewedt- distributions and found moderate bias.

Likewise, to our knowledge, there is a scarcity of studies on the effect of underlying non-normality on ordinal factor analysis/structural equation modeling (CFA/SEM) based on underlying continuous distributions other than the VM trans- form. Monroe (2018) employed outliers and the inclusion of a quadratic term to generate multivariate non-normality for a two-factor analytical model using 10 and 20 variables, and reported substantial bias in estimated factor loadings and test statistics when data were generated by using a quadratic term. Maydeu-Olivares(2006) presented a small simulation

study using a CFA with 12 variables, and a fixed symmetrical ordinal distribution with 3 levels, where the underlying distribution followed the multivariatet-distribution. Bias in parameter estimates and standard errors was low, with me- dian absolute relative bias at 1% and 4%, respectively, for a sample size ofn = 1000. Given the scarcity of studies not using the VM method that investigate how ordinal CFA/SEM estimation is affected by underlying non-normality, and the conflicting results among such studies, we deem it important to further investigate this issue.

A majority of simulation studies have employed confirma- tory factor models (Hoogland & Boomsma,1998), and it has therefore been recommended to extend such studies to full SEM models (Beauducel & Herzberg,2006;Flora & Curran, 2004;Rhemtulla et al.,2012). Hence, in the present study we adopt the full SEM model used byLi (2016b) to assess the performance of various estimators for ordinal SEM under the assumption of underlying normality. This model was constructed from a review of empirical studies using structural equation modeling to be representative from a practical standpoint.

In non-normal conditions where cont-ML, cont-ML-adj, and cat-LS estimation may perform poorly due to estimation bias, there is a need for a test for underlying normality. Be- fore conducting a SEM with ordinal data, a researcher could use such a test to assess whether there is evidence of discretized non-normality in the data. Should the test indicate that the ordinal dataset is unlikely to stem from discretizing a normal vector, cat-LS must be used with caution. Unfortu- nately, due to the lack of a statistical test for underlying non- normality in popular software packages, the current practice in empirical studies utilizing ordinal SEM is to take the normality ofX^∗for granted. A true multivariate test for underlying normality was proposed byMaydeu-Olivares(2006), but this test has been largely ignored in the literature and in software packages. Foldnes and Grønneberg(2019b) conducted the first empirical investigation of this test, and also proposed a new parametric bootstrap test. The bootstrap test operates by estimating the polychoric correlation and the thresholds from the original sample. Then, many continuous samples are drawn from a multivariate normal vector whose correlation matrix equals the polychoric correlation matrix. Each continuous sample is further discretized using the estimated thresholds to produce an ordinal data sample. Then, the test statistic ofMaydeu-Olivares(2006) is calculated in each ordinal sample. The proportion of times the bootstrapped test statistics exceed the test statistic from the original sample serves as the p-value of the bootstrap test for underlying normality. The test ofMaydeu-Olivares(2006) and the bootstrap test were found to maintain Type I error well for dimensionalities less than ten, but only the bootstrap test maintained an adequate Type I error control for larger dimensionalities.

The bootstrap test has hitherto been studied only byFoldnes

(8)

and Grønneberg(2019b) in a CFA context with at most 15 variables. An aim in the present study was therefore to evaluate the bootstrap test in the context of a medium-sized SEM with 20 variables. It is hoped that such a test may reliably detect conditions where underlying non-normality and the observed ordinal distributional forms render ordinal SEM inference severely biased.

The complications of interpreting ordinal SEM simulation studies

The problem of identification of aspects of the underlying distribution of X^∗ based on observations from the ordinal- categorical X has recently received attention (Almeida &

Mouchart,2014;Foldnes & Grønneberg,2019,2019b;Grøn- neberg & Foldnes,2019). Here, we summarize some of the main conclusions fromGrønneberg and Foldnes(2019) and Foldnes and Grønneberg (2019), which show that utmost caution must be taken when interpreting simulation studies using the VM simulation method in the context of ordinal variables. We split our discussion into two subsections. First, we outline how the effect of the marginal distribution ofX^∗ and the threshold placement are confounded in simulation designs. Second, we discuss the interpretation of simulation results for cat-LS where the simulation method is to discretize a VM vector. We review the arguments ofGrønneberg and Foldnes(2019) which note that in most circumstances, simulatingXby discretizing a VM vectorX^∗is numerically identical to simulatingXfrom an exactly multivariate normal X^∗. The only difference is that the thresholds are changed.

Since the ordinal vector X is a censored version of the underlying vector X^∗, there will in general be many other vectors, say ˜X^∗, which when discretized produce exactly the same vector X, or at least a vector with the same distribution. Therefore, one cannot useXto pin-point the exact distribution ofX^∗. To simulate by discretization ofX^∗ intoX therefore also implies that we simulate by discretization of X˜^∗intoX. This complicates the interpretation of all simulation studies where ordinal data stem from discretizing continuous variables. The concept of partial identification analysis is briefly introduced as a theoretical solution to this problem in the online supplementary material (p.1).

Marginal distributions and thresholds are confounded

The marginal distributions ofX^∗and the thresholds (τ_k,_j) cannot be separately identified from data. That is, when only observingX, we cannot simultaneously estimate the thresholds and marginal distributions if they are both unknown.

Since the values ofX^∗ are not used in generatingXexcept through the thresholds, the marginal scale ofX^∗is not identified when observing onlyX. Let Fk(x) = P(X_k^∗ ≤ x) be thek-th marginal distribution of X^∗. We assume thatFkis

invertible. Then

P(Xk≤xk,j)=P(X_k^∗≤τk,j)=Fk(τk,j) (2) SinceP(X_k≤xk,j) is a feature of the distribution ofX, it can be deduced from data. However, onlyFk(τ_k,j) is identified, and the marginalF_kand the thresholdsτ_k,jare not separately identified. Once the marginal distributions ofX^∗ are specified, we can use this to solve Equation (2) forτ_k,j, producing the equation

τ_k,j=F_k⁻¹

P(Xk≤xk,j)

. (3)

This equation shows that in an empirical setting the thresholds and the marginal distribution are inseparable, unless we know one or the other. Let us further consider the interaction between thresholds and marginal distributions and how this affects simulation when X^∗ is non-normal. From Equation (1), we have the equivalence

Xk=xk,j ⇐⇒ τ_k,j−1 <X^∗_k≤τ_k,j. (4) However, as we will see in examples in the upcoming sub- section, there are many random vectors ˜X^∗which generateX via Equation (1) using a modified set of thresholds. Indeed, for any sequenceH1, . . . ,Hp of strictly increasing transformations, we have from Equation (4) that

Xk=xk,j ⇐⇒ Hk(τk,j−1)<Hk(X_k^∗)≤Hk(τk,j).

Let ˜τk,j=Hk(τk,j−1) and ˜X_k^∗=Hk(X^∗_k). Then this equation is Xk=xk,j ⇐⇒ τ˜k,j−1 <X˜^∗_k≤τ˜k,j. (5) The distribution of ˜X^∗ =( ˜X1, . . . ,X˜k)⁰will be different than the distribution ofX^∗. Mathematically, the marginals of ˜X^∗ have changed compared to X^∗, but the rest of the distribution, what is formerly known as the copula, has stayed the same. By an appropriate choice of H_k functions, one can show that the marginal distributions ofX^∗can be transformed into whatever distribution one would like. For example, it is always possible to choose Hk in such a way that ˜X^∗ is marginally standard normal (Foldnes & Grønneberg,2019, Proposition 1). Hence, in a simulated sample, there is no sin- gle and correct way to define the population thresholds and marginal distributions ofX^∗. We have seen that simulatingX through discretizing X^∗yields ordinal observations that are numerically equal to simulating X through discretizing ˜X^∗. By numerically equal, we mean that the simulated samples ofXfromX^∗and ˜X^∗are identical. The simulation ofXcan therefore be seen as having been generated simultaneously fromX^∗as well as ˜X^∗, though with different thresholds.

Let us now return to Equation (3). In our simulation design, outlined in the method section, we vary the marginal distribution ofXacross other factors, such as the type of non- normality ofX^∗. Since the marginal distributions ofX^∗and the threshold values jointly produce the marginal distribution

(9)

ofXthrough Equation (3), we decided to fix the marginal distributions ofX^∗to standard normal, and let the threshold values vary to produce different marginalXdistributions. Then the thresholds uniquely define the marginal distribution ofX, and vice versa, by

P(X_k≤x_k,j)= Φ(τ_k,j), τ_k,j= Φ⁻¹h

P(X_k≤x_k,j)i .

Fixing the marginal distribution ofX^∗ allows us to study the effect of the shape of the ordinal observed marginal distribution ofXon ordinal SEM. The choice of standard normal marginals forX^∗ is natural because this is the scale of the standard methodology: The covariance of a random vector depends on the marginal distributions, yet the marginal distribution ofX^∗is not identified based on observations fromX.

Therefore, consistent statistical estimators for Cov(X^∗) must make assumptions on what the marginal distribution of X^∗ is. The common assumption is that they are standard normal, hence dictating the scale at which the covariance matrix is to be estimated.

The surprising equivalence between ordinal VM simulation and ordinal normal simulation

Here, we consider the consequences of the non- identifiability of the distribution of X^∗ from the distribution of X in the case of the well-known VM simulation method. Briefly, VM consists of applying third order polynomial transformations to the coordinates of a random vec- torZwhich is multivariate normal with a covariance matrix ΣZ. In practice, the polynomial transformations are usually either strictly increasing or strictly decreasing. We will limit attention to the strictly increasing case, and refer the reader toGrønneberg and Foldnes(2019) for a full discussion. For eachk=1, . . . ,p, VM generates

X_k^∗=Hk(Zk), whereHKis strictly increasing.

While the functionH_kused in the original VM approach is a third degree polynomial, the upcoming argument rests only on the assumption thatHkis a strictly increasing transformation. The ordinal observations are thereafter generated following Equation (1):

Xk=xk,j, whenτk,j−1<X_k^∗≤τk,j (6) SinceX^∗_k=H_k(Z_k) we have thatτ_k,j−1<X^∗_k ≤τ_k,jis equivalent to

τ_k,j−1<Hk(Zk)≤τ_k,j. (7) We next apply the strictly increasing inverse of Hk to the above inequalities, and get

Xk=xk,j, when ˜τk,j−1<Zk≤τ˜k,j (8)

where ˜τk,j = H⁻¹_k (τk,j). This argument can be reversed by starting with discretizing a multivariate normal random vec- torZintoX, and applyingHkto the inequalities in Equation (8) and arrive at Equation (6).

We conclude that simulatingXby discretizing a VM generated X^∗ is equivalent to simulating X by discretizing an exactly multivariate normal vectorZ. The polynomial transformation that the VM method applies toZis essentially lost under discretization. This has consequences when applying statistical methods to samples fromXwhich assume thatX^∗ is multivariate normal. One such prominent method is the polychoric correlation of Pearson and Pearson (1922) and Olsson(1979a), which is also central to cat-LS. Above we observed that when X^∗ = Z is exactly multivariate normal, any strictly increasing transformation Hkcan be applied to the coordinates ofX^∗, and the resulting vector, say ˜X^∗, could have generated Xwhen discretized. The only change is the numerical values of the thresholds. When the transformations are strictly increasing third-order polynomials, the resulting vector ˜X^∗ has the same distribution as a VM simulated vector. Yet, the polychoric estimator will estimate the correlation matrix of X^∗ and not ˜X^∗ which here could have been generated from the VM method. Why is this? The answer lies in the consistency of the polychoric correlation whenX^∗is normal, which is the case for the fully normalX^∗ and not for the VM vector ˜X^∗. The polychoric correlation estimator will estimate the covariance matrix of the version of the underlying vector which is fully normal. ShouldXbe a discretization of a VM vector ˜X^∗, the polychoric correlation estimator will still implicitly work with the transformed vectorZ, because it is exactly normal. Therefore, the polychoric correlation will not estimate the correlation matrix of the simulated X^∗, but will instead estimate the correlation matrix of the underlying Z. A more mathematical expla- nation of this correspondence is given in Grønneberg and Foldnes(2019). Several numerical illustrations of this correspondence, where it is shown that the above observations may be used to exactly predict the results reported in simulation studies on the effect of non-normality on polychoric estimation (e.g.,Flora & Curran,2004), is given in the online appendix ofGrønneberg and Foldnes(2019). A particularly unfortunate feature of the VM method is that the covariance matrix ofZandX^∗are often very similar, therefore making it more difficult to detect this problem in numerical investi- gations.

We illustrate the above discussion by considering a typi- cal simulation condition used in ordinal SEM simulation articles.Flora and Curran(2004) included a condition with moderate skewness and moderate kurtosis, where skewness=1.25 and excess kurtosis=3.75 in all marginal distributions. For the K = 5 condition they adopted thresholds fromMuthén and Kaplan(1985): τ1 =−1.645, τ2 =−0.643, τ3 =0.643, andτ4=1.645. The VM method first determines the polyno-