• No results found

Decomposing differences in wage distributions

In document Essays on Human Capital Accumulation (sider 87-93)

In previous empirical work, differences in the distribution of skills have been shown to explain only a modest part of the differences in wage inequality between countries.

7 Low education is defined as having lower secondary education or less (ISCED 1, 2, 3C short, or less),

intermediate education as upper secondary education (ISCED 3A-B or 3C long) or post-secondary non-tertiary education (ISCED 4A-B-C) and high education as any non-tertiary education (ISCED 5A, 5B or 6).

8 Hours worked is not recorded in the Australian data, and therefore the Australian sample includes both

part-time and full-time workers.

distribution, and therefore, the impact of changing the skill distribution in the US may affect some parts on the income distribution more than others. We therefore use a more general framework that allows us to study the impact of skills across the entire distribution of wages.

We apply a method introduced by Firpo, Fortin and Lemieux (2009), whose method makes it possible to perform detailed decompositions of the difference in any distributional statistic between two groups by estimating recentered influence function (RIF) regressions.9 In practice this decomposition method is similar to an Oaxaca-Blinder decomposition, but the outcome variable in the regressions is replaced by the RIF for the statistic of interest. We decompose the Finland-US difference in wages at 19 percentiles to study the impact of skills across the entire wage distribution. In addition, we also present decomposition results for several summary measures of wage dispersion, such as the 90/10, 90/50 and 50/10 percentile ratios, the variance and the Gini coefficient.

2.4.1 Method

To understand the method, let 𝜐(πΉπ‘Œ) denote a distributional statistic (for example, a quantile) of the cumulative distribution of wages πΉπ‘Œ. When performing a decomposition of differences in 𝜐(πΉπ‘Œ) between two groups, we divide this difference into a composition effect, which is related to the difference in observed characteristics, X, and a wage effect, which is related to the difference in the conditional distribution of wages, 𝐹(π‘Œ|𝑋). In the case of the mean, the wage effect only depends on the conditional mean of wages, but when decomposing other distributional measures, it depends on the entire wage distribution, which makes estimation more challenging.

9 This method is also described in detail in Firpo, Fortin and Lemieux (2007) and Fortin, Firpo and

Lemieux (2011). The goal of this section is only to provide a short summary of the method, and readers are advised to turn to the studies mentioned above for additional detail. The disposition of this section follows Firpo, Fortin and Lemieux (2011), who provide an instructive description of the method.

Let πΉπ‘Œ0 and πΉπ‘Œ1 be the cumulative wage distributions observed in country 0 and 1, respectively, and let G be a country indicator referring to the country where the worker characteristics are observed, so that πΉπ‘Œ1|𝐺=1 refers to the actual cumulative distribution of wages observed in country 1. Then, we can decompose the overall difference in the distributional statistic 𝜐(πΉπ‘Œ) between the two countries, so that

Ξ”π‘‚πœ = 𝜐(πΉπ‘Œ1|𝐺=1) βˆ’ 𝜐(πΉπ‘Œ0|𝐺=0)

= [𝜐(πΉπ‘Œ1|𝐺=1) βˆ’ 𝜐(πΉπ‘Œ0|𝐺=1)] + [𝜐(πΉπ‘Œ0|𝐺=1) βˆ’ 𝜐(πΉπ‘Œ0|𝐺=0)]

= Ξ”πœπ‘Š+ Ξ”πœπΆ

where Ξ”π‘Šπœ is the wage effect and Ξ”πœπΆ is the composition effect. 𝜐(πΉπ‘Œ1|𝐺=1) and 𝜐(πΉπ‘Œ0|𝐺=0) refer to the distributional statistics calculated using the actual data for each country. The challenge lies in estimating 𝜐(πΉπ‘Œ0|𝐺=1), which is the distributional statistic of the wage distribution in a counterfactual country, where the characteristics are those observed in country 1 while the wage structure is that in country 0.

Over the years, many methods have been proposed to estimate the composition and wage effects, but they have generally not been successful in performing detailed decompositions (DiNardo, et al., 1996; Juhn, et al., 1993; Machado & Mata, 2005).

Recently, Firpo, Fortin and Lemieux (2009) introduced a method that further allows us to divide the total composition and wage effects into the contributions of specific covariates using recentered influence functions (RIF). The influence function (IF) of 𝜐(πΉπ‘Œ) represents the influence of an individual observation on that statistic, and the recentering part comes from adding the statistic 𝜐(πΉπ‘Œ) to the influence function. This method is similar to the Oaxaca-Blinder decomposition of differences in the mean, but the outcome variable in the regression is replaced by the recentered influence function (RIF) of the statistic 𝜐(πΉπ‘Œ). In other words, once the RIF is calculated, it is possible to run a regression of the RIF in the explanatory variables X for both groups, and perform an Oaxaca-Blinder decomposition where the composition effect and the wage effect can be rewritten as

Ξ”πΆπœ = (Ξ•[𝑋|𝐺 = 1] βˆ’ Ξ•[𝑋|𝐺 = 0])΀𝛾0𝜐

explanatory variables in country 0 and 1, respectively. However, Firpo, Fortin and Lemieux (2007) point out that the decomposition above may not give consistent estimates of the composition and wage effect if the conditional expectation of the RIF regression is nonlinear. In other words, the concern is that if the true relationship between wages and the observed characteristics X is nonlinear, but the relationship is approximated to be linear in a regression, then the regression coefficient on X will change if the distribution of X changes even if the wage setting mechanism is unchanged. What this means is that the price vectors 𝛾0𝜐 and 𝛾1𝜐 could be different just because they are estimated for different sets of X.

As a solution to this problem, Firpo, Fortin and Lemieux (Firpo, et al., 2007; Fortin, et al., 2011) suggest using a method that combines the RIF regression method with a reweighting method introduced by DiNardo, Fortin and Lemieux (1996). By reweighting the distribution of X in group 0 so that it is similar to that in group 1, we can construct a counterfactual country, which has the wage structure of country 0 but (approximately) the characteristics of country 1, which we use in the decomposition.10 The details of the reweighting method are described in DiNardo et. al. (1996) and Fortin et. al.(2011), but in short, the following reweighting function is estimated as

πœ“(𝑋) = Pr (𝑋|𝐺 = 1)

Pr (𝑋|𝐺 = 0)= Pr(𝐺 = 1|𝑋) /Pr (𝐺 = 1) Pr(𝐺 = 0|𝑋) /Pr (𝐺 = 0)

The reweighting function is then used as weights to get the counterfactual mean of the covariates 𝑋̅01, and the counterfactual regression coefficients 𝛾̂01𝜐 . Now the difference 𝛾1πœβˆ’ 𝛾01𝜐 reflects the true change in the wage structure, since it holds the distribution of characteristics unchanged.

10 In our case, we reweight the US to have the observable characteristics of Finland.

effect Δ̂𝐢,π‘‘πœ and a specification error component Ξ”Μ‚πœπΆ,𝑠𝑒:

Ξ”Μ‚πœπΆ,π‘Ÿπ‘€ = (𝑋̅01βˆ’ 𝑋̅0)𝛾̂1𝜐+ 𝑋̅01(𝛾̂01𝜐 βˆ’ 𝛾̂0𝜐)

= Ξ”Μ‚πœπΆ,𝑑+ Ξ”Μ‚πœπΆ,𝑠𝑒

The specification error is related to the fact that the unweighted decomposition only provides a linear approximation of the composition effect Ξ”πΆπœ. When the linear approximation of the composition effect is accurate, the specification error should be small. Therefore, calculating the specification error serves as a good specification test for the unweighted RIF decomposition.

The wage effect can similarly be expressed as the sum of the true wage effect plus a reweighting error:

Ξ”Μ‚πœπ‘Š,π‘Ÿπ‘€= 𝑋̅1(𝛾̂1πœβˆ’ 𝛾̂01𝜐 ) + (𝑋̅1βˆ’ 𝑋̅01)𝛾̂01𝜐

= Ξ”Μ‚π‘Š,π‘‘πœ + Ξ”Μ‚π‘Š,π‘Ÿπ‘’πœ

Where the reweighting error Ξ”Μ‚π‘Š,π‘Ÿπ‘’πœ reflects the fact that the reweighted mean 𝑋̅01 is not exactly equal to 𝑋̅1. The reweighting error should approach zero when the reweighting works well.

As Firpo, Fortin and Lemieux (2007) point out, identification of the composition and wage effect depends on two assumptions. The first assumption is ignorability, or unconfoundedness, which says that the distribution of unobserved factors that affect the wage setting is the same in the two groups, conditional on X. This is a strong assumption, but it is ultimately untestable. It is easy to think about reasons why this assumption would be violated. For example, the two countries differ in their degree of unionization. The second assumption is the overlapping support assumption, which says that there must be an overlap of the observable characteristics in the two groups.

In other words, there must be no value in X that is only observed in one group. This assumption is more easily testable.

group conditional on covariates X for each observation in the sample using a probit model.11 Second, the predicted probabilities of belonging to group 0 (π‘ƒπ‘ŸΜ‚(𝐺 = 0|𝑋)) and 1 (π‘ƒπ‘ŸΜ‚(𝐺 = 1|𝑋)), together with the sample shares of each group (π‘ƒπ‘ŸΜ‚ (𝐺 = 0) and (π‘ƒπ‘ŸΜ‚ (𝐺 = 1), are used to calculate the reweighting function.12

Next, recentered influence functions (RIF) for the distributional statistics of interest (i.e., quantiles) are obtained non-parametrically as described in Firpo, Fortin and Lemieux (2009).13,14 Once the RIFs are obtained, two Oaxaca-Blinder decompositions are performed at each quantile by replacing the outcome variable (log hourly wages) with the RIF. To get the composition effect, we compare country 0 (US) to the counterfactual country (US with the characteristics of Finland). To get the wage effect, we compare country 1 (Finland) to the counterfactual country. In addition, we perform a decomposition comparing unweighted US and Finland to be able to calculate the specification error and the reweighting error. The specification error is defined as the difference between the β€œtotal unexplained” in the unweighted decomposition and the reweighted decomposition of the composition effect. The reweighting error is similarly defined as the difference between the β€œtotal explained” in the unweighted decomposition and the reweighted decomposition of the wage effect.

We use the US as the reference country both since this is consistent with our previous analysis and since it is consistent with previous work (Blau & Kahn, 2005; Paccagnella, 2015). Our estimates are robust to choosing Finland as the reference country, and to

11 In the probit model we interact a female dummy with controls for numeracy test scores, education,

experience and experience squared.

12 In our case, the estimated weights are multiplied by the PIAAC sampling weights.

13 We have also estimated the RIFs parametrically using RIF-OLS, and the results are very similar.

14 In the case of the variance and the Gini coefficient, the RIF is obtained parametrically by estimating a

RIF-OLS regression where the covariates included are numeracy test score, education, gender, work experience and its square.

intermediate education as the reference group.15

In document Essays on Human Capital Accumulation (sider 87-93)