• No results found

6 Analyzes and Results

6.5 Regression Assumptions

6.5.8 Regression Assumption 8

“At each set of values for the k independent variables, εj is normally distributed” Berry (1993:12).

The last assumption by Berry (1993) requires that the error term is normally distributed, and it is important that this assumption is fulfilled in order to draw conclusions on the real

parameters. If this regression assumption is violated, can this hence affect the estimates of the coefficients and lead to biased and not efficient coefficient estimators. In order to test if the error term is normally distributed, we take basis on the skewness and kurtosis of the

distributions. Skewness measures the symmetry properties of the distributions, and when the measures are closer to zero, the error term is closer to be normally distributed. It is hence quite important that the requirement of skewness is met, as the regression coefficients estimators will be biased otherwise. Kurtosis illustrates on the other hand, if the distribution have a form or “tale” that deviates from the normal distribution, and high values indicate abnormal sharpness or flatness of the distributions.

An essential requirement is to have values of both skewness and kurtosis to be <+/-2, and maximum 5 (Sandvik, 2013b). When we examine skewness and kurtosis for our first research model we see that the values of both skewness and kurtosis are extremely high and low, and hence do not meet the requirements of skewness and kurtosis. The lowest value of skewness is -12.498 and the highest value of kurtosis is 156.47, which can be seen in subchapter 6.1.

We have also examined the different histograms for our regressions that shows that we have abnormal sharpness and asymmetry of the distributions, which means that the error term is not normally distributed. This can be seen in the appendices under each of the regression

146 Minu Singh and Cigdem Yavuz

analyzes. The histogram also show that we have some outliers that lies outside of the normal curve, which we for instance can see from the histogram below for hypothesis 1.

Graph 6.3 – Histogram for hypothesis 1 These outliers will make the distribution asymmetrical as they will draw the distribution to the right, which results in a skewed distribution rather than a normal distribution. Hence, we see that outliers can cause the error term to not be normally distributed, similar to extreme values.

Extreme values are as mentioned, values that deviate from the main trend of the observations in the different variables, while outliers are observations between the independent and

dependent variables that deviate from the main trend of this relationship.

We have also constructed scatterplots for each of our variables and the relationships between the independent and dependents variables in order to observe any extreme values or outliers that can violate with the assumption of a normally distributed error term, which can be seen in appendices N to Q. We see that we have extreme values in all of the variables in our first research model, and when we test for the independent variables’ effects on the dependent variables, the scatterplots show that there also are outliers. When we examine these outliers carefully in our data we see that they have the same values as the extreme values, and hence, we have to consider whether we should keep or remove them from the data.

From the graphs below for the dependent variables and some of the independent variables in our first research model we see that the variables have some extreme value that differs widely

147 Minu Singh and Cigdem Yavuz

from the rest of the observations and can affect the variables’ symmetry. We have pointed out the extreme values with arrows.

-50

Graph 6.4 – Scatterplots for variables with extreme values

148 Minu Singh and Cigdem Yavuz

If we hence test the independent and dependent variables we see that the extreme values result in outliers. We can see this from the graphs below, and in appendix P.

Graph 6.5.1 – Scatterplots for regressions with extreme values The outliers draws the regression line towards themselves, and this results hence in biased regressions coefficients where the regression line is estimated inaccurately. Hence, we see that we have to remove the extreme values in order to avoid outliers in our analyzes. To be certain of that this is an appropriate way, we also conduct outliers analyzes for our first research model where we see that these outliers have the same values as the extreme values, as described in subchapter 6.1. We are hence only removing the extreme values from all of the variables, and not the outliers as we do not want to lose much important information and observations.

149 Minu Singh and Cigdem Yavuz

After removing the extreme values, we get more satisfying skewness values and hence get more accurately estimated and less biased regression coefficients. However, the skewness values still differ from zero, and the values of kurtosis is still high, which indicate that the error term is not normally distributed even if we have reduced the violation of this

assumption. Below we present the graphs for some of the regressions without extreme values that shows a more accurate regression line, not drawn by outliers. The other regressions with and without extreme values for all of the other variables can be seen in appendix P.

Graph 6.5.2 – Scatterplots for regressions without extreme values

-20

150 Minu Singh and Cigdem Yavuz

Another way to reduce the values of skewness and kurtosis is to transform our variables using natural logarithm transformation. This is not possible for the variables in our first research model as we have change values that are negative and zero. However, in our second research model we have continuous variables that make it possible to transform them. From appendix Q, we see that these variables have extreme values, but that the problem of extreme values are reduced after transforming the variables. The exception is the natural logarithm of beta, which has one extreme value, and can be seen in the graph below.

Graph 6.6 – Scatterplot for natural logarithm of beta We have chosen to remove this extreme value on the basis of the same discussion as above.

After removing the extreme value we get skewness and kurtosis within the requirements, which indicates that the assumption of normally distributed error term is fulfilled in our second research model. Hence, we get a more accurate regression line without large biased regression coefficients, which can be seen in appendix Q. As mentioned, this assumption is not fulfilled perfectly for our first research model, and the eight assumption is hence partly fulfilled in our study.

As many of the regression assumptions are fulfilled in our study, we conclude that our analyzes and results are accurate, valid and are not too biased. In the next chapter, we will discuss the implications and the contribution of this study, as well as suggestions for further research.

151 Minu Singh and Cigdem Yavuz

7 Discussion

In this chapter, we will discuss the methodological and practical implications in our study. We will further present the contribution of the study, and present suggestions for further research.