• No results found

7.2 Grey level entropy matrices

7.2.2 The GLEM4D-features

116 CHAPTER 7. RESULTS AND DISCUSSION aneuploid histograms. This is because all three ploidy histograms indicate that a significant proportion of the cell images have large IODs and thus also low grey level, see section 3.1). However, as the pattern of the weight arrays are similar, the discussion of what the GLEM-features measures is valid also for all 134 patients. Also, because the grey level average and variance are obviously affected similar by the inclusion of the patients with the tetraploid and poly-ploid histograms, the connection between the GLEM-feature and the average and variance in grey level is also still valid.

Assumptions of the estimated Mahalanobis distance between the classes Because the weight arrays are designed using the estimated Mahalanobis dis-tance between the classes, it is interesting to investigate whether the under-lying assumptions are met. To test these assumptions, we will assume that the samples within each element in the collection of the property arrays of all 134 patients can be seen as independent. We will then test the normality as-sumption of each prognosis class using the Lilliefors goodness-of-fit test [32] at significance level 0.05. This is a generalisation of the Kolmogorov-Smirnov test for the case of normality when the expectation and variance are unknown [32, p.399]. The assumption of equal variances will be tested using the standard F-test [11, pp.515–519] at significance level 0.05 (the null hypothesis will of course be that the two variances are equal). Note that this test is strongly dependent on the normality assumption [11, p.519]. In particular, the standard F-test is more dependent on the normality assumption than the pooled two-samplet-test which the estimated Mahalanobis distance between two classes can be seen as theT-statistic of, if letting the null hypothesis be equal expectations [11, p.519].

However, because none of the tests would ideally be rejected (as the appropri-ateness of using the estimated Mahalanobis distance between the classes can only be guarantied in this case), we expect that the standard F-test performs acceptably as the distributions are at least approximately normal when none of the normality tests are rejected.

Figure 7.5 shows the result of testing the assumptions. We see from the images in the left and middle column that the normality assumptions are slightly questionable. In comparison with figure 7.3 we note that the assumptions are not rejected in the most discriminative elements. This is comforting, but only a natural consequence of the central limit theorem as these are also the elements with most occurrences2. The common variance assumption is slightly more frequently satisfied and also this assumption seems most appropriate in the more interesting elements. In total, we conclude that the underlying assumptions of the estimated Mahalanobis distance between the classes seem to be generally acceptable when using the GLEM-features.

7.2. GREY LEVEL ENTROPY MATRICES 117

Figure 7.5: The assumption of: left column) normality in good prognosis, mid-dle column) normality in bad prognosis, right column) equal variances of the difference GLEM-feature when using the 102 patients and the cell area group:

upper row) [2000,2999], middle row) [3000,3999], lower row) [4000,4999]. The corresponding tests are rejected in black pixels and not rejected in white pixels, both at significance level 0.05. The grey pixels corresponds to elements where all relevant property arrays are zero.

feature)3. In comparison with the result of the negative GLEM-features in table 7.6, we see the performance change when using all 134 patients is insignificant, but is clearly significant for the 102 patients.

It is difficult to inspect the designed weight array of the GLEM4D-features because of its four dimensions. We will therefore take a different approach to get an understanding of what the GLEM4D-features measures. While it has two common axes with the GLEM-features, the significantly improved performance tells us that the two added axes provide new or better information. A natural question is therefore whether both or only one of the axes are of prognostic relevance. It turns out that only the area group axis is relevant. Indeed,

eval-3With respect to the expected CCR. The expected CCReq of the difference GLEM4D-feature is 0.3 % than for the negative GLEM4D-GLEM4D-feature when using all 134 patients, but because this is an insignificant amount and the difference in expected CCR was nearly 3

% in favour of the negative GLEM4D-feature, the negative GLEM4D-feature is considered to be the better among these two features. This conclusion can however be debated as we are most interested in the CCReq and the lower and upper limit of the PI is 1.9 % higher and lower, respectively, for the difference feature with respect to the negative feature, thus indicating that the difference GLEM4D-feature provides a more reliable measurement in terms of discriminating between the classes.

118 CHAPTER 7. RESULTS AND DISCUSSION Table 7.7: The classification results of the negative GLEM4D-feature when using the classification method which attained the best expected CCReq; NMSC/LDC.

All 134 patients The 102 patients CCReq 63.8 % [51.1 %, 76.5 %] 76.1 % [62.1 %, 89.4 %]

CCR 69.0 % [61.5 %, 75.6 %] 82.3 % [75.0 %, 90.4 %]

Specificity 71.3 % [60.6 %, 80.3 %] 86.8 % [75.6 %, 95.1 %]

Sensitivity 56.4 % [33.3 %, 83.3 %] 65.4 % [36.4 %, 90.9 %]

Using 28 (left) and 25 (right) learning patterns in each prognosis class.

uating the GLEM3D-features resulting from setting the window width in the GLEM4D-features to 9 gives expected CCReq and expected CCR which differs with less than 0.5 % (in absolute value) from corresponding performance esti-mates of the negative 4D-GLEM-features, both when using all 134 patients and the 102 patients (the best adaptive texture feature and classification method were again the negative adaptive texture feature and NMSC, respectively).

With respect to the classification results of the GLEM3D-features and the improved performance over the GLEM-features, it is natural that the area group axis provides new prognostic relevant information. Indeed, the scatter plot in figure 7.6 shows that this is the case. Because of this new prognostic relevant axis, we are not sure to which extent the connection between the GLEM-features and the grey level average and variance are inherited to the GLEM4D-features.

Figure 7.6: Scatter plot of the negative GLEM4D-feature against the Area-feature when using the 102 patients. The blue plus sign represents good prognosis and the red asterisk symbol represents bad prognosis.

7.2. GREY LEVEL ENTROPY MATRICES 119

Figure 7.7: Scatter plot of the negative GLEM4D-feature against: left) the GreyLevelAverage-feature, right) the GreyLevelVariance-feature when using the 102 patients. The blue plus sign represents good prognosis and the red asterisk symbol represents bad prognosis.

We see from the scatter plots of the negative GLEM4D-feature against the grey level average and variance in figure 7.7 that this connection is greatly weakened, but still present. The presence of this connection is also indicated by both the greater separation along the negative GLEM4D-feature axis of figure 7.6 in comparison with the Area-feature axis and a comparison between the classification results of the negative GLEM3D-feature and the Area-feature (the CCReqs and CCRs of the negative GLEM3D-feature are about 2–6 % better than the corresponding performance estimates of the Area-feature for both all 134 patients and the 102 patients). We therefore conclude that the GLEM4D-features can be seen as combined measurements of the area and the grey level average and variance.

7.2.3 Comparison with the combination of the cell