• No results found

2.3 Challenges with the dataset

3.1.3 A relevant study

We will now turn to the study by Kristensen et al. [29] which uses DNA ploidy analysis to classify the superset of our dataset containing 284 patients. We will first describe the technical details made in this study and then present its results.

Firstly, let us describe the choice of the imagesAandBin the computation of the IOD in equation (3.1). Because the segmentation is in this dataset done after the shade correction, just as it is in our dataset, it is natural to letAbe a cell image, even though we could have let it be an original image and include a shade correction inB. Instead, we will letBinclude two other accommodations.

One of these is related to the density of the DNA, which is a part of the problem that was indicated section 2.2.1 with the refraction that is correlated with the presence of the nuclei. The other attempts to accommodate for the average effect of any non-nuclei contribution of the tissue sample at the point of imaging.

For ploidy classification, the study used specially trained personnel without knowledge of the recorded true prognosis to manually performed the classifica-tion by applying some defined criteria. By using the average IOD of some imaged

3.1. DNA PLOIDY ANALYSIS 21 non-epithelial cells, a patient-dependent estimate of the IOD corresponding to diploid cells is obtained. It may seem unnecessary that this estimate is depen-dent on the patient, but it is actually rather important as the true DNA content of diploid cells varies significantly between patients.

Using the estimated IOD of diploid cells, the trained personnel manually selected any presenteuploid peaks in the IOD histogram, where a euploid peak is in this thesis defined as a peak in the IOD histogram that approximately corresponds to a positive power of two times the estimated IOD of diploid cells. The trained personnel also selected any non-euploid peaks. We will in the following refer to all selected peaks, euploid or not, as present if they are visually selected by the trained personnel.

The used criteria for diploid histograms were that only a diploid peak was present, that the proportion of estimates in the tetraploid peak did not exceed 10 % and that the proportion of IODs above 2.5 times the estimated IOD of diploid cells did not exceed 1 % when excluding estimates in euploid peaks. A histogram was classified as tetraploid either if a tetraploid peak was present and the proportion of its IODs exceeded 10 %, and that the proportion of IODs above 2.5 times the estimated IOD of diploid cells did not exceed 1 % when excluding estimates in euploid peaks, or if a tetraploid and octaploid peak was present and that the proportion of IODs above 4.5 times the estimated IOD of diploid cells did not exceed 1 % when excluding estimates in euploid peaks. The presence of a diploid peak was optional in both cases. Furthermore, a histogram was classified as polyploid if a octaploid and hexadecaploid peak was present, optionally with the presence of a diploid peak and/or a tetraploid peak. Finally, the histogram was classified as aneuploid if none of the above characteristics fits, i.e. if a non-euploid peak was present or if the number of DNA content above the specified limited exceeds 1 %1. Figure 3.1 shows the result of applying the described IOD and ploidy classification on the images of some patients in our dataset (and this superset). [29, p.1495]

The result of applying the obtained ploidy type to estimate the prognosis of the patients are convincing. If separating on ploidy type, the estimated ten year relapse-free survival rates are 95 %, 89 %, 70 % and 29 % for the diploid, tetraploid, polyploid and aneuploid classification group, respectively. We can note that only ten patients are classified as polyploid, thus this estimate is rel-ative unreliable, and the constructed confidence intervals (CIs) reveals that the uncertainty in the tetraploid group with 64 patients are also large. The diploid and aneuploid group contains 113 and 91 patients, respectively. If we use the indicated patient classification of assigning patients with diploid or tetraploid histogram to good prognosis and patients with polyploid and aneuploid his-togram to bad prognosis, the estimated ten year relapse-free survival rates are respectively 92 % and 34 %. [29, pp.1495–1496]

For better comparison with subsequent classification results, it is interesting to note the performance of this DNA ploidy analysis on our 134 patients. The classification result when using the indicated patient classification is shown in table 3.1. If we also exclude the patients with tetraploid or polyploid histogram, as we will do in some of our subsequent classification results, we obtain the

clas-1If we are very strict, we see that the case when only a diploid peak was selected, but the tetraploid peak contained more than 10 % of the number of estimated DNA contents, is not included. This is only of theoretical concern because the trained personnel would in practise have selected the tetraploid peak in this case and thus classified the histogram as tetraploid.

22 CHAPTER 3. PREVIOUS WORK

Figure 3.1: The IOD histogram with the result of applying the described IOD and ploidy classification on the images of some patients in our dataset. 2Cis the estimated IOD of diploid cells and the other C’s are multiples of this value. The true prognosis of these four patients are good for the IOD histograms classified as diploid, tetraploid and polyploid, and bad for the IOD histogram classified as aneuploid.

Table 3.1: Patient classification obtained by assigning patients with diploid or tetraploid histogram to good prognosis and patients with polyploid and aneuploid histogram to bad prognosis when using all 134 patients in our dataset. CCRis an acronym for correct classification rate.

Prognosis Patients Correctly classified Misclassified CCR

Good 94 80 14 85.1 %

Bad 40 32 8 80.0 %

Total: 134 112 22 83.6 %

Average of the CCRs for good and bad prognosis: 82.6 %

sification result in table 3.2. We see that the estimated performances increases with about 2 % for this method when excluding the patients with tetraploid and polyploid histogram.

3.2. TEXTURE ANALYSIS 23