• No results found

Paper 11: Estimating the depth uncertainty in three-dimensional

5.11.1 Synopsis

Visual interaction in three-dimensional virtual space can be achieved by esti-mating objects depth from the fixations of the left and right eyes. The current depth estimation methods, however do not account for the presence of noise in the data. To address this problem we note that any measured fixation point is a member of a statistical distribution defined by the level of noise in the measure-ment. We thus propose a new numerical method that provides a range of depth values based on the uncertainty in the measured data. The main contribution of this paper is a new method to estimate the depth uncertainty in a virtual environment. This is explicitly linked to the research contributionC5.

Chapter 6

Discussion

This chapter concludes the dissertation with an overview of the results obtained from the research papers, and the main research direction for future work.

6.1 Validating the visual saliency model

0 0.2 0.4 0.6 0.8 1

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Saliency value

Probability

fixated non fixated

(a) Probability histogram

0 0.2 0.4 0.6 0.8 1

0 20 40 60 80 100

Saliency value

Relative probability

fixated non fixated

(b) Relative probabilities

Figure 6.1: Probability histograms and relative probabilities for the fixated and non-fixated regions for an average observer. X-axis shows the saliency values obtained by using the visual saliency algorithm (Itti, Koch, & Niebur, 1998).

In papers 4 and 6, we performed an experiment using linear discrimina-tion analysis to try to separate between the saliency values obtained from the model by (Itti, Koch, & Niebur, 1998) for locations that received fixations and those that received no fixations. The data was based on a subset of the images and corresponding fixations obtained by (Judd, Ehinger, Durand, & Torralba, 2009), where we used 200 landscape images and all the fifteen observers. In the experiment, we defined a fixated area as a square region of dimensions 100 by 100 pixels where the center was located at the fixation point. Non-fixated areas were chosen randomly from parts of the image that had a region of a 100

by 100 pixels without any fixations By collecting the values returned by the saliency algorithm local to those regions into two matrices we were able to use discrimination analysis to determine whether the data of the two matrices is separable. In figure 6.1(a), we show the probability histograms of the fixated and non-fixated regions for all the observers. Here, the histogram is normalized such that the area under the curve is one. We note that the separation between the two sets is not ideal but rather we find a considerable overlap between the two histograms specifically in the middle range. We further note that there is a clear separation between the two sets for regions of the images that received no fixations indicating that the method is good at predicting non-salient regions of the images. At a value of 0.3 the classification of the two sets is random.

To gain better insight into the ability of the algorithm to separate the image regions into fixated and non-fixated, we plotted the relative probabilities of the histograms. For the non-fixated histogram, the relative probabilities were ob-tained by dividing the area under the non-fixated probability histogram curve of a specific bin i of the histogram by the area under the fixated histogram curve for the same bin. For the relative probability of the fixated histogram the reciprocal value was calculated. This curve is plotted in figure 6.1(b) where we observe that for low salience values the separation of non-fixated regions is ideal and that the extent of the separation declines to a level that is random.

We also note that the separation of the highly salient regions, is nearly ideal.

Based on this we can conclude that the saliency algorithm by (Itti, Koch, &

Niebur, 1998) is good in predicting non-salient and highly salient regions but its performance drops in the middle range.

6.2 Proposed group based asymmetry algorithm

In papers 5 and 6, we set about unifying the mathematical description of saliency in a single metric. Based on the knowledge gained from research in image processing where it has been shown that the dihedral groupD4can be used to encode edges and contrast which are the main current descriptions of saliency, we chose to devise an algorithm that represents the level of saliency in an image region by virtue of the transformations of D4. In our experiment, we used a receiver operating characteristic (ROC) curve to compare the performance of the proposed method with that of (Itti, Koch, & Niebur, 1998). For the analysis, we used fixations data from 200 images and fifteen observers. We found that the proposed group based asymmetry (GBA) algorithm results in an AUC value of 0.81 which is better than that achieved with the visual saliency algorithm by (Itti, Koch, & Niebur, 1998) which gives AUC of 0.77. Based on the results, we conclude that the transformations pertaining to the dihedral group D4 are a good metric to estimate salient image regions. In figure 6.2, we offer a visual comparison between the two algorithms, we show the fixations map, and the saliency maps obtained from the proposed GBA algorithm and the visual saliency algorithm by (Itti, Koch, & Niebur, 1998) for an example image. We can see that the maps from both the algorithms are quite similar.

In fact both of them return the region containing the boat at the center as salient, which is also in agreement with the fixations map. The performance of the proposed GBA algorithm is compared with other state-of-the-art saliency models in the next section.

(a) Image from the database by (Judd, Ehinger, Durand, & Torralba, 2009)

(b) Fixations Map

(c) Saliency Map (Itti, Koch, &

Niebur, 1998)

(d) Group based Asymmetry Map(GBA)

Figure 6.2: Comparison of visual saliency algorithms, both algorithms return the region containing the boat at the center as salient, which is also in agreement with the fixations map obtained from the eye fixations data.

6.3 Proposed robust metric for the evaluation