• No results found

An Objective Metric to Represent the Perceived Quality

Chapter 5: Testing the Improvement Code

5.4 An Objective Metric to Represent the Perceived Quality

To make an assessment of the performance of the implemented quality-adaptive algorithm when streaming over a unreliable network, there is need for an objective metric that can be used to represent the perceived quality of layer-encoded video.

The PSNR is a popular metric to present the objective quality of video data. It is described by the following mathematical expression:

PSNR = 10 log10

255 2

MSE where MSE is the mean square error. [15] (b)

However, this metric does not represent the perceived quality of layer-encoded video well enough [4], so further details about it will not be provided in this thesis.

The lack of a metric to represent the perceived quality of layer-encoded video led to a new metric that was developed by Michael Zink (currently a postdoctoral fellow in the Computer Science Department at the University of Massachusetts in Amherst) for his dissertation about scalable Internet Video-on-Demand systems [4]. This metric is called the spectrum.

The objective quality metric spectrum is described by the following mathematical expression:

The spectrum captures the frequency (the number of layer variations) and the amplitude (amount of layers decreased/increased in each layer variation) of quality variations. The frequency of variations is represented by zt. Thus, a step in a time slot corresponds to an increase or decrease of video layers between two consecutive adaptation windows. A value of 0 for the spectrum represents the best possible quality, while the spectrum increases with a decreasing quality.

After further investigation, it turns out that the spectrum doesn’t always capture well the fact that gradually reduced video quality (slow decrease of video layers) is generally better than rapid quality drops.

83

Figure 40: Rapid and Gradual Drops

In the two figures above, the spectrum calculation gives a result of 2 for both cases. In a crisis situation, the implemented algorithm of this thesis tries to achieve good video quality adaptation by gradually adjusting the amount of video layers to transmit. Thus, the algorithm is based on the assumption that gradual quality change is an essential strategy that leads to better perceived quality than a rapid change, which unfortunately is not well represented by the spectrum.

Another drawback with the spectrum metric is that the quality levels are not well represented either. A constant reception of only base layers (lowest quality level) for a number of adaptation windows is indicated by the spectrum to have equally good quality as a constant reception of all 16 layers (highest quality level) for the adaptation windows.

0

Figure 41: Lowest and Highest Quality Reception

The two figures above show that the spectrum calculation does not take into account the fact that the quality level (level of video layers) has an influence on the perceived quality.

As long as there are no layer changes during the streaming session, the result calculated from the spectrum corresponds to perfect quality. In regard to the improvement code of this thesis, this is not an appropriate way to interpret the perceived video quality.

In order to achieve a reasonable assessment of the improvement code, it is necessary to develop a new simple metric for objective quality assessment with regard to the following two issues, as discussed above:

1) The metric must capture the fact that gradual quality changes are better than rapid quality changes.

2) The quality levels have an influence on the perceived video quality and must be taken into consideration by the metric.

84

Inspired by the spectrum, the following new objective quality metric is developed for this thesis, based on the two issues above. For easy reference, this new metric will be called spectrum2:

Notice that the definition of ht and zt are the same as for the spectrum. In this thesis, the time slot mentioned above corresponds to an adaptation window of fixed duration. That is, there are T adaptation windows in total for the streaming video. Similar to the spectrum metric, the lower the value, the better is the perceived quality.

The first part of the equation (d) corresponds to

average

1 , where average indicates the average number of total received layer amounts for all the adaptation windows. This means that the higher the quality level is, the lower is the value of the ‘layer average’

part. The following example describes this situation.

0

Figure 42: Lowest and Highest Quality Level

Value of ‘layer average’ part Value of ‘layer average’ part

85

As the calculations above show, if the entire streaming video is represented by the lowest quality level, the ‘layer average part’ of the equation (d) gives a value of 1. On the other hand, if the highest quality level is achieved all the way, then the value is 0.0625. In other words, this metric takes the quality level into consideration when representing the perceived quality, unlike the spectrum where the result of both cases above are 0. It should be noted that if the perceived quality is perfect, then spetrum2 gives a value of 0.0625, which in the case of spectrum is 0.

The second part of the equation (d) corresponds to

change

sum_layer_diff indicates the sum of all layer difference amounts between the adaptation windows, and tot_num_layer_change indicates the total number of layer changes for the entire streaming session. In other words, this is a measure of the average layer difference between all the adaptation windows. That is, the higher the quality change is, the higher is the value of the average layer difference. The following example describes this.

0

Figure 43: Higher and Lower Quality Changes

Value of ‘average layer difference’ part Value of ‘average layer difference' part

The calculations above show that if the quality drops gradually, then the ‘average layer difference’ part of the equation (2) gives a smaller value compared to when the quality changes faster. This is good in the sense that it matches the way good perceived quality is defined in this thesis, which is the fact that gradual quality changes are preferable when streaming over a network with highly varying bandwidth.

Since quality changes play a significant role in the assessment of the video quality, the weight is put on the ‘average layer difference’ part of the equation (2). That is, if the quality level is high for the streaming video and there are high quality changes during the playout, then the perceived quality of this video is considered to be less good in comparing to a video in which both the quality level and the quality changes are low. The following example explains this more clearly.

86

Figure 44: Higher Quality Level, High Quality Change

Figure 45: Lower Quality Level, Low Quality Change

s2(v) for the streaming session of figure 44:

s2(v) =

The example above shows that even though the highest quality level (16 video layers) for the video is reached most of the time during the streaming session of figure 44, the perceived quality of this video is still considered by spectrum2 to be worse than the one seen in figure 45, because of the high quality drop. As mentioned earlier, one of the purposes of the implemented quality-adaptive algorithm is to prevent high quality changes in the video from occuring, such as the one depicted in figure 43. Thus, it is

87

approriate that spectrum2 indicates the quality changes of figure 43 to be less good than those of figure 44.

Based on the examples given, this new objective metric is verified to a degree of being suitable for representing the perceived quality of layer-encoded video. The fact that it can distinguish between gradual and rapid quality changes, and that the objective quality is based on the quality levels of the video whenever there are no quality changes during a streaming session, makes it appropriate enough for the objective assessment of the further test cases in this thesis.

Although this metric works for the objective assessments in this thesis, it is not guaranteed to work in other circumstances. Since the development of an objective metric is not part of the goals of this thesis, further investigations are probably required in order for this metric to be working on a general basis.