Perception of Lightness - Image Appearance

3.3 Image Appearance

3.3.1 Perception of Lightness

Any luminance value can be perceived as literally any lightness value (shade of gray) depending on its context within the image. Initially, lightness has been assumed to be equivalent with reflectance which can be obtained by dividing luminance by the estimated illumination – a straightforward realization of lightness constancy. This as-sumption, however, has been undermined with empirical evidence.

The problem of lightness perception and lightness constancy has been studied exten-sively in the last two centuries for which a detailed account can be found in [Palmer 1999]. At first, the Gestalt theorists rejected the assumption that luminance per se is the stimulus for lightness. The most prominent theories follow Wallach’s observation [Wallach 1948] that the perceived lightness depends on the ratio of the luminance at edges between neighboring image regions. This inspired the retinex theory [Land and McCann 1971], in which it is assumed that even for remote image regions such a ratio can be determined through the edge integration of luminance ratios along an arbitrary path connecting those regions.

Lightness can be well modeled by the retinex algorithm under the condition that the illumination changes slowly, which effectively means that sharp shadow borders can-not be properly processed. To overcome this problem, Gilchrist and his collaborators suggested that the human visual system performs an edge classification to distinguish illumination and reflectance edges [Gilchrist 1977]. This led to the concept of the decomposition of retinal images into the so called intrinsic images [Barrow and Tenen-baum 1978,Arend 1994] with reflection, illumination, depth and other information stored in independent image layers. The lightness perception theories based on in-trinsic images can predict lightness constancy very successfully. However they define only relative lightness values for various scene regions. Their important shortcoming is the lack of a rule which would define the association between the predicted relative

3.3. IMAGE APPEARANCE 37 lightness and the perceived white, grays and black across the whole scene. Further-more, being developed for good lightness prediction, these theories fail to account for lightness constancy failures typical to human vision [Gilchrist et al. 1999].

The mapping of relative lightness to the perceived shades of gray is solved by anchor-ing. There are several rules of anchoring, each of which defines a method to assign one particular absolute lightness value (e.g. white, black, middle gray) to one relative lightness value – the so called anchor value. The remaining mapping can be imme-diately found through the known lightness ratios. In particular, the anchoring can be directly applied to the intrinsic image models, although initially it was not included, by mapping the maximum value in the reflectance layer to white.

The problem of lightness constancy failures and absolute lightness assignment, al-together, is addressed by the anchoring theory of lightness perception developed by Gilchrist et al. [Gilchrist et al. 1999]. One of the key arguments of the theory is that lightness mapping can differ even within a single image depending on the considered context of the image. In this theory, such ambiguity is accounted for by the concept of frameworks. Frameworks are image components which are grouped by the terms of Gestalt principles: mainly by common illumination, but also by proximity, similarity, co-planarity, good continuation, and common fate. An image is composed of multiple frameworks whose areas can overlap. The anchoring rule can give correct lightness estimates when considered only within one framework. The net lightness of a surface in an image can be found by estimating the influence of each of the frameworks on that surface and by calculating the weighted product of lightness mappings within each of the frameworks.

The main weakness of the lightness perception theories is that they are given in a de-scriptive form and lack computational models. To account for correct lightness re-production in tone mapping, we formulate the computational model of the anchoring theory of lightness perception in Chapter5. Our choice for this theory is motivated by its sound explanation of the particular appearance of many experimental scenes and its extensive experimental studies with human subjects.

Chapter 4

Real-time Tone Mapping for HDR Video

Low dynamic range (LDR) image and video contents, which are stored in the display-referred representation, are usually directly shown on a display. The parameters of a reference display and the preferable observation conditions are well defined in stan-dards [ITU 1990] and guarantee that such in a sense oblivious depiction delivers good quality. While the general parameters of displays usually follow the standard, the obser-vation conditions, however, may not match the reference. Users are able to compensate this mismatch by adjusting few parameters like contrast, brightness and saturation to improve the picture, however the range of adjustment is limited. Moreover, the exces-sive adjustment of these controls may not only lay beyond the capabilities of a display, but may also reveal artifacts in the image contents. This happens because the display-referred representation contains only sufficient image and video quality to produce good results under the assumed conditions. For instance, too strong contrast amplifica-tion would show contouring artifacts. On the other hand, strong increase of brightness would not reveal the details of dark picture parts as perhaps one would expect, because the brightness of these parts lays outside the dynamic range of a reference display and therefore their contents have not been stored in the stream.

Contrary to the display-referred contents, scene-referred HDR contents are not lim-ited to the capabilities of typical displays. HDR video compression [Mantiuk et al.

2004], for example, stores as wide luminance range as the human eye can observe in a real-world scene. This in most cases largely exceeds capabilities of displays, therefore such scene-referred contents require processing (tone mapping) prior to display. Such processing is performed on the side of the target display and thus has several notable advantages with respect to the display-referred representation. First, the HDR contents can be processed in such a way that delivers the best quality on the actually used dis-play under the actual observation conditions. Second, the limited range of brightness and contrast adjustments can be relaxed and moreover the quality of their effect is im-proved. For instance, the brightness correction of HDR data reveals contents of too bright or too dark picture parts because, unlike in the case of LDR data, the informa-tion there is not clipped. Finally, the ample amount of luminance informainforma-tion in HDR video permits to add new controls that take advantage of the available dynamic range

and can increase the realism of a picture. We observe, that for a range of luminance levels typical to certain scenes like nights or sunny days, the depiction of true lumi-nance is not feasible on displays. Hence all such scenes appear in an almost similar way on the screen, although an average observer would expect to see bright saturated colors only on a sunny day, while subdued grayish tones with low acuity are common in the night and a veiling glare usually appears around bright lights. When these phe-nomena are ignored, well visible details appear unrealistic in dimly illuminated scenes, because the acuity of human vision is normally degraded in such conditions. On the other hand, perceptual effects like glare cannot be evoked because the maximum lumi-nance of typical displays is not high enough. However, we are so used to the presence of such phenomena, that adding glare to an image can increase the subjective bright-ness of the tone mapped image [Spencer et al. 1995]. Therefore by simulating such perceptual phenomena, we can increase the realism of HDR contents by reducing the appearance mismatch between the real-world and display.

In our work, we focus on a real-time implementation of a high quality tone mapping operator and on the introduction of new controls that take advantage of the luminance range available in the stream. To match the real-world appearance of recorded HDR scenes, we enhance the tone mapping algorithm by incorporating the most significant perceptual effects that are related to the absolute luminance levels in the scene and to the optics of the eye (Section 3.1). Improving over previous work, we observe that these effects have much in common in terms of spatial analysis and show that making use of such similarities have a tremendous impact on the performance. Further, we add the functionality for convenient inspection of verbatim information in a selectable dynamic range. Such an inspection tool is necessary in cases when it is required to view the exact contents of the recorded scene, as for instance in forensic applications. We implement our approach in the graphics hardware as a stand-alone HDR image/video processing module and achieve a real-time performance. The computational overhead of our extensions to tone mapping is negligible. Although we primarily demonstrate the module in the context of HDR video playback, it can be as well applied to the final stage of a real-time renderer or to other stream of HDR contents like an input from a surveillance camera or data for visualization.

4.1 Previous Work

The tone mapping of HDR contents has been widely addressed in research and we have introduced most of the existing algorithms in Section 2.5. Simple algorithms, which are based on a tone reproduction curve, can be implemented very efficiently in the graphics hardware [Drago et al. 2003], but such methods fail in reproducing fine details in the HDR scenes. Most of the recent algorithms deliver a higher quality but at the cost of the increased complexity and only few are able to achieve interactive rates at 1Mpx resolution [Goodnight et al. 2003]. In contrast, our work is unique in a sense that we aim at real-time tone mapping performance wihout compromising the quality of HDR.

Certain perceptual effects, like the lack of visual acuity or color perception in night scenes, have already been accounted for in several tone mapping algorithms [Ferwerda et al. 1996,Ward et al. 1997,Durand and Dorsey 2000,Pattanaik et al. 2000]. These effects, however, have been discussed only in the context of global operators and have

4.2. COMPUTATIONAL MODELS 41

In document Perception-inspired Tone Mapping (sider 48-53)