Mathematical Framework for Response Estimation

The camera response is estimated from a set of input images based on the expectation maximization approach [Robertson et al. 2003]. The input images capture exactly the same scene, with correspondence at the pixel level, but the exposure parameters are different for each image. The exposure parameters have to be known and the camera response is observed as a change in the output pixel values with respect to a known change in irradiance. For the sake of clarity, in this section we assume that the only parameter is the exposure time, but in general case it is necessary to know how many times more or less energy has been captured during each exposure. Since the exposure time is proportional to the amount of light captured in an image sensor, it serves well as the required factor. In the mathematical formulas below, we obey the notation given in TableA.1and consider only images with one channel.

There are two unknowns in the estimation process. The primary unknown, the camera response function I, models the relation between the camera output values and the irradiance at the camera sensor, or luminance in the scene. The camera output values for a scene are provided as input images, but the irradiance x coming from the scene

A.2. MATHEMATICAL FRAMEWORK FOR RESPONSE ESTIMATION 121 is the second unknown. The estimation process starts with an initial guess on the camera response function, which for instance can be a linear response, and consists of two steps that are iterated. First, the irradiance from the scene is computed from the input images based on the currently estimated camera response. Second, the camera response is refined to minimize the error in mapping pixel values from all input images to the computed irradiance. The process is terminated when the iteration step does not improve the camera response any more. We explain the details of the process below.

Estimation of Irradiance

Assuming that the camera response function I is correct, the pixel values in the input images are mapped to the relative irradiance by using the inverse function I⁻¹. Such relative irradiance is proportional to the true irradiance from the scene by a factor in-fluenced by the exposure parameters (e.g. exposure time), and the mapping is called linearization of camera output. The relative irradiance is further normalized by the ex-posure time t_ito estimate the amount of energy captured per unit of time in the input images i at pixel position j:

x_{i j}=I⁻¹(yi j)

t_i . (A.1)

Each of the x_i images contains a part of the full range of irradiance values coming from the scene. This range is determined by the exposure settings and is limited by the dynamic range of the camera sensor. The complete irradiance at the sensor is estimated from the weighted average of this partial captures:

xj=∑iw_{i j}·x_{i j}

∑iw_{i j} . (A.2)

The weights w_{i j}are determined for camera output values by the certainty model dis-cussed later in this section. Importantly, the weights for the maximum and minimum camera output values are equal to 0, because the captured irradiance is bound to be incorrect in the pixels for which the sensor has been saturated or captured no energy.

Refinement of Camera Response

Assuming that the irradiance at the sensor x_jis correct, one can recapture the camera output values y^′_{i j}in each of the input images i by using the camera response:

y^′_{i j}=I(ti·xj). (A.3)

In the ideal case when the camera response I is perfectly estimated, the y^′_{i j}is equal to yi j. During the estimation process, however, the camera response function needs to be optimized for each camera output value m by averaging the recaptured irradiance x_jfor all pixels in the input images y_{i j}that are equal to m:

E_m={(i,j): y_{i j}=m}, (A.4) I⁻¹(m) = 1

Card(Em)

∑

i,j∈Em

t_i·x_j. (A.5)

Certainty model

The presence of noise in the capture process is conveniently neglected in the capture model in equations (A.1,A.3). A complete capture model would require characteri-zation of possible sources of noise and incorporation of appropriate noise terms to the equation. This would require further measurements and analysis of particular capture technology in the camera, thus is not practical. Instead, the noise term can be accounted for by an intuitive measure of confidence in the accuracy of captured irradiance. In typ-ical 8-bit cameras, for instance, one would expect high noise in the low camera output values, quantization errors in the high values, and good accuracy in the middle range.

An appropriate certainty model can be defined by the following Gaussian function:

w(m) =exp

−4·(m−127.5)² 127.5²

. (A.6)

The certainty model can be further extended with knowledge about the capture process.

Normally, longer exposure times, which allow to capture more energy, tend to exhibit less random noise than short ones. Therefore an improved certainty model for input images yi jcan be formulated as follows:

w_{i j}=w(yi j)·t_i². (A.7) Such weighting function minimizes the influence of noise on the estimation of irradi-ance in equation (A.2). This happens apart from noise reducing properties of the image averaging process itself.

Minimization of Objective Function

After the initial assumption on the camera response I, which is usually linear, the re-sponse is refined by interactively computing equations (A.2) and (A.5). At the end of every iteration, the quality of estimated camera response is measured with the follow-ing objective function:

∑

i,j

w(yi j)·(I⁻¹(yi j)−t_i·x_j)². (A.8)

The objective function measures the error in the estimated irradiance for input images y_{i j} when compared to the simulated capture of the true irradiance x_j. The certainty model requires that the camera output values in the range of high confidence give more accurate irradiance estimates. The estimation process is terminated as soon as the objective function O falls below predetermined threshold.

The estimation process requires an additional constraint, because two dependent un-knowns are calculated simultaneously. Precisely, the values of x_jdepend on the map-ping of I and the equations are satisfied by infinitely many solutions to I which differ by a scale factor. Convergence to one solution is enforced, in each iteration, through normalization of the inverse camera response I⁻¹by the irradiance causing the medium camera output value I⁻¹(mmed).

In document Perception-inspired Tone Mapping (sider 132-135)