Validation and Discussion - Atmospheric Correction over Coastal Waters Based on Machine Learnin

This chapter will present the optimal results of AC and IOP retrieval using the different ML models NN, PLSR, SGDR and SVR. Five statistical metrics were used to evaluate the performance of the different ML models. These metrics were the squared Pearson correla-tion coefficient (R²), the average percentage difference (APD), the mean percentage bias, the root mean square difference (RMSD), and the normalized root mean square difference (NRMSD). Metrics, abbreviations, formulas and interpretations are described further in Tab. 5.1.

5.1 Atmospheric Correction Validation and Discussion

Three different AC approaches were tested in this study, each correcting for different at-mospheric effects. The correction ofLractoRrs(ML1 in Fig. 4.1) will first be discussed.

Then correction of absorption and Rayleigh, whereLracwas predicted fromLt(ML2), and full AC ofLttoRrs(ML3) will be discussed. As the main focus in this study was on ML1, the optimization and discussion of the results from approach ML2 and ML3 will not be described with the same detailed level as for ML1.

5.1.1 Atmospheric Correction of L

_rac

to R

_rs

In this section, the results and discussion of AC of Rayleigh and absorption corrected radi-ance with all the four ML models will be presented, where the goal was to predict theRrs

fromLrac(ML1). The choice of the ML specific hyperparameters and Savitzky and Go-lay filters for this approach was based on the discoveries presented in section 4.4, where hyperparameter optimization based on AC ofL_racwas done. The angle pre-processing step was also applied. The models are compared against each other concerning the metrics described in Tab. 5.1, time complexity and interpretation capability. One sample of the inputL_rac(λ)and outputR_rs(λ)is shown in Fig. 5.1 to illustrate the spectral differences.

Table 5.1:Metrics used for validation with abbreviation, formula and interpretation.XiandYiare the predicted (forecasted) and simulated (actual) data, respectively. N is the number of data points that is validated. me-asures the linear correlation between

two variables X and Y. The value ranges from -1 to 1, where -1 is total negative linear correlation, 0 is no linear correlation, and 1 is total positive linear correlation.

APD 1

Averaged percentage difference is the average of the absolute value of the

relative change.

Bias is the average of percentage errors, by which forecasts of a model differ from actual values

of the quantity being forecast.

RMSD

s PN

i=1(Xi−Yi)² N

Root mean squared difference is an accuracy measure given as the square

root of the mean of the squares of the deviations. It is sensitive to outliers and depends on the

scale of the numbers used.

NRMSD RMSD

Ymax−Ymin

Normalized RMSD is the RMSD divided by the difference of the max and minimum value. This property is not scale dependent

and would be useful when comparing different bands.

5.1 Atmospheric Correction Validation and Discussion

400 450 500 550 600 650 700 750 800

Wavelength [nm]

0.000 0.005 0.010 0.015 0.020 0.025

Sp ec tra

L

rac

( ) R

( )

Figure 5.1:One sample of inputLrac(λ)and outputRrs(λ)to illustrate the spectral differences.

Validation based on metrics and time complexity

The training data for the ML models included 81 wavelength bands ofLracranging from 400to800 nmin addition to the three sun-target-sensor angles (θ0, θ,∆φ). The predicted output was the remote sensing reflectance for the corresponding 81 wavelength bands.

Mean metric values for each ML model were calculated and the optimal results are shown in Tab. 5.2. Tab. 5.2 also shows the number of data points (Ntrain) used for training and time for fitting the models (Tfit) and predicting the output (Tpred). The prediction time was the time it would take to predict all the validation data divided by the number of validation samples (N_val). The results for the corresponding metrics for each wavelength band are also shown in Appendices A.1, A.2, A.3 and A.4.

Table 5.2:Optimal results for AC of Rayleigh and absorption corrected TOA radiance (Rrac) with NN, PLSR, SGDR and SVR based on the mean of different metrics. In addition, time to fit the model (Tfit), time to predict the output (Tpred) and the number of training data (Ntrain) are given.

Metrics NN PLSR SGDR SVR (rbf) SVR (lin)

R² 0.999 0.974 0.968 0.995 0.968

APD[%] 4.42 34.1 22.03 21.4 32.1

Bias[%] -0.40 9.20 5.70 3.62 11.8

NRMSE 0.045 0.197 0.223 0.080 0.229

Tfit[s] 675 166 424 1158 19743

Tpred[s/N] 1.3×10⁻³ 1.03×10⁻⁴ 1.18×10⁻² 2.75×10⁻² 9.66×10⁻³

Ntrain 91702 91702 91702 14250 14250

The NN model showed best results for all the metrics (Tab. 5.2). The linear models PLSR, SGDR and SVR(lin) performed very similarly for the metrics, even though PLSR per-formed a bit better for training and fitting time. SVR(rbf) gave better results on R², Bias and NRMSE than all the linear models, with R² and NRMSE values close to what was achieved with NN. In general, the two non-linear models NN and SVR (rbf) performed better for all the metrics than the linear models.

The training time for SVR was much higher than for all the other models when consider-ing that it only trained on15.5 %of the data compared to the other ML models, especially with the linear kernel. However, with much less data, the SVR(rbf) model still got metric values better than the linear models. PLSR was the best with respect to training time (Tfit) and prediction time (Tpred). Nevertheless, the training time of NN could be reduced to comparable times as the linear models, and still give better results concerning the metric values. The time of fitting the NN depended on how many epochs used in the training and the batch size. The NN would have a trade off on accuracy and training time, where higher training time would increase the accuracy. However, once the NN model is fully trained, it can be run very fast, and AC could be done without training the model each time. This would also be the case for the other linear models. SVR on the other side, might spend even more time on the training, as the input data would have to be changed with the non-linear kernel functions. This can also be seen from Tab. 5.2 where SVR(rbf) has the highest prediction time (T_pred=2.75×10⁻²s/N).

Fig. 5.2 presents scatterplots of predicted and simulatedRrsfor wavelength band400, 500,600and700 nm. These four wavelength bands were chosen as representative bands to illustrate the responses for individual bands in different parts of the wavelength range.

In the figure, each row represents results from one ML model and each column represents one of the four wavelength bands. The orange lines represent where the predicted and sim-ulated data were the same and would indicate R²equal to 1.0 and a perfect prediction. The wavelength bands and the corresponding metrics R², APD, and NRMSD are highlighted in the bottom right and top left in each plot, respectively. The red and blue dots represent data points classified as Case 1 and Case 2 waters. There are no clear definitions of how to sep-arate the data into Case 1 and Case 2 waters based on the remote sensing reflectance, but for this study, the criterion of dividing theR_rs(λ)into the two different cases were based on setting spectra withR_rs(665)< 0.0005 to Case 1 and the rest to Case 2, as done in [33].

From the scatterplots one can observe that the predictions closest to the optimal orange line are found for NN and SVR, which fit with the results shown in Tab. 5.2 where NN and SVR have the highest values of R². Also, the pattern of the scatterplots for PLSR and SGDR looks very similar, fitting the observation from the similar metric values in Tab.

5.2. What is similar for all the ML models is that the R²is lowest for wavelength band 400 nm. Besides, one can observe that the different metric values NRMSD, APD, and R² vary with the different wavelength bands. The best values for APD and NRMSD are, for more or less for all the ML models, found at wavelength band500 nm. The performance of predicting the different bands varies for each ML models and should be investigated further.

5.1 Atmospheric Correction Validation and Discussion

Figure 5.2:Scatterplots of predicted and simulatedRrs(λ)fromLrac(λ)for wavelength band 400, 500, 600 and700 nm(indicated with text), with corresponding R², APD and NRMSD values.

Red and blue dots represent Case 1 and Case 2 waters, respectively.

In document Atmospheric Correction over Coastal Waters Based on Machine Learning Models (sider 117-121)