Stability with respect to tiny worst-case perturbations

Below follows the description of the test applied to some of the networks where we detect instabilities with respect to tiny perturbations.

I. On instabilities of deep learning in image reconstruction

Originalx x+r1 Original ˜x x˜+r2

MRI-VNf(Ax) MRI-VN

f(A(x+r1)) MED 50 ˜f( ˜A˜x) MED 50 f˜( ˜A(˜x+r2))

SoA fromAx SoA fromA(x+r1) SoA from ˜A˜x SoA from ˜A(˜x+r2)

Figure I.4: Perturbationsr1, r2 (created to simulate worst-case effect) are added to the images xand ˜x in the first row. The reconstructions by the network f (MRI-VN), from Ax and A(x+r1), and the network ˜f (MED 50), from A˜x˜ and ˜A(˜x+r2) are shown in the second row. A is a subsampled discrete Fourier transform and ˜A is a subsampled Radon transform. State-of-the-art (SoA) comparisons are shown in the last row.

Deep MRI: In this test we perturb the image x with a sequence of perturbations {rj}³_j=1 with |r1| < |r2| < |r3| in order to simulate how the instabilities continuously transform the reconstructed image from a very high quality reconstruction to an almost unrecognisable distortion. This is illustrated in the lower row of FigureI.1. Note that the perturbations are almost invisible to the human eye as demonstrated in the upper row of Figure I.1. The r_j perturbations are created by early stopping of the algorithm iterating to solve for the optimal worst case perturbation. The purpose of this experiment is to demonstrate how the gradual change in perturbation create artefacts that may be hard to verify as non-physical. Indeed, the worst case perturbationr3causes clearly a reconstruction that, in a real world situation, can be dismissed by a clinician as non-physical. However, for the smallestr1we have a perturbation 44

Stability with respect to tiny worst-case perturbations

Originalx1+r1 Originalx1+r1(zoomed) Ell 50f1(A1x1+r1) SoA fromA1(x1+r1)

Originalx2+r2 Originalx2+r2(zoomed) DAGANf2(A2x2+r2) SoA fromA2(x2+r2)

Originalx3+r3 Originalx3+r3(zoomed) MRI-VNf3(A3x3+r3) SoA fromA3(x3+r3)

Originalx4+r4 Originalx4+r4(zoomed) Deep MRIf4(A4x4+r4) SoA fromA4(x4+r4)

Ell 50/Med 50 DAGAN VN-MRI Deep MRI

Figure I.5: First four rows: Imagesx_j plus structured perturbations r_j (in the form of text and symbols) are reconstructed from measurementsy_j=A_j(x_j+r_j) with neural networksf_jand state-of-the-art (SoA) methods. The networks are: f₁= Ell 50,f₂= DAGAN,f₃= MRI-VN,f₄= Deep MRI. The sampling modalitiesA_j are: A₁is a subsampled discrete Radon transform,A₂is a subsampled discrete Fourier transform (single coil simulation),A₃is a superposition of subsampled discrete Fourier transforms (parallel MRI simulation with 15 coils elements),A₄is a subsampled discrete Fourier transform (single coil).

Note that Deep MRI has not been trained with images containing any of the letters or symbols used in the perturbation, yet it is completely stable with respect to the structural changes. The same is true for the AUTOMAP network (see first column of FigureI.3). Last row: The figures show PSNR as a function of the subsampling rate for different networks. The red line indicates the subsampling ratio that the networks were trained for.

I. On instabilities of deep learning in image reconstruction

that is completely invisible to the human eye, yet it results in a reconstruction that is hard to dismiss as non-physical, and provides an incorrect representation of the actual image. Such examples could potentially lead to incorrect medical diagnosis. Note that state-of-the-art methods are not affected by the perturbation as demonstrated in the rightmost column of FigureI.1. However, although this network is highly unstable with respect to certain tiny perturbations, it is highly stable with respect to small structured changes, see the 4th row of FigureI.5.

Note also that the instabilities are actually stable. In particular, in FigureI.2 we demonstrate how a random Gaussian perturbation added to the perturbation r₁still yields a substantial artefact (see also §I.A.1.8in SI).

AUTOMAP:This experiment is similar to the one above, however, in this case we add ˜r1, . . . ,˜r4to the measurementsy=Ax, where|˜r1|<|r˜2|<|r˜3|<|˜r4|and Ais a subsampled discrete Fourier transform. In order to illustrate how small the perturbations are we have visualised|x+rj|in the first row of FigureI.3, where y+ ˜rj =A(x+rj). To emphasise how the network reconstruction completely deforms the image we have, inspired by the second test on structural changes, added a small structural change in form of a heart that gradually disappears completely in the network reconstruction. This is demonstrated in the second row of FigureI.3, and the third row of FigureI.3contains the reconstruction done by a state-of-the-art method. Note that the worst case perturbations are completely different to the ones failing the Deep MRI network. Hence, the artefacts are also completely different. These perturbations are white-noise like and the reconstructions from the network provide a similar impression. As this is a standard artefact in MRI, it is, however, not clear how to protect against the potential bad tiny noise. Indeed, a detail may be washed out, as shown in the experiment (note the heart inserted with slightly different intensities in the brain image), but the similarity between the standard artefact may make it difficult to judge that this is an untrustworthy image.

MRI-VN: In this case we add one perturbation r1 to the image, wherer1

is produced by letting the algorithm searching for the worst perturbation run until it has converged. The results are shown in the first two columns of Figure I.4, and the conclusion is the same for the MRI-VN net as for Deep MRI and AUTOPMAP; perturbations barely visible to the human eye, even when zooming in, yield substantial misleading artefacts. Note also that the perturbation has no effect on the state-of-the-art-method.

Med-50: Here we add a perturbationr₂ that is also produced by running the algorithm until it has converged, and the results are visualised in the last two columns of Figure I.4. The Med-50 network is moderately unstable with respect to tiny perturbations compared to Deep MRI, AUTOMAP and MRI-VN, however, severe artefacts are clearly seen. It is worth noting that this network is used for the Radon transform, which is, from a stability point of view, a more unstable operator than the Fourier transform when considering its inverse.

Stability with respect to small structural changes

In document Stability and accuracy in compressive sensing and deep learning (sider 51-55)