Convex Variational Methods and Optimization Techniques for Image Processing

(1)

T reba ll F ina l de G rau

GRAU DE MATEMÀTIQUES

Convex Variational Methods and Optimization Techniques for Image

Processing

RAMON OLIVER BONAFOUX

Tutors

Bartomeu Coll Vicens Joan Duran Grimalt

Escola Politècnica Superior

Universitat de les Illes Balears

(2)

(3)

C ^ONTENTS

Contents i

Preface iii

1 Introduction 1

2 Functional Analysis 3

2.1 Banach spaces . . . 3

2.1.1 Linear operators . . . 5

2.1.2 Dual spaces . . . 8

2.2 Hilbert spaces . . . 8

2.2.1 Main properties of Hilbert spaces . . . 9

2.2.2 Adjoint operators . . . 12

2.3 L^pspaces . . . 16

2.3.1 Definition ofL^pspaces and main properties . . . 16

2.3.2 Duality, reflexivity and separablity . . . 17

2.3.3 L^p((0,T),X) spaces . . . . 19

2.4 Compactness and weak topologies in normed spaces . . . 19

3 The Calculus of Variations 29 3.1 Differential Calculus for normed spaces . . . 30

3.2 Introduction to Mathematical image processing . . . 32

3.3 Functional spaces . . . 34

3.3.1 Sobolev spaces . . . 34

3.3.2 The space of functions of bounded variation (BV) . . . 38

3.4 The Direct Method of the Calculus of Variations . . . 39

3.4.1 Coercivity, convexity and lower semicontinuity . . . 39

3.4.2 The direct method . . . 43

3.4.3 Integral functionals . . . 43

3.5 Tikhonov regularization for image processing problems . . . 44

4 Convex Analysis and Continuous Optimization 49 4.1 Subdifferentiability . . . 50

4.2 Legendre-Fenchel conjugate and proximal map . . . 55

(4)

ii CONTENTS

4.2.1 Maximal monotone operators . . . 57

4.2.2 Proximal operator . . . 57

4.3 Convex duality . . . 59

4.4 Proximal gradient algorithms . . . 61

4.4.1 Gradient descent algorithms . . . 61

4.4.2 Primal-dual algorithm . . . 61

5 Applications to image processing 63 5.1 The variational model . . . 63

5.2 Primal-dual optimization . . . 67

5.2.1 The computation of the primal variableuⁿ⁺¹ . . . 68

5.2.2 The computation of the dual variablespⁿ⁺¹andqⁿ⁺¹ . . . 69

5.3 Experimental results . . . 73

5.3.1 The denoising problem . . . 74

5.3.2 Interpolation . . . 75

6 Conclusions 77 A L^pspaces 79 A.0.1 Basic concepts . . . 79

A.0.2 Definition ofL^pspaces and main properties . . . 82

A.0.3 Duality, reflexivity and separablity . . . 85

B The Direct Method for integral functionals 91 B.0.1 Existence and uniqueness. . . 92

B.0.2 Euler-Lagrange equations . . . 95

C Codes used 99

Bibliography 117

(5)

P ^REFACE

Aquest document ha estat elaborat com a Treball de fi de Grau en Matemàtiques a la Uni- versitat de les Illes Balears, Espanya. El treball té un pes de 12 ECTS dins el pla d’estudis de la titulació. El mateix constitueix un epílog en la formació de Grau en Matemàtiques, de 240 ECTS, mitjançant el qual l’estudiant ha de demostrar que ha assolit els coneixements necessaris per obtenir el títol.

Més en concret, el present treball neix de l’interès de l’autor en aprofundir en algunes àrees de l’anàlisi matemàtica i aplicar-les a uns problemes concrets de processament d’imatges. En particular, l’estructura del treball és la següent:

• Al Capítol 2, es tracten temes generals d’anàlisi funcional, necessaris per la correcta comprensió de la resta del treball.

• Al Capítol 3, s’estudien els conceptes bàsics del càlcul de variacions, tot motivant el fort lligam que existeix amb el processament matemàtic d’imatges.

• Al Capítol 4, s’introdueix l’anàlisi convexa i algunes tècniques d’optmització. En particular, es detalla l’algorisme que es fa servir a la posterior aplicació a problemes d’imatges.

• Al Capítol 5, es proposa un nou model variacional per a la resolució de problemes de tractament d’imatges. A continuació, s’exposen els resultats experimentals obtinguts.

• A l’Apèndix A s’amplien els conceptes donats al treball en relació amb els espaisL^p.

• A l’Apèndix B es dóna una ampliació sobre l’estudi detallat dels funcionals integrals.

• A l’Apèndix C, es mostren els codis elaborats per realitzar l’experimentació que s’ha proposat.

El treball ha estat escrit amb anglès, volent així utilitzar la que ha esdevingut la llengua de comunicació internacional per continguts de caràcter científic. L’autor vol fer notar que és el primer document de llarga extensió que escriu en aquest idioma, esperant així que la falta de destresa en la redacció quedi, en certa manera, explicada i atenuada.

El cos teòric del treball ha estat elaborat combinant i sintetitzant continguts extrets de molt diverses referències biblogràfiques. Algunes parts, com demostracions senzilles que als llibres es deixen com problemes a resoldre pel lector, han estat elaborades per l’autor. Així mateix, l’autor ha desenvolupat els càlculs necessaris per poder estudiar l’aplicació donada al Capítol 5, dedicant-se també a la implementació dels algorismes necessaris i a la obtenció

(6)

iv PREFACE dels resultats experimentals. El llenguatge de programació utilitzat ha estat C++. El model proposat és nou i es projecta publicar-lo, així com alguns resultats, en algun àmbit de recerca.

En darrer lloc, l’autor demana disculpes per haver sobrepassat l’extensió recomanada per l’Escola Politècnica Superior per a l’elaboració del Treball de fi de Grau. Una vegada s’havia elaborat el treball, l’autor va considerar que suprimir parts del text no era possible sense tallar el fil conductor del treball.

A Palma de Mallorca, el 27 de juny de 2018, l’autor, Ramon Oliver.

(7)

C

HAPTER

1

I NTRODUCTION

Image processing problems have emerged as essential in our society. Indeed, in a world where technology and computers play a central role, images are always present in our everyday life.

Therefore, the necessity of improving the quality of certain images, such as medical images or remote sensing images, has called the attention of contemporary mathematics.

Several mathematical techniques can be used in order to process an image. Probability and analysis are among the most used for this purpose. In this work, we will consider an approach based on the Functional Analysis, in particular the Calculus of Variations, Continuous Optimization and Convex Analysis.

The Calculus of Variations, which arises as an alternative tool for the resolution of Partial Differential Equations, is the study of minimization problems in spaces of arbitrary dimension.

Investigating the existence and uniqueness of solutions, as well as their regularity, for a wide class of different minimization problems is the main concern of this area of Mathematics. Its applications, which are vast, include several topics of natural sciences and engineering.

In particular, the Calculus of Variations suits very well in image processing problems. Given an observed image to be improved, the aim is to define an appropriate minimization problem.

This problem is designed by taking into account some prior knowledge of the desired solution.

This method can be very well understood as an energy method for physics and engineering:

The further is an image from the expected form, the higher is the energy that we assign it, meaning that it will not be an optimal, minimum energy solution of the problem.

As one can expect, choosing the space where we consider the possible solutions is also a relevant issue. Indeed, not only the mathematical techniques used in each case will be different, but also the solutions we can expect in each case differ from one another. As a consequence, choosing the proper space to seek for the desired image is a central topic which motivates fruitful mathematical discussions.

Once existence, uniqueness and smoothness of solutions have been established, the

(8)

1. INTRODUCTION

next step is to compute such solutions. As happens in the field of Differential Equations, there is no hope for finding an explicit solution for most of the variational models, even if the existence of solutions has been guaranteed. Indeed, we only know that the solution of the variational problem satisfies the associated Euler-Lagrange equation, which is a partial differential equation. This means that we need to introduce algorithmic procedures in order to obtain approximations to the solution by generating a sequence of iterates.

In this context, the appearance of Convex Analysis in this work is highly justified. Since images contain discontinuities and regions which are non differentiable, we need to design efficient minimization strategies that allow these features. Convex Analysis allows generalizing the geometric concept of tangent line to non differentiable curves, which means that we can extend in the desired sense the necessary conditions for optimal solutions from differentiable functions to a wider class of functions.

Moreover, this approach made possible the design of several numerical algorithms suc- cessfully applied in image processing. As any numerical method, the results obtained by such algorithms are discrete, which may be an issue if the nature of the original problem is continuous. However, image processing problems are discrete, since images are composed of a finite number of pixels. As a consequence, the gap between the continuous and the discrete settings is reduced in imaging.

Lastly, the previous theoretical work would not be complete without showing some applications to image processing. We will consider the problems of denoising and conditioned interpolation of digital images. We will propose several variational models to solve them and perform several experimental results to compare their efficiency.

2

(9)

C

HAPTER

2

F ^UNCTIONAL A ^NALYSIS

Functional Analysis is vitally necessary for the development of this work. Banach spaces, Hilbert spacesand, particularly,L^pspacesconstitute the core where the Calculus of Variations takes place. In this chapter, we present the basic tools of Functional Analysis, restraining ourselves to the strictly necessary topics for this project.

2.1 Banach spaces

Banach spaces emerge when avector spaceendowed with a norm (usually called anormed vector spaceor, sometimes, just anormed space) is Cauchy-complete according to this norm.

This property is very useful because, in this case, the convergence of a sequence depends only on the terms of the sequence itself. This also means that a Banach space does not have any

“hole”, in the sense that it contains every cluster point of every subset. Just as metric spaces, there is a canonical procedure to extend a normed space to the “smallest” Banach space that contains it. This extension preserves the behavior of the norm.

On the other hand, Banach spaces have an important feature: while vector spaces studied in Linear Algebra are, in the vast majority of cases, finite dimensional, a number of interesting examples of Banach spaces are infinite-dimensional. This fact creates important differences between these two areas.

In particular, every finite-dimensional normed space is a Banach space and every linear operator is continuous with respect to the topology induced by the norm. These properties do not hold in general for infinite-dimensional normed spaces. Furthermore, every bounded and closed set is a compact set in a finite-dimensional space, which is not true for infinite- dimensional spaces.

We note that the notions ofsequenceandconvergenceare central tools in the study of these spaces. Our aim in this section is to give an overview of the main properties of Banach spaces.

(10)

2. FUNCTIONALANALYSIS

Definition 2.1(Banach space). Let(X,k·k)be a (real or complex) normed vector space.(X,k·k) is aBanach spaceif(X,d)is a (Cauchy-)complete metric space, where d is the metric induced by the norm.

Remark 2.1. From now on, we denote by E the scalar field, which will always beRorC. We now provide some examples of Banach spaces.

Example 2.1. ConsiderCⁿequipped with the normk·kp:Cⁿ→Randp≥1. Define kxkp=

Ã _n X

i=1

|xi|^p

!1/p

, ∀x∈Cⁿ.

It can be checked that (Cⁿ,k·kp) is a Banach space. SinceRⁿis a closed subset ofCⁿ, it is also a complete metric space and, hence, a Banach space.

The same can be said forCⁿendowed with the norm kxk∞= max

1≤x≤n|x_i|, ∀x∈Cⁿ.

All these spaces are finite-dimensional. 4

Example 2.2. Leta,b∈R,a<b, then the space of the real-valued functionsC([a,b]) equipped with the supremum norm is an infinite-dimensional Banach space.

In general, whenΩ⊂Rⁿ is bounded and closed, the setC(Ω) is a Banach space when endowed with the supremum norm.

The completeness of the spaces previously defined is given by the fact that every uniform

limit of continuous functions is a continuous function. 4

Example 2.3(Total Variation). Considerf : [a,b]→Rand letP={xi}^k_i₌₀be a partition of [a,b].

Thevariation of f over Pis

V(f;P)=

k

X

i=0

|f(xi+1)−f(xi)|. Thetotal variation of f over[a,b] is defined as

T V(f)=sup

P∈PV(f;P),

whereP is the set of all possible partitions of [a,b]. IfT V(f)< ∞, we say thatf is ofbounded variation. The set of all functions of bounded variation over [a,b] is denoted byBV([a,b]).

It is straightforward to show thatT V(f)=0 if and only iff is a constant function. Therefore, it is also clear thatT V is a seminorm, but not a norm, over the space of bounded variation functions. In order to obtain a norm, we just need to define

kfk =T V(f)+ |f(a)|,

which solves the problem. It can be shown that if f ∈C¹([a,b]), then T V(f)=

Z b

a |f⁰(x)|d x< ∞, 4

(11)

2.1. Banach spaces which explains why this magnitude is called “total variation”. As a consequence of this property, if we define (following [16])f1,f2: [0, 1]→Rsuch that

f1(x)=xsin µ1

x

¶

and f2(x)=x²sin µ1

x

¶ ,

then it can be proved thatT V(f1)= ∞butT V(f2)< ∞. This fact shows that there exist functions which are bounded in an interval but whose variation is not bounded, as illustrated

in Figure 2.1. 4

Figure 2.1: Plot of the functionsf1and f2on the interval [0, 1]. It can be seen howf1“varies”

much more thanf2in [0, 1]. This justifies thatT V(f1)= ∞andT V(f2)< ∞.

2.1.1 Linear operators

Once the structure has been defined, the next thing to do is the study of mappings that preserve this structure. In this case, these correspondences are bounded linear operators and their main properties will be presented in this section.

Definition 2.2(Bounded operator). Let(X,k·kX)and(Y,k·kY)be normed spaces and A:X→ Y a linear mapping. It is said that A is abounded operatorif there exists K >0such that

(12)

kAxkY ≤KkxkX,∀x∈X . If A is a bounded operator, we can definekAk_L(X,Y)=sup_x₆₌₀^k_k^Ax_x_k^k^Y

X , which is a norm on the space

L(X,Y)={A:X→Y, A linear and bounded}.

Remark 2.2. When there is no possible confusion, we will omit subscripts in the norms. In this case, we will also write X instead of(X,k·kX).

Proposition 2.1. The statements below are equivalent:

(1) A is continuous.

(2) A is continuous at some point.

(3) A is continuous at 0.

(4) A is uniformly continuous.

(5) A is bounded.

(6) If M⊂X is bounded, then A(M)⊂Y is bounded.

Proof. We prove the only implication which is not straightforward, which is (1)⇒(5).

LetAbe a continuous operator. Suppose thatAis not bounded, that is, for alln∈Nthere existsx_n ∈X\ {0} such thatkAx_nk >nkx_nk. Equivalently, definey_n =_nkx^xⁿ_n_k, thenkAy_nk =

kAxnk

nkx_nk>1 for alln∈N. Furthermore,kynk =_n¹ for alln∈N, which implies thatyn→0. By the continuity ofA,Ay_n→0, but this is not possible becausekAy_nk >1 for alln∈N.

We show now that every linear operator is bounded provided that the base space is finite dimensional.

Remark 2.3. From Proposition 2.1, if A:X→Y is a linear operator anddimX = n < ∞, then A is necessarily bounded.

Indeed, let{x1, . . . ,xn}be a basis of X . If zm→0, where zm=Pn

i=0α^m_i xi, thenα^m_i →0 for all i= 1, . . . ,n because{x₁, . . . ,x_n}is a linearly independent subset of X . Therefore,

Azm=A Ã _n

X

i=0

α^m_i xi

!

=

n

X

i=0

α^m_i Axi→0, from which we conclude that A is a bounded operator.

However, the following example shows that Remark 2.3 does not hold for infinite-dimensional normed spaces.

Example 2.4. Consider the spaceC([0, 1]) equipped with the correspondencek·k1:C([0, 1])→ Rsuch that

kfk1= Z 1

0 |f(x)|d x, ∀f ∈C([0, 1]).

6

(13)

2.1. Banach spaces Clearly,k·k1defines a norm onC([0, 1]). In fact, the same holds for the spaceC¹([0, 1]). This norm, which is called theL¹norm, will be one of the central objects of study of this work.

Consider thedifferential operator D:C¹([0, 1])→C([0, 1]) such that D(f)= f⁰ for all f ∈C([0, 1]). Clearly,D is well defined and linear. Take now the sequence {pn}_n∈N⊂C¹([0, 1]) such thatpn(x)=xⁿfor everyx∈X andn∈N. Notice that

kpnk1= Z 1

0

xⁿd x= 1

n+1→n0, meaning that {pn}_n∈Nconverges to 0 inC¹([0, 1]) with theL¹norm.

On the other hand,

kD(pn)k1= Z 1

0 nxⁿ⁻¹d x=1,

therefore {D(pn)}n∈Ndoes not converge to 0 inC([0, 1]), showing that the differential operator is not bounded.

Notice that the linear andunboundedoperatorDis far from being a pathological example and it appears in many problems. Therefore, the study of unbounded operators turns out to be very important in several areas of Analysis.

Figure 2.2: Plots ofp₁,p₅(left) andD(p₁),D(p₅) (right). As we can check, the area below the curvespnwill converge to 0 but the area corresponding toD(pn) will not.

4 Theorem 2.1. Let Y be a Banach space and X a normed space. ThenL(X,Y)is a Banach space.

Proof. It is straightforward to show thatL(X,Y) is well-defined as a normed vector space according to the norm given in Definition 2.2.

Let {A_n}_n∈N⊂L(X,Y) be a Cauchy sequence inL(X,Y). For anyx∈X, we have that kAnx−Amxk ≤ kAn−Amkkxk,

which shows that {Anx}_n∈Nis a Cauchy sequence inY and hence converges to somey∈Y sinceY is a Banach space. Accordingly, we can define an operator A :X →Y such that Ax=y=limnAnx.

(14)

• Ais a linear operator because of the linearity of the limit.

• We will prove that {An}_n∈Nconverges to AinL(X,Y). Since {An}_n∈Nis a Cauchy sequence, for allε>0 there existsn0∈Nsuch thatkAn−Amk <εfor alln,m≥n0. There- fore, ifn≥n₀

kAnx−Axk = klim

m (Anx−Amx)k =lim

m k(An−Am)xk ≤lim

m kAn−Amkkxk <εkxk, since the norm is a continuous function. As a consequence,An−A∈L(X,Y), which leads toA∈L(X,Y). We have also showed thatkAn−Ak < εfor alln ≥ n0, and hence {A_n}_n∈Nconverges toA.

2.1.2 Dual spaces

We now introduce adual spaceof an arbitrary normed spaceX, which is the generalization of a linear equation inRⁿ. Indeed, iff is an element (called alinear functional) of the (algebraic) dual space, then for anyb∈Rthe set {x∈X:f(x)=b} is defined as an hyperplane ofX and is an extension of the concept of hyperplane inRⁿ. That is, the linear expressionPn

i=1αnxn=b is generalized for arbitrary normed spaces.

Definition 2.3(Dual space). Let X be a normed space. A map f :X→E , (E=R,C) is called a functionalof X . Thealgebraic dual spaceof X is the set of all linear functionals. The topological dual spaceof X , denoted as X^?, is the set of all bounded linear functionals on X , that is, X^?=L(X,E).

Remark 2.4. From now on, we will call the topological dual space just the dual space. The topological dual space is a linear subspace of the algebraic dual space. If the space is finite dimensional, both dual spaces coincide.

Finally, we introduce the notions ofreflexiveandseparablespace. As we will see in Section 2.4, reflexivity and separability have a lot of nice properties regarding the convergence of sequences.

Definition 2.4(Bidual and reflexivite space). Let X be a normed space, thebidual spaceof X is the dual space of X^?, denoted by X^??. If X ∼=X^??as normed spaces, then X is said to be reflexive.

Definition 2.5(Separability). A normed space X isseparableif it contains a countable dense subset.

2.2 Hilbert spaces

Like in the finite-dimensional case, in Functional Analysis a whole new range of properties arise when a Banach space also posses an inner product, becoming aHilbert space. It is a well known fact that every inner product defines a norm, but the converse is not true in general.

8

(15)

2.2. Hilbert spaces The study of Hilbert spaces generalizes the geometric theory ofRⁿas an Euclidean space. In fact, a Hilbert space is somewhat an Euclidean space of arbitrary dimension. Furthermore, it is possible to extend the notion oforthogonalityto infinite dimensions, which adds considerably more structure to the space, and introduce infinitebasis, which are very useful for the study of Fourier series.

2.2.1 Main properties of Hilbert spaces

Definition 2.6(Hilbert space). Let X be a real or complex vector space.(X,〈·,·〉)is apre-Hilbert spaceif〈·,·〉:X×X→E is an inner product. If(X,k·k)is a Banach space with the induced norm k·k =p

〈·,·〉, then(X,〈·,·〉)is called aHilbert space.

Remark 2.5. Given a normed space X and a pre-Hilbert space Y , thenL(X,Y)with

∀A,B∈L(X,Y), 〈A,B〉(x)= 〈Ax,B x〉, ∀x∈X,

is a pre-Hilbert space. Moreover, if Y is a Hilbert space, thenL(X,Y)is also a Hilbert space due to Theorem 2.1.

Example 2.5. ConsiderCⁿwith the inner product

〈x,y〉 =

n

X

i=1

xiyi, ∀x,y∈Cⁿ. Sincep

〈·,·〉 = k·k2, by Example 2.1 we obtain that (Cⁿ,k·k2) is a Hilbert space. Moreover,Rⁿ

with the same inner product is also a Hilbert space. 4

Example 2.6. ConsiderC([a,b];C) with the inner product

〈f,g〉 = Z b

a f(x)g(x)d x, ∀f,g∈C([a,b];C),

which defines a pre-Hilbert space. However, it is possible to check that this space is not complete by consideringa= −1,b=1 and the sequence {fn}_n∈Nsuch that

f_n(x)=











0 ifx∈[−1, 0], nx ifx∈¡

0,¹_n¢ , 1 ifx∈£₁

n, 1¤ . By taking the Riemann integrable functionf : [−1, 1]→Rsuch that

f(x)=

(0 ifx∈[−1, 0], 1 ifx∈(0, 1], then

Z 1

−1

(f(x)−fn(x))²d x= Z ¹

n

0

(1−nx)²d x= 1 3n→0.

This implies that {fn}n∈Nis a Cauchy sequence which does not converge in (C([−1, 1];C),〈·,·〉)

becausef does not belong to this space. 4

(16)

The following result gives necessary and sufficient conditions for a norm to arise from an inner product.

Proposition 2.2. Let(X,〈·,·〉)be a pre-Hilbert space. Then, the following identities hold:

a) The parallelogram law:

kx+yk²+ kx−yk²=2(kxk²+ kyk²).

b) The polarization identity:

Re(〈x,y〉)=1

4(kx+yk²− kx−yk²) and

I m(〈x,y〉)=1

4(kx+i yk²− kx−i yk²).

In the real case, this identity becomes

〈x,y〉 =1

4(kx+yk²− kx−yk²).

Example 2.7. ConsiderC([a,b]) endowed with the norm kfk∞= max

x∈[a,b]|f(x)|, ∀f ∈C([a,b]).

The parallelogram law allows us to show that (C([a,b]),k·k∞) is not a pre-Hilbert space. Indeed, takef,g∈C([a,b]) such that

f(x)=1 and g(x)=x−a

b−a, ∀x∈[a,b].

Notice thatkf +gk²_∞=4,kf−gk²_∞=1 andkfk²_∞= kgk²_∞=1, which means that the parallelo-

gram law does not hold. 4

Proposition 2.3(Cauchy-Schwarz inequality). Let X be a pre-Hilbert space, then

|〈x,y〉| ≤ kxkkyk, ∀x,y∈X. The equality holds if and only if x and y are linearly dependent.

Proof. Ify=0, then the equality holds. Ify6=0, then we have that

0≤ kx−αyk²= 〈x,x〉 −α〈x,y〉 −α(〈y,x〉 −α〈y,y〉), ∀α∈E.

In particular, ifα= 〈y,x〉/〈y,y〉 = 〈x,y〉/〈y,y〉, then 0≤ 〈x,x〉 −〈y,x〉

〈y,y〉〈x,y〉 = kxk²−|〈x,y〉|² kyk² , from which we obtain the inequality.

If the equality holds, thenx−αy=0, which means thatxandyare linearly dependent.

Conversely, if there existsα6=0 such thatx=αy, then a simple substitution shows that the equality holds.

10

(17)

2.2. Hilbert spaces In a finite-dimensional space equipped with an inner product, projecting a vector into a subspace is an usual procedure. The following result shows that this can also be done in an infinite-dimensional Hilbert space as long as the subspace isclosed. Completeness turns out to be a key condition for this.

Theorem 2.2(Projection). Let X be a Hilbert space and M⊂X a closed (linear) subspace. For each x ∈X , consider d =inf{kx−yk:y ∈M}. Then, there exists a unique y ∈M such that kx−yk =d . Furthermore, x−y∈M^⊥.

Proof. Sincedis the infimum of the set {kx−yk:y∈M}, there exists {yn}n∈N⊂Msuch that kx−ynk →d. Using the parallelogram law of Proposition 2.2, we have that

2kyn−xk²+2kym−xk²= kyn+ym−2xk²+ kyn−ymk², ∀m,n∈N. Since 1/2(yn+ym)∈M,

kyn+ym−2xk²=4°

°

yn+ym

2 −x°

°

2

≥4d², (2.1)

and therefore

kyn−ymk²≤2kyn−xk²+2kym−xk²−4d²,

which implies that {yn}n∈Nis a Cauchy sequence. SinceXis a Hilbert space, {yn}n∈Nconverges to somey∈X. BecauseMis closed and {yn}n∈N⊂M, we havey∈M. This is the element ofM we look for because

ky−xk = klim

n yn−xk =lim

n kyn−xk =d.

To prove the uniqueness, lety0∈Mbe such thatkx−y0k =d. By using again the parallelogram law and (2.1), we have

ky−y0k²= k(y−x)−(y0−x)k²=2ky−xk²+2ky0−xk²− k(y−x)+(y0−x)k²

=4d²−4°

°

° y+y0

2 −x°

°

2

≤4d²−4d²=0,

which is equivalent toy=y0. We will now prove thatz=x−y∈M^⊥. Lety1∈M,y16=0 be fixed but arbitrary, and defineα=_〈^〈_y^z,y₁_,y¹₁^〉_〉. Then,

kz−αy1k²= 〈z,z〉 −α〈z,y1〉 −α(〈y1,z〉 −α〈y1,y1〉)= 〈z,z〉 − 〈y1,z〉

〈y1,y1〉〈z,y1〉

= kzk²−|〈y1,z〉|

ky₁k² ≤ kzk²= kx−yk²=d². (2.2) Sincey−αy1∈M, we have that

kz−αy₁k²= k(y−αy₁)−xk²≥d². (2.3) Combining (2.2) and (2.3), we conclude thatkz−αy1k =d. Notice that we have showed that the unique element that minimizes the distance fromxtoMisy. This means thaty=y−αy1, from which we getα=0 and thus〈z,y₁〉 =0, concluding the proof.

(18)

Remark 2.6. In the case that M is a closed convex set, the first part of Theorem 2.2 still holds.

SinceM∩M^⊥={0}, a straightforward consequence of Theorem 2.2 allows writing a Hilbert space as the direct sum of a closed subspace and its orthogonal.

Corollary 2.1. Let X be a Hilbert space and M⊂X a closed subspace, then X=M⊕M^⊥. The following result is a very important and well known property of Hilbert spaces. It shows that all bounded linear functionals can be expressed in terms of the inner product of the base space.

Theorem 2.3(Riesz representation Theorem). A functional f :X →E defined on a Hilbert space is bounded and linear if and only if there exists a unique y∈X such that f(x)= 〈x,y〉 for all x∈X . In other words, if X is a Hilbert space then X^?={〈·,y〉:y∈X}. Furthermore, kfk = kyk.

Proof. Giveny ∈X, consider the functionalfy defined as fy(x)= 〈x,y〉for allx∈X. The linearity offy is trivial. By Cauchy-Schwarz inequality (Proposition 2.3),|fy(x)| ≤ kxkkykand thuskf_yk ≤ kyk, from which we get thatf_y is bounded. We also have|f_y(y)| = kykkyk, so kfyk ≥ kyk. We conclude thatkfyk = kyk.

Let now f be a bounded linear functional onXand defineM=ker(f). Sincef is continuous,Mis a closed subspace ofX. IfM=X, thenf ≡0 and f(x)=0= 〈x, 0〉for allx∈Xand kfk =0. We now consider the caseM6=X. From Corollary 2.1, there existsw∈M^⊥such that βw6∈Mfor allβ6=0. Takeα=f(w)/kwk². We want to prove thatf(x)= 〈x,αw〉for allx∈X:

• Ifx∈M, thenf(x)=0=α〈x,w〉 = 〈x,αw〉.

• Ifx=βwwithβ6=0, thenf(x)=βf(w)=βα〈w,w〉 = 〈βw,αw〉 = 〈x,αw〉.

• For everyx∈X, we have thatx=x−βw+βwfor anyβ∈E. If we takeβ=f(x)/f(w), thenf(x−βw)=f(x)−(f(x)/f(w))f(w)=0. Therefore, we obtain from the two previous cases thatf(x)=f(x−βw)+f(βw)= 〈x,αw〉.

The uniqueness is straightforward since〈x,y〉 = 〈x,y⁰〉for allx∈Ximplies thaty=y⁰. The Riesz representation Theorem allows identifying the dual space of a Hilbert space with the space itself through a linear isometry. As a consequence, the following property follows immediately.

Corollary 2.2. Every Hilbert space is reflexive.

2.2.2 Adjoint operators

Any bounded linear operator in a Hilbert space can be associated with another one, itsadjoint operator. The adjoint operator extends the notion of the transposed of a matrix and has an important role for establishing conditions for the existence of solutions of several optimization problems. The existence and uniqueness of the adjoint operator is guaranteed by the Riesz representation Theorem.

12

(19)

2.2. Hilbert spaces Theorem 2.4. Let X,Y be Hilbert spaces and A∈L(X,Y). Then, there exists a unique A^?∈ L(Y,X), theadjoint operatorof A, such that〈Ax,y〉 = 〈x,A^?y〉for all(x,y)∈X×Y . Moreover, kAk = kA^?k.

Proof. For eachy∈Y consider the functional fy :X→Esuch that fy(x)= 〈Ax,y〉. SinceA is linear and so is the first component of the inner product, fy is linear. Cauchy-Schwarz inequality leads to

|fy(x)| ≤ kAxkkyk ≤(kAkkyk)kxk

and hence fy is also bounded. By using the Riesz representation Theorem, there exists a uniquez_y ∈X such that f_y(x)= 〈x,z_y〉for allx∈X. We defineA^?:Y →X as the operator such thatA^?y=zyfor ally∈Y. Then, we have that〈Ax,y〉 = 〈x,A^?y〉for allx∈Xandy∈Y. Finally, we will prove thatA^?is linear and bounded:

• A^?is linear. Lety,y⁰∈Y, then

〈x,A^?(y+y⁰)〉 = 〈Ax,y+y⁰〉 = 〈Ax,y〉 + 〈Ax,y⁰〉 = 〈x,A^?y+A^?y〉, ∀x∈X, from whichA^?(y+y⁰)=A^?y+A^?y⁰. Let nowα∈E, then

〈x,A^?(αy)〉 = 〈Ax,αy〉 =α〈Ax,y〉 =α〈x,A^?y〉 = 〈x,αA^?y〉, ∀x∈X, from which we conclude thatA^?(αy)=αA^?y.

• A^?is bounded. Simple operations lead to

kA^?yk²= 〈A^?y,A^?y〉 = 〈A(A^?y),y〉 ≤ kA(A^?y)kkyk ≤ kAkkA^?ykkyk,

which means thatkA^?yk ≤ kAkkykfor ally∈Y and thus A^?is bounded and verifies kA^?k ≤ kAk. Analogously, we obtainkAk ≤ kA^?kand we conclude thatkAk = kA^?k.

Notice that the unicity condition of the Riesz representation Theorem guarantees the unicity of the adjoint operator.

Thediscrete gradient, used when discretizing some optimization problems, can be studied as a linear bounded operator defined on a finite-dimensional Hilbert space.

Example 2.8(Discrete gradient). Consider the Hilbert spacesX=Rⁿ^×ⁿandY =X×X. We define thediscrete gradient operator∇:X→Y via forward differences and Neumann boundary conditions. That is, (∇u)i,j=((∇u)¹_i,j, (∇u)²_i,j) with

(∇u)¹_i,j=

(ui+1,j−ui,j ifi<n,

0 ifi=n, and (∇u)²_i,j=

(ui,j+1−ui,j if j<n,

0 if j=n.

It is obvious that∇is a linear operator and, therefore, bounded since all involved Hilbert spaces are finite-dimensional. Theorem 2.4 states that there exists its adjoint operator∇^? : Y → X

(20)

such that〈∇u,p〉 = 〈u,∇^?p〉for allu∈Xandp∈Y. We will prove that thediscrete divergence operatorsatisfies−div= ∇^?and also thatk∇k ≤p

8.

First, letp∈Y be denoted byp=(p¹,p²) withp¹,p²∈X. One checks that

〈−div(p),u〉 = −

n

X

i,j=1

(div(p))i,jui,j

and

〈p,∇u〉 =

n−1X

i,j=1

hp¹_i,j(u_i+1,j−u_i,j)+p²_i,j(u_i,j+1−u_i,j)i +

n−1X

j=1

p_n,j² (u_n,j+1−u_n,_j)

+

n−1

X

i=1

p_i,n¹ (u_i+1,n−u_i,n)=

n−1

X

i,j=2

(p¹_i−1,_j−p_i,j¹ +p²_i,j−1−p_i,j² )u_i,_j

+

n−1

X

j=2

(−p¹_1,j+p²_1,j−1−p²_1,j)u1,j+

n−1

X

j=2

(p_n−1,j¹ +p²_n,j−1−p²_n,j)un,j

+

n−1

X

i=2

(−p¹_i,1+p_i−1,1¹ −p_i,1² )ui,1+

n−1

X

i=2

(−p¹_i,n+p_i−1,n¹ +p²_i,n−1)ui,n

+(−p¹_1,1−p²_1,1)u1,1+(−p_1,n¹ +p²_1,n₋₁)u1,n+(p_n¹₋_1,1−p²_n,1)un,1+(p¹_n₋_1,n+p_n,n² ₋₁)un,n. We thus conclude that the discrete divergence is

(div(p))i,j=











p¹_i,j−p_i−1,j¹ if 1<i<n, p¹_i,j ifi=1,

−p_i−1,j¹ ifi=n, +











p²_i,j−p²_i,j−1 if 1<j<n, p²_i,j ifj=1,

−p²_i,j−1 ifj=n.

Now, we can compute a bound of the norm of this gradient operator. First of all, notice that k∇uk²=

n−1

X

i,j=1

[(ui+1,j−ui,j)²+(ui,j+1−ui,j)²]+

n−1

X

i=1

(ui+1,n−ui,n)²+

n−1

X

j=1

(un,j+1−un,j)²

=4

n−1

X

i,j=1

·µ1

2ui+1,j+1 2(−ui,j)

¶2

+ µ1

2ui,j+1+1 2(−ui,j)

¶2¸

+4

"_n

−1

X

i=1

µ1

2ui+1,n+1 2(−ui,n)

¶2

+

n−1

X

j=1

µ1

2un,j+1+1 2(−un,j)

¶2# .

As a consequence, if we apply that the quadratic real function is convex and rearrange the resulting terms conveniently, we obtain that

k∇uk²≤2

n−1X

i,j=1

[(ui+1,j)²+2(ui,j)²+(ui,j+1)²]+2

n−1X

i,j=1

[(ui+1,n)²+(ui,n)²+(un,j)²+(un,j+1)²]

≤2

n

X

i=2 n

X

j=1

(ui,j)²+4

n

X

i,j=1

(ui,j)²+2

n

X

i=1 n

X

j=2

(ui,j)²≤8kuk², which means thatk∇k ≤p

8. 4

14

(21)

2.2. Hilbert spaces We finally enumerate some properties of adjoint operators.

Proposition 2.4. Let X,Y,Z be Hilbert spaces and A∈L(X,Y), B∈L(Y,Z). Then, the follow- ing properties hold:

i) (A^?)^?=A, ii) (B A)^?=A^?B^?.

Proof. i) For allx∈X andy∈Y, we have that

〈A^?y,x〉 = 〈y, (A^?)^?x〉 ⇒ 〈x,A^?y〉 = 〈(A^?)^?x,y〉 ⇒ 〈(A^?)^?x,y〉 = 〈Ax,y〉, concluding that (A^?)^?=A.

ii) For allx∈Xandz∈Z, we have that

〈x, (B A)^?z〉 = 〈B Ax,z〉 = 〈Ax,B^?z〉 = 〈x,A^?B^?z〉, from which we conclude thatA^?B^?=(B A)^?.

Definition 2.7(Self-Adjoint operator). Let X be a Hilbert space and A∈L(X). A is aself- adjoint operatorif A=A^?.

Example 2.9. The projection into a linear subspace of a Hilbert space is a self-adjoint operator.

4 Proposition 2.5. If X,Y are Hilbert spaces and A∈L(X,Y), then A A^?∈L(X),kA A^?k = kAk² and A A^?is self-adjoint.

Proof. Letx,y∈X andα,β∈E. We have

A A^?(αx+βy)=A(αA^?x+βA^?y)=αA A^?x+βA A^?y, showing thatA A^?is linear. Moreover,A A^?is bounded since

kA A^?xk ≤ kAkkA^?xk ≤ kAkkA^?kkxk = kAk²kxk, ∀x∈X. We have proved thatkA A^?k ≤ kAk², but we also have

kAk²= sup

kxk=1kAxk²= sup

kxk=1〈Ax,Ax〉 = sup

kxk=1〈A^?Ax,x〉 ≤ kA A^?k, from which we deduce thatkA A^?k = kAk².

Finally, by Proposition 2.4, we obtain (A A^?)^?=(A^?)^?A^?=A A^?and we conclude thatA A^? is self-adjoint.

(22)

2.3 L

^p

spaces

The spaces of functions defined on measure spaces withpth power integrable (meaning that the integral is finite) constitute one of the most celebrated examples of Banach spaces in functional analysis. These spaces, calledLebesgueorL^pspaces, will be studied in some detail in this section since they have a key role in the theoretical discussion of image processing problems.

In this section, we summarize the most important properties ofL^pspaces. We performed a deeper study in Appendix A. In fact, this section can be very well understood as an abstract of Appendix A.

2.3.1 Definition of L^pspaces and main properties We now defineL^pspaces and prove the most basic properties.

Definition 2.8(L^p space). Let(X,S,µ)be a measure space and1≤p< ∞. The L^p spaceis defined as

L^p(µ)=

½

f :X→E measurable and Z

X|f|^pdµ< ∞

¾ . We consider the correspondencek·kp:L^p(µ)→Rsuch thatkfkp=¡R

X|f|^pdµ¢¹_p

. In this way, we can rewrite the L^pspace as

L^p(µ)=©

f :X→E measurable andkfkp< ∞ª .

Definition 2.9(L^∞space). Let f :X→E be a measurable function. Theessential supremumof f is defined askfk∞=ess supf =inf{λ≥0 :|f(x)| ≤λa.e.}. Therefore, the L^∞spaceis defined as

L^∞(µ)=©

f :X→E measurable andkfk∞< ∞ª . The following is one of the most celebrated inequalities forL^pspaces.

Proposition 2.6(Hölder’s inequality). If1/p+1/q=1with1≤p≤ ∞, f ∈L^p(µ)and g∈L^q(µ), then f g∈L¹(µ)andkf gk1≤ kfkpkgkq.

We now show howL^pspaces are, in fact, Banach spaces. Proofs can be found in Appendix A.

Proposition 2.7. If1≤p≤ ∞, then L^p(µ)is a vector space over E andk·kpis a seminorm.

Remark 2.7. In order to obtain a normed vector space, we consider the following equivalence relation over L^p(µ):

∀f,g∈L^p(µ), f ∼g⇔f =g a.e.

Notice that f =g a.e. is equivalent tokf−gkp=0. By considering the quotient space with the normk[f]kp= kfkp, where f ∈[f], we obtain a normed vector space. This norm is well defined since|kfkp− kgkp| ≤ kf −gkp=0if f,g∈[f].

From now on, we rename this normed space as L^p(µ).

16

(23)

2.3. L^pspaces The completeness of Lebesgue spaces is guaranteed by the result that follows.

Theorem 2.5(Riesz-Fischer Theorem). If1≤p≤ ∞, then(L^p(µ),k·kp)is a Banach space.

Remark 2.8. (i) We consider the L²(µ)space and the inner product

〈f,g〉 = Z

X

f g dµ.

We have that〈f,f〉 = kfk²₂and therefore L²(µ)is a Hilbert space.

(ii) For1≤p ≤ ∞such that p 6=2, L^p(µ) is not, in general, a Hilbert space because the parallelogram law does not necessarily hold.

(iii) Let us show that L^p(µ)for p6=2, is not a Hilbert space whenµis the Lebesgue measure overR. We just need to consider the functions f(x)=1for[0, 1]and f(x)=0elsewhere, and g(x)=1for x∈[1, 2]and g(x)=0elsewhere. Both functions are p-integrable for all 1≤p≤ ∞. We have that

kf+gk²p+ kf −gk²p=2²^p+2^p² =2(2)^p² and

2(kfk²p+ kgk²p)=2(1+1)=4.

Note that both terms are not equal when p6=2.

2.3.2 Duality, reflexivity and separablity

In the following lines we present a few more important properties ofL^pspaces. The detailed exposition of this section can be found in Appendix A.

The following result summarizes reflexivity.

Theorem 2.6. If p∈[1,∞), the dual space of L^p(µ)is L^q(µ)with1/p+1/q=1. The dual space of L^∞(µ)strictly contains L¹(µ). In conclusion, L^p(µ)is reflexive if and only if1<p< ∞.

We study next the separability of some kinds ofL^pspaces. In what follows, we consider Ω⊂Rⁿa bounded domain with regular boundary. Ifµis the Lebesgue measure overΩ, we then write writeL^p(µ)=L^p(Ω). This allows us to study separability ofL^pspaces in this context.

Definition 2.10(Space ofC^k functions with compact support). Let k∈N∪{0,∞}. We define C₀^k(Ω)={f ∈C₀^k(Ω)with compact support onΩ}.

Proposition 2.8. C0(Ω)is dense in L^p(Ω)for1≤p< ∞, i.e., for each f ∈L^p(Ω)there exists {g_n}_n∈N⊂C0(Ω)such thatkg_n−fkp→0.

Theorem 2.7. L^p(Ω)is separable for1≤p< ∞. Proposition 2.9. L^∞(Ω)is not separable.

(24)

Dual space Reflexive Separable L^p(Ω), p∈(1,∞) L^q(Ω), _p¹+¹_q =1 Yes Yes

L¹(Ω) L^∞(Ω) No Yes

L^∞(Ω) Strictly containsL¹(Ω) No No Table 2.1: Duality, reflexivity and separability for Lebesgue spaces.

As a consequence, the most important regarding duality, reflexivity and separability is summarized in Table 2.1

The following result provides a set of embeddings ofL^p(Ω) spaces and its proof has been done as an exercise.

Proposition 2.10. If1≤p≤q≤ ∞, then we have the following embeddings:

L^∞(Ω)⊂L^q(Ω)⊂L^p(Ω)⊂L¹(Ω).

Proof. Recall thatµ(Ω)< ∞sinceΩis bounded. Iff ∈L^∞(Ω), we have Z

Ω|f(x)|^qd x= Z

Ω\A|f(x)|^qd x≤ kfk^q∞µ(Ω)< ∞,

whereAis set of measure zero such that sup_x_∈Ω_\A|f(x)| = kfk∞. This means thatf ∈L^q(Ω).

Iff ∈L^q(Ω), we considerr such that 1/r+1/(q/p)=1. Using Hölder’s inequality we obtain Z

Ω|f(x)|^pd x≤ µZ

Ω|f(x)|^qd x

¶^p_qµZ

Ω|1Ω(x)|^rd x

¶¹_r

≤ kfk^pq(µ(Ω))¹^r < ∞, from whichf ∈L^p(Ω).

Iff ∈L^p(Ω), using again Hölder’s inequality:

Z

Ω|f(x)|d x≤ kfkp

µZ

Ω|1_Ω(x)|^p⁰d x

¶_p0¹

= kfkp(µ(Ω))^p0¹ < ∞, where 1/p+1/(p⁰)=1. We thus conclude thatf ∈L¹(Ω).

Remark 2.9. Let f : (0,∞)→R be given by f(x)=1/x. Notice that f ∈L²((1,∞))but f 6∈

L¹((1,∞)). This means that the hypothesisµ(Ω)< ∞is necessary.

Finally, we define the concept oflocally integrablefunction.

Definition 2.11(Locally integrable function). Let1≤p≤ ∞. We say that a function f islocally integrableand we write f ∈L^p_loc(Ω)if f|K∈L^p(K)for every compact set K⊂Ω.

Remark 2.10. From this definition it follows that L^p(Ω)⊂L_loc^p (Ω)for1≤p≤ ∞.

The following proposition is a fundamental result of the Calculus of Variations.

18

Convex Variational Methods and Optimization Techniques for Image Processing

T reba ll F ina l de G rau

GRAU DE MATEMÀTIQUES