Hybridization of gradient-based and gradient-free optimization techniques for simultaneous optimization of number of wells, their location and drilling time in a 2-dimensional reservoir

(1)

i

Study Program/Specialization: Spring Semester, 2019

Petroleum Engineering/Reservoir Engineering

Open/Restricted access Writer:

Mohammad Nezhadali Faculty supervisor(s):

Remus Gabriel Hanea

Thesis title:

Hybridization of gradient-based and gradient-free optimization techniques for simultaneous optimization of number of wells, their location and drilling time in a 2-dimensional reservoir Credits (ECTS): 30

Key words: Pages: 72

Hybrid Optimization

Well Placement +Enclosure: 5

Reservoir Optimization Robust Optimization

Ensemble-based Optimization Stavanger, 15.06.2019

(2)

ii

Abstract

Optimization has always been a challenging scope in any field of science, and reservoir management is no exception. By the advent of computational power, the use of mathematical algorithms in conjunction with reservoir simulators has prospered drastically. These mathematical optimization algorithms are divided into two main categories, namely gradient-based algorithms and gradient-free algorithms.

The optimization in petroleum industry is mostly focused on maximization of the Net Present Value (NPV) of the project. Utilization of gradient-based optimization techniques for this aim, even though being popular, has some limitations including risk of convergence to local optima and difficulties in optimizing discrete and categorical variables such as well path and well locations.

On the other hand, gradient-free optimization techniques, which mostly rely on probabilistic and stochastic principles, have their own limitations.

This study shows that hybridization of the gradient-free and gradient-based optimization techniques has a good potential to result in a robust reservoir optimization scheme which outperforms both gradient-free and gradient-based optimizers separately. In doing so, a serial optimization algorithm is devised by hybridization of genetic algorithm, simulated annealing, and stochastic gradient descent method. Employing this algorithm in several optimization problems in a two-dimensional oil field, has proved that this algorithm outperforms its previous forerunners in optimization of the number of wells in the field, their location, and drilling schedule.

Findings of this study can be used to raise a foundation for generation of new optimization techniques which can boost the standards in both the complexity of problems solved in petroleum industry and their computational efficiency.

(3)

iii

Acknowledgement

I would like to express my deepest appreciation to my supervisor, Professor Remus Gabriel Hanea, who has the attitude and the substance of a genius: he revitalized my passion towards mathematics and mathematical problems in petroleum engineering once again both as a teacher and supervisor, and continually and convincingly helped me throughout this last one year of my studies. Without him completion of this thesis would have not been possible.

In addition, I would like thank PhD students, Camilo Malagon Nieto and Brage Strand Kristoffersen, as well as Doctor Alireza Zare and Doctor Mathias Bellout for their help and support who were available for help whenever I faced problems.

I would also show my gratitude to the University of Stavanger for providing good learning environment, and the infrastructure required for the progress of this project.

Finally, I would like to express my deepest gratitude to my family for their continuous and heartful support and patience throughout years of my life, and in particular through these two years of my studies.

Mohammad Nezhadali Stavanger, Norway June 2019

(4)

iv

Table of Content

Abstract ... ii

Acknowledgement ... iii

Table of Content ... iv

List of Figures ... vi

List of Tables ... viii

Abbreviations ... ix

1 Introduction ... 1

2 Theory ... 4

2.1 Gradient-free optimization techniques: ... 4

2.1.1 Nelder-Mead Simplex method: ... 5

2.1.2 Genetic Algorithm... 7

2.1.3 Particle Swarm Optimization: ... 11

2.1.4 Simulated Annealing ... 12

2.2 Gradient-based optimization techniques: ... 13

2.3 Hybridization of algorithms ... 14

3 Methodology: ... 16

3.1 Ensemble methods and Robust Optimization ... 17

3.2 Optimization program structure: ... 20

3.2.1 Genetic Algorithm: ... 21

3.2.2 Live Genetic Algorithm (LGA) ... 27

3.2.3 Simulated Annealing: ... 29

3.2.4 Gradient-based optimizer: ... 31

3.2.5 Hybridizing the Algorithms: ... 35

4 Results and discussion: ... 37

4.1 Sensitivity analysis: ... 37

4.1.1 Sensitivity analysis for the genetic algorithm ... 37

4.1.2 Simulated Annealing Sensitivity Analysis ... 47

4.2 Solved Problems ... 49

4.2.1 Primary problem ... 49

4.2.2 Main Problem: ... 52

4.2.3 Problem of moderate drilling cost ... 56

4.2.4 Problem of costly drilling ... 59

4.2.5 Problem of different initial well locations ... 62

(5)

v

4.2.6 Problem of single pre-drilled well ... 64

5 Conclusion and future work ... 68

6 References ... 70

Appendix A—The Eclipse DATA file... 73

(6)

vi

List of Figures

Figure 2. 1 Evolution of Nelder-Mead method in a sample problem ... 5

Figure 2. 2 Probability distribution function example for the children in GA ... 9

Figure 2. 3 Schematic of compromise between particle velocity, its trust in itself and its trust in swarm ... 12

Figure 2. 4 An example of SA advancement ... 13

Figure 2. 5 comparison of Gradient Descent and Stochastic Gradient Descent methods... 14

Figure 3. 1 3D representation of the field structure ... 17

Figure 3. 2 Permeability fields of a sample ensemble of realizations ... 19

Figure 3. 3 Basic structure of the optimization algorithm ... 20

Figure 3. 4 Different formats of GA fitness function ... 24

Figure 3. 5 Flow chart of the GA used in this study ... 26

Figure 3. 6 The population list in LGA... 29

Figure 3. 7 A comparison of GD and SGD ... 31

Figure 3. 8 SGD progress example ... 33

Figure 3. 9 A simplistic graph on metaheuristic neighborhood ... 34

Figure 3. 10 Hybrid algorithm flow chart ... 36

Figure 4. 1 Mean of NPV achieved by different inflation rates... 38

Figure 4. 2 Maximum NPV achieved by different inflation rates ... 38

Figure 4. 3 mean of the NPV achieved by different inflation rates ... 39

Figure 4. 4 Maximum NPV achieved by different inflation rates ... 40

Figure 4. 5 Mean of the NPV of population for different crossover ratios ... 41

Figure 4. 6 Maximum NPV achieved by different crossover ratios ... 41

Figure 4. 7 Mean NPV achieved for different elitism schemes ... 42

Figure 4. 8 Standard deviation of the maximum NPV achieved in several runs of the algorithm for different elitism schemes ... 43

Figure 4. 9 Mean of the NPV of the population in different mutation ratios ... 44

Figure 4. 10 maximum NPV achieved in different mutation rates ... 44

Figure 4. 11 Mean of the NPV of the population for different population sizes ... 45

Figure 4. 12 Maximum NPV achieved for different population sizes ... 46

Figure 4. 13 Maximum NPV achieved for different acceptance chance ratio ... 47

Figure 4. 14 Maximum NPV achieved for different acceptance probability half-lives... 48

Figure 4. 15 Maximum NPV in different optimization algorithms for primary problem... 50

Figure 4. 16 Simulations ran in different optimization algorithms for primary problem ... 50

Figure 4. 17 Maximum NPV in different optimization algorithms for the main problem ... 53

Figure 4. 18 Simulations ran in different optimization algorithms for the main problem ... 53

Figure 4. 19 Final water saturation map of the ensemble in the main problem ... 55

Figure 4. 20 NPV evolution of the optimized solution and its ensemble and its comparison to a sub-optimal solution and its ensemble ... 56

(7)

vii

Figure 4. 21 Maximum NPV achieved by different optimization algorithms for the problem of moderate drilling cost ... 57 Figure 4. 22 Number of simulations ran in different optimization algorithms for the problem of moderate drilling cost ... 58 Figure 4. 23 Maximum NPV achieved by different optimization algorithms for the problem of costly drilling ... 60 Figure 4. 24 Number of simulations ran in different optimization algorithms for the problem of costly drilling ... 60 Figure 4. 25 Maximum NPV achieved by different optimization algorithms for the problem of different initial well locations ... 63 Figure 4. 26 Number of simulations ran in different optimization algorithms for the problem of different initial well locations ... 63 Figure 4. 27 Maximum NPV achieved by different optimization algorithms for the problem of single pre-drilled well ... 65 Figure 4. 28 Number of simulations ran in different optimization algorithms for the problem of single pre-drilled well ... 66

(8)

viii

List of Tables

Table 3. 1 The structural properties of the simulation case ... 16

Table 4. 1 Primary problem definition ... 49

Table 4. 2 Solution to primary problem ... 51

Table 4. 3 Definition of main problem ... 52

Table 4. 4 Solution to the main problem... 54

Table 4. 5 Definition of the problem of moderate drilling cost ... 57

Table 4. 6 Solution to the problem of moderate drilling cost ... 58

Table 4. 7 Definition of the problem of costly drilling ... 59

Table 4. 8 Solution to the problem of costly drilling ... 61

Table 4. 9 Definition of the problem of different well initial well locations ... 62

Table 4. 10 Solution to the problem of different initial well locations ... 64

Table 4. 11 Definition of the problem of single pre-drilled well ... 65

Table 4. 12 Solution to the problem of single pre-drilled well ... 67

(9)

ix

Abbreviations

2-D - two-dimensional 3-D - Three-dimensional

ANN - Artificial Neural Networks AI - Artificial Intelligence BHP - Bottom Hole Pressure GA - Genetic Algorithm GD - Gradient Descent ICV - Intake Control Valve LGA - Live Genetic Algorithm n-D - n-Dimensional

NPV - Net Present Value

PSO - Particle Swarm Optimization RO - Robust Optimization

SA - Simulated Annealing

SGD - Stochastic Gradient Descent

(10)

1

1 Introduction

Defining it in a general way, optimization is the process of finding a minimum or maximum of an objective function in a search space while satisfying a set of constraints. In petroleum reservoir management, normally the objective function is the Net Present Value of the field over a certain period of time, which is a measure of how satisfactory the project is, in economic standards. The search space is made of the parameters forming the field development plan and production schedule including the number of wells, ICVs, timing of drilling, order of wells to be drilled, injection or production switches as well as several IOR methods and many other parameters which form a hyper-complex search space. The constraints such as the limitations on production, environmental issues, well head and piping facilities as well as reservoir restrictions, etc. normally confine the search space and consequently complexify the problem to a higher degree. The problem that is finally in front, is almost impossible to solve. Nonetheless, in order to be able to solve the problem, some simplifications on the search space are required.

In the first step, according to the conventions of industry, some of the parameters are set to be fixed or limited. Setting the well path to be fixed or deciding on the IOR method to be planned prior to the optimization are examples of this simplification. These simplifications should also happen over the variables that are changing with time and need to be optimized. ICV openings, for instance, can change at any point in time; however, since by doing so optimization of the problem becomes impossible, the ICV openings are set to be constant at certain periods of time and their modification is limited to discrete time intervals, every 3 months for example. These are all added to the initial simplification of geological reality to an ensemble of geological models. The hypothesis that the reality and its uncertainty can be modelled with an ensemble of static models and a dynamic model, whose accuracy in prediction of reservoir behavior is still unknown, are other influential simplifications. After all these steps, an optimization problem is formed which is, finally, computationally feasible to solve.

Regardless of the traditional methods of optimization and the novel machine learning techniques, the process of finding the optimized solution in the simplified search space is classically performed using some mathematical optimization methods which are generally divided into two categories:

gradient-based optimization techniques and gradient-free optimization techniques [1]. As it can be

(11)

2

inferred from their name, their difference comes from relying of one category of techniques on calculation of the gradient of the objective function over the search space. Despite being very successful and widely used in industry, gradient-based optimization techniques have problems in handling discrete or categorical parameters. In petroleum engineering, the number of wells to be drilled and their location are examples of these parameters. On the other hand, gradient-free optimization techniques are designed to overcome this kind of problems, yet they also have their own limitations.

The main objective of this study is to set up an initiative on hybridization of gradient-based and gradient-free optimization techniques to come up with a more robust optimization scheme which can optimize the number of wells and their location and time of drilling based on the preferred objective function. In doing so, the performance of several gradient-based and gradient-free algorithms on optimization of reservoir and similar problems were studied, and three optimization algorithms were finalized for hybridization. These three algorithms are Genetic Algorithm (GA), Simulated Annealing (SA), and Stochastic Gradient Descent (SGD). In order to make a discrete form of the problem, which is closer to the reality, it was assumed that the well locations are in the center of the grid cells and their drilling time is in a certain year of the project, which reduces the problem into a discrete problem of finding the proper number of wells and the grid in which they are drilled and their year of drilling. Assuming the wells to be controlled by the bottom hole pressure (BHP) for the whole production time and the wells to be vertical, which is the basic form of any optimization problem, and the field to be consisted of 800 (40 by 20) grid cells, initially a sensitivity analysis was performed on the parameters of the above-mentioned algorithms to tune them for the case study, and following that, all the algorithms as well as their hybridization in several ways were tried on the field and the quality of their optimization was assessed quantitatively.

As the previous studies suggest, one of the best hybrid optimization schemes that has proved to work well in similar problems, is the serial use of optimization algorithms that can result in a better optimization scheme, when comparing to the original algorithms [1-5]. Accordingly, the above- mentioned algorithms were firstly, adjusted for optimization of some similar problems on the proposed petroleum field, and then merged serially with each other. Consequently, several case

(12)

3

studies were performed using the algorithms. And six of these problems are presented in this thesis in order to compare the quality of optimization algorithms.

Finally, a brief elaboration of potential extensions of this work and the application its ideas in industry were addressed.

(13)

4

2 Theory

By the developments in technology the computational power has increased drastically. As optimization entails several simulations as part of it, this development in technology has helped a lot in advancements in optimization realm. As early methods were only capable of solving linear programming problems and heuristics, the more novel methods can solve much more complex problems with nonlinearity involved. The use and improvement of many new techniques such as discrete and continuous optimization, stochastic programming, metaheuristics, and genetic algorithms have made us capable of uncertainty handling and solution of much more complex problems [1].

The approach towards optimization in petroleum industry and literature can be categorized into three mainstreams including, sensitivity analysis and use of simulation tools, learning and machine learning based on the intuition and data form the past, and the generic term used in mathematics for optimization which entails mathematical and programming algorithms for optimization [6, 7].

Even though the first two categories of optimization were assumed to be outdated. In recent years thanks to upcoming of big data analysis and machine learning techniques as clustering, regression methods and neural networks, the second category, which relies on data-driven decisions, is emerging again. This emergence has proven to be successful both in optimization and substitution of simulators and models with proxies [8-12].

This study mainly focuses on the last category of the optimization, in which several optimization schemes are investigated, and some new hybridized methods are generated to attack the problems in petroleum industry. The optimization schemes in mathematics are divided into two main categories, gradient free and gradient-based Optimizations. Each of the categories have been widely used in the petroleum industry. A brief representation of the main subcategories and their employment in the petroleum industry will be covered in this chapter.

2.1 Gradient-free optimization techniques:

There are several gradient-free optimization techniques which have been used in petroleum reservoir management. We will cover them in brief here.

(14)

5

2.1.1 Nelder-Mead Simplex method:

This method uses a heuristic approach to search for the optimum solution. It can also be referred as nonlinear simplex in the literature.

A simplex in an n-dimensional space is formed using n+1 points whose any 4 points are not in a same plane. The n-dimensional space itself is formed using our control variables which are meant to be optimized. The n+1 points forming the simplex are a set of potential solutions that are used as initial guess and are optimized in every step so as to finally converge in our optimum solution.

At any step of the optimization process either the simplex shrinks into a smaller size or it is reflected, expanded or contracted so that the mean of the simplex vertices decreases. This means that in any step we are moving forward towards the optimized solution; assuming that there are no very local good solutions to our problem which is a logical assumption. In petroleum reservoir management the models that we run for optimization are more of smooth functions with only some exceptions in highly heterogeneous fields or fractured reservoirs. Additionally, very local good solutions normally have a higher risk of failure which are not of our interest.

Figure 2. 1 Evolution of Nelder-Mead method in a sample problem

(15)

6

In general even though this method does not require calculation of derivative which benefits us in computational cost, it does not show good convergence speed in high number of dimensions, especially in problems with more than 10 design variables. This method has been used in reservoir production optimization in the literature. In a study by Rahmawati et al. two scenarios were considered for optimization of the NPV in a WAG project[13]. In the first scenario the optimization variables were listed as sales gas fraction, DPC temperature, gas-condensate reinjection fraction, and lean gas-condensate re-injection fraction. In the second scenario, the decision variables were comprised of sales gas fraction, DPC temperature, gas injection rate and water injection rate for WAG scenario and WAG period. This research showed that using nonlinear simplex method for optimizing the NPV could result in 12 to 25 percent of growth in the NPV compared to initial engineering guess.

In another study for a case study of optimization of a reservoir in Brugge the Simplex reflection Nelder-mead method of optimization was compared with four other optimization schemes including Pattern Search Hook-Jeeves method, Generalized Pattern Search method, Line search derivative free method, and sequential quadratic programming method[14]. The control variables were the water cuts of the production wells under which the wells were shut. According to the study, the line-search derivative-free method was the most efficient among them and the Nelder- mead was the slowest in convergence. The reason behind this can be the big number of control variables (54 wells) which has resulted in the slowness of Nelder-Mead method.

The same group in another research conducted on optimization of reservoir NPV using NPV as its objective function, proved that taking a set of water cuts as their optimization variables, a good initial guess can be made to start gradient-based optimization[15]. This study also proved that using a big number of search variables, will result in slow convergence of Nelder-Mead method among other methods for optimization.

In some older studies the simplex method is used to optimize daily oil rate without the use of simulators and by linearizing the optimization problem. These have resulted in sub-optimal solutions [16, 17].

(16)

7

2.1.2 Genetic Algorithm

A widely recognized method of optimization which is essentially inspired by the process of natural selection is the genetic algorithm. This method was initially devised and used by John Holland to understand the evolution of life by computational simulation[18]. This method is based on three main principles.

• Survival of the fittest species

• Reproduction of the species and transferring their traits to the next generation

• Mutation and randomized modification of the species characteristics

A short introduction of this useful algorithm in petroleum engineering will be provided in this section.

The algorithm starts with generating a population of possible solutions to the problem. In any optimization technique we are aiming at minimizing an objective function considering some constraints to the problem. Accordingly, based on the objective function and some other side- functions a term called “fitness” is defined. Any of the population members has its own value of objective function, and consequently fitness. This fitness is the essence of GA inspired by the natural selection. A member with a higher fitness is a better fit to the environment, hence has a higher chance of reproduction which means it is more probable that we have that member or some of its traits in the next generation. The good traits or the good members are transferred from one generation to the next and hereby it is assumed that by convergence of the objective function and fitness to its highest value in this probabilistic approach the optimum solution to the problem is found.

The process of reproduction and other concepts as mutation and generation which are adopted from natural phenomena, necessitate us to define their essence as well—the genome.

Any member of the population is a set of control variables which can lead us to an objective function; these control variables are called phenotype, since they are defined in the real environment of optimization. In order to make possibility for definition of the aforementioned concepts of optimization as reproduction, we have to define and encode the characteristics of any solution into a genome, which is called genotype. Consider this example:

(17)

8

Plan A, has drilling well W1A in location L1A and time T1A after that drilling well W2A in location L2A and time T2A. Plan B has drilling well W1B in location L1B and time T1B followed with drilling well W2B in location L2B and time T2B. There are several methods of encoding including binary encryption, decimal encryption, table encryption etc. These two plans can be encoded simply using decimal encryption in the following way: a string of 16 digits with 1^st four digits showing the location of the 1^st well the 2^nd four digits the timing of the 1^st well, the 3^rd four digits the location of the 2^nd well and the 4^th four digits time of time of the drilling of the 2^nd well.

Plan A:

Plan B:

If these two plans go into a reproduction process, using crossover we can come up with a pair of children for them. The crossover is switching between the parents at certain locations. For example, if the crossover is to happen after the location of the first well the children will be:

Plan C1: Plan C2:

The crossover is normally performed in a probabilistic way and its rate is adjusted in such a way that the average number of crossovers in a pairing would be about 1 which is mimicked from the natural phenomena [19]. Also, the crossovers which break the structure and make the plan meaningless are also avoided, that is why normally the crossover is performed at the end of each gene section, but this is not a rule and in binary encryption it is not enforced. There are also some continuous schemes for crossover. Consider a crossover between well locations for example. In the scheme that was introduced previously, the child either had the traits from one parent or the traits from the other, therefore the trait that is inherited is only close to one of the parents. However, if the variable search space is a continuous one, the location of the well can be anything between the location of its parents or even anywhere in the search space. The probability density function of the location of the wells in one dimensional space with peaks in the location of the wells of the parents in figure 2.2 is an example of continuous crossover [19].

L1A T1A L2A T2A

L1B T1B L2B T2B

L1A T1B L2B T2B

L1B T1A L2A T2A

(18)

9

Figure 2. 2 Probability distribution function example for the children in GA

Another technique which is used to mimic natural phenomena in this method of optimization is mutation. As in natural processes this mutation can by chance act positive or negative for our objective function. As for the previous example, the mutation can be introduced in the following way:

Plan:

Mutated:

The drilling time of the first well is randomly changed to another time in the search space. The fitness of the mutated plan will decide probabilistically that it will go through reproduction again or not.

There are some other techniques that can help the optimization to progress better. Elitism, for instance, makes it possible that some of the best results of the generation survive to the next with a probability of unity; hence our best results so far will be saved. Niching divides the population into some subpopulations so that each of the subpopulations will occupy a certain proportion of fitness landscape. Diversifying is another technique in genetic algorithms which targets exploration of the search space. It introduces another term in the objective function, distance to other members of the population for example by growth in which the exploration of the search space will improve, and consequently the results of optimization will become more reliable[19].

L1 T1 L2 T2

L1 T*1 L2 T2

(19)

10

One of the difficulties when using genetic algorithm is handling nonlinear constraints. When there is a discontinuity in some area of the search space, the children of any reproduction are not necessarily valid scenarios for the optimization process. Suppose one of the constraints tells us that the distance between any two wells has to be at least a certain amount. In the initial population satisfying the criterion is not difficult since human can be involved, but from that point on human intrusion in every step will be both time-consuming and bothersome; hence it has to be resolved in another way. In a study conducted by Emerick et al this problem was resolved using an algorithm called Genocop III[20]. This technique updates any invalid population member several times with the existing valid members so that it finally becomes a feasible production scenario. An article using this technique targeted simultaneous optimization of the number of wells as well as their location and trajectory [20]. The results showed that even though the totally random selection of the initial population had a better result than those of base case engineering guess, they were inferior to those conducted by an initial engineering guess.

An application of the genetic algorithms in reservoir optimization is planning and optimization of the unconventional wells. Defining and optimizing such kind of well is arduous in other methods of optimization since these wells are completely dependent on geological characteristics of the area at which the well is to be drilled. Using the GA though, a fixed chromosome of well characteristics can be defined and by change in the drilling environment the genome will evolve to comply with the geological traits of the reservoir. A study by Yeten et al used this method in conjunction with artificial neural network as proxy for reservoir simulation and a final state hill climbing optimizer to optimize the choice of well type and well plan of unconventional wells. This method was more successful than randomly chosen wells[21].

In another research on well placement for water flooding in the gulf of Mexico, polytope algorithm, artificial neural network, and kriging were introduced to GA algorithm to add some hill-climbing theme to its stochastic approach of optimization. This method showed to be computationally efficient by reducing the computational cost and also avoiding the local minima[22]. This technique has also been used to optimize the pipeline network in use, separator pressure, gas injection rate, well connection systems [23-25].

As of particular interest, the optimization problem of well-placement under uncertainty in addition to involving risk taking capability into objective function is also investigated in some studies [22,

(20)

11

26]. This has been done by designing an objective function which is weighted average based on the compromise in maximization of the objective function and minimization of its variance.

However, in another study, a comparison between GA and PSO has been performed and the superiority of PSO over GA in the optimization of well types including vertical, deviated, dual lateral, etc. over single and multiple reservoir realizations has been shown [27].

2.1.3 Particle Swarm Optimization:

Another gradient-free optimization technique that is inspired from nature is the particle swarm optimization. This algorithm essentially mimics flocking of the birds. It starts form a random colony of possible answers to the problem and using the fitness function and previous responses to the problem it tries to improve it. The basic idea behind the PSO is as follows.

Initially a population of possible answers are generated. Each of the members is assumed to be a particle moving in the search space, which is assumed to be a n-dimensional space formed by the decision variables with some constraints. Any of these particles at each time-step has its own position, velocity, and direction of movement. Two terms are to be defined here; the best global position, and the best personal position of each particle. The position of the particle in the next time-step is a compromise between the current position, the velocity and direction of movement, the location of the best personal record, and the location of the best global record. The formulation on compromise is shown in equations 2.1 and 2.2.

𝑣_𝑖,𝑗^𝑘+1= 𝑣_𝑖,𝑗^𝑘 + 𝑐₁𝑟₁(𝑥𝑏𝑒𝑠𝑡_𝑖,𝑗^𝑘 − 𝑥_𝑖,𝑗^𝑘 ) + 𝑐₂𝑟₂(𝑥𝑔𝑏𝑒𝑠𝑡_𝑗^𝑘− 𝑥_𝑖,𝑗^𝑘 ) (2.1)

𝑥_𝑖,𝑗^𝑘+1 = 𝑥_𝑖,𝑗^𝑘 + 𝑣_𝑖,𝑗^𝑘+1 (2.2)

In the above-mentioned formula, vi,jk is the jth component of velocity of ith particle in kth time-step.

xik is the position of ith particle in kth time-step. r1 and r2 are two random numbers with uniform probability in (0,1). xbesti and xgbest are the best position of ith component and the best position of the whole group so far. c1 and c2 are factors representing the particle’s confidence in itself and swarm respectively. A graphical representation of the update is shown in figure 2.3 [28].

(21)

12

Figure 2. 3 Schematic of compromise between particle velocity, its trust in itself and its trust in swarm

The velocities of particles adaptively slow down as they converge to the solutions, and the solutions are obtained [29]. This method has recently become very popular in optimization of the problems faced in petroleum industry.

In a study by Shirang and Durlofsky, the development plan for number, type, location, and controls of the new wells is optimized using robust optimization. The uncertainty is taken into account using multiple realizations and the objective function has been the net present value of the entire project [30]. In some other studies the categorical variables such as well locations, are defined as integer-valued parameters and are optimized simultaneously with continuous variables such as valve openings using the PSO [31-33]. The objective function varies from the NPV, to total oil production of the field, and some other multi-objective functions that were defined previously.

2.1.4 Simulated Annealing

Simulated annealing is a probabilistic simulation method inspired by the annealing process in metallurgy. In metallurgy a piece of metal is tempered and cooled down several times consecutively so as to finally have the metal frozen in its least energy configuration. This method is mimicking the mentioned physical phenomenon to compute the minimum of the objective function after several stochastic moves which get more and more limited by the progress of the algorithm. As the metal reaches to its minimum state of energy, the simulated annealing algorithm

(22)

13

is aimed to conduct the control variable to its minimum state of energy or in other words optimized solution of the problem [34].

Figure 2. 4 An example of SA advancement

This method has been used in optimization of a wide range of problems in petroleum industry from minimum miscibility pressure prediction to fracture network modeling [5, 35].

2.2 Gradient-based optimization techniques:

Gradient-based techniques of optimization are both diverse and widespread in the academic literature and industry. All of these methods entail 4 main steps [36];

1. Convergence criteria check 2. Search direction computation 3. Step length calculation 4. Design variables update

Most of the gradient-based techniques including steepest descent method, conjugate gradient method, nonlinear conjugate gradient method, Newton’s method, modified and quasi-Newton method, Davidon-Fletcher-Powell (DFP) Method, trust region method etc. entail computation of the derivative of the objective function, e.g. gradient, Hessian matrices. As we use commercial simulators in the petroleum industry for calculation of the objective function, calculation of these

(23)

14

matrices is either very costly or even impossible. Accordingly, instead of using gradient-based techniques for optimization of the reservoir we mostly use stochastic gradient descent techniques which are numerical estimations of the gradient based techniques, which are not very good approximates of steepest descent method; however, they prove to be practically very successful.

In this method a stochastic direction and a step length in the search space is chosen. The algorithm then updates to the next search location based on these two parameters and the current location, satisfying the condition that the objective function is to be improved. Figure 2.5 depicts a comparison between gradient descent method and stochastic gradient descent method in a two- dimensional search space [37].

Figure 2. 5 comparison of Gradient Descent and Stochastic Gradient Descent methods

These techniques are widely used in the optimization section of the closed loop management for several purposes basically focusing in optimization of the continuous variables of the problem [38- 42]. This technique has also been used in well-placement optimization by taking the hypothesis that wells are continuous variables in a constrained search space [43].

2.3 Hybridization of algorithms

The attraction of simultaneous benefit from pros of both categories of optimization methods in addition to minimizing their weakness in practice has encouraged many scientists to devise some hybridized techniques by which the problems are handled more efficiently. The joint use of

(24)

15

gravitational search and Nelder-Mead algorithm has been used to improve thin-wall structures in auto-motive industry [44]. Hybridization of joint genetic algorithm and particle swarm optimization has been used for training neural network-based artificial intelligence (AI) in oceanography and rainfall forecast [45]. Particle swarm optimization has been used in conjunction with simulated annealing to improve the prediction of minimum miscibility prediction of petroleum fluids and a combination of genetic algorithm and simulated annealing is used stochastic reservoir modeling improvement [5, 46]. Also, hybrid algorithms made from combination of genetic algorithm and simulated annealing are used for optimization of multiple water reservoirs in the field of hydrology [47, 48]. The probabilistic genetic algorithm has also been used in a hybridized format with the deterministic gradient-based approach to optimize the total number and placement of wind turbines in a wind farm which is a close problem to the problem of our interest [4].

Siavashi et al., in their study, have compared utilization of genetic algorithm and particle swarm optimization and a hybrid of the two on water flooding optimization in which the hybrid optimization has showed excellence over the pure GA and pure PSO [2]. The genetic algorithm and finite difference gradient methods have also been used in conjunction with artificial neural networks and kriging to optimize the number, location and type of the wells in a petroleum reservoir [3, 49]. The genetic algorithm has also been hybridized with polytope search in order to optimize the injection strategy in a reservoir so as to optimize the oil production [50].

(25)

16

3 Methodology:

In order to implement the optimization process in test cases, we need both the optimization algorithms compiled in a computer platform and the case designed for this purpose. As for this to happen, we should start the test with simple cases. As usual for the initiation of any research on optimization, our basic simple case is a 2-dimensional reservoir. The structural view and a sample of permeability ensemble of this reservoir is represented in figure 3.1. The problem that is going to be solved is summarized in table 3.1, and a more comprehensive elaboration on the reservoir model characteristics can be found in appendix A.

Property

Reservoir Type 2-phase, black oil

Dimensions: 20 x 40 x 1

Grid Size: 100m x 100m x 20m

Top Depth: 2700 m

Initial Pressure: 2000 psi

Porosity: 0.2

Initial water saturation 0.25

Fault 1: Between rows 13 and 14, Transmissibility: 0.2

Fault 2: Between rows 27 and 28, Transmissibility: 0.05

Fetkovich Aquifer properties: PI:5, Volume: 1.0E9, Eastern side of the field

Wells: BHP controlled; set be 500 psi

Table 3. 1 The structural properties of the simulation case

(26)

17

Figure 3. 1 3D representation of the field structure

3.1 Ensemble methods and Robust Optimization

One of the most difficult tasks in petroleum engineering is uncertainty quantification. One the most popular methods to tackle this problem is using ensembles of models instead of a single deterministic model. In this research, as usual in the literature and industry, we put our uncertainty on the static models; the dynamic model which is our simulator is supposed to work accurate for now. The uncertain parameter in this case is assumed to be the permeability field; accordingly, an ensemble of realizations is acquired to account for this uncertainty. A sample of these permeability fields can be found in figure 3.2. All of these permeability fields are assumed equiprobable. This entails that when calculating the objective function of the optimization process the arithmetic mean of the NPVs of the ensemble is calculated and reported as the final result for the objective function of each case. This will be referred to as Robust Optimization (RO).

One of the properties of the objective function formed out of an ensemble of realizations is that it is smoother than any of them. This is a direct consequence of normalization. Any of the NPV functions of each member of ensemble forms a potential surface in the multi-dimensional search space of the optimization problem. Adding up and normalization of two or several of these potential surfaces will result in alleviation of local exclusive features and more emphasis on the

(27)

18

common features of all ensemble members. This results in a smoother potential surface of the objective function. Accordingly, finding the optimized solution in an ensemble-based method becomes easier relative to single reality method, simply because the chance of getting stuck in the local minima decreases.

In order to prove the quality of our algorithm we try optimization of both single reality and ensemble-based reality models and provide the results.

The field that we are planning to optimize, has two pre-drilled wells in the first year of field development plan. There is also a constraint in the number of wells that can be drilled in each year.

In this case, it is supposed that we can drill only one well per year. There exists an aquifer in the eastern side of the field so that it supports the pressure drop of field due to production. Problem is defined by the number of the wells to be drilled in this field, their location, as well as the time of drilling the wells limited by the afore-mentioned constraint. All of the wells are producing at constant pressure. Since the time they are drilled, they start to produce at constant pressure. We are aiming at maximizing the NPV function for the whole project. Drilling any well has its own constant cost which is initially set to be 5 million dollars for drilling and completion of the well.

The objective function for this problem is the Net Present Value of the project. It is calculated based on the formula 3.1 [51]:

𝑁𝑃𝑉(𝑖, 𝑁) = ∑ 𝑅_𝑡 (1 + 𝑖)^𝑡

𝑁

𝑡=0

(3.1)

In this formula, t denotes time of the cash flow, i is the discount rate, and Rt is the cash flow rate.

As for this problem, the parameters for the cash flow rate are oil price per barrel, water treatment cost per barrel, water injection cost per barrel, and well drilling cost per well.

(28)

19

Figure 3. 2 Permeability fields of a sample ensemble of realizations

(29)

20

3.2 Optimization program structure:

The optimization program consists of two basic parts designing a test case and running the simulation on that. The simulations are run using commercial Schlumberger Eclipse E100.

Designing the test cases on the other hand is done using a MATLAB program which is written to construct test cases based on the employed optimization technique. These optimization techniques are including genetic algorithm with several schemes, local search and metaheuristics, simulated annealing, and stochastic gradient descent.

The structure of the algorithm design code is depicted in figure 3.4. More comprehensive depictions of particular algorithms will follow in next sub-chapters.

Figure 3. 3 Basic structure of the optimization algorithm

In the initialization process, eclipse files, proper population of initial engineering guess for beginning of the optimization process, optimization parameters setup, and several other parameters are decided and set.

Then the output of this section is fed to the optimization algorithm. There the algorithm modifies the include eclipse file which is the well plan of the field and using that the Eclipse DATA file is ready to be run in the Eclipse simulator.

(30)

21

The DATA file is finally run in Eclipse simulator and the results of the run are fed back to the optimizer so as to calculate the objective function and decide on the next run of the algorithm or convergence criteria.

3.2.1 Genetic Algorithm:

In this algorithm initially a population of initial guess which can be both random and meaningful is generated. The structure of the genome is designed in a completely discrete manner. For any of the possible scenarios we would either drill a well or not, the location of the well can be a completely discrete encoded number which denotes the number of the grid cell at whose center the well is drilled. In this case we have used a column following row order for the grid numbering.

The drilling schedule will simply be the order of the wells in the genome. Consider the following structure for example:

389 1 47 1 221 0 180 1 98 0

In the above-mentioned 5-year drilling schedule, any of the integer codes in odd cells denotes location of the grid cell at which the well is drilled and the binary codes at the even cells shows whether that well is drilled or not. The size of the genome simply represents the possible drilling period at the end of which the Net Present Value will be calculated.

After designing the genome, we should also design a proper evolutionary algorithm for the designed genome. In practice a genetic algorithm consists of two main parts, both of which are inspired by the nature: crossover and mutation.

Crossover, as mentioned in the previous chapter, consists of choosing two members of the population as parents and getting their genomes for the breeding process. In this process a location in the genome is chosen and the first part of the genomes are retained while the second part is exchanged. As for this research, the location of the genome cut is chosen after the binary code, as other than that it would be meaningless; however, it may add to the exploration capability of the code. The rate of crossover which means what is the probability of crossover happening while the genome is traversed is set to be proportional to inverse of genome size which is a rule of thumb and sensitivity analysis would be carried out on it.

(31)

22

The mutation on the other hand is responsible for adding to the exploration capability of the code by a completely random change in the genome the rate at which this genome change happens also is a factor to be investigated.

Elitism is another technique in GA which is also considered in this research. Based on elitism some members of the code with the highest fitness or objective function always survive to the next generation regardless of the probability function and probabilistic approach of the algorithm. The ratio of the Elites that are chosen from one generation to the next is also a number to be changed and decided. With the increase in this ratio the exploitation capacity of the code will be boosted, but there will be a chance that the algorithm gets stuck in local optima. The optimum ratio for that should be decided based on the problem characteristics and via sensitivity analysis.

The fitness function and its calculations are among the core segments of the genetic algorithm.

Initially the objective function itself was assumed to be the fitness function which literally means that the chance of any species to survive is proportional to its objective function. However, there is a problem with this approach. If the objective function potential hyper-surface is flat, the fitness function for both the worst and best solutions of the problem will be the same. This implies that in any upcoming generation of schemes the probability of survival of the best solution and the worst solution will be very close to each other. This will result in very low convergence speed. In an attempt to attack this issue, we modified the fitness function to be not the objective function itself but an increasing function of it. Two different functions were proposed in this study and the sensitivity analysis of them are also carried out.

The first function that is introduced keeps the linearity of the fitness function. In fact, it is a linear transform of this function to a space where the chance of the best solution to survive is K times the chance of the worst solution. Being an increasing function the order of the solutions is also maintained, which means that if scheme A has a higher objective function than scheme B, it will have a higher chance of survival in the new fitness function probability space as well. The formulation of this function is presented as equation 3.2.

𝑓_𝑛 = 𝑓_𝑜−𝐾 ∗ 𝑀𝑖𝑛(𝑓_𝑜) − 𝑀𝑎𝑥(𝑓₀)

𝐾 − 1 (3.2)

𝑓_𝑛 = 𝑛𝑒𝑤 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛

(32)

23 𝑓_𝑜 = 𝑜𝑙𝑑 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛

𝐾 = 𝑡ℎ𝑒 𝑟𝑎𝑡𝑖𝑜 𝑜𝑓 𝑡ℎ𝑒 𝑐ℎ𝑎𝑛𝑐𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑏𝑒𝑠𝑡 𝑠𝑜𝑙𝑢𝑡𝑖𝑜𝑛 𝑡𝑜 𝑡ℎ𝑎𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑤𝑜𝑟𝑠𝑡

𝑀𝑖𝑛 = 𝑚𝑖𝑛𝑖𝑚𝑢𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑓𝑢𝑛𝑐𝑖𝑡𝑜𝑛 𝑎𝑚𝑜𝑛𝑔 𝑡ℎ𝑒 𝑚𝑜𝑠𝑡 𝑟𝑒𝑐𝑒𝑛𝑡 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑀𝑎𝑥 = 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑓𝑢𝑛𝑐𝑖𝑡𝑜𝑛 𝑎𝑚𝑜𝑛𝑔 𝑡ℎ𝑒 𝑚𝑜𝑠𝑡 𝑟𝑒𝑐𝑒𝑛𝑡 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛

A final normalization of the function is required for the sum of probabilities to be equal to one. A representation of the comparative probability functions is presented in figure 3.4.

Another scheme that is also examined in this study uses exponential transform function to build the new fitness function. In this scheme the transform function gives a respectively higher chance to the solutions with higher objective function than those with lower. In other words, the probability density function is denser around the best solutions. This will result in more exploitation in the algorithm which has both its own positive and negative points which has to be balanced. A mathematical formulation of the function is represented in equation 3.3.

𝑓_𝑛 = exp (ln(𝐾) ∗ (𝑓_𝑜− 𝑀𝑖𝑛(𝑓_𝑜))

𝑀𝑎𝑥(𝑓_𝑜) − 𝑀𝑖𝑛(𝑓_𝑜) ) (3.3)

𝑓_𝑛 = 𝑛𝑒𝑤 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑓_𝑜 = 𝑜𝑙𝑑 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛

𝐾 = 𝑡ℎ𝑒 𝑟𝑎𝑡𝑖𝑜 𝑜𝑓 𝑡ℎ𝑒 𝑐ℎ𝑎𝑛𝑐𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑏𝑒𝑠𝑡 𝑠𝑜𝑙𝑢𝑡𝑖𝑜𝑛 𝑡𝑜 𝑡ℎ𝑎𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑤𝑜𝑟𝑠𝑡

𝑀𝑖𝑛 = 𝑚𝑖𝑛𝑖𝑚𝑢𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑓𝑢𝑛𝑐𝑖𝑡𝑜𝑛 𝑎𝑚𝑜𝑛𝑔 𝑡ℎ𝑒 𝑚𝑜𝑠𝑡 𝑟𝑒𝑐𝑒𝑛𝑡 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑀𝑎𝑥 = 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑓𝑢𝑛𝑐𝑖𝑡𝑜𝑛 𝑎𝑚𝑜𝑛𝑔 𝑡ℎ𝑒 𝑚𝑜𝑠𝑡 𝑟𝑒𝑐𝑒𝑛𝑡 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛

(a) (b)

0 100000000 200000000 300000000 400000000 500000000 600000000 700000000 800000000

1 2 3 4 5 6

Original Mean NPV of samples

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2

1 2 3 4 5 6

Original Fitness PDF

(33)

24

(c) (d)

Figure 3. 4 Different formats of GA fitness function

As it can be seen in figure 3.4, the fitness function initially was proportional to the mean NPV of the ensemble for any of the drilling schedules (b). This will imply that in the example above (a), when we have cases with completely random location of the wells which generate NPV values of about 550 million dollars due to good quality of the reservoir, and some much better optimized schemes that produce about 700 million dollars of NPV, the probability of survival of the former would be 0.15 and that of the latter would be 0.18, which does not give a much higher privilege to the optimized solutions. Using this fitness function without any modification will result in very slow convergence speed.

On the other hand, we have proposed two different schemes, one linear (c) and one exponential (d), that will set the exploitation/exploration ratio to our preferred adaptive rate.

Another constraint that we have forced in the problem is the distance between the production wells.

As for the production to be closer to optimal point, the distance of the wells is adjusted so that the drainage area of two production wells does not collide. Even though it is not a mathematically provable idea to keep the wells apart; however, this idea is implemented widely in industry, so we add this constraint to our list of constraints. Since we are in 2-dimensional discrete space in this problem we will define the distance of two wells with Manhattan distance, which can be defined as number of cells a rook in chess moves to reach its destinations times the dimension of each cell movement. The constraint on the spacing between the wells is the Manhattan distance between

0 0.05 0.1 0.15 0.2 0.25 0.3

1 2 3 4 5 6

Linear Fitness function

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

1 2 3 4 5 6

Exponetntial Fitness Function

(34)

25

any two wells has to be more than or equal to a certain value D, which is set to be a positive constraint so as to help us avoid any unnecessary calculations, in this study we avoided neighboring cells to have wells simultaneously by implementation of this method i.e. D=2.

The Genetic Algorithm used in this study can be defined as follows:

1. Generate a of meaningful ensemble of realizations

2. Generate an ensemble of initial guess or initial population of drilling scheme genotypes-- This can be both done randomly be machine or continuously by engineer

3. Transform the genotypes to phenotypes and simulate for all the drilling schemes and their realizations using Schlumberger Eclipse E100

4. Calculate the mean NPV of all the realizations for any of the drilling schemes based on the Eclipse files

5. Calculate the fitness function for any of the phenotypes and make a lottery pot using this population and the elites list

6. Choose N pair of parents by lottery and intersect them to come up with a population of N children

7. Check if the children are fit to the self-appointed constraint otherwise substitute the unfit children using the lottery pot and new parents

8. Transform the children genotypes to phenotypes and simulate for all the drilling schemes and their realizations using Schlumberger Eclipse E100

9. Calculate the mean NPV of all the realizations for any of the drilling schemes based on the Eclipse files

10. Compare the new generation with the previous generations and if the convergence criteria are not satisfied go to 6

11. Report scheme with the highest mean NPV as the solution to the problem A flow chart of this algorithm can be found in figure 3.5.

(35)

26

Figure 3. 5 Flow chart of the GA used in this study

The population size which is also a very critical parameter in the runtime of the code is to be decided. A prevailing number for that in the literature is the genome size [19]. As the genome size increases, with any unit of this increase, another dimension is added to the search space which makes the problem incomparably bigger but with adding one member to the population the computational cost will only increase linearly.

(36)

27

Another problem that is faced in robust optimization is that we should generate several realizations and simulate all of them. The objective function for the optimization algorithm will finally be the mean of the NPV or any other fitness function of the ensemble. This would imply that with increase in the size of the ensemble the computational cost will increase linearly. And due to the limited capacity of the accessed clusters this would mean that a normal robust optimization with an ensemble of size 10 will take 10 times more computation runtime than normal single-realization optimization.

3.2.2 Live Genetic Algorithm (LGA)

As it can be simply recognized in the problem and its size, the use of clusters for this problem seems inevitable. One of the problems with the traditional sequential genetic algorithms is that the results of any full cycle of breeding, simulation, and objective function calculation should be determined at the end of any cycle for the next generation to be formed. This implies that if we have a cluster of 100 and our population size is 60 for example, in this case 60 of the cores are assigned to the computational costs pertaining to each of the population members and the other 40 will be out of use or a more complex algorithm should be implemented to use the whole 100 cores for different realizations in case of robust optimization. But even in this case, using traditional genetic algorithm, reaching 100 percent of computational power usage is impossible. In order to attack this problem, we design a new scheme in genetic algorithms called “Live Genetic Algorithm” (LGA) using which we can attain the 100 percent usage of computational cluster capacity.

The LGA as the GA itself is inspired by the nature. As we do not have any discrete form of generations is the nature, which means that second generation of some species will not stay for the first generation to end their lifetime so that they can start their lifespan. Different generations and genome lines coexist together but have a certain life expectancy. This is exactly what we have mimicked to design LGA. Starting with a population of genomes, we specify a lifetime “L” for each of the population members. This means that any of the population members can take part in L lotteries to be one of the parents. After pairing if the child was fit to the environment, which is the set of constraints, the child goes for simulation in all of its realizations and the necessary data about each of them in addition to its own genome is stored in a live list. So each of the cores can parallelly come to this list, choose a pair of parents from the last L members of the list, generate a

(37)

28

fit child genome, run the simulations of all of the realizations based on the phenotype built on the child’s genome, and store the chief data at the end of the live list and update it. The algorithm and a schematic of the live GA can be seen below.

The Live Genetic Algorithm outline:

1. Generate an initial live list of drilling schemes of size L

2. Simulate the realizations of all the members in the initial population (live list)

3. Calculate the mean NPV for any of the population members based on the simulations 4. Calculate the fitness function for any of the members of the list

5. Add the last L members of the list to the lottery pot together with the elites list

6. If any core is free choose two parents from the lottery pot based on the probabilistic approach generate a child member

7. Simulate all the realizations based on the child scheme 8. Calculate the NPV of all the realizations of the child member 9. Calculate the mean NPV and the fitness function accordingly

10. Add the child member and the chief data of all of its realizations to the end of the list 11. Update the elites list

12. If the convergence criteria are not met go to 5

13. Report the top of elites as the optimized scheme of this algorithm

As it can be seen in figure 3.6 in any step the child is generated by selection of parents in lottery among the last L members of the list and an elite list, and after simulations and calculation of its mean NPV it is added to the end of the list.

(38)

29

Figure 3. 6 The population list in LGA

3.2.3 Simulated Annealing:

This method is widely used specially in the problems with complex search spaces. This is due to its probabilistic approach in direction of the movement. Thanks to this probabilistic approach, the algorithm becomes capable of dealing with the problem of local optima.

As for the problem of our interest, the technique should be modified for the discrete search space [52]. For this purpose, the same graph that will be introduced in the gradient-based search is constructed. The two algorithms differ in the point that in stochastic gradient descent method the new neighboring scheme that we try at any step is only accepted if it improves the objective function, which increases the risk of getting stuck in the local optima, while in simulated annealing the worse solution is also accepted with some probability. This probability decreases with time so that finally the method will be very close to stochastic gradient descent method.

(39)

30

In simulated annealing method, the probability of accepting the decrease in objective function has to be a function of the step at which the algorithm is. The functionality in equation 3.4 is proposed in this research for that purpose.

𝑃(𝑘) = 𝑃₀ exp (−𝛼 𝑘) (3.4)

𝑘 = 𝑡ℎ𝑒 𝑠𝑡𝑒𝑝 (𝑎𝑛 𝑖𝑛𝑡𝑒𝑔𝑒𝑟 𝑖𝑛𝑐𝑟𝑒𝑎𝑠𝑖𝑛𝑔 𝑜𝑛𝑒 𝑢𝑛𝑖𝑡 𝑏𝑦 𝑒𝑣𝑒𝑟𝑦 𝑓𝑢𝑙𝑙 𝑠𝑒𝑡 𝑜𝑓 𝑠𝑖𝑚𝑢𝑙𝑎𝑡𝑖𝑜𝑛𝑠) 𝑃(𝑘) = 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑎𝑐𝑐𝑒𝑝𝑡𝑖𝑛𝑔 𝑎 𝑑𝑒𝑐𝑟𝑒𝑎𝑠𝑒 𝑖𝑛 𝑁𝑃𝑉 𝑖𝑛 𝑘_𝑡ℎ 𝑠𝑡𝑒𝑝

𝑃₀ = 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑎𝑐𝑐𝑒𝑝𝑡𝑖𝑛𝑔 𝑎 𝑑𝑒𝑐𝑟𝑒𝑎𝑠𝑒 𝑖𝑛 𝑁𝑃𝑉 𝑖𝑛𝑖𝑡𝑖𝑎𝑙𝑙𝑦 𝛼 = 𝑡ℎ𝑒 𝑑𝑒𝑐𝑎𝑦 𝑟𝑎𝑡𝑒 𝑜𝑓 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦

The decay rate is to be adjusted based on how fast we want to avoid the moves towards the decreasing objective function. In order to quantify it we use the half-life terminology. If we want the probability of accepting the reduction in objective function to be halved every Kth step the formula 3.5 can be used to determine α.

𝛼 = ln (2)

𝐾 (3.5)

The α in equation 3.5 should be determined in a way that the exploitation to exploration ratio be adjusted well for the problem.

The following is an algorithm proposed in this research for the simulated annealing.

1. Take one of the vertices as the initial guess

2. Update the probability of accepting reduced objective function—P(k)

3. Randomly choose either time or space as for search in this step (not equal chances) 4. Take one of the wells as for the trial in this step

5. Randomly choose one of the directions to which the cell can move (2 directions in time domain and 4 directions in space domain)

6. Simulate the neighboring scheme that is formed

7. Calculate the objective function and compare it with the objective function in the previous cell

8. Generate a random number R between 0 and 1 with uniform probability.

(40)

31

9. If the objective function has increased or R is smaller than P(k) move to the new cell 10. If the convergence criteria are not satisfied move to step 2

11. Report the final scheme and its NPV as the solution to the problem

3.2.4 Gradient-based optimizer:

As mentioned in the previous chapter, the optimizers of this category are trying reach the optimum solution by moving along the direction with the steepest dip. Even though there are many successful gradient-based techniques in the literature, most of them are not applicable in petroleum industry for a couple of major reasons. Firstly, the calculation of derivatives is not technically possible as we normally use commercial software for simulation of our study cases. Secondly, if want to calculate the derivative matrices in a numerical way, it will take a gigantic computational cost, as most of the time that optimization takes is due to simulation of the cases. Because of these two main reasons and some other minor ones, the industry tries to cope with the problem through use of stochastic gradient methods. In this subcategory, the searching move will not be in the direction with the steepest slope. Instead of that, the move will be in a random direction with our preferred sign of slope. This would be a very bad approximate of the best move direction, but in practice it has shown to be the most successful among the gradient-based techniques. A two- dimensional symbolic representation of this technique is shown in the figure3.7 [37].

Figure 3. 7 A comparison of GD and SGD

When we are in the discrete search space, such as the discrete (2+1)*n dimensional search space that we have in this problem, where 2 stands for two spatial dimensions, 1 for time dimension, and n for the number of wells, we can modify the continuous search into a metaheuristic search. By