Genetic Algorithm

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

INTRODUCTION

The genetic algorithm is a method based on natural selection, the mechanism that drives
biological evolution, for addressing both limited and unconstrained optimization problems. A
population of individual solutions is repeatedly modified by the genetic algorithm. The genetic
algorithm picks individuals from the present population to be parents at each phase and utilizes
them to generate the following generation's children. The population "evolves" toward an ideal
solution over generations. The genetic algorithm may be used to handle several optimization
problems that aren't well suited for traditional optimization techniques, such as issues with
discontinuous, nondifferentiable, stochastic, or highly nonlinear objective functions. The
evolutionary algorithm can be used to solve mixed integer programming issues in which certain
components must be integer valued.

INTRODUCTION TO OPTIMIZATION

The process of improving anything is known as optimization. As indicated in the


accompanying diagram, every process has a set of inputs and outputs.

SET OF SET OF
PROCESS OUTPUTS
INPUTS

The term "optimization" refers to the process of determining input values in order to obtain the
"optimal" output values. The term "best" has several meanings depending on the situation, but
in mathematics, it refers to maximizing or reducing one or more objective functions by
changing the input parameters.

The search space is the collection of all potential solutions or values for the inputs. There is a
point or a collection of points in this search area that provides the best solution. The goal of
optimization is to locate a certain point or group of points inside the search space.

WHAT IS GENETIC ALGORITHM?

Nature has long served as a source of inspiration for all of humanity. Genetic Algorithms (GAs)
are search-based algorithms based on natural selection and genetics principles. GAs are a
subset of Evolutionary Computing, a considerably bigger discipline of computation.

GAs was invented at the University of Michigan by John Holland and his students and
colleagues, most notably David E. Goldberg, and have subsequently been tested on a variety
of optimization problems with great success.
We have a pool or population of possible solutions to a problem in GAs. These solutions are
then subjected to recombination and mutation (as in natural genetics), resulting in the birth of
new children, and the process is repeated over generations. Each individual (or candidate
solution) is given a fitness value (based on its objective function value), and the fitter ones have
a better probability of mating and producing additional "fitter" individuals. This is consistent
with Darwin's "Survival of the Fittest" theory.
In this manner, we continue to "evolve" better persons or solutions through generations until
we reach a threshold for halting.

Genetic algorithms are adequately randomized in nature, but they outperform random local
search (in which we just attempt numerous random solutions and keep track of the best so far)
because they also use past data.

ADVANGATGES OF GENETIC ALGORITHM

GAs offer a number of advantages that have helped them become quite popular. These include
the following:

• There is no need for any derivative data (which may not be available for many real-
world problems).
• When compared to traditional procedures, it is quicker and more efficient.
• Has excellent parallel capacity.
• Optimizes multi-objective problems as well as continuous and discrete functions.
• Provides a list of "excellent" solutions rather than simply one.
• Always receives a solution to the problem, which improves with time.
• When the search space is large and there are many parameters to consider, this method
is useful.

LIMITATIONS OF GENETIC ALGORITHM


GAs, like any other approach, have some drawbacks. GAs are not appropriate for all situations,
particularly those that are simple and for which derivative information is accessible.

• For some issues, calculating the fitness value many times might be computationally
costly.
• Because the solution is stochastic, there are no assurances about its optimality or
quality.
• The GA may not converge to the best solution if it is not implemented correctly.

GENETIC ALGORITHM- MOTIVATION

Genetic Algorithms can deliver a "good-enough" answer "quickly enough." As a result, genetic
algorithms are appealing for tackling optimization issues. The following are the reasons why
genetic algorithm is required:

1. Solve difficult problems

There are a huge number of NP-Hard issues in computer science. This basically indicates that
even the most powerful computer systems will take a long time (years!) to tackle the problem.
In this situation, GAs seems to be an effective tool for quickly delivering workable near-
optimal solutions.
2. Gradient based methods failure

Starting at a random place and traveling in the direction of the gradient until we reach the
summit of the hill, traditional calculus-based approaches work. This method is quick and
effective for single-peaked objective functions like linear regression's cost function. However,
in most real-world scenarios, we have a very complicated problem called landscapes, which
are made up of multiple peaks and valleys, causing such approaches to fail because they have
an intrinsic propensity to become stuck at the local optima, as seen in the diagram below.

Figure 1: Potential genetic algorithm search area

3. Getting better solution fast


The Traveling Salesperson Problem (TSP), for example, has real-world applications in path
finding and VLSI design. Consider that you're using your GPS Navigation system, and it takes
a few minutes (or even hours) to compute the "best" route from point A to point B. In such
real-world applications, delays are unacceptable, hence a "good-enough" solution that is
supplied "quickly" is essential.
FUNDAMENTALS OF GENETIC ALGORITHM
This section covers the fundamental vocabulary needed to comprehend GAs. A general
structure of GAs is also described in pseudo-code and visual form. The reader is urged to fully
comprehend all of the concepts presented in this section and to keep them in mind while they
read the rest of the tutorial.
BASIC TERMINILOGY
Before diving into the topic of Genetic Algorithms, it's important to understand some basic
terms that will be utilized throughout this lesson.

It's a subset of all the feasible (encoded) solutions to the problem. The population of a GA is
similar to that of human beings, only we have Candidate Solutions representing human beings
instead of human people.
Chromosomes - A chromosome is an example of a solution to a problem.

Gene - A gene is a single chromosomal element location.


Allele - The value that a gene has for a certain chromosome is called an allele.

Figure 2: relation between population, chromosome, gene and allele

Genotype - The population in the computing space is known as genotype. The answers are
represented in the computation space in a fashion that can be easily understood and
manipulated by a computer machine.
Phenotype - is the population in the real-world solution space in which solutions are
represented in the same manner they are in real-world settings.
Decoding and Encoding – The phenotypic and genotype spaces are the same for basic issues
when decoding and encoding. In most circumstances, however, the phenotypic and genotype
spaces are distinct. Decoding is the transformation of a solution from genotype to phenotype
space, whereas encoding is the transformation from phenotype to genotype space. Because
decoding is done frequently in a GA throughout the fitness value computation, it should be
quick.
Consider the 0/1 Knapsack Problem as an example. The Phenotype space is made up of
solutions that just have the item numbers of the items to be chosen.
However, it may be represented as a binary string of length n in genotype space (where n is the
number of items). A 0 at location x indicates that the xth item is selected, whereas a 1 indicates
the opposite. The genotype and phenotype spaces are distinct in this scenario.

Encoding

Genotype Space
Phenotype space
(Computation Space)
(Actual solution space)

0 1 0 1 0 1 0 1

Decoding

Figure 3: Genotype and Phenotype space

Fitness function - is a function that accepts the solution as an input and returns the solution's
appropriateness as an output. The fitness function and the objective function may be the same
in certain circumstances, but they may be different in others depending on the challenge.
Genetic operator - These genetic operators change the genetic makeup of the progeny.
Crossover, mutation, and selection are examples of these.
BASIC STRUCTURE
The basic structure of a GA is as follows –
We begin by selecting parents for mating from an initial population (which may be produced
at random or seeded by other heuristics). To create new offspring, use crossover and mutation
operators on the parents. Finally, these offspring replace the population's current members, and
the cycle resumes. In this approach, genetic algorithms attempt to approximate human
evolution to a degree.

Figure 4: Basic structure of genetic algorithm

The following program explains a generalized pseudo-code for a GA


GA()
initialize population
find fitness of population

while (termination criteria is reached) do


parent selection
crossover with probability pc
mutation with probability pm
decode and fitness calculation
survivor selection
find best
return best
GENOTYPE REPRESENTATION
The representation we will choose to represent our answers is one of the most essential
considerations to make while constructing a genetic algorithm. It has been discovered that bad
representation might result in poor GA performance.
As a result, selecting an appropriate representation and defining the mappings between the
phenotypic and genotype spaces is critical to the GA's performance.
In this part, we'll go through some of the most frequent genetic algorithm representations.
However, because representation is extremely issue specific, the reader may discover that
another representation or a combination of the representations listed here is more appropriate
for his or her situation.
Binary Representation
In GAs, this is one of the most basic and often used representations. The genotype in this sort
of representation is made up of bit strings.
The binary form is natural for some problems when the solution space comprises of Boolean
decision variables - yes or no. Take the 0/1 Knapsack Problem, for example. If there are n
items, a solution can be represented as a binary string of n components, with the xth element
indicating whether item x is selected (1) or not (0).

0 0 1 0 1 1 1 0 0 1
Figure 5: Binary Representation

We can represent numbers with their binary form in other situations, particularly ones
involving numbers. The issue with this type of encoding is that various bits have different
meanings, therefore mutation and crossover operators might have unexpected results. Gray
coding can help with this to some level, as a change in one bit does not have a huge impact on
the answer.
Real Value Representation
The real valued form is the most appropriate for issues when we want to specify the genes
using continuous rather than discrete variables. However, the precision of these real or floating-
point integers is restricted to the computed precision.

0.5 0.2 0.6 0.8 0.7 0.4 0.3 0.9 0.4 0.7
Figure 6:Real value representation
Integer Representation
We can't always limit the solution space to binary 'yes' or 'no' for discrete valued genes. For
example, if we wish to encode the four distances of North, South, East, and West as 0,1,2,3,
we may do so. In certain situations, integer representation is preferred.

5 2 6 8 7 4 3 9 4 7
Figure 7: Integer representation
Permutation Representation
The answer to many issues is represented as an order of elements. In these situations,
permutation representation is the best option.
The traveling salesman dilemma is a prominent example of this approach (TSP). In this game,
the salesperson must travel all of the cities, visiting each one precisely once before returning
to the beginning city. The tour's overall mileage must be kept to a minimum. Because the
solution to this TSP is a natural ordering or permutation of all the cities, it makes reasonable to
use a permutation representation for this problem.

1 5 9 8 7 4 2 3 6 0
3
Figure 8: Permutation representation

GENETIC ALGORITHM – POPULATION


In the present generation, population is a subset of solutions. It can also be thought of as a
collection of chromosomes. When working with the GA population, there are a few things to
keep in mind.

• The population's variety must be preserved; otherwise, early convergence may occur.
• A huge population can cause a GA to slow down, whereas a smaller population may
not be adequate for a suitable mating pool. As a result, the best population size must
be determined by trial and error.
Typically, the population is specified as a two-dimensional array of – size population, size x,
and chromosomal size.
Population Initialization
In a GA, there are two major ways for initializing a population. They are −

• Fill the starting population with entirely random solutions using random initialization.
• Heuristic initialization populates the beginning population with a problem-specific
heuristic.
It has been discovered that employing a heuristic to start the whole population might result in
the population having similar solutions and minimal variety. The random solutions are the
ones that lead the population to optimality, according to experiments. As a result, rather than
filling the whole population with heuristic-based answers, we use heuristic initialization to
seed the population with a few good solutions and fill in the remainder with random
solutions.
It has also been discovered that in some circumstances, heuristic initialization merely affects
the population's initial fitness, but in the end, it is the variety of the solutions that leads to
optimality.
Population Models
There are two population models widely used, they are
Steady State
In steady state GA, each iteration produces one or two offspring who replace one or two
members in the population. Incremental GA is another name for steady state GA.
Generalization
In a generational model, we create 'n' offspring, where n is the population size, and at the
conclusion of the iteration, the entire population is replaced by the new one.
GENETIC ALGORITHM - FITNESS FUNCTION
Simply explained, the fitness function is a function that takes a candidate solution to a
problem as input and outputs how "fit" or "excellent" the answer is for the problem under
discussion.
The calculation of fitness value is performed several times in a GA, therefore it should be
quick. A sluggish fitness value computation can have a negative impact on a GA, making it
extremely slow.
The fitness function and the objective function are usually the same in most circumstances
since the goal is to maximize or decrease the provided objective function. An Algorithm
Designer may pick a different fitness function for more complicated issues with numerous
objectives and restrictions.
The following properties should be included in a fitness function:

• The fitness function should be computed quickly enough.

• It must quantify the fit of a particular solution or the fit of persons that may be created
from that solution.

Due to the intrinsic intricacies of the situation at hand, calculating the fitness function directly
may not be viable in some circumstances. In such circumstances, we approximate fitness to
meet our demands.
The fitness computation for a 0/1 Knapsack solution is shown in the graphic below. It's a
basic fitness function that adds the profit values of the objects being chosen (which all start
with a 1) and scans the elements from left to right until the knapsack is filled.
0 1 2 3 4 5 6 Item number

0 1 0 1 1 0 1 Chromosome

2 9 8 5 4 0 2 Profit Values

7 5 3 1 5 9 8 Weight Values

Knapsack capacity = 15

Total associated profit = 18

Last item is not picked as it exceeds knapsack capacity

Figure 9: Fitness function

GENETIC ALGORITHM – PARENT SELECTION


The process of selecting parents who will mate and recombine to produce offspring for the
following generation is known as parent selection. The choice of parents is critical to the
GA's convergence rate, since excellent parents motivate their children to seek out better and
more suited solutions.
However, caution should be exercised to avoid a single exceptionally fit solution taking over
the whole population in a few generations, since this would result in the solutions being near
to one another in the solution space, reducing variety. Maintaining a high level of
demographic variety is critical to a GA's success. Premature convergence occurs when one
exceptionally fit solution consumes the whole population, which is an undesirable scenario in
a GA.
Fitness Proportionate Selection
One of the most prominent methods of parent selection is fitness proportionate selection.
Every individual has a chance of becoming a parent, which is proportionate to their fitness.
Fitter people have a better chance of marrying and passing on their characteristics to the next
generation. As a result, such a selection approach provides selection pressure on the
population's fittest individuals, resulting in the evolution of superior people over time.
Consider a wheel that is round. The wheel is divided into n pies, where n represents the
population's total number of people. Each person receives a chunk of the circle equal to their
fitness level.
There are two ways to implement fitness proportional selection:
Roulette Wheel Selection
The circular wheel is split as previously explained in a roulette wheel selection. The wheel is revolved
when a fixed point on the circumference is picked as indicated. The parent is the area of the wheel
that comes in front of the fixed point. The identical procedure is followed for the second parent.

Chromos Fitness
ome value
A 8.2
B 3.2 Fixed
C 1.4 point
D 1.2
Select D as
E 4.2
the parent
F 0.3
Spin
Roulette
wheel
Figure 10: Roulette Wheel Selection

A fitter person obviously has a larger pie on the wheel and therefore a better probability of landing in
front of the fixed point when the wheel is revolved. As a result, the likelihood of selecting an
individual is exactly proportional to its fitness.
We utilize the steps below for implementation. −
S = the sum of a finesses is the formula to use.
Make a number between 0 and S at random.
Add the finesses to the partial sum P till PS, starting at the top of the population.
The chosen individual is the one for whom P surpasses S.
Stochastic Universal Sampling (SUS)

Stochastic Universal Sampling is similar to Roulette wheel selection, except that instead of having
only one fixed point, we have numerous fixed points, as seen in the graphic below. As a result, all of
the parents are picked in a single wheel spin. Furthermore, such a system promotes extremely fit
people to be picked at least once.
It should be noted that fitness proportional selection approaches do not function in situations when
fitness might be negative.
Chromos Fitness
ome value
Fixed
A 8.2
point
B 3.2
C 1.4 Fixed
D 1.2 point
E 4.2
F 0.3 Spin

Figure 11: Stochastic universal selection

Tournament Selection
In a K-Way tournament, we randomly choose K people from the population and choose the
best among them to become parents. The same procedure is followed to choose the next
parent. Tournament Selection is also very popular in literature since it may function with
fitness values that are negative.

Figure 12: Tournament selection


Rank Selection
Rank Selection also works with negative fitness values and is most commonly utilized when
the population's fitness values are relatively near (this happens usually at the end of the run).
As seen in the accompanying graphic, each individual has an almost equal slice of the pie (as
in the case of fitness proportional selection), and thus each individual, no matter how fit
compared to each other, has an essentially equal chance of being picked as a parent. As a
result, the selection pressure towards fitter people decreases, causing the GA to make bad
parent decisions in such circumstances.

Chromosomes Fitness values


A 8.1
B 8
C 8.05
D 7.95
E 8.02 Fixed point
F 7.99
Spin the wheel

No selection pressures

Figure 13: Rank Selection

When choosing a parent, we remove the idea of a fitness value. Every person in the
population, however, is graded according to their fitness. The parents are chosen based on
each individual's rank rather than their fitness. Individuals with higher ranks are favored over
those with lower ranks.
Chromosome Fitness value Rank
A 8.1 1
B 8.0 4
C 8.05 2
D 7.95 6
E 8.02 3
F 7.99 5

Random Selection
We choose parents at random from the current population in this technique. Because there is
little selection pressure for fitter people, this method is frequently ignored.
GENETIC ALGORTIHM- CROSS OVER
Reproduction and biological crossover are comparable to the crossover operator. More than
one parent is chosen, and one or more offspring are generated utilizing the parents' genetic
material. Crossover is typically used in a high-probability GA – pc.
Cross over operator
One point crossover
Multi Point crossover
Uniform Crossover
Whole Arithmetic recombination
Davis’ order crossover

You might also like