Comparison of A, Euclidean and Manhattan Distance Using Influence Map in Ms. Pac-Man
Comparison of A, Euclidean and Manhattan Distance Using Influence Map in Ms. Pac-Man
Comparison of A, Euclidean and Manhattan Distance Using Influence Map in Ms. Pac-Man
Sudip Karki
Hari Sagar Ranjitkar
Faculty of Computing
Blekinge Institute of Technology
SE-371 79 Karlskrona Sweden
This thesis is submitted to the School of Computing at Blekinge Institute of Technology
in partial fulfillment of the requirements for the degree of Master of Science in Computer
Science. The thesis is equivalent to 20 weeks of full time studies.
Contact Information:
Author(s):
Sudip Karki
E-mail: sukc10@student.bth.se
University advisor(s):
Professor Lars Lundberg
Associate Prof. Stefan Johansson
Department of Computer Science and En-
gineering
We would like to thank Stefan J. Johansson and Lars Lundberg for their incredible
support and guidance to fulfill the task successfully. We want to express our
gratitude towards Blekinge Institute of technology for providing all the necessary
support and resources so as to complete the thesis fruitfully.
List of Figures
1.1 A generic influence map in GO showing the black and white stones
influence in its surrounding upto 8 tiles further away. Figure with
permission from Niklas Hansson [8]. . . . . . . . . . . . . . . . . . 2
1.2 The intensity map of all positions in a map in ORTS.Figure with
the permission by Hagelback and Johansson [8] . . . . . . . . . . . 2
1.3 Manhattan Distance Representation . . . . . . . . . . . . . . . . . 5
1.4 A* Search algorithm Representation . . . . . . . . . . . . . . . . . 6
iii
List of Tables
5.1 t-test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2 t-test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.3 t-test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
iv
Contents
Abstract i
1 Introduction 1
1.1 Pac-Man . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Potential Field and Influence Maps . . . . . . . . . . . . . . . . . 1
1.3 Euclidean, Manhattan distances and A* search algorithm . . . . . 4
1.4 Difference among Euclidean, Manhattan distances and A* search
algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Related works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
v
4 Experiments 19
4.1 Goal of the experiment . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Experiment Planning . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.4 Experiment Instrumentation . . . . . . . . . . . . . . . . . . . . . 21
4.4.1 Experiment Phase 1 . . . . . . . . . . . . . . . . . . . . . 22
4.4.2 Experiment Phase 2 . . . . . . . . . . . . . . . . . . . . . 22
7 Conclusion 38
8 Future Work 39
References 40
A Appendix 43
A.1 Ms. Pac-Man . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
A.1.1 Ghosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
A.1.2 Lair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.1.3 Pac-Man . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.1.4 Normal Pills . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.1.5 Power Pills . . . . . . . . . . . . . . . . . . . . . . . . . . 50
vi
A.1.6 Fruits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.1.7 Junctions . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.1.8 Teleports . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
vii
Chapter 1
Introduction
1.1 Pac-Man
Pac-Man is a popular arcade game originally produced by Midway and developed
by Toru Iwatani for Namco Company in year 1981 [1]. The best variant of
the game is Ms. Pac-Man was released later in 1981. It introduced a female
character, new maze designs and several gameplay changes[10]. Screenshots of
the game are shown in Figure A.1 and Figure A.2 in the appendix: Ms. Pac-Man
moves around the maze, eating pills for points while trying to avoid the four
ghosts (Blinky,Pinky,Inky and Sue) who strive to eat Ms. Pac-Man[10]. The four
power-pills that appear randomly at the corners of the maze allow Ms. Pac-Man
to eat the ghosts for a specified time period to gain additional points, during this
specified time the ghosts turn blue.
Influence map is such a property in which the adjacent tiles are influenced
by object and the adjacent tiles are respectively influenced by the neighboring
tiles forming a propagation of influences in the source which is far away [13].
The source which is moving towards the destination then compares the influences
among the possible look ahead positions and picks the path having highest in-
fluence. This means the chosen path is such that the source is nearest to the
1
Chapter 1. Introduction 2
There are some differences between potential fields and influence maps. The
propagation in potential field is not terrain sensitive i.e. the propagation is not
blocked when there is a wall between source and destination. Whereas, the prop-
agation of influence maps is terrain sensitive i.e. the propagation can be blocked
when there is a wall that hinders the motion resulting to follow the maze path.
In this case, the shortest path algorithm (A*) can be used to compute influence
since the influence follow the paths of a maze and do not propagate through walls
whereas Euclidean and Manhattan distance metrics are used to compute poten-
tial field.
Although, potential field methods and influence maps are similar techniques that
are used for obstacle avoidance applications. Potential field is rapidly used in
robots and mobile robots whereas influence map is used in gaming industry [18].
The generic influence map in GO showing black and white stones influence is
represented diagrammatically in Figure 1.1. Figure 1.2 shows the intensity map
of all positions in a map in ORTS.
The figure 1.1 is an example of influence map whereas the figure 1.2 is an
example of potential field. Influence map is very important in Go board segmen-
tation [20]. Among the authors many analogies, board domination is the one he
is highly interested in. In this board domination, black stone is situated in the
Chapter 1. Introduction 3
It is assumed that a game is in progress and the board position is stored. The
position is transferred to a 19X19 integer matrix by placing 50 for each black
stone, -50 for each white stone, and 0 elsewhere. Then each point which is posi-
tive sends out +1 to each of its four neighbors, and each negative point sends out
-1 to each of its neighbors. These numbers are accumulated as this procedure is
repeated four times [20].
Other analogies of the author in influence map includes drops of oil and water
on a blotted paper and heat conduction. In heat conduction analogy, by altering
the thermal conductivity at any point on the plate, we can change the influencing
ability.It seems to be a very simple thing when we give a glance but but it is still
worthy when we gaze and think of the composite plate consisting of various things
added together representing the terrain in the map. For instance, if the certain
map section is made up of styrofoam or aluminum, the heat conduction does
not take place. So, we need to define about the influence conduction in various
terrain type. We can assume that mountainous terrain conduct like styrofoam,
plains like aluminum, roads like gold, forest like steel . Jason Kester [4].The heat
conduction is analogous to amount of influence which is spread to the neighbors.
The use of influence maps are becoming more popular in gaming industry.
An author known as Mat Buckland has referred influence maps for using simple
array of data and each element from this data array represents data about the
specific world position. Influence maps are regarded as 2D grid overlaid in the
world [5]. Influence maps are mostly used in Real time Strategy(RTS) games.
In RTS games, influence maps represent areas of various agents. For instance,
various players have positive influence over energy sources and negative influence
over kill zones. The proper and efficient decision making is possible by querying
influence maps by AI techniques [5].
There are some problems with using influence maps. The first is that cal-
culation of influences is expensive. The cost varies according to the types of
applications. For instance, some applications are required to update most of the
time whereas others require updating only some regions [5].
The advantage of influence maps is that it can be queried for required infor-
mation regarding the certain position by an AI entity for a particular constant
Chapter 1. Introduction 4
time. This is greatly important in RTS games in particular since various calcula-
tions like finding the safe way out reduces cost to very high extent. The simple
and easy visualization of influences is another advantage of influence maps.
In influence maps model various agents in the game maze like normal pills,
power pills and ghosts radiate an influence over a certain space and sum of all these
influences gives the influence map of that space. So, it is such a property in game
world that represents favorable or unfavorable region for motion by calculating
the influence maps of that region. If the value is positive, it is favorable to
move on in that region otherwise not. In terms of Ms. Pac-Man, the normal
pills, power pills, edible ghosts, junctions, teleports emit the positive influences
whereas inedible ghosts are the negative influences.
The early attempts to construct Pac-Man controllers include the work of Koza,
Gallagher and Ryan. Koza used genetic programming based approach as a rem-
edy to static ghost movement since Pac-Man shows the deterministic behaviour
Chapter 1. Introduction 7
of the movement of the ghosts. Gallagher and Ryan applied the weighted set of
rules which is dependent on the results of the previous set of games.
Robles and Lucas applied the tree-based search and got a very good result in
terms of score by reaching a depth of forty moves ahead in the search. This work
was further carried by Samothrakis et al. by using Monte Carlo tree search with
certain tree depth allowing to find the paths better than the earlier one.
In 2008, Wirth and Gallagher presented a solution stating the use of influence
maps. Their model comprised of three main parameters that have intuitional
relation with the behaviour of agents. They have used the greedy algorithm, ran-
dom, systematic and global exploration method to show the experimental results
that explores the model performance over its parameter space [18]. The model
of Wirth and Gallagher stems to three dimensional optimization problem. They
used greedy algorithm and it took steps in random directions in the parameter
space. The step size was slowly deducted in a certain period of search. They
have used the Euclidean distance measure in their proposed equation. They con-
structed the influence map model only by taking accounts of dots, ghosts and
edible ghosts but not the fruits and other attributes of maze like junctions and
lairs.
Despite Wirth and Gallagher states the use of Influence map in their paper,
Svensson in ”Influence-map based controllers for Ms. Pac-Man and the Ghosts”
claims the use of potential field in this paper. Johan Svensson and Stefan J.
Johansson have implemented influence map based controller both for the Ms.
Pac-Man and the ghosts. They have used A* distance measure algorithm for
calculating the influence [13].
Svensson claimed that Wirth Gallagher has not implemented influence map
rather it is a potential field. Moreover, they said that their implementation is
influence map. Although, there is slight difference between potential field and
influence map but both of them can be represented by same equation. The major
differences between Wirth Gallagher and Svensson are they have used different
pacman controller , different distance measures i.e. Wirth Gallagher has used
Euclidean but Svensson has used A*. They have used different algorithm to find
optimal space. These distances are not compared in the same environment with
same controller and with the same algorithm. Despite of having less differences
the same equation is used for both potential field and influence map. Moreover,
influence map is more commonly used in gaming field. In rest of our thesis ,
we will be representing both potential field and influence map as influence map.
Also, we will have single equation which will derive potential field and influence
map based on the distance measure used.
Chapter 2
Problem Description and Statement
The three distance measures Euclidean, Manhattan and A* have not been
compared
1. Together.
Ms. Pac-Man Maze has a grid structure i.e. the travel path used by ghost
and pacman is either a straight line, right angle or numerous combination of right
angle. It is relevant to compare Euclidean distance and Manhattan distance.
The three way comparison is important because distance measure is not only
the factor to improve the highest score, in addition to distance measure various
strategy can be applied in Ms. Pac man game which will result more score. i.e.
8
Chapter 2. Problem Description and Statement 9
one can implement euclidean distance measure with good gaming strategy and
yield more score than A* implementation.
3. Which distance measure among A*, Euclidean and Manhattan will provide
highest score when run in same algorithm, controller and environment ?
For the literature review, studies related to Ms. Pac-Man, artificial intelli-
gence and finally path finding solutions like potential field and influence maps
were considered.
We performed search in six different stages where we have used different sets
of keywords each time. The choice of keyword were initially chosen as Pacman
and this yielded many research papers. Then, the search is narrowed down by
Chapter 2. Problem Description and Statement 10
focusing on the advanced version of Pacman i.e. Ms. Pac-Man. After this, search
was performed gradually by focusing on one of the modern approaches known
as artificial intelligence. Similarly we targeted our search to potential field and
influence maps since these two terminologies are the focus areas in our research.
The table shows the results and the list of keywords used for searching the dif-
ferent databases. The databases used were IEEE, Inspec and ACM.
By the literature review that we have conducted 271 Journals and Conference
papers were obtained. These results were filtered by an exclusion criterion to
contain only full text. This resulted to only 157 results. After this, this result
was refined further so that the articles are after 1999 A.D. or last 15 years which
reduced the result to 70 results. The titles and abstract of all these 70 papers were
reviewed and only 15 papers were selected that seem to be relevant for our thesis.
Most of the papers were excluded because they focused on decision tree search
algorithm. Instead the papers that dealt with hill climbing algorithm, potential
field and influence maps were chosen from a lot of papers. Similarly, Papers im-
plementing neural networks, genetic programming are not selected since artificial
intelligence approach is selected for calculating potential fields and influence maps
in our thesis. The reference papers of the selected papers were also taken into
consideration while selecting the papers.
2.4.2 Methodology
For our RQ1 and RQ2, we will implement a mix of qualitative and quantitative
approach. Firstly, we conduct our review through various types of literatures such
Chapter 2. Problem Description and Statement 11
For RQ3, we will use quantitative approach. The design in influence model
will be used with various distance metrics algorithm and will be implemented in
Ms. Pac-Man to play with the controller. The result generated from the game
play will be used to analyze and show performance. The performance is shown
by pointing how efficient is distance measure using mean value of score from the
game.
2.4.3 Outcomes
The possible outcome we expect to achieve is a complete game that can be played
with the controller. Graphs show the performance of the algorithms used in Ms.
Pac-Man. The outcome results in list of individual graphs for each distance
measures, distribution of weights used, ghosts, edible ghosts, power pills and
normal pills, frequency graphs and density graphs. Finally, tabular results include
numeric values like scores, optimal points that can be gathered from games. These
collected values will be used for statistical analysis to analyze the results.
Chapter 3
Ms. Pac-Man Implementation
A calculation for this type of radiation from many objects is resource inten-
sive where the calculation should be made in every tile. In our design, we use a
simple calculation to make it effective where we do not care all influences made
along a pathway to origin of the object. Rather, we take the influence in fading
point or instead of influences completely fading away. We calculate the influence
of objects until it reaches near the adjacent tiles of Ms. Pac-Man Lookahead
position.
12
Chapter 3. Ms. Pac-Man Implementation 13
power pills, normal pills, ghost , edible ghost , lairs. wo is the strength of the
influence of the object O, lapi is the lookahead position of adjacent neighbour of
Ms. Pac-Man where i >= 1 and i <= 4. dm is the distance measure between the
lapi and O where m is the various distance measure through:
m ∈ M (Eu, M, A∗)
Eu, M and A* are Euclidean, Manhattan and A-Star distance measures re-
spectively. M is the collection of Distance measures where m is one of the members
of the Objects.
1. Positive Influence : Those objects which attract the agent. These objects
radiate a positive value in the maze.
2. Negative Influence : Those objects which repel the agent. These objects
radiate a negative value.
We further divide these categorised objects into fields where we discuss how
the individual objects has its influences.The actual Influence map functions used
from the generic functions are:
4
X wEg
If oEg (Eg, wEg , lapi ) = max
lapi ∈LAP
k=1
dm (lapi , Egk )
where If oEg (Eg, wEg , lapi ) is field of edible ghost, Eg is the edible ghost where
maximum number of edible ghost is four and wEg is the weight of edible ghost.
the Im (lap) the one with the highest value will be chosen as a direction for Ms.
Pac-Man.
The algorithm starts by initializing random values for all five parameters. The
range for random values for Ghosts, Edible ghosts and Power pills are between
1 and 400. The range of random values of Normal pills are between 1 to 40. A
random value of Normal pills is not more than 40 because high value for normal
pills leads to local minimum problem. The value of Junction is fixed and set to
2.
The proposed algorithm starts with one of the random values by decrementing
the value as long as the score is improved. Once the score is no longer improved
the algorithm starts incrementing the random value as long as the score is im-
proved. when the value is both incremented and decremented the algorithm makes
selection between incremented and decremented value based on the highest score.
This process is repeated with rest of the random values and selection is made.
The final result will be selection of five optimized values that yielded the high-
est score. This whole process is referred as single iteration. The algorithm then
moves to second iteration but instead of random values optimized values of pre-
vious iteration are used.
The algorithm completes each iteration when all the five values are both incre-
mented and decremented. The first iteration starts with random values whereas
the consecutive iteration starts with optimized values from previous iteration.
The iteration stops when the score can be no longer be improved. The result will
be final optimized values that yielded the highest score among all the iterations.
The figure 3.2 is an example which shows selection of optimal value based on the
highest score. In the example, 27 is selected as an optimal value based on the
highest score. This process is repeated for all the set of values.
2000 random set of weights will be processed through the algorithm which will
yield 2000 final optimized values.
4.2 Motivation
Quantitative approach is known for fetching quantitative data from controlled
experiments which promotes comparison and statistical analysis [19].
19
Chapter 4. Experiments 20
2. Hypothesis Formulation
To be sure our proposed algorithm works better than random parameters
used to find optimal value we have constructed hypothesis
”Random parameters is efficient in finding optimal values than proposed
algorithm to find optimal values”
(a) Hypothesis I
The hypothesis test is related to comparison of score obtained from op-
timized algorithms with score obtained from random parameter space
in Ms. Pac-Man. Our sample size is equal whereas the variance is
assumed to be unequal. Therefore, we choose Welch’s t-test which is
intended for use with two sample of equal or unequal size with possibly
unequal variances.
Null Hypothesis Ho:
µR andom ≥ µO ptimized
t= rXR −XO
S2 S2
R+ O
NR NO
S2 S2
( NR + NO )2
Degrees of freedom (f ) = S4
O O
S4
R + O
2
N (NR −1) 2
N (NO −1)
R O
Dependent variables: Values for normal pills, power pills, ghosts, edible
ghosts.
Independent variable: Junction value which is fixed for iteration.
2. Guidelines: As the controller plays the game itself, we only need to verify
and check if the software gets aborted abnormally. Data is collected and
appended in every run.
Altogether three different experiments are conducted for A*, Euclidean and
Manhattan distance measure. 10,000 game trials are run for each individual ex-
periment.
The log parameters recorded during this experiments are the score, number of
normal pills eaten, number of ghosts eaten, number of power pills eaten. These
recorded parameters belong to each individual game.
Chapter 5
Result and Analysis
The Figure 5.1 (a) and 5.1 (b) are obtained using A* distance measure. The
Figure 5.1 (a) shows the unsorted highest score obtained when implementing al-
gorithm for every random weights in search space. X axis is number of games
whereas y axis is final high score obtain after applying algorithm. The maxi-
mum score in the graph is 16,934 whereas the minimum score obtained is 2071.
The Figure 5.1 (b) shows highest average score frequency, the frequency score is
grouped with 1000 interval.
The suitability of the algorithm is measured in four factors. They are score
optimization, number of iterations used, distribution of weights used and com-
parison with equal sample using random set of values. These four factors are
described in detail below.
24
Chapter 5. Result and Analysis 25
The Figure 5.2 (a) shows optimization of scores when algorithm is applied. Opti-
mization of score is difference of first score obtained from set of random weights
and final highest score obtained using the proposed algorithm. The collected
optimized score consist result from both increment and decrement weight. The
algorithm is able to optimize score maximum upto 8000.
Figure 5.2 (b) shows frequency of optimized score, the frequency is grouped
with 500 interval. The average score the algorithm is able to optimize is between
2000 to 4000. The score 0 means algorithm is not able to optimize initial score
obtain from random weight. The number of inital score that the algorithm is not
able to increase is 129 out of 2000 random set of weights, in percentage number
of 0 obtained is 6.45%.
Chapter 5. Result and Analysis 26
The Figure 5.3 shows the number of sets of values used in every iterations.Chapter
3.5 describes the iteration in detail. The algorithm starts with random sets of
values and will move to further iterations if there is possibility to optimize more.
The algorithm will always go to iteration 2 from iteration 1.
The histogram shows 1300 random set of values are able to reach iteration 2nd
but were unable to move to iteration 3 ( The iteration 1 will always proceed to
iteration 2). 336 random sets of values are able to reach iteration 3, 289 random
sets of values are able to reach iteration 4, 58 random sets of values are able to
reach iteration 5, 12 random sets of values are able to reach iteration 6, 2 random
sets of values are able to reach iteration 7 and 3 random sets of values are able
to reach iteration 3.
In every iteration the set of values which are able to score better than previous
iteration are used in next iteration. Figure 5.3 shows 35% of random set of values
are able to pass through iteration 2. The maximum number of iteration the
algorithm has reached is 8.
Chapter 5. Result and Analysis 27
Figure 5.4: Set of weights between high score 10,000 and 12,000 using A*
From Figure 5.1 (b) it is clear that high frequency of score after implementation of
algorithm is between 10,000 and 12,000. Therefore, the weights that yielded score
between 10,000 and 12,000 are used to plot Figure 5.5 for finding the distribution
of weights. The Figure 5.4 shows the weights are randomly distributed within
the range of 0 to 400. This shows that the algorithm is able to explore randomly
in all the parameter space.
Figure 5.5: Set of weights greater than high score 12000 using A*
From Figure 5.5 the weights are found randomly distributed from which it is
unable to find optimal space in clustered area. The Figure 5.5 is plotted with
weights that yielded score greater than 12,000. The result was similar to above
Chapter 5. Result and Analysis 28
Figure 5.6: Count of games played in algorithm to collect high score using A*
2000 random set of values are used to get 2000 final optimized set of values.
These 2000 final optimized set of values are the result of the highest score ob-
tained from different number of games.
The algorithm starts with 4 random weights where one of the weights for lair
i.e. junction is fixed. Although, the junction is fixed meaning it was not given
any random weight like other four parameters but the value is incremented and
decremented for the junction as well. So, initial game is played with the ran-
dom weights then the weights are increased and decreased in Iteration 1 and 2.
Chapter 5. Result and Analysis 29
In iteration 1 all the 5 weights are at least incremented and decremented once.
i.e. 10 games are played when 5 weights are incremented and decremented once.
Therefore, in iteration 2 also at least 10 games are played when 5 weights are
incremented and decremented once. The algorithm will proceed to iteration 2
even if the random weights in iteration 1 yields highest score.
Thus, 21 games are at least played (Initial game plus 10 games in iteration 1
plus 10 games in iteration 2).
In Figure 5.6 , X axis shows the number of games played and Y axis shows
the number of optimized score. example: X axis = 21 and Y axis = 129, i.e.
129 optimized scores are collected where each optimized score is highest from 21
games. X axis = 22 and Y axis = 382, i.e. 382 optimized scores are collected
where each optimized score is highest from 22 games.
In algorithm the initial weights are increased and decreased with 1 every
time, what about instead of increasing and decreasing weights random sets are
used every time ? To compare algorithm with only random set of weights data
from Figure 5.6 is used.
Result shown in Figure 5.6 is used to collect same amount of high score with
equal number of games only using random set of weights. example X axis = 21
and Y axis = 129., i.e. 21 games are played with 21 random sets and the highest
score is recorded, this process is repeated until 129 highest scores are collected.
X axis = 22 and Y axis = 382 i.e. 22 games are played with 22 random sets and
the highest score is recorded, this process is repeated until 382 games are collected.
In this way to compare with random sets similar distribution of games are
played only with random sets as described in above paragraph. The result of this
comparison is shown in Figure 5.7
Chapter 5. Result and Analysis 30
Figure 5.7 is the high score obtained with only random set of weights for A*.
Figure 5.7 is obtained only with random sets with similar distribution of games
based on Figure 5.6. In Figure 5.7, X axis shows the score obtained and Y axis
shows the frequency. While comparing Figure 5.7 ( Highest score obtain with
random sets ) with Figure 5.1 (b) ( Highest score obtain applying algorithm ) it
shows the algorithm is performing better compare to only using random sets.
2 2
( 2809.3 + 1300.4 )2
= 2000
2809.34
2000
1300.44
=
+
20002 (2000−1) 20002 (2000−1)
2818.041
tα , df t= 1.960
| − 47.5985| > 1.960 and hence H0 is rejected
2 2
( 2047.1 + 1193.2 )2
= 2000
2047.14
2000
1193.24
=
+
20002 (2000−1) 20002 (2000−1)
3216.73
tα , df t= 1.960
| − 32.26925| > 1.960 and hence H0 is rejected
Chapter 5. Result and Analysis 32
2 2
( 3052.3 + 1724.1 )2
= 2000
3052.34
2000
1724.14
=
+
20002 (2000−1) 20002 (2000−1)
3156.74
tα , df t= 1.960
| − 21.60468| > 1.960 and hence H0 is rejected
t critical value 1.960 is chosen for distribution of freedom (df) larger than 2000
with significance level 0.05 from t distribution table.
The Bonferroni correction sets the significance cut-off at α/n where α is the
significance level and n is number of tests. In the above 3 tests with significance
level 0.05, the Bonferroni correction of p value is 0.05/3 = 0.01666666666667. The
p value of each individual hypothesis is less than .00001 i.e the p value of each
individual hypothesis is lower than 0.01666666666667 therefore null hypothesis is
rejected.
After collecting the experimental data, the descriptive statistics are used to
describe and present the data graphically. The Figures 5.9 and 5.10 depict that
the average score of A* is more efficient compared to Euclidean and Manhattan
distance. The densities of average score of Euclidean and Manhattan tend to be
quite similar whereas the density of average score of A* is twice efficient. The
mean score from the experiment are 10790, 7469 and 6108 for A*, Euclidean
and Manhattan respectively. The highest score obtained from the experiment are
30930 , 22480 and 20260 for A*, Euclidean and Manhattan respectively.
Chapter 5. Result and Analysis 34
5.3 Conclusion
For RQ1, Section 5.1.1 Figure 5.2 (a) and (b) shows increment of score where
the highest score incremented is 8000. Moreover algorithm was able to increment
93.55% of random parameter space and only 6.45% was not able to increase.
For RQ2, Section 5.1.4 and welch t test proves that algorithm was able to
perform better than random optimal parameter space.
For RQ3, Experiment 2 clearly shows A* was able to provide highest score
30,930 compared to Euclidean and Manhattan. Moreover, the mean score of A*
is 10790 which is greater than mean score 7469 and 6108 of Euclidean and Man-
hattan. We conclude that A* provides highest score when run in same algorithm,
controller and environment.
Chapter 6
Discussion and Validity Threats
6.1 Discussion
There are a lot of algorithms, which we could have chosen for Experiment 1 such
as hill climbing, greedy algorithm, weighted algorithm e.t.c. Our requirement
was an algorithm, which could explore a huge search space with time constraints.
So we made an effort to introduce this algorithm, which could explore a huge
parameter space with time constraints.
The results of experiment number 1 show that the proposed algorithm is suit-
able to find optimal points. From Figure 5.2 raise in score clearly shows the
algorithm was able to optimize the random points however we were not satisfied
because the samples for random point in space and optimize points in space are
from different distribution. To make a comparison we collected random points
in space with same distribution, which is based on games played in each iteration.
In Figure 5.6, we can see the score collected from games with parameter space
visited i.e. 400 games are run where each individual games have visited at least
21 different parameter space and out of 21 parameter space the highest score is
taken. The parameters space visited are from 21 to 62. Using this frequency table
the random games are run to obtain data from similar distribution. The output
of the result Figure 5.7 shows on using same distribution of sample still the algo-
rithm is able to perform better than random parameters. If we look deeper into
the result the algorithm has optimized score greater than 12,000.
This has been noted and suggested for future research to implement superior
35
Chapter 6. Discussion and Validity Threats 36
version of the algorithm and make comparison with the recent results. Moreover,
future research also includes incrementing based on percentage and using float
values.
Every random value will go through iteration 2 and the game shows only 35%
of the value has crossed iteration 2. 7594 and 10889 are average score obtain from
random and optimize algorithm where average increment of score in terms of per-
centage is 30.25%. To prove further we have run Welch t-tests which disprove
our null hypothesis and approve our alternate hypothesis. This strongly suggest
no doubt the optimization algorithm is better than random parameter space. We
have performed three different t tests for A*, Manhattan and Euclidean distance
measures and all of them disprove our null hypothesis and approve our alternate
hypothesis.
In the experiment No.2, 10790, 7469 and 6108 are the mean score values of
A*, Euclidean and Manhattan from 10,000 runs. The mean value suggest that
Euclidean and Manhattan performance is close enough. A* in the other hand has
performed far more better than Euclidean and Manhattan. The density graph
clearly show without no doubt A* is the most effective among three of them.
38
Chapter 8
Future Work
Since the efficiency of distance measure used in influence maps and potential fields
have been identified in this effort, proposed future work deals with improvement
of algorithm used. The algorithm can be improved by using other non probability
sampling techniques, using percentage instead of fixed increment of 1 during opti-
mization. Future research can be comparison of algorithm with other algorithms
such as advanced hill climbing algorithm in Ms. Pac-Man. The algorithm can be
used in other environment to verify how good it is able to evolve optimal param-
eters. Similar experiment can be run with implementation of Influence map in
ghosts. Solutions that have been proposed in discussions against the issues can
be worked on for further improvement of algorithm.
39
References
[3] L.L. DeLooze and W.R. Viner. Fuzzy Q-learning in a nondeterministic envi-
ronment: developing an intelligent Ms. Pac-Man agent. In IEEE Symposium
on Computational Intelligence and Games, 2009. CIG 2009, pages 162 –169,
September 2009. 00013.
[4] P.K. Egbert and S.H. Winkler. Collision-free object movement using vector
fields. IEEE Computer Graphics and Applications, 16(4):18 –24, July 1996.
[6] Johan Hagelbck and Stefan J. Johansson. Using multi-agent potential fields
in real-time strategy games. In Proceedings of the 7th international joint
conference on Autonomous agents and multiagent systems - Volume 2, AA-
MAS ’08, pages 631–638, Richland, SC, 2008. International Foundation for
Autonomous Agents and Multiagent Systems. 00059.
[8] Stefan J. Johansson. A survey of the use of artificial potential fields and
influence maps in game ai research. 2013.
40
References 41
[10] P. Rohlfshagen and S.M. Lucas. Ms Pac-Man versus Ghost Team CEC 2011
competition. In 2011 IEEE Congress on Evolutionary Computation (CEC),
pages 70 –77, June 2011.
[13] J. Svensson and S.J. Johansson. Influence Map-based controllers for Ms.
PacMan and the ghosts. In 2012 IEEE Conference on Computational Intel-
ligence and Games (CIG), pages 257 –264, September 2012.
[14] G. Synnaeve and P. Bessiere. A Bayesian model for RTS units control applied
to StarCraft. In 2011 IEEE Conference on Computational Intelligence and
Games (CIG), pages 190 –196, September 2011.
[15] Tse Guan Tan, J. Teo, and P. Anthony. Uniform versus Gaussian mutators in
automatic generation of game AI in Ms. Pac-man using hill-climbing. In 2010
International Conference on Information Retrieval Knowledge Management,
(CAMP), pages 282 –286, March 2010.
[18] N. Wirth and M. Gallagher. An influence map model for playing Ms. Pac-
Man. In Computational Intelligence and Games, 2008. CIG ’08. IEEE Sym-
posium On, pages 228 –233, December 2008.
[19] Claes Wohlin, Per Runeson, Martin Hst, Magnus C. Ohlsson, Bjrn Regnell,
and Anders Wessln. Experiment Process. In Experimentation in Software
Engineering, number 6 in The Kluwer International Series in Software Engi-
neering, pages 31–39. Springer US, January 2000.
References 42
[20] Albert L. Zobrist. A model of visual organization for the game of GO. In
Proceedings of the May 14-16, 1969, spring joint computer conference, AFIPS
’69 (Spring), pages 103–112, New York, NY, USA, 1969. ACM. 00080.
Appendix A
Appendix
43
Iteration: 1
52 111 88 2 2 ------------------Random set
6586 52 111 88 2 2
8737 51 111 88 2 2 --------------------- Highest
(optimized value)
8597 50 111 88 2 2
5391 53 111 88 2 2
7535 51 110 88 2 2
6375 51 112 88 2 2
7700 51 111 87 2 2
7515 51 111 89 2 2
6070 51 111 88 1 2
7639 51 111 88 3 2
Iteration: 2
51 111 88 2 2 ---------------------
Optimized value from Iteration 1
8247 50 111 88 2 2
8634 49 111 88 2 2
8125 48 111 88 2 2
9137 52 111 88 2 2
6281 53 111 88 2 2
7658 52 110 88 2 2
7172 52 112 88 2 2
8820 52 111 87 2 2
6111 52 111 89 2 2
6372 52 111 88 1 2
9634 52 111 88 3 2 ------------------- Highest
(optimized value)
7841 52 111 88 4 2
Iteration: 3
52 111 88 3 2 ---------------------
Optimized value from Iteration 2
6763 51 111 88 3 2
9087 53 111 88 3 2
6097 54 111 88 3 2
9574 53 110 88 3 2
8580 53 109 88 3 2
8562 53 112 88 3 2
6772 53 110 87 3 2
6657 53 110 89 3 2
7438 53 110 88 2 2
7837 53 110 88 4 2
Maze is a layout of the game which consist of ghosts, lairs, Pac-Man, normal
pills, power pills, fruits, junctions and teleports [11]. All these components are
described below in detail.
A.1.1 Ghosts
Ghosts hinder Pac-Man to eat the normal pills, power pills and fruits. When all
the normal pills are eaten, the level of the game advances forward with different
types of maze. As soon as Pac-Man eats a power pill, ghosts become edible for
a short period of time and successive eating of those ghosts accumulates points
in exponential manner. Pac-Man achieves the extra points by eating randomly
appeared fruits in the maze. Pac-Man loses one of the three lives as soon as she
collides with an inedible ghost. When all the three lives are lost by Pac-Man, the
game ends. Ms. Pac-Man is the advanced version of Pac-Man. There are four
different types of ghosts in Ms. Pac-Man. They are Blinky (red), Pinky (pink),
Inky (cyna) and Sue (orange) [11].
Appendix A. Appendix 50
A.1.2 Lair
In the start of each level in the game, ghosts are situated in the middle of maze,
called lair. The idle time to get outside of lair and chasing Pac-Man decreases
with increasing levels.
A.1.3 Pac-Man
Pac-Man is a main agent in maze which can move right, left and up to eat normal
pills, power pills fruits and edible ghosts to score points. Pac-Man has three lives
and game ends when all the three lives are lost. When Pac-Man scores 10000
points she gets an extra life.
A.1.6 Fruits
Fruits appear randomly in a maze during a game and Pac-Man receives extra
points by eating these fruits.
A.1.7 Junctions
Junctions are the places in a maze which have three or four directions to choose.
They are the places in the maze where Pac-Man and Ghosts can change the
directions (left, right and up).
A.1.8 Teleports
Teleports of a maze are special junctions which are connected to the opposite side
of the maze.
Appendix A. Appendix 51
There are four different types of mazes which are shown in Fig 1.2. Game
starts with maze A and successively moves to B, C and D when level of the game
increases.
Ms. Pac-Man is successor to Pac-Man. These two games are very similar, still
they have a number of minor differences. In particular, the ghosts in Pac-Man
behave deterministically whereas in Ms. Pac-Man the ghosts movement has an
element of pseudo-randomness, making it impossible to learn paths and leading
to a more challenging game. Ms. Pac-Man also features additional maze layouts
compared to Pac-Man [18].
Ms. Pac-Man scores 50 points as soon as she eats any of the power pills.
Successive eating of the ghosts after eating a power pill doubles the score each
time. So, the optimal score that can be achieved after eating power pills is 3050
(= 50+ 200 + 400 + 800 + 1600). If another power pill is eaten during the edible
stage of ghost, the point is reset to 200 again [11].
The game starts in maze A with three lives of Ms. Pac-Man and ghosts ready
to come out from lair. Ms. Pac-Man starts to eat normal pills, power pills and
Appendix A. Appendix 52
randomly located fruits and ghost starts to chase her. Once any of the power
pills are eaten, ghosts become edible for a short period of time.
Ms. Pac-Man is one of the competitions included in yearly competitions track
like CIG [9]. The competitions since 2007 are being held every year for imple-
menting Ms. Pac-Man controllers [9, 16].