An Analysis of Play Style of Advanced Mahjong Players Toward The Implementation of Strong AI Player
An Analysis of Play Style of Advanced Mahjong Players Toward The Implementation of Strong AI Player
An Analysis of Play Style of Advanced Mahjong Players Toward The Implementation of Strong AI Player
Distributed Systems
To cite this article: Hiroshi Sato, Tomohiro Shirakawa, Akitoshi Hagihara & Kento Maeda (2017)
An analysis of play style of advanced mahjong players toward the implementation of strong AI
player, International Journal of Parallel, Emergent and Distributed Systems, 32:2, 195-205, DOI:
10.1080/17445760.2015.1049267
RESEARCH ARTICLE
1. Introduction
The research on artificial intelligence (AI) for game play has a long history (10). Among them, the game
with perfect information was studied exhaustively and we saw excellent progress in this area. For
example, in chess, computer programs were considered to be stronger than human when Deep Blue
created by IBM researchers defeated the world champion in 1997 (4). In Japanese chess which is called
Shogi, the situation comes to be quite similar. Nowadays, many AI programs show the great ability to
play compared with top ranked human players (11).
On the other hand, human player are still stronger than computer programs in the games with
imperfect information such as table games (8) or video games (13). In this type of the games, it is usually
very difficult to evaluate the game situation because the player can only use limited information. We
think the game of Mahjong is especially important because it is a multi-player game with imperfect
information.
The research of mahjong as a testbed of artificial intelligence has just began. There are several
reasons. Of course the primary reason is that it is too difficult to tackle, but another significant reason is
that it is thought as gambling, contrary to chess or go which are thought as intellectual game. Another
reason is that it is not a Western game. Therefore, as you can see in the reference of this article, many
of the literature are written in Japanese.
There are several researches using some learning technique for implementing a computer which
plays mahjong. Chikayama uses neural network as a tool of learning (5). Komatsu proposes a Monte-
carol like search method (6). However, the strength of the computer mahjong so far is not so high.
There are another type of research. It is about an analysis of human behavior in mahjong. For-
tunately, there are online mahjong site and we can obtain the record of the game called ‘Haifu’.
Figure 1. Examples of specific patterns of tiles. Left: Eyes, Middle: Meld (Chow), Right: Meld (Pong).
Figure 2. An examples of legal hand. This is made from one eyes and four melds.
Tosugeki-Tohoku analyzed two type of online mahjong site and presents the effective guideline for
selecting good moves to win (15, 14).
We think the above two approaches should be merged in a good way. This paper does not aim to
find optimal strategy for this type of game. Instead, we aim to develop a method to determine the
strategy which match to the opponent strategies, because as I mentioned above, it is too difficult to
find the optimum strategy, it is not so hard to find a better strategy against specific strategy. In other
words, we develop the system which can estimate opponents’ strategies from their behaviors. This
estimation is done based on the record of the behaviors of online players.
The rest of this paper consists of the following five sections: Section 2 explains the game of mahjong
and how to record players’ behavior. In Section 3, we analyze the behavior of top-level player using the
real game record (Haifu). In Section 4 we classify the players based on the knowledge obtained from
the haifu. In Section 5, we validate our result by implementing simple artificial intelligence mahjong
program. Section 6 concludes this paper.
In order to evaluate concrete situations, the haifu of the online mahjong site ‘Teng-ho’ (12) is used.
We selected advanced players’ haifu of 7,888 games in Ho-Ou? table. Ho-Ou table is the place where
only top-level players can enter.
In this case, we observe how many times the player changes his tile during the game. Equation (1)
shows the result of regression using the Teng-ho haifu data. Table 3 shows the information of the
regression.
T = 0.07x1 + 0.73x2 + 1.32x3 + 1.68x4 + 1.74x5 + 5.45 (1)
where,
T : An anticipated number of turns when the player becomes ready to complete,
x1 : The number of changing tile during turn 1 to turn 3,
200 H. SATO ET AL.
Using this equation, we can predict the number of players who is ready to complete their hands. If
the T becomes large, it means that the situation is risky.
Table 5. Devision of the situation based on the players condition and environmental condition.
Probability of completing a legal hand
Small Medium Large
Number of the player 1 s1–1 s1–2 s1–3
ready to complete 2 s2–1 s2–2 s2–3
3 s3–1 s3–2 s3–3
Table 7. Result of principal component analysis to the behavior the advanced players. Std. Dev.: standard deviation, Prop. Var.:
Proportion of variance and Cum. Prop.
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9
S1–1 0.03 0.00 −0.01 0.03 0.03 −1.00 0.06 0.01 −0.03
S1–2 0.01 0.09 0.00 0.02 0.99 0.03 −0.03 −0.03 −0.01
S1–3 0.00 0.00 0.00 0.04 0.01 −0.02 −0.01 0.00 1.00
S2–1 0.99 −0.02 −0.13 0.00 −0.01 0.03 0.00 0.00 0.00
S2–2 0.02 0.99 −0.06 0.01 −0.09 0.00 0.01 0.06 0.00
S2–3 0.00 −0.01 0.02 0.99 −0.02 0.04 0.01 0.00 −0.03
S3–1 0.13 0.05 0.99 −0.02 −0.01 0.00 0.02 0.01 0.00
S3–2 0.00 0.06 0.01 0.01 −0.04 −0.02 −0.20 −0.98 −0.01
S3–3 0.00 0.00 0.02 0.09 −0.03 −0.06 −0.97 0.20 −0.02
Std.Dev. 5.17 4.48 4.06 3.79 3.37 3.08 2.95 2.89 2.17
Prop. Var. 0.22 0.17 0.13 0.12 0.10 0.08 0.07 0.07 0.04
criterion of division is the number of the other players ready to complete – condition of environment.
Then we evaluate the player’s action using the risk of discarded tile. Table 5 shows how to split
situations. In this study, we split the situation into 9 segments.
Table 6 shows the average risk discarded tiles.
In Table 6, the difference of situations does not affect on attitude of taking the risk. This means the
behavior of top-level players is quite consistent. This result reminds us that consistent behavior is the
key of success in many kinds of game and sports.
hand. The haifu data shows that, in these situations, the decision of advanced player has a variety.
In other words, these players act homogeneously in another situation such as all three other players
seem to be ready to complete or no other player seems to be ready. In former case, obviously it is good
to defensive. In latter case, it is also obvious to be offensive.
5. Discussion
Though the analysis behavior of advanced mahjong player in Section 4, we find that there are two
types of situation: one is the situation where all players behave in the same way and the other is the
situation where the players behave in different manner. Considering all the players in this haifu are
placed in top-level, we can say there is one decisive move in the former situation and the players can’t
win without selecting these decisive moves. On the contrary, there are many acceptable moves in
latter situation and the player can choose their favorite moves. We validate this interpretation using
simple AI mahjong program.
INTERNATIONAL JOURNAL OF PARALLEL, EMERGENT AND DISTRIBUTED SYSTEMS 203
where, S(x) is the score of candidate discarded tile x, E(h) is evaluation of whole hand, R(x) is a risk of
candidate discarded tile x and a is the parameter (0 < a < 1). E(h) can be calculated from the melds,
eyes, and doras in his hand. R(x) can be calculate by Table 4.
An AI programs which implements this decision making function selects the tile x which maximize
S(x) as a discarded tile.
5.2. The relation between average rank of the game and the variety of actions
We split the situations into two groups: (1) Situation 2–1 and Situation 2–2, and (2) Others. This split
corresponds to the result in Section 4. Then we consider the following two cases: Case (a) the AI player
doesn’t have a variety of action in Situation 2–1 and Situation 2–2, but have a variety of action in other
situations. Case (b): the AI player have a variety of action in Situation 2–1 and Situation 2–2, but doesn’t
have a variety of action in other case.
Tables 8 and 9 show the average ranking of AI players. In Table 8, the lowest average rank is 2.60
and the highest average rank is 2.65. In case (a), the range of average rank is 0.05. In Table 9, on the
other hand, the lowest average rank is 2.41 and the highest average rank is 3.61. In case (b), the range
of average rank is 1.20. Now you can get the evidence that players’ performance don’t change if they
change their behaviors when you are in Situation 2–1 and Situation 2–2, however, player’s performance
change significantly if the players change their strategy when they are in the other situations.
204 H. SATO ET AL.
6. Conclusion
In this study, we tried to classify the play style of the Mahjong players from Haifu – the record of their
behaviors. The players’ attitude to the risk of discarding tile is used for evaluation. Analyzing the haifu
of top-level players in ‘Teng-Ho’ online mahjong site, advanced players have the same attitude to the
risk almost every situation. But there are very few situations where these players behave differently.
We classified the behavior in focus of these situations by principal component analysis and find that
the advanced players’ behaviors are classified into one major cluster and three minor clusters. We also
validate our finding using simple computer simulation. As future works, we have to consider many
properties neglected in this study such as melding and richi. Then, we are planning to create AI players
based on this result.
Disclosure statement
No potential conflict of interest was reported by the authors.
References
[1] S. Edwards, Standard: portable game notation specification and implementation guide. Available at http://www.
saremba.de/chessgml/standards/pgn/pgn-complete.htm.
[2] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd
ed, Springer, New York, 2009.
[3] A. Hollosi, SGF file format FF[4]. Available at http://www.red-bean.com/sgf/.
[4] F.H. Hsu, Behind Deep Blue: Building the Computer that Defeated the World Chess Champion, Princeton University
Press, New Jersey, 2002.
[5] R. Kitagawa, M. Miwa, and T. Chikayama, Learning of evaluation functions in accord with game records in mahjong (in
Japanese), Proceedings of the 12th Game Programming Workshop, Kanazawa, 2007, pp. 76–83.
[6] T. Komatsu, K. Narisawa, and A. Shinohara, Effective algorithm for decision making on hand-composing game (in
Japanese), IPSJ SIG Tech. Rep. 2012-GI-28, 8 (2012)., pp. 1–8.
INTERNATIONAL JOURNAL OF PARALLEL, EMERGENT AND DISTRIBUTED SYSTEMS 205