Final Project Paper

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 34

1

European Soccer Rankings

Seth Kurzynski

Advised by Dr. Matt Rissler

Abstract

Teams in England’s top tier of professional soccer or football, The Premier League, have

the possibility of playing in up to six different competitions in a given season. Given the high

number of trophies teams not only in England but in Europe are competing for, it can often be

unclear of who the best team truly is. This paper aims to provide a solution to this problem by

attempting to provide a sound ranking system for the teams in Europe’s top five leagues

(England, Spain, Italy, Germany, France). The paper specifically explores the Colley and

Massey Method rankings, which have commonly been used in March Madness. These ranking

methods were applied to each of Europe’s top five leagues individually, allowing us to gauge

each team’s recent performance. Rankings strictly looked at each team’s performance in their

given league but these were compared to see which team is doing the best compared to their

league opponents. This project provides sound insights into ranking league performance in

European soccer but it is also a springboard for future research into other and new ranking

methods.

Introduction

In the world of sports, everyone wants to be the best. If you’re not playing to be the best

then what’s really the point of playing? In the quest to find who the best team truly is, many

analysts obsess over rankings, constantly looking at different ways to compare teams and find

who is on top. Whether you agree with their value or not, rankings are a part of all team sports
2

today and they play an important factor in seeding, predictions and many other facets of sport.

For my project, I will be comparing different methods of ranking European soccer teams,

specifically focusing on the Colley and Massey methods of ranking.

The first aspect of this project that is important to understand is the way European soccer

is structured as it is very different from how American sports works. The governing body of

European soccer is UEFA, Union of European Football Associations, and all countries must

abide by their rules and regulations. Each country is comprised of a domestic league and cup

system. Each league consists of a set number of teams that play each other in a round robin

format, with the champion being determined by whoever has the most points at the end of the

season. Teams receive three points for a win, one for a tie and zero for a loss. Each country also

has a cup which is a single elimination tournament that takes place throughout the league season.

The majority of teams put a higher emphasis on winning the league over the cup as the league

winner is viewed as the true best team in a country.

Beyond domestic competitions, European soccer also consists of two tournaments that

allow teams to compete against teams from other countries: the Champions League and the

Europa League. Both of these tournaments follow the same structure as they consist of a group

stage followed by a knockout phase but the tournaments differ in who takes place in each. The

Champions League consists of teams that finished the highest in their domestic leagues while the

Europa League consists of the best of the rest. The group stage involves each team being seeded

into groups of four, with each team playing each other twice. The knockout phase consists of a

single elimination tournament where each matchup is decided by a game at either team’s home

stadium with the team scoring the most goals over the two games advancing. In the Champions

League, the top two teams from each group qualify for the knockout phase while the third-place
3

team drops down to the Europa League knockout phase and the bottom team is knocked out. In

the Europa League, the top two teams from each group qualify for the knockout phase while the

bottom two are knocked out. The matchup between the final two teams remaining in each

competition consist only of one game rather than two.

Current Ranking Methods

Currently, UEFA uses their own set of rankings to determine qualification and seedings

for both European tournaments. According to UEFA’s website “The club coefficients are based

on the results of clubs competing in the five previous seasons of the UEFA Champions League

and UEFA Europa League. The rankings determine the seeding of each club in relevant UEFA

competition draws”.1 UEFA also has a rule where a club will either be seeded by their club

coefficient or by their association club coefficient. The association club coefficient is calculated

the same way as the club coefficient but it is based on how all teams from a certain association

(country) do.2

These rankings play a very important role in determining how each team gets into

tournaments and where they are seeded once they get in but are they truly a good way of ranking

teams? The rankings serve their purpose of giving UEFA a way of seeding teams but they bring

up the issue of recency. Do results from five years ago really matter in terms of how a team is

doing right now? Realistically, those results wouldn’t matter when looking at who the best team

at the moment is meaning UEFA’s ranking isn’t sound for knowing who the best team at a given

moment is. These rankings also don’t factor in quality of opponent which can also bring up

issues in ranking. This brings up the possibility of a team being artificially ranked higher simply

1
“How the Club Coefficients Are Calculated,” UEFA, last modified August 23, 2019,
https://www.uefa.com/memberassociations/uefarankings/club/ahbout/.
2
UEFA, “How the Club Coefficients Are Calculated.”
4

because they beat worse teams. This isn’t a huge issue given the purpose and length of time of

the rankings but it is still interesting to point out. Overall, UEFA’s ranking fulfills their purpose

but leave soccer fans hungry for a ranking system that shows who the best team is right now.

Another ranking method used in soccer is called “SUM” which is how FIFA, the

governing body for international teams, ranks their teams. This method is actually fairly new as

it has only been implemented since August of 2018 and it is based on the Elo method, which is a

common ranking method in team sports and chess.3 The formula for “SUM” looks as such:

P=Pbefore+ I∗(W −We)

(1)

In this formula P stands for points, Pbefore equals the amount of points a team comes into the

match, I is the importance of the match, which FIFA has a list for each type of match and what

the multiplier is for that match. W stands for the result of the match, which is 1 for a win, 0.5 for

a draw and 0 for a loss. We, the expected result of the match, is subtracted from the result of the

match and it is calculated using a specific formula:

1
We= −dr
10 600
+1

(2)

In this formula, dr is the difference in the team’s rankings. The total points formula above works

to add and subtract from each team’s points total based on the result of each game. Teams are

then ranked based on the total amount of points they have.

3
Men's Ranking Procedure,” FIFA, accessed March 15, 2021, https://
www.fifa.com/fifa-worldranking/procedure/men.
5

Compared to UEFA’s ranking system, FIFA’s formula seems to be a more effective way

of ranking teams as it incorporates both importance of a match and the rankings of the two

teams. They aren’t based on simply how far a team goes in a tournament but they are based on

each match a team plays and how that match was expected to go. This makes the ranking more

fluid and up to date as a lower ranked team that beats a higher ranked team will move up higher

and faster than say a lower ranked team making it far in a tournament under UEFA’s rankings.

The importance of the match is also a very important variable to include as all games are

weighted differently but all games are still taken into consideration. This is effective as it

doesn’t penalize a team badly for a loss in a game that doesn’t matter nor does it reward the

victor too much. Another effective part of this ranking is the fact that teams keep their point

totals year after year. This means that the previous results of the team from years past determine

a team’s ranking but it doesn’t mean they continue to have a large say on the team’s future as

each new result will have more of an impact. This is one problem I have with UEFA’s ranking

as they seem to put too much of an impact on previous results as team’s that did well three to

four years ago are still higher up in the rankings even when they may not be playing well today.

Research

To develop our rankings, Dr. Rissler and I built a program in R that uses linear algebra to

rank teams. I explored two different methods of ranking teams using linear algebra: the Colley

Method and the Massey Method. The Colley Method takes the typical method of ranking teams

by winning percentage and it adds a dependence on the rating of the opponent.4 This means that

a win over an opponent with a better record is weighted higher than a win over an opponent with

a poor record. This aspect of the method is very important as it could rank a team with a worse
4
Tim Chartier, Erich Kreutzer, Amy Langville, and Kathryn Pedings, “Bracketology: How Can Math Help?,” Dolciani
Mathematical Expositions 43, no. 1 (2010): 55–70, doi:10.5948/upo9781614442004.006.
6

record higher based on the quality of wins they have. The problem with this method however is

that it doesn’t consider the score of a game.5 This means that a win by one goal carries the same

weight as a five-goal thrashing.

The Massey Method addresses this issue as it factors in the winning margin. This method

believes that transitivity is approximate, meaning that if team A beats team B by five and team B

beats team C by ten, the team A will likely beat team C by fifteen.6 The Massey Method uses

linear algebra and transitivity to predict who will win a game and by how much based on the

previous results of both teams.7 The biggest drawback of this method would be that transitivity

is often not followed but it is very hard to predict when it will not be followed.

Both the Colley and Massey methods offer differing specific focuses in their approach to

ranking as Colley focuses on quality of win, while Massey looks at score. These methods are

proven to be effective but which is more effective for ranking soccer teams specifically? By its

nature, soccer is a very low scoring sport, meaning that the score of a game often does not tell

the whole tale. For instance, the Seattle Sounders recently hosted Real Salt Lake in the 2021

MLS playoffs. Seattle dominated possession at 62%, posted 21 shots in comparison to Real’s 0,

and got 15 corners in comparison to the opponents 1.8 Real Salt Lake won the game in penalty

kicks. The score line would suggest this was a close game between two evenly matched teams

but that was far from the case as Seattle dominated this game from a statistical point of view.

This means that the Massey method may not be the most effective method for ranking due to the

low amount of scoring and how often scores do not match how a game really went. Looking at

5
Chartier, “Bracketology,” 58.
6
Chartier, “Bracketology,” 59.
7
Chartier, “Bracketology,” 59.
8
“MLS playoffs: Real Salt Lake stun Sounders in penalty shootout,” Match Report, ESPN, accessed November 28,
2021, https://www.espn.com/soccer/report/_/gameId/621672.
7

basketball, Massey makes more sense as basketball has so many more points than soccer,

meaning differences in points more accurately represent the result of the game. This isn’t always

the case but it is generally true when comparing soccer and basketball.

Ranking Methods Used in Research

Colley Method9

This method of ranking starts from the standard formula for winning percentage,

wi
,
ti

(3)

where w iis the total number of wins and t i is the total number of games played for a given team,

i . Laplace’s rule of succession is then applied to produce

1+w i
r i= .
2+t i

(4)

The formula for wins can them be decomposed into

wi −l i wi +l i wi−l i t i
w i= + = + ,
2 2 2 2

(5)

ti
where l iis the number of losses for a given team, i . Looker closer at , at the beginning of a
2

1 1
season, each team’s ratings start at and they will fluctuate around throughout the season
2 2

meaning

9
Chartier, “Bracketology,” 56-58.
8

1
( total games ) ≈( ∑ of opponent s ' ranks for all games played),
2

(6)

which can be translated into

1
2 i j∑
t≈ rj
∈O i

(7)

where Oiis the set of opponents for a given team, i . This can then be substituted back into the

equation for w i to produce

( w i−li )
wi≈ +∑ rj
2 j ∈Oi

(8)

This relationship is approximate as the ranking of all of a team’s opponents may not average out

1
to as one team may not play every other team. If equality is assumed, we find the rating of a
2

given team, i , based on this equation

wi −l i
1+ + ∑ rj
2 j∈O
r i= i

2+t i

(9)

Let’s look at an example to see how we can get ratings from this equation. Using the results

from a famous group from the 2019 Champions League, we can look at how the equation is used

to get ratings.10
10
“2019 Champions League Table,” Google, accessed April 8, 2021, https://www.google.com/search?
q=2019+champions+league+table&sxsrf=AOaemvKS0-5pJaIiTlxHHZfDEyrxrKVyEA
%3A1638400852377&ei=VAOoYbbFFs3dtAaBg6WACw&oq=2019+champions+league+&gs_lcp=Cgdnd3Mtd2l6EAE
YATIECAAQQzIKCAAQgAQQhwIQFDIECAAQQzIECAAQQzIECAAQQzIECAAQQzIECAAQQzIFCAAQgAQyBQgAEIAEMgU
9

( 10 )

Looking at each team’s wins, losses, and total games played, we can substitute those numbers

into our equation to get a linear system of equations:

5−1
1+ +2 R Tottenham +2 R Inter +2 RPSV
2
R Barcelona =
2+6

3−3
1+ +2 RBarcelona +2 RInter + 2 R PSV
2
RTottenham =
2+6

3−3
1+ +2 R Barcelona + 2 RTottenham + 2 R PSV
2
R Inter=
2+6

1−5
1+ +2 RBarcelona +2 R Tottenham +2 R Inter
2
R PSV =
2+6

( 11 )

One may question why the wins and losses don’t exactly match the number of wins and losses

from the group and that is because we must factor ties in. Each tie is half a win and half a loss so

0.5 is added to the win and loss total for each tie. We now have a system of equations that we

can shift around and get into a matrix that we can solve. Shifting around our values produces:

IABCABDoHCCMQsAMQJzoHCAAQRxCwAzoKCC4QyAMQsAMQQ0oECEEYAFDkAljkAmDiCmgBcAJ4AIABXogBXpIBAT
GYAQCgAQHIAQ_AAQE&sclient=gws-wiz#sie=lg;/g/11c743v3x1;2;/m/0c1q0;st;fp;1;;.
10

−3 2 2 2
=−R Barcelona + R Tottenham + R Inter+ RPSV
8 8 8 8

−1 2 2 2
= R −R Tottenham + R Inter + RPSV
8 8 Barcelona 8 8

−1 2 2 2
= RBarcelona + RTottenham −R Inter + RPSV
8 8 8 8

1 2 2 2
= R Barcelona + R Tottenham + R Inter −R PSV
8 8 8 8

( 12 )

We can now shift these values into a matrix:

[ ][ ]
−3 2 2 2
−1
8 8 8

[ ]
8
−1 2 2 2 R Barcelona
−1
8 = 8 8 8 RTottenham
−1 2 2 2 R Inter
−1
8 8 8 8 R PSV
1 2 2 2
−1
8 8 8 8

( 13 )

By multiplying both sides by the inverse of our right-side matrix, we can produce rankings:

[ ] [ ][ ][ ]
−1
8 −2 −2 −2 3 R Barcelona 0.70
−2 8 −2 −2 R
∗ 1 = Tottenham = 0.5
−2 −2 8 −2 1 R Inter 0.5
−2 −2 −2 8 −1 R PSV 0.3

( 14 )

These rankings are interpreted in the same way as winning percentage meaning they are simply

the percentage we would expect each team to win. For example, we would expect Barcelona to

win 70% of the time if this group continued.


11

Massey Method11

The Massey method stems from a matrix that represents each game played in a given

league/season.

[ ][ ] [ ]
1 −1 0 r 1 3
0 1 −1 r 2 = 1
−1 0 1 r3 2

( 15 )

The matrix represents a series of three games in which r 1beat r 2 4-1, r 2beat r 31-0, and r 3 beat r 14-

2. r 1 , r 2 , and r 3 represent the ratings for team 1,2, and 3 of the example series. The first matrix on

the left represents whether each team won or lost against an opponent. The first column

represents r 1while the second represents r 2 , etc. Each row represents the specific game between

two teams in the set, for example the first row represents r 1beating r 2 as the r 1 column has a

positive 1 and the r 2 column has a negative 1. This row corresponds to the 3 on the matrix on the

right side of the equals sign as r 1−r 2=3. Each row on the left thus corresponds to the resulting

score differential on the right. Let’s say we have another game where team 1 beats team 3, 1-0.

Our resulting matrix would thus be:

[ ][ ] [ ]
1 −1 0 r 3
0 1 −1 1 1
r2 =
−1 0 1 2
1 0 −1 r 3 1

( 16 )

11
Chartier, “Bracketology,” 59-60.
12

The result is added on to each matrix rather than adding a total into an existing element of the

matrix. This means for each game we add in a series/season, both matrices keep getting bigger

and the matrix will never end up having full rank.12 This means that we cannot find an exact

solution to the equation but using the method of least squares, we can reduce the residual error

and get an approximate solution.13 To do this, we begin by taking the transpose of our results

matrix on the left and multiplying both sides by the transpose:

[ ][ ][ ] [ ] [ ]
1 −1 0 r 3
1 0 −1 1 1 1 0 −1 1
0 1 −1 1
−1 1 0 0 ∗ r2 = ∗ −1 1 0 0
−1 0 1 2
0 −1 1 −1 0 −1 1 −1
1 0 −1 r 3 1

( 17 )

Our new equation is now:

[ ][ ] [ ]
3 −1 −2 r 1 2
−1 2 −1 r 2 = −2
−2 −1 3 r 3 0

( 18 )

This new equation is interesting as the diagonal of the left side matrix represents the number of

games each team has played, while the negative numbers represent the number of games

between two specific teams. For instance, looking at the first row we can see r 1 has played three

games, one against r 2 and two against r 3 . The right-side matrix is each teams goal differential,

meaning the number of goals scored minus number of goals given up. The problem with this new

system is that it is now over defined. To account for this, we will replace the bottom row on the

left with one’s and the bottom row on the right with a 0:

12
Chartier, “Bracketology,” 60.
13
Chartier, “Bracketology,” 60.
13

[ ][ ] [ ]
3 −1 −2 r 1 2
=
−1 2 −1 r 2 −2
1 1 1 r3 0

( 19 )

The zero is added because the goal differentials will always sum to zero. We add the ones in

simply because it makes the system less defined, allowing us to solve. Taking the inverse of the

left side and solving we get:

[][ ]
r1 0.534
r 2 = −0.667
r3 0.134

( 20 )

This produces our Massey ratings for each team. We interpret each rating as the number of goals

we would expect each team to win/lose by against the average team in the set. Looking at team

1, we would expect them to win by about half a goal against the average opponent. We can also

take the different of two teams’ ratings and get the expected result of the match. Taking the

difference in rating between team 1 and team 3 produces 0.4. We would thus expect team 1 to

beat team 3 by 0.4 goals during the next match.

Weighting

The other aspect that can be considered when looking at these methods is weighting them

by when a game takes place in the season. The methods by themselves treat each game as equal

in terms of ranking, but this is often not the best way to indicate how good a team is. One can

use a linear, logarithmic, or other model of weighting to determine the weight each game of the

season should have on the final ranking of teams. Linear ranking would put a linear increase on

the weight each game has toward the ranking with 0 being at the start of the season, all the way
14

up to 1 at the end of the season.14 This can work but often things in real life do not work out in a

linear pattern. Logarithmic weighting could be used to put a higher weight on games at the end

of the season but not as much disparity between games earlier in the season.15 Another

alternative could be to use a bi-weekly scale that adds weight to each game based on the week in

the season with the weight increasing as the season goes on.16 All of these methods are viable

but we unfortunately did not have enough time to explore them further.

Code

For this project, Dr. Rissler and I developed R code that would take in game results for

each of Europe’s top five leagues, apply either the Colley or Massey method, and print them to

an excel file. Both sets of code take in the results from a data set that gets results from online and

then they proceed to apply their respective method. Each set of code is doing exactly what

happened in each example explained above, just on a much larger scale.17 The rankings would

then be printed out to an excel file that would make them easily accessible. All I had to do was

run the code and it would print out current rankings for each league. We also did each league on

its own so we have five sets of code for each league respectively.

Results

After running the code and looking at each league, I made a table for each league that

looks at standings for this season (21/22) and last season (20/21) as well as Colley and Massey

rankings for both seasons. One thing to note about the Colley and Massey rankings is that they

only consider each season individually so no results from last season factor into the current

14
Chartier, “Bracketology,” 61.
15
Chartier, “Bracketology,” 62.
16
Chartier, “Bracketology,” 63.
17
See Appendix
15

rankings for this season. The current rankings only look at results from this current season. The

rankings are also up to date as of November 10th, 2021.

France

Team 2020 Colley Rating 2020 Massey Rating 2020 League Finish 2021 Colley Rating 2021 Massey Rating 2021 League Standings
Paris Saint-Germain 0.714285714 1.45 2 0.813759151 1.139249787 1
RC Lens 0.547619048 0.025 7 0.609880906 0.70413741 2
OGC Nice 0.488095238 -0.075 9 0.610855714 0.844806627 3
Olympique Marseille 0.571428571 0.175 5 0.639715874 0.593656514 4
Stade Rennes 0.547619048 0.3 6 0.599606532 0.767072608 5
Montpellier HSC 0.523809524 -0.05 8 0.544423246 0.220859827 6
Olympique Lyon 0.69047619 0.95 4 0.543710318 0.079687991 7
RC Strasbourg 0.416666667 -0.225 15 0.502124397 0.441938453 8
Angers SCO 0.428571429 -0.45 13 0.537459773 0.265765487 9
FC Nantes 0.416666667 -0.2 18 0.479444073 0.095636279 10
AS Monaco 0.69047619 0.85 3 0.485477616 -0.028731354 11
Lille OSC 0.75 1.025 1 0.471343641 -0.210605639 12
FC Lorient 0.416666667 -0.45 16 0.45870008 -0.560095636 13
Troyes 2nd Tier 2nd Tier 2nd Tier 0.422464544 -0.425203514 14
Clermont Foot 2nd Tier 2nd Tier 2nd Tier 0.392872945 -0.878509288 15
Stade Reims 0.44047619 -0.2 14 0.409288844 -0.178377773 16
Stade Brest 0.404761905 -0.4 17 0.407513849 -0.324872631 17
Bordeaux 0.428571429 -0.35 12 0.427798663 -0.539983066 18
AS Saint-Etienne 0.452380952 -0.3 11 0.335495188 -1.00525034 19
FC Metz 0.464285714 -0.1 10 0.308064648 -1.001181742 20
Nimes Olympique 0.357142857 -0.775 19 2nd Tier 2nd Tier 2nd Tier
Dijon FCO 0.25 -1.2 20 2nd Tier 2nd Tier 2nd Tier

( 21 )

The first league explored is the French league or Ligue 1, which is typically looked at as

a one-horse race as Paris Saint-Germain (PSG) are considered by far the best team talent wise.

PSG spends large amounts of money to build a team of some of the top players in the world,

something no other team in the league is able to do. It was thus a great surprise when Lille won

the league last season with a Colley rating of 0.75 and a Massey rating of 1.025. Lille did this by

simply getting results with a good group of players as PSG was the better team in terms of goal

scoring given their 1.45 Massey rating but they did not get as many results according to their

0.714 Colley rating. Lille’s current position this season represents a common theme in soccer of

what happens to a smaller club when they win a competition. Being a smaller club, Lille does

not have the financial means of keeping players when they find success so three of their top

players, Boubakary Soumare, Mike Maignan and Luiz Araujo, were all bought in the summer
16

transfer window.18 This is a common trend in European soccer as bigger teams often come in and

buy players from smaller clubs that find success as smaller clubs simply can’t pay their players

enough to get them to stay. Lille also experienced much financial trouble during the pandemic

which hindered them further from keeping their success.19 Looking at their ratings for this year,

0.47 Colley and -0.21 Massey, it is clear to see that they are not the same team that won the

league last season.

Spain

Team 2020 Colley Rating 2020 Massey Rating 2020 League Finish 2021 Colley Rating 2021 Massey Rating 2021 League Standings
Real Sociedad 0.583333333 0.525 5 0.705097729 0.569867197 1
Real Madrid 0.75 0.975 2 0.721918594 1.116874391 2
Sevilla FC 0.678571429 0.5 4 0.716117037 0.988869565 3
Atletico Madrid 0.761904762 1.05 1 0.64659099 0.539039139 4
Real Betis 0.571428571 6.17E-17 6 0.57410985 0.180782387 5
Rayo Vallecano 2nd Tier 2nd Tier 2nd Tier 0.527733879 0.367135279 6
CA Osasuna 0.44047619 -0.275 11 0.554922786 -0.127342061 7
Athletic Bilbao 0.464285714 0.1 10 0.568717819 0.23832924 8
FC Barcelona 0.702380952 1.175 3 0.519015015 0.274330961 9
Valencia CF 0.44047619 -0.075 13 0.515010307 0.115184153 10
Espanyol Barcelona 2nd Tier 2nd Tier 2nd Tier 0.505591632 0.063900206 11
Villarreal CF 0.55952381 0.4 7 0.49903953 0.122983279 12
RCD Mallorca 2nd Tier 2nd Tier 2nd Tier 0.487259045 -0.403135518 13
CD Alaves 0.392857143 -0.525 16 0.402766428 -0.577865168 14
Celta Vigo 0.511904762 -0.05 8 0.37601395 -0.250132096 15
Cadiz CF 0.44047619 -0.55 12 0.400196065 -0.568996193 16
Granada CF 0.44047619 -0.45 9 0.392987745 -0.380914272 17
Elche CF 0.380952381 -0.525 17 0.37773489 -0.392360355 18
Levante UD 0.428571429 -0.275 14 0.267663005 -0.970988372 19
Getafe CF 0.392857143 -0.375 15 0.241513704 -0.905561764 20
SD Huesca 0.369047619 -0.475 18 2nd Tier 2nd Tier 2nd Tier
Real Valladolid 0.357142857 -0.575 19 2nd Tier 2nd Tier 2nd Tier
SD Eibar 0.333333333 -0.575 20 2nd Tier 2nd Tier 2nd Tier

( 22 )

The Spanish league or La Liga is more interesting that France as there is more

competition at the top of the league. Historically, Real Madrid and FC Barcelona have been

considered the best teams in the league with a few other teams occasionally usurping them.

Currently, Real Sociedad finds themselves atop the league even though they do not have the

18
Daniel Storey, “Lille and Inter Milan’s post-title financial struggles show why football may be broken beyond
repair,” INews, August 13, 2021, https://inews.co.uk/sport/football/lille-inter-milan-european-football-finances-
1147364.
19
Storey, “Lille.”
17

highest Colley or Massey ratings. This is primarily due to the fact that they have played one

more game than Real Madrid and Sevilla FC and thus are only one point ahead. It is likely that

once the teams are even in total games played Real Sociedad will not be in first. Aside from

that, Real Sociedad sits so close to the top primarily because of their 0.71 Colley rating, meaning

they are winning about 71% of the time. Their 0.57 Massey rating suggests that they are not

beating teams as soundly as Real Madrid (1.12) and Sevilla (0.99) but they are getting results at

an almost identical rate.

Another interesting team in La Liga is FC Barcelona as last season they finished third

with a Colley rating of 0.70 and a Massey rating of 1.175. Their current numbers suggest they’re

struggling much more than last season as they are winning about 20% less and are beating

opponents by almost a goal less than last year. This is likely due to the fact that FC Barcelona

sold their best player, who is widely considered one of the greatest players of all time, in Lionel

Messi. To give context for how good of a player Lionel Messi is, he recently received a record

seventh Ballon d’Or award, given to the best soccer player on the planet.20 The rankings show

that losing a player and goal scorer like that has a large effect on a team as FC Barcelona clearly

is not the same team they were in the previous season.

Another team that is interesting to look at is CA Osasuna with a 0.55 Colley rating and a

-0.13 Massey rating. This means that according to the Colley rating, we would expect CA

Osasuna to win a 5% more than lose but according to the Massey rating we would expect them to

lose on average by 0.13 goals. This can be explained by CA Osasuna’s goal differential as they

are the highest ranked team with the lowest goal differential, meaning they are winning more

20
Gabriele Marcotti, “Why Lionel Messi's Ballon d'Or win shouldn't make you angry,” ESPN, November 30, 2021,
https://www.espn.com/soccer/blog-marcottis-musings/story/4535190/why-messis-ballon-dor-win-shouldnt-
make-you-angry.
18

than they lose but they aren’t outscoring their opponents. This could be due to them winning but

not winning by many goals and then having a few large losses where they lose by multiple goals.

One may be surprised by this but soccer is a very volatile sport where teams that are towards the

middle of a league fluctuate a lot in terms of rankings.

The last interesting team to point out in La Liga is Rayo Vallecano who was in the

second-tier last season. They currently find themselves in sixth with a 0.53 Colley rating and a

0.37 Massey rating. It may be surprising to see a recently promoted side doing so well but this

can often happen, especially early in a season. Recently promoted teams are often coming off of

many good performances in the previous season and thus ride that momentum into the new

season in a new league. As a season goes on this often levels out but it is not uncommon. There

also isn’t a specific reason to explain their rankings and standings other than the fact that they are

finding ways to get good results. Often this is the case that teams simply are just playing well

and having a good season which is then reflected in the rankings.

Germany

Team 2020 Colley Rating 2020 Massey Rating 2020 League Finish 2021 Colley Rating 2021 Massey Rating 2021 Leauge Standings
Bayern Munchen 0.763157895 1.527777778 1 0.797170809 2.521849066 1
Borussia Dortmund 0.631578947 0.805555556 3 0.70159018 1.068783983 2
SC Freiburg 0.486842105 3.52E-16 10 0.677321528 0.773659862 3
VfL Wolfsburg 0.631578947 0.666666667 4 0.554183066 -0.193400882 4
RasenBallsport Leipzig 0.657894737 0.777777778 2 0.580553922 1.070700984 5
Bayer Leverkusen 0.552631579 0.388888889 6 0.59285707 0.673876209 6
1. FSV Mainz 05 0.434210526 -0.472222222 12 0.528225833 0.316505037 7
1. FC Union Berlin 0.552631579 0.194444444 7 0.595170559 0.18075638 8
Bor. Monchengladbach 0.526315789 0.222222222 8 0.513176993 -0.050213141 9
1899 Hoffenheim 0.473684211 -0.055555556 11 0.469489111 0.156564673 10
1. FC Koln 0.381578947 -0.722222222 16 0.524914608 -0.051567841 11
VfL Bochum 2nd Tier 2nd Tier 2nd Tier 0.409393521 -0.740640042 12
Hertha BSC 0.407894737 -0.305555556 14 0.437100274 -0.925997373 13
Eintracht Frankfurt 0.631578947 0.444444444 5 0.43741135 -0.504262418 14
VfB Stuttgart 0.486842105 0.027777778 9 0.354058231 -0.644680999 15
FC Augsburg 0.394736842 -0.5 13 0.366073949 -0.919317269 16
Arminia Bielefeld 0.394736842 -0.722222222 15 0.335477414 -0.890078659 17
SpVgg Greuther Furth 2nd Tier 2nd Tier 2nd Tier 0.125831581 -1.84253757 18
Werder Bremen 0.368421053 -0.583333333 17 2nd Tier 2nd Tier 2nd Tier
FC Schalke 04 0.223684211 -1.694444444 18 2nd Tier 2nd Tier 2nd Tier

( 23 )
19

The German League or Bundesliga is similar to the French league as it is often

considered a one-horse race in which Bayern Munich wins while everyone else competes for

second. Looking at Bayerns 0.80 Colley rating and 2.52 Massey rating, it is clear to see that they

are dominating in the league. They sit atop the Massey ratings by 1.5 goals which can be

explained by their +29-goal differential which is 17 goals higher than the second-best team.

Basically, Bayern often scores lots of goals and wins big as they are no stranger to 5 or 4 goal

wins. In fact, they are on track to shatter the Bundesliga goal differential record. Their

consistency is also reflected in their ratings from last season as they weren’t as good as this

season but are still a significant amount better than the rest of the league. Bayern are currently

sitting on nine consecutive league titles, tied for the longest streak in Europe’s top five leagues,

and they are easily on track to win their tenth.21

Another interesting team is SC Freiburg who currently sit third after finishing tenth in the

league last season. SC Freiburg is a great example of how competitive the Bundesliga is aside

from Bayern as there is often great fluctuation in the middle of the table. Last season, SC

Freiburg recorded a 0.49 Colley rating and a 0 Massey rating which have improved significantly

this season to 0.68 and 0.77. One reason for this is simply because Freiburg have found ways to

get lots of good results as they have recorded good results against the typical top three of Bayern,

Dortmund, and RB Leipzig. Sometimes a long explanation isn’t necessary as a team is just

finding a good run of form and getting results, which is the case for SC Freiburg.

Wolfsburg is a team that is similar to CA Osasuna in La Liga as they currently have a

Colley rating above 50% and a negative Massey rating. Currently, Wolfsburg has a zero-goal

differential which explains why their Massey rating is so low. It seems that they are getting
21
“9 facts on Bayern's 9 league titles in a row,” FCBayern.com, May 22, 2021,
https://fcbayern.com/en/news/2021/05/champions-2021/9-facts-on-bayerns-9-league-titles-in-a-row.
20

results but when they lose, they lose by multiple goals. One speculation of the reasoning behind

this could be the Wolfsburg struggles to play from behind as they may be a team that needs to

get the lead and keep it rather than trying to come from behind. Teams that play a counter-

attacking style struggle from this as if they get a lead early they can sit back, play defense, and

keep the lead. If a team like this gets scored on first, it is hard to get back into a game as the

style is defensive and if a team is struggling to defend more goals can often be poured on

throughout a match.

RB Leipzig is also very interesting as they sit behind Wolfsburg even though they have a

+12-goal differential. They seem to be the opposite case of Wolfsburg as they have lots of big

wins but struggle to see games out at times. This is reflected in their 0.58 Colley rating and 1.07

Massey rating as they are winning a little more than they lose but are beating teams by over a

goal a game. RB Leipzig is likely an attack minded team that jumps on and overwhelms teams

at times but they likely struggle to keep teams out of their goal in close games. This is often the

case with attack minded teams as they are so focused on attacking that they often give openings

to opponents defensively.

Italy
21

Team 2020 Colley Rating 2020 Massey Rating 2020 Leauge Finish 2021 Colley Rating 2021 Massey Rating 2021 League Standings
SSC Napoli 0.678571429 1.125 5 0.789268969 1.342770976 1
AC Milan 0.702380952 0.825 2 0.839951107 1.172794877 2
Inter 0.797619048 1.35 1 0.726857901 1.354181038 3
Atalanta 0.702380952 1.075 3 0.630683448 0.505180842 4
Lazio Roma 0.607142857 0.15 6 0.607710026 0.484171284 5
AS Roma 0.583333333 0.325 7 0.531322235 0.457954135 6
ACF Fiorentina 0.416666667 -0.3 13 0.493813691 0.170358089 7
Juventus 0.702380952 0.975 4 0.556460024 0.255323973 8
Bologna FC 0.416666667 -0.35 12 0.552634734 -0.272294703 9
Hellas Verona 0.44047619 -0.125 10 0.527965646 0.436575815 10
Empolic FC 2nd Tier 2nd Tier 2nd Tier 0.428171316 -0.609616072 11
Torino FC 0.404761905 -0.475 17 0.418704579 0.14045488 12
Sassuolo Calcio 0.583333333 0.2 8 0.394930625 -0.199639815 13
Udinese Calcio 0.404761905 -0.4 14 0.475065222 -0.17485086 14
Venezia 2nd Tier 2nd Tier 2nd Tier 0.2825868 -1.121525282 15
Spezia Calcio 0.404761905 -0.5 15 0.315247838 -1.28462013 16
Genoa CFC 0.428571429 -0.275 11 0.319483569 -0.775355714 17
Sampdoria 0.488095238 -0.05 9 0.349141202 -0.742475256 18
Cagliari Calcio 0.380952381 -0.4 16 0.260001067 -1.139388079 20
Benevento 0.357142857 -0.875 18 2nd Tier 2nd Tier 2nd Tier
Parma FC 0.25 -1.1 20 2nd Tier 2nd Tier 2nd Tier
FC Crotone 0.25 -1.175 19 2nd Tier 2nd Tier 2nd Tier

( 24 )

The Italian league or Serie A used to be looked at a one-horse race, similar to the German

and French leagues, but this is no longer the case. Juventus recently had their streak of nine

straight league titles snapped by Inter Milan last season.22 Juventus’s decline last season can be

mostly attributed to the rise of new teams in Serie A such as Inter Milan, AC Milan and Atalanta.

Each of these three teams has went through a rebuilding phase and are coming out of them with

strong teams that compete for the title. It seemed that it was more down to the success of other

teams rather than a lack of success form Juventus as they competed well in terms of Massey

rating but they simply couldn’t get results at a high enough percentage as Inter Milan.

Juventus’ decline this season has been very evident in their rankings as they have now

slipped to a 0.56 Colley rating and a 0.26 Massey rating. They represent a similar case to FC

Barcelona as Juventus recently sold Cristiano Ronaldo who is another player considered on the

same level as Lionel Messi. Ronaldo is one of the best goal scorers of all time and his impact

can be seen in Juventus dropping 0.7 in their Massey rating. Juventus simply isn’t able to

produce the same kind of goalscoring numbers that they could with Ronaldo.
22
“9 Facts on Bayern.”
22

Two teams that seem to be running away with the league are AC Milan and SSC Napoli

as they both sit tied for first with 32 points and a 10-2-0 record after 12 matches. Both are 7

points ahead of Inter Milan and 10 points ahead of the rest of the league. This is reflected in

their Colley ratings of 0.79 and 0.84 which is a significant amount higher that the rest of the

league. Their Massey ratings are also very high but Inter Milan actually has a higher rating than

AC Milan due to a higher goal differential. This shows a good example of why both rating

systems are helpful as the Massey rating does not tell the whole story as Inter Milan is winning

games by multiple goals but they simply aren’t winning at as high a rate as SSC Napoli and AC

Milan. AC Milan also is a little more defensive side meaning they aren’t going to win by as

many goals but they are actually winning at the highest rate in the league. Looking at both

rankings together helps us get a more formed picture of how well each team is doing and how

they are getting those results.

England
23

Team 2020 Colley Rating 2020 Massey Rating 2020 League Finish 2021 Colley Rating 2021 Massey Rating 2021 League Standings
Chelsea 0.619047619 0.55 4 0.757536698 2.077001825 1
Manchester City 0.75 1.275 1 0.724155321 1.621678499 2
West Ham United 0.595238095 0.375 6 0.675111538 0.811961967 3
Liverpool 0.630952381 0.65 3 0.702864229 1.949731764 4
Arsenal 0.55952381 0.4 8 0.608482664 0.002591043 5
Manchester United 0.678571429 0.725 2 0.541686934 0.194241341 6
Brighton & Hove Albion 0.428571429 -0.15 16 0.55933229 -0.04431741 7
Wolverhampton Wanderers 0.44047619 -0.4 13 0.45274818 -0.454448925 8
Tottenham Hotspur 0.571428571 0.575 7 0.519309932 -0.48495213 9
Crystal Palace 0.428571429 -0.625 14 0.595285436 0.481343833 10
Everton 0.547619048 -0.025 10 0.455178988 -0.412323541 11
Leicester City 0.595238095 0.45 5 0.509906812 -0.191002023 12
Southampton 0.416666667 -0.525 15 0.485571437 -0.180914163 13
Brentford 2nd Tier 2nd Tier 2nd Tier 0.44956253 0.09853325 14
Leeds United 0.535714286 0.2 9 0.388380916 -0.847345265 15
Aston Villa 0.511904762 0.225 11 0.349002304 -0.583735734 16
Watford 2nd Tier 2nd Tier 2nd Tier 0.312140569 -0.980788039 17
Burnley 0.392857143 -0.55 17 0.376786759 -0.324302218 18
Newcastle United 0.44047619 -0.4 12 0.27970048 -1.0919957 19
Norwich City 2nd Tier 2nd Tier 2nd Tier 0.257255983 -1.640958374 20
Fulham 0.321428571 -0.65 18 2nd Tier 2nd Tier 2nd Tier
West Bromwich Albion 0.297619048 -1.025 19 2nd Tier 2nd Tier 2nd Tier
Sheffield United 0.238095238 -1.075 20 2nd Tier 2nd Tier 2nd Tier

( 25 )

The English league or Premier League is widely considered the most popular and

competitive league at the moment due to the amount of competition for first place. There are

usually 3-5 times in a given year that have the talent and depth to win the league which makes it

very interesting for fans to follow. This competitiveness is also reflected in how close the title

race is as 6 points separate 1st through 5th in the league. Perhaps the most interesting team in the

league right now is Arsenal who started the season at the bottom of the league with three straight

losses but now sit in fifth having not lost a game since. They have a zero-goal differential which

explains their 0 Massey rating but their Colley rating of 0.61 shows that they are finding ways to

get results. Arsenal has been a team that has grinded their way back into the title race through

finding ways to get results in close games. This may not be a recipe for long term success as a

team that often competes in close games is very susceptible to begin losing those games more

than they win but for now Arsenal is finding ways to get results.

The team I personally support, Tottenham, finds themselves at a similar part of the table

as last season but are now lower due to a negative Massey rating. We are finding ways to barely
24

squeak by results but many bad losses have left us with our negative Massey rating. The

difference in Massey rating between years is currently due to Tottenham’s lack of goal scoring

output, specifically in the lack of scoring from Harry Kane. Kane is considered one of the best

goal scorers in the league and he has been way off the mark this season, scoring one goal in

twelve games.23 This is one of the main reasons for Tottenham’s lack of success, specifically in

the goal scoring department.

Manchester United is another team that has struggled in terms of goal scoring as they

have dropping about 0.5 a goal on their Massey rating. This is very surprising given the fact that

the team signed an incredible goal scorer in Cristiano Ronaldo this offseason. It seems that

Ronaldo has not hit his usual form yet for United and other players have not matched their form

from last season which brought them a second-place finish in the league. This may simply be

down to the competitiveness of the Premier League as even the top teams struggle with poor runs

of form at times and it seems that this is the case for United. They struggled similarly last year

but were able to put a good run of form together towards the end of the season to secure second.

This could likely happen again this season once Ronaldo begins to hit his stride.

League leaders, Chelsea, have represented an interesting team as they boast the highest

goal differential with +23 while allowing the fewest goals. They are winning more than last

season and are also beating the average opponent by about 1.5 more goals than last season. This

success is most likely due to recent player acquisitions and a new manager. Chelsea has bought

many players over the past few seasons that have helped strengthen the squad and it seems that

many of them are finding their form. New manager Thomas Tuchel has also seemed to be doing

very well at implementing a system that works well for Chelsea as they are having no problems

23
“2021/22 Player Stats,” Premier League, accessed November 29, 2021, https://www.premierleague.com/stats.
25

scoring goals and keeping teams off the score sheet. If this keeps up, it is very possible we could

see Chelsea hoist another league title come spring.

The team right behind Chelsea in terms of goal differential is Liverpool who boas a +20-

goal differential. This is reflected in their Massey rating of almost 2, suggesting that their

struggles may come defensively. Their goalscoring is to the form of league leading scorer and

assister Mohammed Salah who is widely considered to be having the best season of any player

on the planet at the moment. The struggles are thus likely coming on the defensive end as

Liverpool is struggling to keep teams off the scoreboard and secure results in close games at a

rate similar to Chelsea. Their Colley rating is 5% less than the current league leaders suggesting

that they are very close but a few games have kept them from truly having and elite season.

League Leaders

Team 2021 Colley Rating 2021 Massey Rating Combined Rating League
Bayern Munchen 0.797170809 2.521849066 3.319019875 Germany
Chelsea 0.757536698 2.077001825 2.834538523 England
Manchester City 0.724155321 1.621678499 2.34583382 England
SSC Napoli 0.789268969 1.342770976 2.132039946 Italy
Inter 0.726857901 1.354181038 2.081038939 Italy
AC Milan 0.839951107 1.172794877 2.012745984 Italy
Paris Saint-Germain 0.813759151 1.139249787 1.953008938 France
Real Madrid 0.721918594 1.116874391 1.838792985 Spain

( 26 )

After compiling all rankings from each of the top five leagues into different tables, it

seemed logical to find out who the best teams are in terms of league performance. The combined

rating column was thus born and it simply consists of a teams Colley rating plus their Massey

rating. This led to the table consisting of the eight teams above from each of the top five

leagues. One must keep in mind that this is simply a measure of the team’s performance in their

given league as it does not consider cup or European tournament competitions. The dominance
26

of Chelsea and Bayern against their given leagues is highlighted in this table as they both boast a

combined rating at least 0.5 higher than all other teams, with Bayern being a whole point ahead.

The rest of the six teams are separated by only 0.5 points suggesting similar dominance in their

given leagues.

This table begins to bring up the question of who truly is the best team currently in

European soccer. One could simply look at this and choose Bayern because of their dominance

in their league but without connections between leagues it is hard to say whether this is fully true

or not. UEFA’s Association club coefficients help us better decipher this as they give a concrete

rating and ranking of each league:

( 27 )

Even though Bayern is the best performer in comparison to their league, the German league is

actually the fourth best out of all five leagues. This added info may sway someone who may

have to place money on who they think the European champion will be at the end of the season

as it seems that Chelsea may be the best bet based on this info. This is still fairly speculative as

connections between leagues would be helpful but the data used in this project did not allow for

that.
27

Going Forward/Conclusion

Going forward, the biggest aspect missing from this project was connections between the

leagues. This would’ve allowed for comparison between each league which would’ve allowed

for a legitimate ranking of all teams together in comparison to one another. Unfortunately, the

data used in this project did not have up to date Champions league data, which would’ve

provided the connections we needed. This also could’ve allowed for mapping of rankings as we

would’ve been able to map each league and then map them to one another. Mapping each league

was a possibility but time constraints prohibited that option. Other ranking methods, such as Elo,

also would’ve been beneficial to explore and compare. Both the Colley and Massey methods

proved to be effective at doing what they are intended to do but they both have drawbacks. It

would’ve been nice to see another common method in Elo added to the tables to see how the

results would’ve changed. Again, time constraints kept this from happening but it is still a fairly

reasonable concept to explore. The last aspect that could’ve been explored is coming up with a

new ranking method. This was considered but due to troubleshooting code and understanding

our two current methods, we were unable to do this. One thought was to make a ranking that

explored statistics particular to soccer like possession, expected goals, shots, etc. Again, this

would’ve been a time-consuming process that wasn’t feasible for the project time window.

In conclusion, this project purposed to explore the Colley and Massey methods in

European soccer. We were successfully able to apply these methods to 2020 and 2021 league

data for each of Europe’s top five leagues, producing rankings and tables. Both ranking methods

offered different approaches to rankings that provided more context to league table standings.

The Massey method provided more different results from the league tables than the Colley

method did, suggesting that the Colley method aligned more with common league rankings. We
28

were unable to fully unpack the question of who the best team in Europe is at the moment but we

were able to get close in our table featuring the top league performers thus far. This seems a

fitting end to the project as ultimately the question of who is best comes down to opinion until

the Champions League final is played and even that may not fully satisfy some fans.
29

Appendix

France
france_current <- function(Season=2021){

s1<-s2<-myseason<-f1<-f2<-df1<-NULL
myseason<-Season
s2<-as.numeric(substr(myseason,3,4))
s1 <- s2+1

#f2=read.csv("http://www.football-data.co.uk/mmz4281/1617/F2.csv")
#df1 <-
rbind(engsoccerdata::getCurrentData(f1,'F1',1),engsoccerdata::getCurrentData(
f2,'F2',2))
f1=read.csv(paste0("http://www.football-data.co.uk/mmz4281/",s2,s1,"/
F1.csv"))
df1 <- rbind(engsoccerdata::getCurrentData(f1,'F1',1,Season=myseason))

df1$Date <- as.Date(df1$Date, format="%Y-%m-%d")


fran <- engsoccerdata::france
if(identical(max(df1$Date), max(fran$Date))) warning("The returned
dataframe contains data already included in 'france' dataframe")
tm <- engsoccerdata::teamnames
df1$home <- tm$name[match(df1$home,tm$name_other)]
df1$visitor <- tm$name[match(df1$visitor,tm$name_other)]
return(df1)
}

france_2020 <- function(Season=2020){

s1<-s2<-myseason<-f1<-f2<-df1<-NULL
myseason<-Season
s2<-as.numeric(substr(myseason,3,4))
s1 <- s2+1

#f2=read.csv("http://www.football-data.co.uk/mmz4281/1617/F2.csv")
#df1 <-
rbind(engsoccerdata::getCurrentData(f1,'F1',1),engsoccerdata::getCurrentData(
f2,'F2',2))
f1=read.csv(paste0("http://www.football-data.co.uk/mmz4281/",s2,s1,"/
F1.csv"))
df1 <- rbind(engsoccerdata::getCurrentData(f1,'F1',1,Season=myseason))

df1$Date <- as.Date(df1$Date, format="%Y-%m-%d")


30

fran <- engsoccerdata::france


if(identical(max(df1$Date), max(fran$Date))) warning("The returned
dataframe contains data already included in 'france' dataframe")
tm <- engsoccerdata::teamnames
df1$home <- tm$name[match(df1$home,tm$name_other)]
df1$visitor <- tm$name[match(df1$visitor,tm$name_other)]
return(df1)
}

#Filtering Teams

FourTeamDataSet <- france_current() %>% filter(tier == "1")


#FourTeamDataSet %>% filter(home %in% all_teams, visitor %in% all_teams)
all_teams <- unique(c(FourTeamDataSet$home,FourTeamDataSet$away))
#add all teams set

#Filtering Teams

France2020 <- france_2020() %>% filter(tier == "1")


all_teams2020 <- unique(c(France2020$home,France2020$away))
#add all teams set

#Use Colley Current


A=matrix(rep(0,length(all_teams)^2),nrow=length(all_teams))
b=rep(1,length(all_teams))
diag(A)=rep(2,length(diag(A)))

for(i in 1:length(FourTeamDataSet$hgoal)){
Team1=match(FourTeamDataSet$home[i],all_teams)
Team2=match(FourTeamDataSet$visitor[i],all_teams)

A[ Team1,Team2 ]=A[ Team1 , Team2] -1;


A[ Team2, Team1 ]=A[ Team2 ,Team1 ]-1;
A[ Team1 ,Team1 ]=A[ Team1 ,Team1 ]+1;
A[ Team2 ,Team2 ]=A[ Team2 ,Team2 ]+1;

if(FourTeamDataSet$hgoal[i]>FourTeamDataSet$vgoal[i]){
Share1 = 0.5
} else if(FourTeamDataSet$hgoal[i]==FourTeamDataSet$vgoal[i]){
Share1 = 0
} else {
Share1 = -0.5
}

Share2=-Share1
31

b[ Team1 ]=b[ Team1 ]- Share2


b[ Team2 ]=b[ Team2 ]- Share1
}

Ratingscolley=solve(A,b)

rankedteamscolley<-cbind(all_teams,as.numeric(Ratingscolley))
rankedteamscolley<-rankedteamscolley[ order(Ratingscolley,decreasing=TRUE), ]
rankingscolley<-cbind(seq(1,length(Ratingscolley)),rankedteamscolley)%>%
as_tibble()

## Warning: The `x` argument of `as_tibble.matrix()` must have unique column


names if `.name_repair` is omitted as of tibble 2.0.0.
## Using compatibility `.name_repair`.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
generated.

names(rankingscolley)<-c("Ranking","Team","Rating")
##row.names(rankingscolley)<-seq(nrow(rankingscolley))
write.csv(rankingscolley, paste("France Current Rankings Colley ",
format(Sys.time(),"%Y %m %d"),".csv",sep=""), row.names = FALSE)

#Massey
M=matrix(rep(0,length(all_teams)*length(FourTeamDataSet$hgoal)),nrow=length(F
ourTeamDataSet$hgoal))
b=rep(0,length(FourTeamDataSet$hgoal))

for(i in 1:length(FourTeamDataSet$hgoal)){
Team1=match(FourTeamDataSet$home[i],all_teams)
Team2=match(FourTeamDataSet$visitor[i],all_teams)

M[ i,Team1]=M[i , Team1] +1;


M[ i,Team2 ]=M[ i ,Team2 ]-1;

FourTeamDataSet$hgoal[i] - FourTeamDataSet$vgoal[i] -> goaldifference

b[ i ]=goaldifference
}

A=t(M)%*%M
A[length(all_teams),]=rep(1,length(all_teams))
b=t(M)%*%b
b[length(all_teams)]=0

Ratingsmassey = solve(A,b)

rankedteamsmassey<-cbind(all_teams,as.numeric(Ratingsmassey))
32

rankedteamsmassey<-rankedteamsmassey[ order(Ratingsmassey,decreasing=TRUE), ]
rankingsmassey<-cbind(seq(1,length(Ratingsmassey)),rankedteamsmassey)%>%
as_tibble()
names(rankingsmassey)<-c("Ranking","Team","Rating")
##row.names(rankingsmassey)<-seq(nrow(rankingsmassey))
write.csv(rankingsmassey, paste("France Rankings Massey ",
format(Sys.time(),"%Y %m %d"),".csv",sep=""), row.names = FALSE)

#Use Colley 2020


A=matrix(rep(0,length(all_teams2020)^2),nrow=length(all_teams2020))
b=rep(1,length(all_teams2020))
diag(A)=rep(2,length(diag(A)))

for(i in 1:length(France2020$hgoal)){
Team1=match(France2020$home[i],all_teams2020)
Team2=match(France2020$visitor[i],all_teams2020)

A[ Team1,Team2 ]=A[ Team1 , Team2] -1;


A[ Team2, Team1 ]=A[ Team2 ,Team1 ]-1;
A[ Team1 ,Team1 ]=A[ Team1 ,Team1 ]+1;
A[ Team2 ,Team2 ]=A[ Team2 ,Team2 ]+1;

if(France2020$hgoal[i]>France2020$vgoal[i]){
Share1 = 0.5
} else if(France2020$hgoal[i]==France2020$vgoal[i]){
Share1 = 0
} else {
Share1 = -0.5
}

Share2=-Share1

b[ Team1 ]=b[ Team1 ]- Share2


b[ Team2 ]=b[ Team2 ]- Share1
}

Ratingscolley2020=solve(A,b)

rankedteamscolley2020<-cbind(all_teams2020,as.numeric(Ratingscolley2020))
rankedteamscolley2020<-rankedteamscolley2020[
order(Ratingscolley2020,decreasing=TRUE), ]
rankingscolley2020<-
cbind(seq(1,length(Ratingscolley2020)),rankedteamscolley2020)%>% as_tibble()
names(rankingscolley2020)<-c("Ranking","Team","Rating")
##row.names(rankingscolley)<-seq(nrow(rankingscolley))
33

write.csv(rankingscolley2020, paste("France 2020 Rankings Colley ",


format(Sys.time(),"%Y %m %d"),".csv",sep=""), row.names = FALSE)

#Massey 2020
M=matrix(rep(0,length(all_teams2020)*length(France2020$hgoal)),nrow=length(Fr
ance2020$hgoal))
b=rep(0,length(France2020$hgoal))

for(i in 1:length(France2020$hgoal)){
Team1=match(France2020$home[i],all_teams2020)
Team2=match(France2020$visitor[i],all_teams2020)

M[ i,Team1]=M[i , Team1] +1;


M[ i,Team2 ]=M[ i ,Team2 ]-1;

France2020$hgoal[i] - France2020$vgoal[i] -> goaldifference

b[ i ]=goaldifference
}

A=t(M)%*%M
A[length(all_teams2020),]=rep(1,length(all_teams2020))
b=t(M)%*%b
b[length(all_teams2020)]=0

Ratingsmassey2020 = solve(A,b)

rankedteamsmassey2020<-cbind(all_teams2020,as.numeric(Ratingsmassey2020))
rankedteamsmassey2020<-rankedteamsmassey2020[
order(Ratingsmassey2020,decreasing=TRUE), ]
rankingsmassey2020<-
cbind(seq(1,length(Ratingsmassey2020)),rankedteamsmassey2020)%>% as_tibble()
names(rankingsmassey2020)<-c("Ranking","Team","Rating")
##row.names(rankingsmassey)<-seq(nrow(rankingsmassey))
write.csv(rankingsmassey2020, paste("France 2020 Rankings Massey ",
format(Sys.time(),"%Y %m %d"),".csv",sep=""), row.names = FALSE)

Above is all code used to produce rankings for the French league. To modify for other leagues,

different functions were written that took in data for the specific league. Titles were also

changed to that country so it was clear which league we were using. Otherwise, most of the code

is uniform across all five leagues.


34

Works Cited

Chartier, Tim, Erich Kreutzer, Amy Langville, and Kathryn Pedings. “Bracketology: How Can
Math Help?,” Dolciani Mathematical Expositions 43, no. 1 (2010): 55–70.
doi:10.5948/upo9781614442004.006.
ESPN. “MLS playoffs: Real Salt Lake stun Sounders in penalty shootout,” Match Report.
Accessed November 28, 2021. https://www.espn.com/soccer/report/_/gameId/621672.
FIFA. “Men's Ranking Procedure.” Accessed March 15, 2021.
https:// www.fifa.com/fifa-worldranking/procedure/men.
Google. “2019 Champions League Table.” Accessed April 8, 2021.
https://www.google.com/search?q=2019+champions+league+table&sxsrf=AOaemvKS0-
5pJaIiTlxHHZfDEyrxrKVyEA
%3A1638400852377&ei=VAOoYbbFFs3dtAaBg6WACw&oq=2019+champions+leagu
e+&gs_lcp=Cgdnd3Mtd2l6EAEYATIECAAQQzIKCAAQgAQQhwIQFDIECAAQQzI
ECAAQQzIECAAQQzIECAAQQzIECAAQQzIFCAAQgAQyBQgAEIAEMgUIABCA
BDoHCCMQsAMQJzoHCAAQRxCwAzoKCC4QyAMQsAMQQ0oECEEYAFDkAljk
AmDiCmgBcAJ4AIABXogBXpIBATGYAQCgAQHIAQ_AAQE&sclient=gws-
wiz#sie=lg;/g/11c743v3x1;2;/m/0c1q0;st;fp;1;;.
Marcotti, Gabriele. “Why Lionel Messi's Ballon d'Or win shouldn't make you angry.” ESPN,
November 30, 2021.
https://www.espn.com/soccer/blog-marcottis-musings/story/4535190/why-messis-ballon-
dor-win-shouldnt-make-you-angry.
Premier League. “2021/22 Player Stats.” Accessed November 29, 2021.
https://www.premierleague.com/stats.
Storey, Daniel. “Lille and Inter Milan’s post-title financial struggles show why football may be
broken beyond repair.” INews, August 13, 2021. https://inews.co.uk/sport/football/lille-
inter-milan-european-football-finances-1147364.
UEFA. “How the Club Coefficients Are Calculated.” Last modified August 23, 2019.
https://www.uefa.com/memberassociations/uefarankings/club/ahbout/.
“9 facts on Bayern's 9 league titles in a row.” FCBayern.com, May 22, 2021.
https://fcbayern.com/en/news/2021/05/champions-2021/9-facts-on-bayerns-9-league-
titles-in-a-row.

You might also like