Wednesday, 14 August 2024

Web Tutorial: Python Matplotlib Heatmap (Part 1/2)

How's everyone doing?

After all we've been through with the Liverpool Football Club dataset, here's one more!

Today we will get to making a heatmap in Python's matplotlib library. It's rather like a line chart in the sense that we plot data over two axes. The difference is that the data will be a two-dimensional list. Thus, while the data gathering is very straightforward - we ask the user if they want to see goals or appearances, and that's it! - the data massaging is a little more complex.

Let's start by taking the code for the line chart, and renaming the lineChart() function to heatMap(). Instead of labels as an argument, we will have two arguments in place of labels - seasons and players. Clear everything else except for the seasonName() function; we will still need that. Needless to say, we'll need the data as well.
import numpy as np
import matplotlib.pyplot as plt

def heatMap(seasons, players, vals, stat):
  
def seasonName(year):
  return str(year) + "/" + str(year + 1)

data = {
  2017: {
    "Mohamed Salah": {"goals": 44, "appearances": 52},
    "Roberto Firminho": {"goals": 27, "appearances": 54},
    "Sadio Mane": {"goals": 20, "appearances": 44},
    "Alex Oxlade-Chamberlain": {"goals": 5, "appearances": 42}
  },
  2018: {
    "Mohamed Salah": {"goals": 27, "appearances": 52},
    "Roberto Firminho": {"goals": 16, "appearances": 48},
    "Sadio Mane": {"goals": 26, "appearances": 50},
    "Alex Oxlade-Chamberlain": {"goals": 0, "appearances": 2}
  },


After the data portion, clear everything; we'll start relatively fresh. Declare two lists - players and seasons. These will be used as arguments in the heatMap() function later.
  2022: {
    "Mohamed Salah": {"goals": 30, "appearances": 51},
    "Roberto Firminho": {"goals": 13, "appearances": 35},
    "Alex Oxlade-Chamberlain": {"goals": 1, "appearances": 13},
    "Diogo Jota": {"goals": 7, "appearances": 28},
    "Luis Diaz": {"goals": 5, "appearances": 21}
  }
}

players = []
seasons = []


Use a For loop. For every element in data, use seasonName() to derive the season name and then append to seasons. At the conclusion of the loop, you should have a full list of seasons.
players = []
seasons = []

for season in data:
  seasons.append(seasonName(season))


Also, derive a list of players using the list() function and using the keys() method on the current element of data. It's possible for us to just add the entire list to players using the addition operator.
for season in data:
  seasons.append(seasonName(season))
  players = players + list(data[season].keys())


Now we'll have a full list of players as well in players, but many of them will be repeats. What we'll do is use the fromkeys() method of the Python's dict object to convert list into a dictionary (removing all duplicates in the process), then use the list() function to convert back to a list. And finally, we'll sort players using the sort() method.
for season in data:
  seasons.append(seasonName(season))
  players = players + list(data[season].keys())
  
players = list(dict.fromkeys(players))
players.sort()


We follow up by declaring ans as an empty string, then creating a While loop that runs as long as ans is not 0.
players = list(dict.fromkeys(players))
players.sort()
  
ans = ""

while (ans != 0):


In the loop, print the options - "1" for goals, "2" for appearances, and "0" to exit.
ans = ""

while (ans != 0):
  print ("1: goals")
  print ("2: appearances")
  print ("0: Exit")


In here, create an infinite While loop.
while (ans != 0):
  print ("1: goals")
  print ("2: appearances")
  print ("0: Exit")

  while True:


Inside this While loop, we have a Try-catch block.
while True:
  try:

  except:


Here, we'll use the input() function to ask the user for input, then use the int() function to convert the input to an integer. If the user enters a letter instead, this will trigger an exception, which we will then handle with the print() function.
while True:
  try:
    int(input("Select a stat"))
  except:
    print("Invalid option. Please try again.")


Now, assign the value of the input to ans, and then use a break statement. This means that if an exception is not triggered by an invalid input, the user will break out of the infinite loop.
while True:
  try:
    ans = int(input("Select a stat"))
    break
  except:
    print("Invalid option. Please try again.")


Then, depending on the value of ans, assign a value to stat. But if ans is 0, we stop all execution. If ans is not 0, 1 or 2, we restart the loop.
while (ans != 0):
  print ("1: goals")
  print ("2: appearances")
  print ("0: Exit")

  while True:
    try:
      ans = int(input("Select a stat"))
      break
    except:
      print("Invalid option. Please try again.")
  
  if (ans == 1): stat = "goals"
  if (ans == 2): stat = "appearances"
  if (ans == 0): break
  if (ans > 2 or ans < 0): continue


As you can see, I entered "hello" and it was rejected, then I entered "5", and "-1", both of which resulted in being asked the question again. It was only after I entered a "0" that the script terminated.


We will need a two-dimensional list, stats. And then we're going to populate stats using a nested For loop. We first iterate through the seasons. Now we can't use the list seasons for this because it contains the proper season names instead of the keys in the dataset. So we'll need to use a list of the keys in data. In the inner loop, we iterate through players.
if (ans == 1): stat = "goals"
if (ans == 2): stat = "appearances"
if (ans == 0): break
if (ans > 2 or ans < 0): continue
  
stats = []

for season in list(data.keys()):
  for player in players:


In the outer loop, we declare an empty list, arr. Then in the inner loop, we use an If block to check if player is in the element of data pointed to by season.
stats = []

for season in list(data.keys()):
  arr = []
  for player in players:
    if (player in data[season]):

    else:


If so, we append the relevant stat to arr. If not, we simply append a zero.
for season in list(data.keys()):
  arr = []
  for player in players:
    if (player in data[season]):
      arr.append(data[season][player][stat])
    else:
      arr.append(0)      


And finally, in the outer loop, we append arr to stats. By the end of this, stats should be a two-dimensional list of stats for each player by season, with 0s where the player was not present during the season (or simply made no appearances or scored no goals).
for season in list(data.keys()):
  arr = []
  for player in players:
    if (player in data[season]):
      arr.append(data[season][player][stat])
    else:
      arr.append(0)
      
  stats.append(arr)


And then we call the heatMap() function, passing in seasons and players for the labels, then stats as the values, and stat as the string variable that we'll use in the title.
for season in list(data.keys()):
  arr = []
  for player in players:
    if (player in data[season]):
      arr.append(data[season][player][stat])
    else:
      arr.append(0)
      
  stats.append(arr)

heatMap(seasons, players, stats, stat)


Next

We implement a heatmap using the data we have obtained!

No comments:

Post a Comment