Another Python EDA project – this time we use pytrends (an unofficial Google Trends API) to scrape search data and illustrate how Expected Goals (xG) has gone mainstream.
xG has come a long way
Expected Goals (xG) has come a long way – since the concept was first introduced in 2012 (by Opta’s Sam Green) the use of the metric has grown exponentially.
Stats Perform’s Jonny Whitmore has published a detailed explanation of xG here.
“Expected Goals (or xG) measures the quality of a chance by calculating the likelihood that it will be scored from a particular position on the pitch during a particular phase of play”
Jonny Whitmore (Stats Perform)
Although Expected Goals is now widely accepted by the football community it did face significant push back. One of the most famous incidents materialized in November 2017 when Sky Sports presenter Jeff Stelling shared his thoughts on the metric after Arsène Wenger (then Arsenal manager) quoted xG in an interview.
The key Jeff Stelling quote reads as follows: “He’s the first person I’ve ever heard take any notice of expected goals, which must be the most useless stat in the history of football! What does it tell you? It’s absolute nonsense, it really is”.
The point here is that Jeff Stelling felt safe enough to make the remarks at the time, to play up to the “real football men” and anti-intellectualism if you like since Expected Goals had not yet gone mainstream.
Now, it’s different. It is unlikely that a presenter in Jeff Stelling’s position would belittle the metric in a similar manner.
To illustrate how Expected Goals has moved into the footballing mainstream we can use some pretty straightforward Python code to connect to Google Trends, scrape data relevant to xG and visualize the association patterns – the key packages to use here are pytrends (an unofficial Google Trends API), pandas and matplotlib.
The result of the code is a basic graph (which has been dressed up a bit in Adobe Photoshop and InDesign).

Python code
The Python code for connecting to Google Trends and building the line chart is as follows . . .
Import modules
from pytrends.request import TrendReq
import pytrends
import matplotlib.pyplot as plt
import pandas as pd
Define the Google Trends search term and the timeframe that we want to look at
search_terms = [“xG”]
timeframe = “2015-01-01 2023-05-26”
Before we issue an actual request, we need to initialize a TrendReq object
pytrends = TrendReq(hl=’en-GB’, tz=360)
hl = the hosting language for accessing Google Trends (English)
tz = timezone
Build the payload for the request and request the interest_over_time for the search term
The Geo tag narrows the results down to a specific country – in this case GB (Great Britain)
pytrends = TrendReq(hl=’en-GB’, tz=360)
pytrends.build_payload(search_terms, timeframe=timeframe, geo=”GB”)
xgtrends = pytrends.interest_over_time()
We have built the xgtrends pandas dataframe – inspect the first ten rows as a sample
xgtrends.head(10)
Plot the data using matplotlib
def xgplot_searchterms(df):
fig = plt.figure(figsize = (15,8))
ax = fig.add_subplot(111)
df.plot(ax=ax)
plt.title(“Google Trends with the search term ‘xG’ (2015-2023)”, loc = ‘left’)
plt.ylabel(‘Google relative search term frequency’)
plt.xlabel(‘year’)
plt.ylim((0,110))
plt.legend(loc=’lower right’)
return ax
plt.style.use(‘_mpl-gallery’)
ax = xgplot_searchterms(xgtrends)
plt.savefig(‘stelling.png’, dpi=300, bbox_inches=’tight’)
Feedback welcome
Feel free to contact Brian McDonnell by email on sixtwofourtwo@gmail.com. Brian, of course, can also be contacted via @sixtwofourtwo, on LinkedIn and or, alternatively, on Instagram.