Footballwebscraper
This is a web scraper that helps to scrape football data from FBRef.com. It can scrape data from the top 5 Domestic League games. It can be easily edited to scrape data from other leagues as well as from other competitions such as Champions League, Domestic Cup games, friendlies, etc.
Install / Use
/learn @hoyishian/FootballwebscraperREADME
Football Data Science
This project aims to obtain the data available about a player's performance in their respective domestic leagues (e.g. English Premier League) and visualize their performance data in comparison to other players. We have broken up the analysis into 2 types of players in order to ensure a useful form of comparison.
Explaining the Data from fbref (https://fbref.com/en/)
We extracted the data required for our analysis from the FBRef Website, which consists of a breakdown of a match-by-match performance for each player. File used to find this may be found here: https://github.com/hoyishian/fantasypldatascience/blob/main/fbref_scout_extraction.py
There are 2 kind of players in the data we collected: Outfield players and Goalkeepers.
Goalkeeper Players
Goalkeeper players only have 1 set of statistics that is made up of 7 groups of Goalkeeping statistics. The following screenshot is taken from the FBRef website:

Outfield Players
Outfield players have 4 set of statistics. They are explained in the Appendix section of this Readme Document.
Running the Python File
Step 1: Navigate to folder containing fbref_scout_extraction.py
Step 2: Decide on type of web-scraping action to run
There are 2 types of web scraping that can be done:
Player Statistics by league
Extract all statistics for all domestic league games for players in a given league
Similar Players
Extract the top 10 most similar players for every player in a given domestic league. The top 10 most similar players may not be from the same domestic league that has been selected.
Step 3a: For obtaining Player Statistics
- Run the following command
python fbref_scout_extraction.py
- You will then be given the following prompt:
Enter League Here (EPL, Ligue 1, Bundesliga, Serie A, La Liga). Press Enter when ready:
- Enter the league of interest and press enter. It will take approximately 20 to 30 minutes to completely scrape all data for players in a given domestic league.
Step 3b: For obtaining similar players
-
Ensure to comment out the following line and uncomment out the next line
-
Run the following command
python fbref_scout_extraction.py
- You will then be given the following prompt:
Enter League Here (EPL, Ligue 1, Bundesliga, Serie A, La Liga). Press Enter when ready:
- Enter the league of interest and press enter. It will take approximately 20 to 30 minutes to completely scrape all data for players in a given domestic league.
Examples of Data
To see examples of how the data scraped from FBRef will look like, you can refer to the playerstats_(league) and goalkeeperstats_(league) file (For scraping player statistics) and similar_player_(league) file for (For scraping similar player).
EPL Data
Ligue 1 Data
Bundesliga Data
Serie A Data
La Liga Data
Appendix
Goalkeeper Statistics
Performance
SoTA -- Shots on Target Against
GA -- Goals Against
Save% -- Save Percentage (Shots on Target Against - Goals Against/Shots on Target Against)
CS -- Clean Sheets (Full matches by goalkeeper where no goals are allowed.)
PSxG -- Post-Shot Expected Goals (PSxG is expected goals based on how likely the goalkeeper is to save the shot)
Penalty Kicks
PKatt -- Penalty Kicks Attempted
PKA -- Penalty Kicks Allowed
PKsv -- Penalty Kicks Saved
PKm -- Penalty Kicks Missed
Launched
Cmp -- Passes Completed (Passes longer than 40 yards)
PassAttemptedLong -- Passes Attempted (Passes longer than 40 yards)
Cmp% -- Pass Completion Percentage (Passes longer than 40 yards)
Passes
PassAtt -- Passes Attempted (Not including goal kicks)
Thr -- Throws Attempted
Launch% -- Percentage of Passes that were Launched (Not including goal kicks) (Passes longer than 40 yards)
AvgLen -- Average length of passes, in yards (Not including goal kicks)
Goal Kicks
GoalKickAtt -- Passes Attempted
GKLaunch% -- Percentage of Goal Kicks that were Launched (Passes longer than 40 yards)
GKAvgLen -- Average length of goal kicks, in yards
Crosses
Opp -- Opponent's attempted crosses into penalty area
Stp -- Number of crosses into penalty area which were successfully stopped by the goalkeeper
Stp% -- Percentage of crosses into penalty area which were successfully stopped by the goalkeeper
Sweeper
#OPA -- # of defensive actions outside of penalty area
AvgDist -- Average distance from goal to perform defensive actions
Outfield Player Statistics
Passing

Total
Cmp -- Passes Completed
PassAtt -- Passes Attempted
Cmp% -- Pass Completion Percentage (Minimum 30 minutes played per squad game to qualify as a leader)
PassTotDist -- Total distance, in yards, that completed passes have traveled in any direction
PassPrgDist -- Progressive Distance (Total distance, in yards, that completed passes have traveled towards the opponent's goal. Note: Passes away from opponent's goal are counted as zero progressive yards.)
Short (Passes between 5 and 15 yards)
Cmp.1 -- Passes Completed
Att.1 -- Passes Attempted
Cmp%.1 -- Pass Completion Percentage
Medium (Passes between 15 and 30 yards)
Cmp.2 -- Passes Completed
Att.2 -- Passes Attempted
Cmp%.2 -- Pass Completion Percentage
Long (Passes longer than 30 yards)
Cmp.3 -- Passes Completed
Att.3 -- Passes Attempted
Cmp%.3 -- Pass Completion Percentage
Performance
Ast -- Assists
xA -- xG Assisted (xG which follows a pass that assists a shot)
KP -- Key Passes (Passes that directly lead to a shot (assisted shots))
PassFinThird -- Passes into Final Third (Completed passes that enter the 1/3 of the pitch closest to the goal, Not including set pieces)
PPA -- Passes into Penalty Area (Completed passes into the 18-yard box, Not including set pieces)
CrsPA -- Crosses into Penalty Area (Completed crosses into the 18-yard box, Not including set pieces)
PassProg -- Progressive Passes (Completed passes that move the ball towards the opponent's goal at least 10 yards from its furthest point in the last six passes, or any completed pass into the penalty area. Excludes passes from the defending 40% of the pitch)
Goal and Shot Creation

SCA Types
SCA -- Shot-Creating Actions (The two offensive actions directly leading to a shot, such as passes, dribbles and drawing fouls. Note: A single player can receive credit for multiple actions and the shot-taker can also receive credit.)
PassLiveShot -- Completed live-ball passes that lead to a shot attempt
PassDeadShot -- Completed dead-ball passes that lead to a shot attempt. (Includes free kicks, corner kicks, kick offs, throw-ins and goal kicks)
DribShot -- Successful dribbles that lead to a shot attempt
ShLSh -- Shots that lead to another shot attempt
Fld -- Fouls drawn that lead to a shot attempt
DefShot -- Defensive actions that lead to a shot attempt
GCA Types
GCA -- Goal-Creating Actions (The two offensive actions directly leading to a goal, such as passes, dribbles and drawing fouls. Note: A single player can receive credit for multiple actions and the shot-taker can also receive credit.)
PassLiveGoal -- Completed live-ball passes that lead to a goal
PassDeadGoal -- Completed dead-ball passes that lead to a goal. (Includes free kicks, corner kicks, kick offs, throw-ins and goal kicks)
DribGoal -- Successful dribbles that lead to a goal
ShGoal -- Shots that lead to another goal-scoring shot
FldGoal -- Fouls drawn that lead to a goal
DefGoal -- Defensive actions that lead to a goal
OG -- Actions that led directly to an opponent scoring on their own goal
Defensive Actions

Tackles
Tkl -- Number of players tackled
TklW -- Tackles Won (Tackles in which the tackler's team wo
Related Skills
node-connect
341.6kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
341.6kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.6kCommit, push, and open a PR
