GooglePageSpeedR
Generate pagespeed insights for as many pages as you like using Google's pagespeed insights API from within R.
Install / Use
/learn @Phippsy/GooglePageSpeedRREADME
googlePageSpeedR
This package makes a call to the Google PageSpeed Insights API and returns a data frame of results.
This means you can programatically retrieve all of the analysis points accessible via the Google PageSpeed Insights webpage.
Useful for analysing your / your competitors' page performance and organising into glorious tabular format.
TODO
- Fix type coercion issues within results.
Install
devtools::install_github("Phippsy/googlePageSpeedR")
Simple Query
We define the URL to analyse and then pass it to the get_pagespeed() function.
library(googlePageSpeedR)
page <- "https://www.rstudio.com"
insights <- get_pagespeed(ps_url = page)
insights

We are given back a data frame containing a number of key metrics for the given page.
Definitions for each metric can be reviewed at https://developers.google.com/web/tools/chrome-user-experience-report/#metrics
Provide an API key
The pagespeed insights API doesn't seem to require an API key, but you can provide one. Get your key from the Cloud console then define it in your query.
my_key <- "abcdefgHIJK"
insights <- get_pagespeed(ps_url = page, key = my_key)
Multi-page query
Given that we have a convenient function for getting pagespeed insights, it's easy to iterate over many URLs and get a nice, large collection of results.
I favour the tidyverse approach and map() collection of functions, but you could achieve the results below with for loops or apply() functions, as you prefer.
library(tidyverse)
urls <- list("http://www.rstudio.com", "http://www.reddit.com", "https://www.python.org/", "https://myspace.com/", "https://www.linux.org")
multi_results <- map_df(urls, get_pagespeed)
multi_results

Update - preventing data loss using possibly
Unfortunately, if any of your URLs return an invalid http status (e.g. Bad Request (HTTP 400)) then you will lose all data for all URLs when you call map_df().
A workaround for this is to wrap get_pagespeed() in the possibly() function from purrr. Many thanks to Johannes Radig for flagging this issue in this Twitter convo.
The drawback with the below code is that you won't know if any URLs fail - you'll have to cross-check the data frame of results against the list of URLs you submit.
urls <- list("http://www.rstudio.com", "http://www.reddit.com", "https://www.python.org/",
"http://www.thisisnotarealURLandisntgoingtowork.com", "https://myspace.com/", "https://www.linux.org")
# The code below would fail - not run
# multi_results <- map_df(urls, get_pagespeed)
# multi_results
# Adding our safe function
safe_pagespeed <- possibly(get_pagespeed, otherwise = NULL)
safe_multi_results <- map_df(urls, safe_pagespeed)
safe_multi_results
Raw content query
If you prefer to have the API response converted directly into a list and select the values of interest, you can do this using get_pagespeed_content().
insights_content <- get_pagespeed_content(ps_url = page)
str(insights_content[1:5])
Related Skills
node-connect
354.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
112.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
354.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
354.0kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
