SkillAgentSearch skills...

TidyTuesdayWithPython

My weekly (or monthly) data visualisation practicing using data from TidyTuesday, using Python instead of R!

Install / Use

/learn @xh313/TidyTuesdayWithPython
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Tidy Tuesday with Python

My weekly (or monthly) data visualisation practicing using data from TidyTuesday, using Matplotlib and Python instead of R!

Projects

2 Aug 2022

Frogs spotted in Oregon. I experimented with circle packing but it's so much pain that I would probably never do it again. Anyway here's the code and the pic.

frogs

Code: Here

19 Jul 2022

This data set is so interesting that I got so obsessed with wrangling it and forgot about visualisation...

Here is a meaningless graph just for fun XD

image

Code: Here

12 Jul 2022

European flights. I don't know what I'm doing. Ideally this highlights the hit of COVID on air traffic.

image

Code: Here

5 Jul 2022

SF rent and lease distribution. I tried to do an animation and failed miserably. I still have to get up to work on Wednesday so I'd try again next week lol.

image

Code: Here

28 June 2022

UK pay gap. image

Code: Here

21 June 2022

In honour of Juneteenth :) did a lot of text processing stuff to brush up my regex. image

Code: Here

14 June 2022

The data set is on droughts in the US but I focused on California in this part. It isn't going well... IMG_6613

Then I also looked into the general trend for every state in the US but this is kind of unclear at first glance... us_droughts

Code: Here

7 June 2022

Holding companies donating to anti-LGBTQ politicians accountable. image Condensed: image

Code: Here

24 May 2022

Women's rugby. image

Code: Here

17 May 2022

Eurovision! And the drastic contrast between 2022 and 2021.

2021: image

Whereas 2022: image

All of our best wishes go to Ukraine <3

Plotted on Python using Basemap in Matplotlib and Geopy.

Code: Click here

3 May 2022

After a month of random COVID disruptions and UCLA DataFest I am finally back to TidyTuesday!

Today's raw data: https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-05-03

Graphic: image

29 Mar 2022

Plotly is so cool!

ncaafunds

22 Mar 2022

Cheesiest plot I've made so far...

  • Raw data: [https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-03-22/babynames.csv]

Graphic: A graphic showing the trendiness of a selection of feminine baby names across the time span from 1960 to 2017.

XH 22 Mar 2022

08 Mar 2022

PENDING

01 Mar 2022

Tried Geopandas and Geoplots the first time! I'd say I would probably rather use seaborn the next time though...

Data processing logic:

  • Raw data: [https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-03-01/stations.csv]
  • Map the points on the country map using the LONGITUDE and LATITUDE columns.
  • Colour code the points using the column FUEL_TYPE_CODE distinguishing the fuel types.

Visual features:

  • Designs:
    • Map of the US (excluding Alaska and islands) as background with faint county borders
    • Translucent data points showing the density of the distribution clearly
    • Legend showing fuel types
  • Avenir typesetting! Avenir is the best
  • Also added Alt text

Issues:

  • Projection: whenever I employ projection methods the session crashes, so now the map looks kind of squished
  • The spots on the legend are so faint that it's hard to tell apart the difference in colours
  • The legend handles are currently acronym and might work better if I type in the full name

Plans:

  • Maybe fix the projection issue
  • Differentiate the colours more

Graphic: A graphic showing the alternative fuel station distribution in the US. The shape of the country excluding Alaska and islands are shown on the background with bright points indicating the occurrences of the stations in different regions. The colours mark the type of alt fuel the stations supply.

XH 01 Mar 2022

22 Feb 2022

Happy 22022022 palindrome day!

Content note: The raw data given by TidyTuesday this week involves comparison between countries, which might involve some political disputes and/or underlying assumptions. The raw data does not come from me and does not represent my political opinions. Please assess the credibility of the original data under your own judgements.

Data processing logic:

  • Raw data: [https://github.com/rfordatascience/tidytuesday/blob/master/data/2022/2022-02-22/freedom.csv]
  • Selecting the column of 'Status' (Free, partially free and not free)
  • Extract the status of each country for every year (1995-2020), count the data and funnel into 3 dictionaries.
  • Show the trend of the number of countries that are in each status over the 26 years

Visual features:

  • Designs:
    • Used mock-ggplot style with some modifications (facecolor etc.)
    • Translucent on-graph legend with sharp corners
    • All-filling solid colours with different shades
  • Avenir typesetting! Avenir is the best

Issues:

  • Sort of boring (I didn't have much time to make it fancier :(
  • Would probably work better if the graph is more horizontal (aka the height could be decreased)
  • The grids in the background are useless since the area fills don't have an alpha (quick fix)

Plans:

  • Add alphas to the filled area under curves
  • Change graph dimensions
  • Improving the documentation and styling
  • Alt text

Graphic: image

XH 22 Feb 2022

8 Feb 2022 (actually using the data set from 25 Jan 2022)

I am not interested in random american airforce people so I pulled out an old boardgame data set instead!

** LOGGING IN PROCESS NOT FINISHED **

Data processing logic:

  • Data on dog breeds and their different traits
  • Quantifying all qualitative descriptions into scores using text processing
  • Weighting and categorising each trait into two new parameters 'friendliness' and 'fluffiness'
  • Plot scatter plot with each point corresponding to a breed on the quadrant of fluffiness-friendliness

Visual features:

  • Detecting overlapping points or close-by points automatically and wrap/dodge off the labelling (still bugged :( )
  • Generating a new colour for each data point on the tab 20b palette (or any other palettes, might change it if in the mood)
  • Avenir typesetting! Avenir is the best
  • Annotation of the breed name beside each data point
  • Legend indexing all 190+ breed names

Issues:

  • The overlap detector does not work for certain few points for some reason
  • Graph too huge with too many data points -- hard to read! Don't know if there's a better way to present the data!
  • Might need to adjust some weighings a bit (as I don't own a dog myself, I am biased!)

Plans:

  • Indexing the position on the diagram for each breed and incorporating into the legend
  • Improving the documentation and styling (it is currently unfortunately a mess!)

Graphic: image

XH 8 Feb 2022

1 Feb 2022

Data processing logic:

  • Data on dog breeds and their different traits
  • Quantifying all qualitative descriptions into scores using text processing
  • Weighting and categorising each trait into two new parameters 'friendliness' and 'fluffiness'
  • Plot scatter plot with each point corresponding to a breed on the quadrant of fluffiness-friendliness

Visual features:

  • Detecting overlapping points or close-by points automatically and wrap/dodge off the labell
View on GitHub
GitHub Stars18
CategoryDevelopment
Updated4mo ago
Forks1

Languages

Jupyter Notebook

Security Score

77/100

Audited on Dec 3, 2025

No findings