Data, data

A statistical analysis and exploration on Jorge Drexler's music and lyrics.

by Alex Ingberg

Data, data is a homage to the great Uruguayan musician and songwriter Jorge Drexler.

Pulling data using both the Genius API and the Spotify API I've been able to analyze Jorge's music and get some insights and visualizations on his creative process and his songs in general; both from the lyrics side and the musical theory side.

Wordcount, lexical and lyrical density, sentiment analysis and analysis of musical components like tempo, time signature and key are all taken into account. Also, in the end, gloom_index is used combining both lyrics and music.

To check the analysis, go here.

To see the article published in Towards Data Science, go here

To check its spanish translation, go here

To check how i built the database, go here.

Some cool samples from the visualizations:

NRC emotions through the years Tempo by albums Usage of keys Top 10 songs with more words Wordcloud Lyrical density vs lexical density Correlation in negative NRC emotions

This whole project has been created using Python 3, Jupyter Notebook and a little bit of PyCharm.

I created the databases with pandas, BeautifulSoup, Spotipy (an amazing Python wrapper for the Web Spotify API), and the Genius API and Web Spotify API.

To work on the analysis the tools I used were pandas, NumPy, Matplotlib, Seaborn, scikit-learn, SciPy, Natural Language Toolkit, wordcloud and py-lex.