Openalex
Code to upload OPenAlex dump and split it into tables.
Install / Use
/learn @insyspo/OpenalexREADME
OpenAlex upload to BigQuery
Code to upload OpenAlex dump and split it into tables
The code is divided into two files.
Upload - Bash commands to upload
Model - BigQuery code for relational model
Diagrams for tables
Steps
The steps are as follows.
- First file (Upload):
- Using a Google VM we download the most recent dump. OpenAlex dump.
- The dump is uploaded as tables. One for each of the main entities Entities.
- The tables have just one column as a JSON entry are uploaded to BigQuery using a project already set up. How to create projects.
- Second file (Model):
- Everything is run over Google Colaboratory taking advatage of the internal authorisation mechanism. Also, the queries are organised in sequence. Integrating Colab and BigQuery.
- All the tables are split into fields creating columns for the values.
- New tables are created to connect the main ones.
- New tables are created to explode the array of data inside the values in the tables.
