LearnPythonforResearch

This repository provides everything you need to get started with Python for (social science) research.

Generate Convert Improve

Install / Use

/learn @TiesdeKok/LearnPythonforResearch

About this skill

Quality Score

0/100

README

<h1 align="center"> <img src="https://i.imgur.com/KZGIDj0.png" alt="Get started with Python for Research" title="Get started with Python for Research" /> </h1> <p align="center"> <a href="https://mybinder.org/v2/gh/TiesdeKok/LearnPythonforResearch/master?urlpath=lab"><img src="https://mybinder.org/badge_logo.svg"></a> <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/license-MIT-blue.svg"></a> <a href="https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=2UKM4JREAPTBG"><img src="https://img.shields.io/badge/buy%20me%20a-coffee-yellow.svg"></a> <img src="https://img.shields.io/badge/last%20updated-June%202020-3d62d1"> </p> <p align="center"> Want to learn how to use <strong>Python for (Social Science) Research</strong>? <br> This repository has everything that you need to get started! <br><br> <span style='font-size: 15pt'><strong>Author:</strong> Ties de Kok (<a href="https://www.TiesdeKok.com">Personal Page</a>)</span> </p>

Introduction
- Who is this repository for?
- How to use this repository?
Getting your Python setup ready
- Installing Anaconda
- Setting up Conda Environment
Using Python
- Jupyter Notebook/Lab
- Installing packages
Tutorial Notebooks
- Exercises
Code along
- Binder
- Local installation
Questions?
License
Special thanks

<h2 id="introduction">Introduction</h2>

The goal of this GitHub page is to provide you with everything you need to get started with Python for actual research projects.

<h3 id="audience">Who is this repository for?</h3>

The topics and techniques demonstrated in this repository are primarily oriented towards empirical research projects in fields such as Accounting, Finance, Marketing, Political Science, and other Social Sciences.

However, many of the basics are also perfectly applicable if you are looking to use Python for any other type of Data Science!

<h3 id="howtouse">How to use this repository?</h3>

This repository is written to facilitate learning by doing

If you are starting from scratch I recommend the following:

Familiarize yourself with the Getting your Python setup ready and Using Python sections below
Check the Code along! section to make sure that you can interactively use the Jupyter Notebooks
Work through the 0_python_basics.ipynb notebook and try to get a basics grasp on the Python syntax
Do the "Basic Python tasks" part of the exercises.ipynb notebook
Work through the 1_opening_files.ipynb, 2_handling_data.ipynb, and 3_visualizing_data.ipynb notebooks.
Note: the 2_handling_data.ipynb notebook is very comprehensive, feel free to skip the more advanced parts at first.
Do the "Data handling tasks (+ some plotting)" part of the exercises.ipynb notebook

If you are interested in web-scraping:

Work through the 4_web_scraping.ipynb notebook
Do the "Web scraping" part of the exercises.ipynb notebook

If you are interested in Natural Language Processing with Python:

Take a look at my Python NLP tutorial repository + notebook

If you are already familiar with the Python basics:

Use the notebooks provided in this repository selectively depending on the types of problems that you try to solve with Python.

Everything in the notebooks is purposely sectioned by the task description. So if you, for example, are looking to merge two Pandas dataframes together, you can use the Combining dataframes section of the 2_handling_data.ipynb notebook as a starting point.

<h2 id="setup">Getting your Python setup ready</h2>

There are multiple ways to get your Python environment set up. To keep things simple I will only provide you with what I believe to be the best and easiest way to get started: the Anaconda distribution + a conda environment.

<h3 id="anaconda">Anaconda Distribution</h3>

The Anaconda Distribution bundles Python with a large collection of Python packages from the (data) science Python eco-system.

By installing the Anaconda Distribution you essentially obtain everything you need to get started with Python for Research!

<h4 id="anacondainstall">Step 1: Install Anaconda</h4>

Go to anaconda.com/download/
Download the Python 3.x version installer
Install Anaconda.
- It is worth to take note of the installation directory in case you ever need to find it again.
Check if the installation works by launching a command prompt (terminal) and type python, it should say Anaconda at the top.
- On Windows I recommend using the Anaconda Prompt

Note: Anaconda also comes with the Anaconda Explorer, I haven't personally used it yet but it might be convenient.

<h4 id="setupenv">Step 2: Set up the <i>learnpythonforresearch</i> environment</h4>

Make sure you've cloned/downloaded this repository: Clone repository
cd (i.e. Change) to the folder where you extracted the ZIP file
for example: cd "C:\Files\Work\Project_1"
Note: if you are changing do folder on another drive you might have to also switch drives by typing, for example, E:
Run the following command conda env create -f environment.yml
Activate the environment with: conda activate LearnPythonforResearch

A full list of all the packages used is provided in the environment.yml file.

<h4 id="pythonversion">Python 3 vs Python 2?</h4>

Python 3.x is the newer and superior version over Python 2.7 so I strongly recommend to use Python 3.x whenever possible. There is no reason to use Python 2.7, unless you are forced to work with old Python 2.7 code.

<h2 id="usingpython">Using Python</h2>

Basic methods:

The native way to run Python code is by saving the code to a file with the ".py" extension and executing it from the console / terminal:

python code.py

Alternatively, you can run some quick code by starting a python or ipython interactive console by typing either python or ipython in your console / terminal.

<h3 id="jupyter">Jupyter Notebook/Lab</h3>

The above is, however, not very convenient for research purposes as we desire easy interactivity and good documentation options.
Fortunately, the awesome Jupyter Notebooks provide a great alternative way of using Python for research purposes.

Jupyter comes pre-installed with the Anaconda distribution so you should have everything already installed and ready to go.

Note on Jupyter Lab

JupyterLab 1.0: Jupyter’s Next-Generation Notebook Interface
JupyterLab is a web-based interactive development environment for Jupyter notebooks, code, and data. JupyterLab is flexible: configure and arrange the user interface to support a wide range of workflows in data science, scientific computing, and machine learning. JupyterLab is extensible and modular: write plugins that add new components and integrate with existing ones.

Jupyter Lab is an additional interface layer that extends the functionality of Jupyter Notebooks which are the primary way you interact with Python code.

What is the Jupyter Notebook?

From the Jupyter website:

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.

In other words, the Jupyter Notebook allows you to program Python code straight from your browser!

How does the Jupyter Notebook/Lab work in the background?

The diagram below sums up the basics components of Jupyter:

At the heart there is the Jupyter Server that handles everything, the Jupyter Notebook which is accessed and used through your browser, and the kernel that executes the code. We will be focusing on the natively included Python Kernel but Jupyter is language agnostic so you can also use it with other languages/software such as 'R'.

It is worth noting that in most cases you will be running the Jupyter Server on your own computer and will connect to it locally in your browser (i.e. you don't need to be connected to the internet). However, it is also possible to run the Jupyter Server on a different computer, for example a high performance computation server in the cloud, and connect to it over the internet.

How to start a Jupyter Notebook/Lab?

The primary method that I would recommend to start a Jupyter Notebook/Lab is to use the command line (terminal) directly:

Open your command prompt / terminal (on Windows I recommend the Anaconda Prompt)
Activate the right environment with conda activate LearnPythonForResearch
cd (i.e. Change) to the desired starting directory
for example: cd "C:\Files\Work\Project_1"
Note: if you are changing do folder on another drive you might have to also switch drives by typing, for example, E:
Start the Jupyter Notebook/Lab server by typing: jupyter notebook or jupyter lab

This should automatically open up the corresponding Jupyter Notebook/Lab in your default browser. You can also manually go to the Jupyter Notebook/Lab by going to localhost:8888 with your browser. (You might be asked for a password, which can find in the terminal window where there Jupyter server is running.)

***How to cl

Related Skills

claude-opus-4-5-migration

99.2k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

model-usage

344.4k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

feishu-drive

344.4k

things-mac

344.4k

Manage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)