SkillAgentSearch skills...

DataScienceResources

Open Source Data Science Resources.

Install / Use

/learn @jonathan-bower/DataScienceResources
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Data Science Resources

Hello and welcome to the Data Science Resources repo. I originally built this repo so that I could have a location to host resources that are helpful to me. Through building the repo I realized that other people might be also be interested. I have tried to curate content on data science topics, high quality resources to learn from, and relevant blog posts.

The intended goal was to cover more than just the technical component of data science. I have tried to find topics that cover building data science teams, business practices, use-cases, product metrics and data science career paths. Hope this is helpful

Table Of Contents

1. Data Science Getting Started

2. Data Pipeline & Tools

3. Product

4. Career Resources

5. Open Source Data Science Resources

Data Science Getting Started

Data Science is a multidisciplinary field covering at the very minimum - statistics, programming, machine learning Drew Conway's venn diagram or Cheat Sheet of a Modern Data Scientist. These topics are covered throughout this repo. I personally find the best way to learn a topic is to get my hands dirty quickly - with that in mind I would get to work in python and then implement different tools or theory into my toolkit as they are understood. If you haven't used python before I would strongly urge you to use the codecademy course to familiarize yourself with the content and how to program. Good luck and have fun.

A note about order - I framed the contents in the Pipeline & Tools section order of the data pipeline starting with acquisition, exploratory data analysis, cleaning data, model section & evaluation and then visualization.

Start

Data Science Courses:

  • Coursera - Data Science Specialization at Coursera - many other courses available as well.
  • Udacity - Online MOOCs that are the Data Science related courses. by I
  • Data Science Bootcamps - A collection of all bootcamps currently on the market as of April 5, 2014 by Ikechukwu Okonkwo.
  • Coursera Machine Learning Course - Andrew Ng's pinnacle Machine Learning course.
  • Edx - EDX courses related to data science.

Data Pipeline & Tools

Python

Python is my workhorse language specifically as it has many data science and statistic library, the ability to work in production environments, and work on other problems outside of data science. There are many other languages that could be useful but are not covered here: Julia, R, Cython, Pig, Scala, Java, etc.

Data Structures & CS Topics

Statistics

Some primers on understanding statistics and other resources to get a deeper understanding.

Stats/Engineering Libraries

A collection of workhorse libraries that are elemental for any python data scientist.

  • Pandas Wes McKinney's pandas library for EDA on small to medium sized data sets when you don't want to put the infrastructure for SQL or when it isn't necessary. It has many other great applications other than just better than SQL on small to medium data sets.
  • SciPy - Open-source software for mathematics, science and engineering.
  • NumPy -
View on GitHub
GitHub Stars4.2k
CategoryDevelopment
Updated1d ago
Forks1.5k

Security Score

80/100

Audited on Apr 10, 2026

No findings