SkillAgentSearch skills...

Datasciencectacontent

repository for Community Mentor content related to the Johns Hopkins University Data Science Specialization on Coursera

Install / Use

/learn @lgreski/Datasciencectacontent
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

A Community Mentor's Guide to the Johns Hopkins University Data Science Specialization

Author: Len Greski

This repository contains content developed during my time as either a student or Community Mentor in the Data Science Specialization from Johns Hopkins University that is offered over Coursera. A number of people have developed content to help students work through the ten courses in the specialization. The main index for community generated content across the Specialization is datasciencespecialization.github.io.

Repository Contents

As a participant and Community Mentor in courses in the curriculum, there are patterns of similar issues experienced by students. Migrating the content to github will facilitate reposting it to new runs of courses within the curriculum. This will make it easier for students to have access to the experiences from prior students without me having to regularly cut and paste content from past sessions into Discussion Forums, which are the primary mechanism for communication between students and Community Mentors.

<table> <tr><th>File</th><th>Description</th></tr> <tr><td valign="top">/markdown</td><td valign="top">Directory containing markdown files, the primary form of documentation for the content in the repository.</td><tr> <tr><td valign="top">/markdown/images</td><td valign="top">Directory containing portable network graphics files, which are used to illustrate the narrative content in other documentation. </td><tr> <tr><td valign="top">README.md</td><td valign="top">File explaining the purpose and contents of the repository, listing of links to specific content by course.</td><tr> </table>

The remainder of this README serves as a directory of the content, aligning individual documents with the course(s) for which the content is relevant.

Course 1: Data Scientist's Toolbox

  1. Course Prerequisites and Difficulty Levels Provides an overview of the Data Science Specialization courses, explaining from a practical perspective the courses a student needs as prerequisites to other courses. While students may take more than one class at a time, it's important to know how information from earlier courses is used in subsequent ones. <br><br> The article also ranks the difficulty levels from most to least difficult, based on the author's experience in the curriculum as well as Discussion Forum feedback contributed by other students.
  2. Configuring RStudio to work with git / github - Mac OSX
  3. Configuring RStudio to work with git / github - Windows 7, 8, and 10
  4. Using Editor Modes in Discussion Forum Posts
  5. Buying a Computer for Data Science
  6. R and RStudio on Chromebook
  7. Installing R and RStudio on Chromebook Walkthrough demonstrating how to install R and RStudio on a Chromebook with Crouton and Ubuntu Linux.

Issue: Students Struggle to find URLs in Lecture Slides

If you're interested in the URLs for the lecture slides, they are available in the Data Science Specialization Courses github repository. Each course is stored in a subdirectory within the repository, and the slides are built in R Markdown language, a technique you'll learn in Reproducible Research.

Miscellaneous Articles about Data Science

  1. The Future of Data Analysis by John Tukey, 1962 paper where he challenges statisticians to move away from ever more complicated mathematics to tackle data analysis problems in more realistic ways.
  2. 50 Years of Data Science by David Donoho, a 2015 retrospective on John Tukey's 1962 paper.

Course 2: R Programming

START HERE

If you're new to the course and trying to figure out what to do in what order, start with these articles.

  1. Resources for R Programming Provides a summary of student-generated content to support the course, some of which is indexed on the Data Science Specialization's github.io site
  2. References for R Programming Provides a list of references for R programming, ranging from beginning to advanced topics.
  3. Data Science Specialization: what is the value? Addresses a common question raised by students in R Programming who are frustrated by the amount of work they have to do on their own to complete quizzes and assignments.
  4. Swirl: common problems & getting help Discusses a couple of frequent problems students have getting swirl to work on their computers, and provides URLs to support from the creators of swirl.
  5. R versus Python Roundup of articles and surveys comparing R and Python, including usage, history, and pros / cons.

The next set of articles includes general commentary about the course, R programming in general, and R in relationship to other statistics packages.

  1. Commercial Statistics Packages: An Historical Perspective
  2. Configuring RStudio to work with git / github - Mac OSX
  3. A Data Frame is Also a List
  4. Forms of the Assignment Operator
  5. Forms of the Extract Operator
  6. S Objects, R Objects, and Lexical Scoping
  7. Thinking in R versus Thinking in SAS
  8. Strategy for the Programming Assignments
  9. Why is R More Difficult than SAS?
  10. R Onboarding for SAS Users
  11. References for R Programming Provides a list of references for R programming, ranging from beginning to advanced topics.
  12. Object Oriented Programming and R Explains how object oriented programming concepts are implemented in R, in response to a student question about accessing content output by the R linear models function, lm().
  13. Scoping in C/C++ vs. R Compares variable scoping in R versus C/C++.

Posts regarding specifics of programming assignments

  1. Assignment 1: Breaking Down Pollutantmean
  2. Assignment 1: Breaking Down Complete
  3. Assignment 1: Breaking Down Corr
  4. Assignment 1: A SAS Version of Pollutantmean
  5. Assignment 1: Common Mistakes - Weighted vs. Unweighted Means
  6. Assignment 1: Common Mistakes - complete("specdata",332:1) fails
  7. Assignment 1: A More Elegant Solution
  8. Assignment 2: Demystifying makeVector
  9. Assignment 2: makeCacheMatrix as an Object
  10. Assignment 2: Using Github Desktop
  11. Assignment 2: Grading the SHA-1 Hash Code
  12. Assignment 3: Functions to Sort Data Frames

Miscellaneous Code Examples and Instructions

  1. Permanently Setting R Working Directory Link to R-bloggers.com article that explains how to set your working directory permanently in R (instead of RStudio)
  2. Tutorial: Downloading Files Illustrates various ways of downloading files, including binary and text files.
  3. Creative Use of R: Downloading Course Lectures Article illustrating how to use R to automate the download of lectures from Data Science Specialization courses, such as R Programming. Techniques used in this article are helpful to make research reproducible, as required for courses like Getting and Cleaning Data and Reproducible Research.
  4. How to Upgrade R without Losing Your Packages article by Kris Eberwein on datascienceriot.com that includes code to save your list of packages to an rds file, and reinstall any packages that don't make it through the upgrade process.
  5. Common R Mistakes: Overwriting R Functions with Output Variables
  6. R Programming Cheat Sheet Based on content from R for Everyone by Jared Lander.

Interesting R News and Blog Articles

  1. R vs. Python: 2016 Survey of Software used for Data Science Overview of results from a 2016 KDNuggets Software Poll, written by Gregory Piatetsky. The follow up article with expanded analysis is What Big Data, Data Science, Deep Learning software goes together, also on kdnuggets.com.
  2. R and Python vs. SAS and SPSS Jeroen Kromme's take on strengths and weaknesses of these languages, posted on r-bloggers.com.
  3. Scaling R for Data Science August 2016 article by Federico Castanedo explaining three ways to scale R.
  4. Lexical Scoping and Statistical Computing Article by Robert Gentleman and Ross Ihaka at the University of Auckland describing how lexical scoping works, and why it is valuable in statistical computing.
  5. Data Science Job Report 2017: R Passes SAS, But Python Leaves Them Both Behind Bob Muenchen's take on the job market for various data science langauges.
  6. Redmonk Programming Language Rankings: February 2020 Steven O'Grady's analysis of popularity of programming languages, based on their activity on Github and Stackoverflow.
  7. IEEE 2018 Language Rankings R-bloggers article highlighting the 2018 IEEE programming language ra
View on GitHub
GitHub Stars642
CategoryProduct
Updated1mo ago
Forks1.6k

Languages

HTML

Security Score

95/100

Audited on Feb 8, 2026

No findings