SkillAgentSearch skills...

EC524W20

Masters-level applied econometrics course—focusing on prediction—at the University of Oregon (EC424/524 during Winter quarter, 2020 Taught by Ed Rubin

Install / Use

/learn @edrubin/EC524W20
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

EC 524, Winter 2020

Welcome to Economics 524 (424): Prediction and machine-learning in econometrics, taught by Ed Rubin and Connor Lennon.

Schedule

Lecture Tuesday and Thursday, 10:00am–11:50am, 105 Peterson Hall

Lab Friday, 12:00pm–12:50pm, 102 Peterson Hall

Office hours

  • Ed Rubin (PLC 519): Thursday (2pm–3pm); Friday (1pm–2pm)
  • Connor Lennon (PLC 430): Monday (1pm-2pm)

Syllabus

Syllabus

Books

Required books

Suggested books

Lecture notes

000 - Overview (Why predict?)

  1. Why do we have a class on prediction?
  2. How is prediction (and how are its tools) different from causal inference?
  3. Motivating examples

Formats .html | .pdf | .Rmd

001 - Statistical learning foundations

  1. Why do we have a class on prediction?
  2. How is prediction (and how are its tools) different from causal inference?
  3. Motivating examples

Formats .html | .pdf | .Rmd

002 - Model accuracy

  1. Model accuracy
  2. Loss for regression and classification
  3. The variance bias-tradeoff
  4. The Bayes classifier
  5. KNN

Formats .html | .pdf | .Rmd

003 - Resampling methods

  1. Review
  2. The validation-set approach
  3. Leave-out-out cross validation
  4. k-fold cross validation
  5. The bootstrap

In-class: Validation-set exercise (Kaggle)

Formats .html | .pdf | .Rmd

004 - Linear regression strikes back

  1. Returning to linear regression
  2. Model performance and overfit
  3. Model selection—best subset and stepwise
  4. Selection criteria

Formats .html | .pdf | .Rmd

005 - Shrinkage methods

  1. Ridge regression
  2. Lasso
  3. Elasticnet

Formats .html | .pdf | .Rmd

006 - Classification intro

  1. Introduction to classification
  2. Why not regression?
  3. But also: Logistic regression
  4. Assessment: Confusion matrix, assessment criteria, ROC, and AUC

Formats .html | .pdf | .Rmd

007 - Decision trees

  1. Introduction to trees
  2. Regression trees
  3. Classification trees—including the Gini index, entropy, and error rate

Formats .html | .pdf | .Rmd

008 - Ensemble methods

  1. Introduction
  2. Bagging
  3. Random forests
  4. Boosting

Formats .html | .pdf | .Rmd

009 - Support vector machines

  1. Hyperplanes and classification
  2. The maximal margin hyperplane/classifier
  3. The support vector classifier
  4. Support vector machines

Formats .html | .pdf | .Rmd

Projects

Intro Predicting sales price in housing data (Kaggle)

Help: Kaggle notebooks

001 KNN and loss (Kaggle notebook) <br> You will need to sign into you Kaggle account and then hit "Copy and Edit" to add the notebook to your account. <br> Due 21 January 2020 before midnight.

002 Cross validation and linear regression (Kaggle notebook) <br> Due 04 February 2020 before midnight.

003 Model selection and shrinkage (Kaggle notebook) <br> Due 13 February 2020 before midnight.

004 Predicting heart disease (Kaggle competition) | Competition Due 20 February 2020 before midnight.

005 Classifying customer churn (Kaggle competition) | Competition Due In-class 27 February 2020.

Class project Due 12 March 2020 before class.

Lab notes

000 - Workflow and cleaning

  1. General "best practices" for coding
  2. Working with RStudio
  3. The pipe (%>%)

Formats .html | .pdf | .Rmd

001 - dplyr and Kaggle notebooks

  1. Finish previous lab on dplyr
  2. Working in (Kaggle) notebooks
  3. Kaggle contest notes

002 - Cross validation and simulation

  1. Cross-validation review
  2. CV and interdependence
  3. Writing functions
  4. Introduction to learning via simulation
  5. Simulation: CV and dependence

Formats .html | .pdf | .Rmd

Additional R script for simulation

003 - Data cleaning and dplyr

004 - Data cleaning and workflow with tidymodels

005 - Perceptrons and neural nets

Additional Data cleaning in R (with caret)

  • Converting numeric variables to categorical
  • Converting categorical variables to dummies
  • Imputing missing values
  • Standardizing variables (centering and scaling)

Additional resources

R

View on GitHub
GitHub Stars82
CategoryDevelopment
Updated1mo ago
Forks27

Languages

HTML

Security Score

80/100

Audited on Mar 1, 2026

No findings