Email: [email protected]
Class Time: TR, 9:55 - 11:40 AM in Harney 430
Office Hours: TR, 12:00 - 12:30 PM in Harney 107B
Book: R for Data Science by Hadley Wickham and Garret Grolemund
Syllabus: Link
By the end of this course, you will be able to
- Proficiently wrangle, manipulate, and explore data using the R programming language
- Utilize contemporary R libraries including ggplot2, tibble, tidyr, dplyr, knitr, and stringr
- Visualize, present, and communicate trends in a variety of data types
- Communicate results using R markdown and R Shiny
- Formulate data-driven hypotheses using exploratory data analysis and introductory model building techniques
The focus of this course will be to provide you with the basic techniques available for making informed, data-driven decisions using the R programming language. This is not a statistics course, but will provide you the intuition to make hypotheses about complex questions through visualization, wrangling, manipulation, and exploration of data. The course will be graded based on the following components:
- Attendence (20%): You will lose 2% of this grade for every course you miss.
- Assignments (40%): You will be assigned a computational assignment to be completed using RStudio and the package knitr regularly throughout class.
- Case Studies (20%): You will be assigned applied case studies throughout the class that are to be completed using RStudio.
- Final Project (20%): The final project will be a computational case study that brings together the techniques learned throughout the semester. The description for this project will be provided towards the mid point of the semester.
- Extra Credit (+5%): Create a well-organized database of all R functions that you use throughout the semester. These include those mentioned in lectures, those introduced in homework, etc. Along with each function, give a brief description that details the use of the function. Also, organize these functions into categories according to their use.
- Undergraduate Research in Statistics and Data Science: Article from Amstat News
Overall, this course will be split into two main parts: (1) learning the basics of how to code in R and (2) performing data analysis on real case studies and examples using data science techniques in R.
Introduction
Topic | Reading | Assignment | Due Date | In Class Code |
---|---|---|---|---|
Introduction - History of Data Science | Ch. 1 What is Data Science? | HW 1 | Thursday, 8/24 | |
R and RStudio | HW 2 | Tuesday, 8/29 | My First Code | |
R Packages and RMarkdown | HW 3 | Tuesday, 9/5 | My First Knit |
Data Structures in R
Topic | Reading | Assignment | Due Date | In Class Code |
---|---|---|---|---|
Vectors, Matrices, and Arrays | HW 4 | Tuesday, 9/12 | [Data Structures I] [Data Structures II] | |
Lists and Data Frames | Ch. 20 in R for Data Science | Data Structures III | ||
Tibbles | Ch. 10 in R for Data Science | HW 5 | Tuesday, 9/26 | Tibbles |
String Analysis | Ch. 14 in R for Data Science | HW 6 | Thursday, 9/28 | String Analysis I |
String Analysis 2 | Ch. 14 in R for Data Science | HW 7 | Thursday, 10/5 | String Analysis II |
Factors | Ch. 15 in R for Data Science | Factors |
Data Wrangling and Plotting
Topic | Reading | Assignment | Due Date | In Class Code |
---|---|---|---|---|
Input and Output | Input and Output | |||
Plotting in R | HW 8 | Friday, 10/27 | Plotting 1 | |
Wrangling Data |
Programming
Topic | Reading | Assignment | Due Date | In Class Code |
---|---|---|---|---|
Control Flow | Ch. 19 in R for Data Science | Functions 1 | ||
Writing Functions | Ch. 19 in R for Data Science | Functions 2 | ||
Functionals | Ch. 18 in R for Data Science |
Statistical Modeling in R
Topic | Reading | Assignment | Due Date | In Class Code |
---|---|---|---|---|
Intro to Statistical Modeling in R | Ch. 23 and 24 in R for Data Science |
Case Study | Data | Date |
---|---|---|
CS 1: Beer Review Analysis | beerdata.RData | September 12th, 2017 |
CS 2: Text Mining | [tweets.csv]; [stateoftheunion1790-2012.txt] | September 28th, 2017 |
CS 3: Building the Game of Blackjack | November 8th, 2017 |
Description | Due Date |
---|---|
Project Signup | October 31st at 9:00 AM |
Final Project Description | November 30th at 9:00 AM |
- Monday, August 28th - Last day to add the class
- Friday, September 8th - Census date. Last day to withdraw with tuition reversal
- Tuesday, October 17th - Fall break! (no class)
- Friday, November 3rd - Last day to withdraw
- Thursday, November 23rd - Thanksgiving Holiday (no class)
- Tuesday, December 5th - Last day of class