Go back to STAT545A home
1.5 credits
04 September 2013 - 16 October 2013
Mon Wed 9:30 - 11am in ESB 1042, a computing lab on the main ground floor of the Earth Sciences Building (ESB) at 2207 Main Mall
Instructor: Jennifer (Jenny) Bryan [email protected]
TA: Song Cai [email protected]
Google Group for Q & A: STAT545A_2013
github repository for course materials: https://github.com/jennybc/STAT545A
cm = class meeting
Monday Sept 02 is a statutory holiday. No class.
cm 01 | Wednesday Sept 04 | Introduction to the course (slides as PDF)
-
Complete the Google form. JB sent a link to all registered students. If you need the link, contact her by email.
-
Ask to join the STAT545A_2013 Google Group or play hard to get and wait for us to invite you.
-
Follow some of the links that interest you in the cm01 lecture slides (link to PDF above). Would be great if people started a thread on the Google group suggesting more or better blogs or articles hitting the same topics or pointing out broken links.
-
Work through
- Get R, RStudio, and some add-on packages set up
- Basics of R/RStudio, workspaces, and projects <-- Really important that you show up Monday having completed this tutorial. Have that
toyline.R
script ready to work with.
-
Sign up for an account at Rpubs.com. We will try this as the first and gentlest method of generating finished work for this course. JB has conducted a test and it's dead easy. We'll get you started on Monday.
cm 02 | Monday Sept 09 | Create first report, Deep Thoughts, Basic care and feeding of data in R (slides as PDF)
-
Sign up for an educational account at github.com/edu. Take advantage of their special deal for students, where you can get something like 5 private repositories for two years for free. We will experiment with students modifying the course repository via a browser-based workflow for those who do not wish to take the git plunge yet. I will not require you to create or share your own repositories -- you just need a github account in order to edit mine, e.g. the course repository. Usage cases I have in mind:
- Submitting coursework by adding a link to work you've published on Rpubs.com
- Suggesting a dataset to work on later in the class or commenting on suggestions made by others
-
Work through
-
Feel free to start thinking about some datasets we could work with later in the class
cm 03 | Wednesday Sept 11 | R objects (beyond data.frames) and R Markdown (slides as PDF)
-
Submit homework 1 ASAP.
-
Submit homework 2 before class starts @ 9:30am Monday Sept. 16.
-
Work through
- R objects (beyond data.frames) and indexing. Note: you should be able to do the homework before completing this tutorial.
cm 04 | Monday Sept 16 | Data aggregation (slides as PDF)
-
Note that on Wednesday Sept 18 UBC will observe National Reconciliation Week. No class.
-
Study
- a tutorial on data aggregation
- a note on CSS, triggered by making attractive HTML tables in the data aggregation tutorial
-
Submit homework 3 before class starts @ 9:30am Monday Sept. 23.
-
Post a serious proposal for a dataset and/or make a thoughtful contribution to the discussion of an existing proposal. We need to start fleshing these out. The time needed for data assembly and cleaning is going to break your heart. Here are two places for these discussions
- Start or contribution to threads on the Google Group
- Edit the "dataset ideas" page, now that you all know how to propose changes to the course web materials!
-
Spend ~1 hr (or more if you are new to the command line and scripting) reading these resources about how to ask for help. Don't be paranoid, this is not specifically about you and that question you asked the other day! This material has been in the course for a couple of years. This is another aspect of the culture that one has to actually learn.
- The 9th circle of R hell in The R Inferno by Patrick Burns.
- How To Ask Questions The Smart Way by Eric Raymond.
- The posting guide for the R-help mailing list, which I recommend you NOT post to any time soon. (But by all means, search for answers to your questions there!) Turn to Jenny, Song, and the Google Group first for now.
cm 05 | Monday Sept 23 | Explore a quantitative variable, visuals via lattice
(slides as PDF)
-
Work through
- Getting data out of R
- An external resource on writing your own functions. Pick one (or more) for yourself.
- Exploring a quantitative variable with
lattice
. We did the stripplot material. We'll do the rest in class on Wednesday.
-
Reading on the
lattice
package in the book Lattice: Multivariate Data Visualization with R by Deepayan Sarkar (2008). Links to the eBook and other resources can be found on my resources page.- Ch. 1 Introduction (short! totally worth it)
- Ch. 2 A Technical Overview of lattice (skimming is OK; at least you'll know where to come back to when you're confused)
- Ch. 3 Visualizing Univariate Distributions (great companion to our work in class this week)
cm 06 | Wednesday Sept 25 | Explore a quantitative variable, visuals via lattice
, cont'd (slides as PDF)
-
Work through
- Exploring a quantitative variable with
lattice
. We finished with densityplots, boxplots, histograms, etc. - UNDER DEVELOPMENT! Curious about managing factors? Draft topic on factors
- Why are we using
lattice
andggplot2
? Read about the R graphics landscape - Revisit homework 3 to see Jenny's solutions and access links to work of fellow students.
- To tidy or not to tidy? re: code tidying in
knitr
- Exploring a quantitative variable with
-
Reading on the
lattice
package in the book Lattice: Multivariate Data Visualization with R by Deepayan Sarkar (2008). Links to the eBook and other resources can be found on my resources page.- Ch. 5 Scatter Plots and Extensions. Most important sections (skim the rest?):
- 5.1 The standard scatter plot
- 5.3 Variants using the type argument
- 5.4 Scatter-plot variants for large data
- Ch. 5 Scatter Plots and Extensions. Most important sections (skim the rest?):
-
Submit homework 4 before class starts @ 9:30am Monday Sept. 30. Make graphical companions to data aggregation output from homework 3.
cm 07 | Monday Sept 30 | Two quantitative variables + lattice
details + writing figures to file (slides as PDF)
-
Work through
- A quick tour of `xyplot()
- Gateway to more advanced
lattice
usage - Writing a figure to file
- Revisit homework 4 to see JB's solutions and access links to work of fellow students.
-
Reading on the
ggplot2
package in the book ggplot2: Elegant Graphics for Data Analysis. Links to the eBook and much more can be found on my resources page.- Ch. 1 Introduction
- Ch. 2 Getting started with
qplot()
(although I think we will not useqplot()
)
-
Submit homework 5 before class starts @ 9:30am Monday Oct. 07.
-
Reading on the
ggplot2
package in the book ggplot2: Elegant Graphics for Data Analysis. Links to the eBook and much more can be found on my resources page.- Ch. 3 Mastering the grammar
-
Work through
- Using colors in R
- Mapping a factor into a color in base graphics (this is also nice demonstration of
match()
andmerge()
) - Taking control of qualitative colors in
lattice
-
Work through
Monday Oct 14 is a statutory holiday. No class. Happy Thanksgiving!
cm 11 | Wednesday Oct 16 | Coding style, project organization, version control (slides as PDF)
-
Work through
-
Submit homework 6 by 12pm noon Monday Oct. 21.
- Descriptive statistics on HW and course marks