Skip to content

Latest commit

 

History

History
256 lines (152 loc) · 13 KB

index.md

File metadata and controls

256 lines (152 loc) · 13 KB
css.file dirs.to.publish
_fileherd/markdown7.css
.

STAT540 Statistical methods for high dimensional biology

Course Information

Credits and cross-listing

STAT 540 is a 3 credit course with a mandatory computing seminar.

Cross-listed as

  • STAT 540
  • BIOF 540
  • GSAT 540

Instructors

TA(s)

Google Group for Q & A (TAs will add students to group in due course)

STAT540_2014

GitHub repository

Vast majority of course content, including source for this website, can be found here:

https://github.com/jennybc/stat540_2014

Datasets

photoRec data

Time and Location

06 January 2014 - 07 April 2014

Lecture (Sec 201)

Time : Mon Wed 9:30 - 11am

Location : ESB 4192

Seminar / computing lab (S2A) -- REGISTRATION IS REQUIRED!

Time: officially runs Wed 12pm - 1pm; unofficially students are welcome to come after class around 11am and begin a ~1 hour guided analysis with TA support; TA will remain in the lab until 1pm to help those who start as 12pm and for general office hours.

Location: ESB 1042 and 1046

Prerequisites.

Officially none BUT here in reality ...

  • Statistics: you should have already taken university level introductory statistics course.

  • Biology: No requirements, but you are expected to learn things like the difference between a DNA and RNA and a gene and a genome.

  • R: no experience required but be prepared to do a lot of self-guided learning. Go ahead and start now by installing R and the HIGHLY RECOMMENDED "integrated development environment" (IDE) RStudio! Students are expected to run R on their own computer or a computer they have plenty of access to and control over. The best set-up, if possible, is to bring your own laptop to the computing seminars.

Evaluation

  • Homework. Two assignments worth 25 points each. Homework #1 was due Thurs Feb 27. Homework #2 was due Fri March 28. Instructions for how to submit homework.

  • Group project. Groups formed and projects conceived during January/February. Primary deliverable is a poster, presented in last class meeting. Each student also produces a short report. 40 points. More more information, go here.

  • 10 points for "other", e.g. participation in class, seminars, and the Google group, engagement with small computing exercises.

Syllabus

Week 1

seminar 00 | R, RStudio Set Up & Basics, borrowed from STAT 545A. Students complete/attempt on their own in advance. Bring any difficulties to first seminar.

lecture 01 | Introduction to high dimensional biology and the course (PP) | Mon Jan 06 | slides as PDF

lecture 02 | Overview / review of probability and statistical inference, 1 of 2 (JB) | Wed Jan 08 | slides as PDF

seminar 01 | R basics and exploring a small gene expression dataset | Wed Jan 08

  • R stuff 11am - 12pm (or later, if necessary)
  • Molecular biology/genetics 101 (LL), 12pm - 1pm | slides as PDF

Week 2

lecture 03 | Overview / review of probability and statistical inference, 2 of 2 (JB) | Mon Jan 13 | slides as PDF

lecture 04 | Exploratory analysis (PP) | Wed Jan 15 | slides as PDF

seminar 02 | Learn more R while reviewing probability (LL) | Wed Jan 15

Week 3

lecture 05 | Data QC and preprocessing (JB for GC-F) | Mon Jan 20 | slides as PDF

lecture 06 | Statistical inference: two group comparisons, e.g. differential expression analysis (JB) | Wed Jan 22 | slides as PDF

seminar 03 | R graphics AND knitr, R markdown, and git(hub) | Wed Jan 22

  • Introduction to knitr, R markdown, and git(hub) 11:15am - 12pm SJ will run a guided, hands-on tutorial in one of the labs
  • R graphics (LL) content is ready to work through in the other lab, from 12 - 1pm, or on your own.

Fri Jan 24: Project groups should be formed.

Week 4

lecture 07 | Statistical inference: more than two groups --> linear models (JB prep, GCF deliver) | Mon Jan 27 | slides as PDF

lecture 08 | Statistical inference: linear models with 2 categorical covariates, greatest hits of linear models inference (JB prep, GCF deliver) | Wed Jan 29 | slides as PDF

seminar 04 | Two group testing and data aggregation (SJ) | Wed Jan 29

Fri Jan 31: Initial project proposals due.

Week 5

lecture 09 | Statistical inference: linear models including a quantitative covariate, fitting many linear models at once (JB) | Mon Feb 03 | slides as PDF

lecture 10 | Large scale inference: Empirical Bayes, limma (JB) | Wed Feb 05 | slides as PDF

seminar 05 | Fitting and interpretting linear models (low volume) (SJ) | Wed Feb 05

Fri Feb 07: Homework #1 assignment is posted. Due Thurs Feb 27.

Week 6

Mon Feb 10 is Family Day; no class

lecture 11 | Large scale inference: multiple testing (JB) | Wed Feb 12 | slides as PDF

seminar 06 | Fitting and interpretting linear models (high volume), limma package | Wed Feb 12

Fri Feb 14: feedback to groups re: initial project proposals. Each group will be assigned an instructor or TA + instructor pair for extra support.

Week 7

(Mon Feb 17 UBC closed for mid-term break)

(Wed Feb 19 UBC closed for mid-term break)

Week 8

lecture 12 | Analysis of RNA-Seq data (PP), 1 of 2 | Mon Feb 24 | slides as PDF

lecture 13 | Analysis of RNA-Seq data (PP), 2 of 2 | Wed Feb 26 | slides as PDF

seminar 07 | RNA-Seq analysis (SJ) | Wed Feb 26

Thurs Feb 27: Homework #1 due.

Week 9

lecture 14 | Analysis of epigenetic data, focus on methylation (Elodie Portales-Casamar) | Mon Mar 03 | slides as PDF

Wed Mar 05: final project proposals due.

lecture 15 | Principal component analysis (PP) | Wed Mar 05 | slides as PDF

seminar 08 | Methylation analysis (LL) | Wed Mar 05

Fri Mar 07: Homework #2 assigned

Week 10

lecture 16 | Cluster analysis (GC-F) | Mon Mar 10 | slides as PDF

lecture 17 | Classification (GC-F) | Wed Mar 12 | slides as PDF

seminar 09 | Clustering and PCA (LL) | Wed Mar 12

Week 11

lecture 18 | Model and variable selection: cross validation and regularization (GC-F) | Mon Mar 17 | slides as PDF

lecture 19 | Regularization cont'd, Proteomics and missingness (GC-F) | Wed Mar 19 | slides as PDF: variable selection, proteomics and missing values

seminar 10 | Supervised learning, cross validation, variable selection (SJ) | Wed Mar 19

Week 12

lecture 20 | Analysis of gene function, 1 of 2: Gene set analysis (PP) | Mon Mar 24 | slides as PDF

lecture 21 | Analysis of gene function, 2 of 2 (PP) | Wed Mar 26 | slides as PDF

seminar 11 | TA office hours during seminar time ... group project work | Wed Mar 26

Fri Mar 28: Homework #2 due.

Week 13

lecture 22 | Resampling and the bootstrap (JB)| Mon Mar 31 | slides as PDF

lecture 23 | Guest lecture by STAT540 alum Dr. Sohrab Shah | Wed Apr 02

seminar 12 | TA office hours during seminar time ... group project work | Wed Apr 02

Week 14

lecture 24 | Poster session 9:30am - 12:00pm | Wed Apr 09 <-- NOTE THIS IS ON WEDNESDAY, instead of Monday. Location: Room 101, aka the multi-purpose room, on ground floor of the Michael Smith Building.

Seminars (guided analyses)

We will borrow some material from STAT 545A Exploratory Analysis, in addition to using content specific to STAT 540.