This course develops the foundations of predictive modeling by: introducing the
conceptual foundations of regression and multivariate analysis; developing
statistical modeling as a process that includes exploratory data analysis, model
identification, and model validation; and discussing the difference between the
uses of statistical models for statistical inference versus predictive modeling.
The high level topics covered in the course include: exploratory data analysis,
statistical graphics, linear regression, automated variable selection, principal
components analysis, exploratory factor analysis, and cluster analysis.
I'm sharing artifacts from my completion of the course with the intent that someone may benefit from the approach taken. I'm sharing everything that I think is reasonable without treading on the publishers of reference texts, and the academic institution. I'd ask that if you've come upon this repository and use it for reference, please contact me and let me know what helped you. Alternatively, if you've found this and believe I'm in violation for sharing the material, please contact me first rather than issue a take-down.
I've deliberately included the data along with the source code so that the work can be reproduced, however I've also omitted the course instruction as it is likely to cause a claim from the institution. It is also likely that the course will significantly change as time goes on, so the value of this reference is limited. However, I've personally found references from decades prior while working on this course, and felt that if I could benefit anyone I should share.
The overall caveat is that I'm just a student. Typically out of time, and under duress to complete this course material. I don't posit that this is a correct reference, rather, I know there are significant shortcomings. Read everything as a skeptic, and think critically rather than directly re-use something that I have done.
Northwestern's [Predictive Analytics](http://sps.northwestern.edu/program- areas/graduate/predictive-analytics/) program was the first that I found to provide full degree accreditation from a completely on-line program with curriculum vectoring towards becoming a practitioner of Data Science. I'm excited to participate in the program and hope that the university doesn't take offense in me sharing this coursework.
I would like to see higher education adopt a more open form of education, much like what we're seeing in the massively on-line open courses of late. I further would like to see institutions choose to use libre/open technology in their curriculum, as these tools are forms of expression that become unavailable to students after separation with the institution. Where possible I've tried to use or reproduce work in libre/open form, however time is always limited.