Lecture notes (in form of slides) and excercises in Python using ipython-notebook for teaching data and media analysis. It includes introductions to Python, Numpy, Scipy, Scikit-Learn, SimpleCV. It covers the topics Supervised/Unsupervised Learning, Signal Analysis, Image Analysis, Text and Web-Media Analysis.
The lecture notes are optimized for presentation. In order to use them, invoke
ipython nbconvert --to=slides --post serve path-to-lecture-notes
to start the presentation (a browser window should open automatically).
you can also try to get livereveal.js up and running in your ipython environment.
This work is largely based on a number of great tutorials and resources all over the web, compiled by great people from very different domains. Without their effort and their will to make their hard work open access, i would have not been able to compile this tutorial. The individual contributions are listed in the beginning of every part.
-
Introduction - Why, What, Who, How
- The point of view of Web Mining (Course Web Mining Project@University Passau)
-
Part I: Scientific Programming in Python
- Introduction
- Programming Basics
- Numpy in a Nutshell
- [Exercise 3.1. Data Structures and Operations in Numpy] (http://nbviewer.ipython.org/urls/raw.github.com/mgrani/LODA-lecture-notes-on-data-analysis/master/I.Data-Science-in-Python/exercises/Exercise%20DSiP-3-1-Numpy.ipynb)
- Scipy in a Nutshell
- Mathplotlib in a Nutshell
- [Exercise DSiP-5-1-Analysing the Iris Dataset with Mathplotlib] (http://nbviewer.ipython.org/urls/raw.github.com/mgrani/LODA-lecture-notes-on-data-analysis/master/I.Data-Science-in-Python/exercises/Exercise%20DSiP-5-1-Analysing%20the%20Iris%20Dataset%20with%20Mathplotlib-.ipynb)
- Pandas based Data Analysis 1. [Exercise 6.1. Analysing New York Open Data with Pandas] (http://nbviewer.ipython.org/urls/raw.github.com/mgrani/LODA-lecture-notes-on-data-analysis/master/I.Data-Science-in-Python/exercises/Exercise%20DSiP-6-1-Pandas-NYC-Open-Data.ipynb)
-
Part II: Machine Learning and Data Mining [in Python]
-
Machine Learning in a Nutshell with scikit-learn
-
Machine Learning Basics
- On the Data
- Regression
- A simple Example
- Least Mean Square Algorithm - a Gradient Descent approach to regression
- Regression and Classification - using the analytical solution
- Concept Learning
- Measuring Performance
-
Decision Trees
- Decision Tree Basics
- Impurity Functions
- Decision Tree Algorithms
- Decision Tree Pruning
-
Statistical Learning
- Probability Basics
- Bayes Classification
- Graphical Models
-
Linear Models
-
Kernel Models
-
Neuronal Networks
- Perceptron Learning
- Multilayer Perceptrons
- Deep Learning
-
Ensemble Classifiers
-
Cluster Analysis
-
Dimensiontality Reduction and Manifold Learning
-
Dimensionality Reduction and Manifold Learning with scikit-learn
-
Association Rules
-
Reinforcement Learning
-
Deep Learning
-
-
Part III: Natural Language Processing [in Python]
-
Part IV: Visual Analytics
-
Part V: Social Network Analysis
-
Part X. Web Mining Applications
This work is licesend under a Creative-Commons 3.0 license.
Powered by zenodo (Join and contribute your own open material)