Skip to content

Latest commit

 

History

History
73 lines (68 loc) · 2.45 KB

README.md

File metadata and controls

73 lines (68 loc) · 2.45 KB

Tools and Techniques in Data science

Credit Hours: 3

Prerequisites: None

Course Contents:

  • Introduction to Data Science
  • Data Science Life cycle & Process
    • Asking Right Questions
    • Obtaining Data
    • Understanding Data
    • Building Predictive Models
    • Generating Visualizations For Building Data Products
  • Introduction to Data (Types of Data and Datasets)
  • Data Quality (Measurement and Data Collection Issues)
  • Data pre-processing Stages
    • Aggregation
    • Sampling
    • Dimensionality Reduction
    • Feature subset selection
  • Feature creation
  • Algebraic & Probabilistic View of Data
  • Introduction to Python
  • [Data Science Stack]
    • Python
    • Numpy
    • Pandas
    • Matplotlib
  • Relational Algebra & SQL
  • Scraping & Data Wrangling
    • assessing
    • structuring
    • cleaning munging of data
  • Basic Descriptive & Exploratory Data Analysis
  • Introduction to Text Analysis
    • Stemming
    • Lemmatization
    • Bag of Words
    • TF-IDF
  • Introduction to Prediction and Inference
  • Supervised & Unsupervised Algorithms
  • Introduction to Scikit Learn
  • Bias-Variance
  • Trade-off
  • Model Evaluation & Performance Metrics
    • Accuracy
    • Contingency Matrix
    • Precision-Recall
    • F-1 Score
    • Lift
  • Introduction to Map-Reduce paradigm

Course Assessment:

Mid Term Exam: 30 %

Terminal Exam: 40 %

Assignment: 20 %

Quizzes: 10 %

Reference Materials:

Books:

  1. Python for Data Analysis, 1st Edition, William McKinney
  2. An Introduction to Statistical Learning with Applications in R, 1st Edition G. James, 0D. Witten, T. Hastie and R. Tibshirani
  3. Computational and Inferential Thinking: The Foundations of Data Science, 1 st Edition,A. Adhikari and J. DeNero
  4. Data Mining and Analysis: Fundamental Concepts and Algorithms, 1 st Edition, M. Zaki & W. Meira
  5. Data Science from Scratch, 1st Edition, Joel Grus
  6. Doing Data Science, 1 st Edition, Cathy O'Neil and Rachel Schutt
  7. Introduction to Data Science. A Python Approach to Concepts, Techniques and Applications, 1st Edition, Laura Igual.

Useful Online Resources:

  1. Kaggle
  2. UCI Machine Learning Repository
  3. Introduction to Decision Trees
  4. Introduction to Natural Language Processing