Skip to content

Latest commit

 

History

History
29 lines (24 loc) · 1.57 KB

lecture1.md

File metadata and controls

29 lines (24 loc) · 1.57 KB

What is Data Science?

  • drawing useful conclusions from large and diverse data sets, through exploration, prediction and inference.
    • exploration involves identifying patterns in information
    • prediction involves using information to make informed guesses about unknown values
    • inference involves quantifying degree of certainty.
  • primary tools for
    • exploration are visualizations and descriptive statistics
    • prediction are machine learning and optimization
      • inference statistical tests and models
  • Statistics is central component of data science because it studies how to make robust conclusions based on incomplete information.
  • Computing is central component of data science because it allows us to apply analysis techniques to the large and diverse data sets that arise in real-world applications, including
    • numbers
      • text
      • images
      • videos
      • sensor readings
  • Data Science is all of these things, but it is more than the sum of its parts because of the applications.
  • Through understanding a particular domain, data scientists learn to ask appropriate questions about their data and correctly interpret the answers provided by inferential and computational tools.

Why Data Science?

  • The degree of uncertainty for many decisions ca be reduced sharply by
    • access to large data sets, and
    • computational tools required to analyze them effectively
  • Data driven decision making has already transformed a tremendous breadth of industries
  • wide range of academic disciplines are evolving rapidly to incorporate large-scale data analysis into their theory and practice.