Skip to content

LADCO/training-r-intro

Repository files navigation

Introduction to R for Air Quality Data Science

This repository contains training materials for learning the R programming language, specifically tailored towards air quality data science. The training is structured into distinct lessons, each focusing on a different aspect using R. Each lesson is contained in a separate folder, and includes a README.md file with the lesson material.

Prerequisites

These lessons do not assume that you have any experience using R or any other programming language. It does assume that you have some familiarity with air quality data. Below is a list of software and R packages that are used throughout the lessons.

Data Science and Programming

These lessons are meant to be self-learning materials for air quality data science using R. It also provides some instructions on how to program with R. These two topics are related but it's helpful to understand the distinction.

  • Data Science is a collection of skills and methods for extracting insights from data. We are not using this phrase to include machine learning, as many do. In our use of the term, data science focuses on obtaining, storing, transforming, and visualizing data. Basic statistics and quality assurance are also touched on.

  • Programming, in our use of the term, is automating tasks using a computer language. In our case, we want to use R programming to automate air quality data science tasks to make our life easier. We also want to use programming to handle a high volume of data and a wide variety of analyses.

The topics in these lessons are not necessarily divided into data science lessons and programming lessons. But it may be helpful to keep these two topics separate in your mind as you progress through the material. Programming topics such as conditionals and loops (and the R version of loops called apply functions) are difficult to understand at first. However, they are not essential for using R to read air quality data, transform it, and visualize the output for use in a report or presentation. Data science tasks will be more straight forward and they are the main topics in these lessons.

Lessons

  1. Introduction to R: This lesson provides a basic introduction to R, including how to install and set up R and RStudio, an overview of R syntax, and how to perform simple operations.

  2. Functions and Importing Data: This lesson covers how to use functions in R, including built-in functions and functions from packages. It also discusses how to import data such as text data in CSV files and Excel workbooks.

  3. Subsetting, Sorting, and Combining Data Frames: This lesson covers how to subset data using indexing, logical operators, and the filter( ) function from dplyr. It also covers how to sort and combine data frames.

  4. Writing Functions, Conditionals, and Loops: This lesson introduces the concept of writing functions in R, using conditionals to control the flow of execution, and implementing loops for repetitive tasks.

  5. Plotting: This lesson focuses on visualizing data using various plotting techniques in R, including scatter plots, line graphs, and histograms.

  6. Basic Statistics: This lesson covers the basics of statistical analysis in R, including calculating means, medians, standard deviations, and correlations.

  7. Quality Assurance: This lesson discusses quality assurance in data analysis, including checking data types, handling outliers, dealing with missing data, and other common pitfalls.

Contributing

Contributions to this repository are welcome. If you have a suggestion, please open an issue in this repository and let us know how we can improve the material. You can also submit a pull request.

Resources

Below is a list of helpful resources for learning R:

License

This project is licensed under the MIT License.

About

Introduction to R for Air Data Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •