Skip to content

pbvahlst/support-workshops-f20

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 

Repository files navigation

title: Support Workshops for Digital Literacy and Curriculum F20
place: online
time: Oct 23, Oct. 30, Nov. 12, Nov. 19, Dec. 4
instructors: K.G. Kjelmann, R.D. Kristensen-McLachlan, M. Jacomy, & K.L. Nielbo

Support Workshops are optional hands-on introductions to a technical topic (e.g., data wrangling, web-scraping, machine learning) offered exclusively to course participants on Digital Literacy and Digital Curriculum. Participation is optional and there is no sign-up, just drop, when a workshop begins. Every workshop requires approximately one hour preparation (see below), but tech support offers a pre-workshop preparation one hour before each workshop (e.g., a workshop that starts at 09:00 AM, will offer online pre-workshop preparation at 08:00 AM using the same Zoom room). Zoom links are distributed through Slack and mailing list for participants. For questions, please contact M. Andesen.

As several of the instructors are are trained Software and Data Carpenters, we re-use material from The Carpentries' workshop curricula and, to a lesser extend CodeRefinery's lesson material.

Support Workshops will continue in S21, if you have any requests for content or suggestions please write K.L. Nielbo. Planned topics for S21 are Machine Learning with Python #2 and Reproducible Coding with Python.

Workshops at a Glance

Date Time Content Instructor
Oct. 23 09:00-12:00 Managing Data with OpenRefine K.G. Kjelmann
Oct. 30 09:00-12:00 Basic Scripting with Python* #1 R.D. Kristensen-McLachlan & K.L. Nielbo
Nov. 12 09:00-12:00 Basic Scripting with Python* #2 R.D. Kristensen-McLachlan & K.L. Nielbo
Nov. 19 12:00-15:00 Introduction to Web-Scraping M. Jacomy & K.G. Kjelmannm
Nov. 04 09:00-12:00 Machine Learning with Python* #1 R.D. Kristensen-McLachlan & K.L. Nielbo

*) To accommodate R users, tech support will provide parallel scripts in R.

Managing Data with OpenRefine

A part of the data workflow is preparing the data for analysis. Some of this involves data cleaning, where errors in the data are identified and corrected or formatting made consistent. This step must be taken with the same care and attention to reproducibility as the analysis. OpenRefine is a powerful free and open source tool for working with messy data: cleaning it and transforming it from one format into another.

This lesson will teach you to use OpenRefine to effectively clean and format data and automatically track any changes that you make. Many people comment that this tool saves them literally months of work trying to make these edits by hand.

Preparation: install OpenRefine, first DOWNLOAD then follow these installation instructions INSTALL. Tech support will provide support for installation of OpenRefine on Oct. 23 08:00-09:00 AM.

The workshop is cloned and modified from OpenRefine for Social Science Data.

Basic Scripting with Python

The workshop (consists of two episodes) introduces how researchers can use basic scripting in Python (and R) to manipulate data, automate analysis, and make research pipelines reproducible. To goal is to provide tools that make it easier to get more done with less work, while, at the the same time,facilitate open and reproducible science.

The lessons will teach you how to use variables, data structures, control structures, functions, and error handling in Python. We use Jupyter Notebooks to interactively run Python code in the browser.

Preparation: Jupyter is offered in the Cloud by tech support, but should you want to install it locally, please download and install the individial Anaconda Distribution and obtain lesson material and follow Option A: Jupyter Notebook.

Episode Content Instructor
A Python Calculator Basic data type R.D. Kristensen-McLachlan & K.L. Nielbo
Variable assignment
Analyzing Tabular Data Why tabular data? R.D. Kristensen-McLachlan & K.L. Nielbo
Process tabular data
Visualizing Tabular Data Basic visualization R.D. Kristensen-McLachlan & K.L. Nielbo
Group visualizations
Repeating Operations Computers love repetitions R.D. Kristensen-McLachlan & K.L. Nielbo
applying same operation on different values
Collections of Values How to bundle values R.D. Kristensen-McLachlan & K.L. Nielbo
list operations
Analyzing Multiple Files Read many files R.D. Kristensen-McLachlan & K.L. Nielbo
Visualize many files
Logical Conditions Making choices R.D. Kristensen-McLachlan & K.L. Nielbo
Boolean operators
Packaging Code in functions Function definition R.D. Kristensen-McLachlan & K.L. Nielbo
Defining vs. calling
Errors and Exceptions Is Python rude? R.D. Kristensen-McLachlan & K.L. Nielbo
error handing

Introduction to Web Scraping and Crawling

The workshop introduces to basic internet technology and how to automatically query and extract data available through the internet using Beautiful Soup and Hyphe:

Episode Content Instructor
How does the internet work? The structure of internet and the web: IP, DNS, browser, HTML... M. Jacomy
What you need to know as a scholar
Accessing the internet with Python Making a HTTP request K.G. Kjelmann
Downloading basic data
Parsing HTML with Beautiful Soup in Python Dealing with web data K.G. Kjelmannm
Writing a simple script
Web Crawlers Differences between scraping and crawling M. Jacomy & K.G. Kjelmann
Different tools for different needs (harvesting, exploring, archiving...)
An example with the crawler Hyphe
Working with the internet Methodological, ethical and legal considerations M. Jacomy & K.G. Kjelmann

Machine Learning with Python

Support Holiday special is an introduction to machine learning with Python (and R). Throughout Digital Literacy and Curriculum, we have seen examples of how AI, machine learning and deep learning can accelerate (& automate) research tasks in humanities and social science. Now it is time to get our hands dirty! The workshop focuses on text classification in classical machine learning using scikit-learn in Python. At the end, we touch extend the usecase to deep neural networks and image classification.

Episode Content Instructor
Data preparation feature engineering K.L. Nielbo
train-test split
Model definition task definition K.L. Nielbo
Parameters vs. hyperparameters
Model training training goal K.L. Nielbo
training steps
Model evaluation performance metrics K.L. Nielbo
validation procedures
Parameter tuning hyperparamter tuning K.L. Nielbo
artform vs science
Application predictive vs. statistical modeling K.L. Nielbo

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published