Skip to content

Commit

Permalink
Update course book
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed Sep 23, 2024
1 parent 8cb8fda commit 7a3d1d4
Show file tree
Hide file tree
Showing 31 changed files with 5,503 additions and 20 deletions.
2 changes: 1 addition & 1 deletion .buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: e15a3bad2f69ae135502627e72bce1d7
config: 86cd2658efcd234a534d2ab3e8470ff2
tags: 645f666f9bcd5a90fca523b33c5a78b7
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@
"\n",
"## Section 1: Introduction to the Process of Open Science\n",
"\n",
"In this section, you will review the definition of several common terms in the context of open science, including research products, data, software, and results. In addition, you will read examples that demonstrate how these open-science tools are used in practice. The lesson wraps up with an example of how one group openly shared their data, results, software, and paper."
"In this section, you will review the definition of several common terms in the context of open science, including research products, data, software, and results. In addition, you will read examples that demonstrate how these open-science tools are used in practice. The section wraps up with an example of how one group openly shared their data, results, software, and paper."
]
},
{
Expand Down Expand Up @@ -594,7 +594,7 @@
"source": [
"### Key Takeaways\n",
"\n",
"In this lesson, you learned:\n",
"In this section, you learned:\n",
"\n",
"- The definition of science tools, common examples, and which part of the scientific workflow they can support.\n",
"- The definition and purpose of persistent identifiers. The usefulness of ORCIDs and DOIs in the scientific process.\n",
Expand Down Expand Up @@ -882,7 +882,7 @@
"source": [
"### Key Takeaways \n",
"\n",
"In this lesson, you learned:\n",
"In this section, you learned:\n",
"\n",
"- The different types of scientific data, including primary, secondary, published, and metadata.\n",
"- A list of open science practices to implement FAIR principles that make data and results easily accessible to a wide range of people.\n",
Expand Down Expand Up @@ -1279,7 +1279,7 @@
"source": [
"### Key Takeaways\n",
"\n",
"In this lesson, you learned:\n",
"In this section, you learned:\n",
"\n",
"- The usefulness of digital tools that manage foster collaboration, and house open code.\n",
"- How version control systems like Git and platforms like GitHub can increase collaboration and management of code.\n",
Expand Down Expand Up @@ -1510,7 +1510,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.19"
"version": "3.9.20"
}
},
"nbformat": 4,
Expand Down
1 change: 1 addition & 0 deletions _sources/tutorials/W1D3_OpenData/chapter_title.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Open Data
31 changes: 31 additions & 0 deletions _sources/tutorials/W1D3_OpenData/further_reading.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
## Key terms

**Copyright** – A type of intellectual property that protects original works of authorship as soon as an author fixes the work in a tangible form of expression. Many different types of works are covered by copyright law including data products and software. (As well as books, poems, paintings, photographs, illustrations, musical compositions, and many more.)

Note: Raw data, which are considered facts, are not covered by copyright law.

**Data** – Factual information (such as measurements or statistics) used as a basis for reasoning, discussion, or calculation.

**Data License** – Data licenses give any data creator a way to grant the public permission to use their products under copyright law. Similarly, data licenses give data users clear guidelines regarding how they can reuse the material.

Note: Raw data are not covered by copyright law.

**Data Products** – Data Products are reusable assets that process data and generate insights that help organizations make better decisions. Data products can include datasets, data streams, data feeds, APIs, code, data models, analytics models, and dashboards.

**CC-BY and CC0 License** – CC-BY and CC0 are Creative Commons data licenses. CC-BY allows reusers to distribute, remix, adapt, and build upon the material in any medium or format so long as attribution is given to the creator. The license allows for commercial use. CC0 allows creators to give up their copyright and put their works into the worldwide public domain. CC0 allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, with no conditions.

**Data Management Plan** – A data management plan (DMP) describes the what, where, when, and who for data that will be created during a research project. Common components of data management plans include a description of the type, volume, and format of the data; where and when the data will be made available, and who will make the data available. The plan can also describe data variables, sources, accuracy, and precision if that information is available.

**Metadata** – Data that describes data. It can be global – describing the overall contents of a single file or collection of files – or local – describing an individual variable within the file. Typically, global metadata offers information about who created the file, information about the data set, what satellite/instrument/lab/etc. created the set, the DOI, and file format information, among other metadata fields. Local metadata about variables contains information such as the full/long name of the variable, any scaling factors or uncertainty information, and measurement units.

**Machine-Readable Persistent Identifiers (PID)** – A unique string that identifies an object, such as a dataset. Though the online location of the object may change, the PID will not, and will also lead back to the data, ensuring that citations referencing the PID will always be valid.

**Findable (data)** – Data that is readily discoverable to both humans and machines. It should include a unique persistent identifier and rich metadata describing the data and context and be registered in an index that is searchable.

**Accessible (data)** – Data that can be accessed over standard communication protocols, with metadata that can be accessed even if the data itself is no longer available.

**Interoperable** (data) – Data that uses controlled ontologies and vocabularies so that it can be used and/or combined with other relevant data sets in different applications.

**Reusable (data)** – Data that has a clear license, detailed provenance, adequate description/definition and meets community/domain standards, and can be replicated or combined with other data.

**Dataflow** – The data workflow that includes how data are used, made, and shared. Different actors will have different (or multiple) roles in this workflow.
Loading

0 comments on commit 7a3d1d4

Please sign in to comment.