Skip to content

Commit

Permalink
update README, fix misspelling (#12)
Browse files Browse the repository at this point in the history
* update README, fix misspelling

* update description

* update description

* update README.md

* readme and vignettes typos

* readme version

* news update

* rm README.html, rerender README.md

---------

Co-authored-by: gorgitko <[email protected]>
  • Loading branch information
pfeiferl and gorgitko authored Sep 14, 2024
1 parent ae3452f commit 7eb50d5
Show file tree
Hide file tree
Showing 11 changed files with 70 additions and 41 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ tests/testthat/run_pipeline_vignette_config_patches/*/*.yaml*
!tests/testthat/run_pipeline_vignette_config_patches/*/*.default.yaml
/doc/
/Meta/
README.html
8 changes: 7 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: scdrake
Type: Package
Title: A pipeline for droplet-based single-cell RNA-seq data secondary analysis implemented in the drake Make-like toolkit for R language
Version: 1.5.2
Version: 1.6.0
Authors@R:
c(
person(
Expand All @@ -17,6 +17,12 @@ Authors@R:
role = c("aut"),
email = "[email protected]"
),
person(
given = "Lucie",
family = "Pfeiferova",
role = c("aut"),
email = "[email protected]"
),
person(
given = "Michal",
family = "Kolar",
Expand Down
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# scdrake 1.6.0
- `scdrake` now allows processing of spatial transcriptomics data from spot-based technologies (Visium).
- See `vignette("scdrake_spatial")`.
- Added annotation using user-defined marker genes.
- Updated `stage_input_qc` and `stage_norm_clustering` vignettes.

# scdrake 1.5.0

- Major refactoring:
Expand Down
16 changes: 10 additions & 6 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,10 @@ knitr::opts_chunk$set(
[![Overview and outputs](https://img.shields.io/badge/Overview%20&%20outputs-vignette("pipeline_overview")-informational)](https://bioinfocz.github.io/scdrake/articles/pipeline_overview.html)
[![Pipeline diagram](https://img.shields.io/badge/Pipeline%20diagram-Show-informational)](https://github.com/bioinfocz/scdrake/blob/main/diagrams/README.md)
![License](https://img.shields.io/github/license/bioinfocz/scdrake)
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-stable-green.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![Docker Image CI](https://github.com/bioinfocz/scdrake/actions/workflows/docker-ci.yml/badge.svg?branch=main)](https://github.com/bioinfocz/scdrake/actions/workflows/docker-ci.yml)

`{scdrake}` is a scalable and reproducible pipeline for secondary analysis of droplet-based single-cell RNA-seq data.
`{scdrake}` is a scalable and reproducible pipeline for secondary analysis of droplet-based single-cell RNA-seq data (scRNA-seq) and spot-based spatial transcriptomics data (SRT).
`{scdrake}` is an R package built on top of the `{drake}` package, a [Make](https://www.gnu.org/software/make)-like pipeline
toolkit for [R language](https://www.r-project.org).

Expand All @@ -34,9 +34,13 @@ The main features of the `{scdrake}` pipeline are:
- Import of scRNA-seq data:
[10x Genomics Cell Ranger](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger)
output, delimited table, or `SingleCellExperiment` object.
- Quality control and filtering of cells and genes, removal of empty droplets.
- Import of SRT data:
[10x Genomics Space Ranger](https://www.10xgenomics.com/support/software/space-ranger/latest/getting-started/what-is-space-ranger)
output, delimited table, or `SingleCellExperiment` object, and tissue positions file as in Space ranger.
- Quality control and filtering of cells/spots and genes, removal of empty droplets.
- Higly variable genes detection, cell cycle scoring, normalization, clustering, and dimensionality reduction.
- Cell type annotation.
- Spatially variable genes detection (for SRT data)
- Cell type annotation using reference sets, cell type annotation using user-provided marker genes.
- Integration of multiple datasets.
- Computation of cluster markers and differentially expressed genes between clusters (denoted as "contrasts").
- Rich graphical and HTML outputs based on customizable RMarkdown documents.
Expand Down Expand Up @@ -378,7 +382,7 @@ By contributing to this project, you agree to abide by its terms.
### Funding

This work was supported by [ELIXIR CZ](https://www.elixir-czech.cz) research infrastructure project
(MEYS Grant No: LM2018131) including access to computing and storage facilities.
(MEYS Grant No: LM2018131 and LM2023055) including access to computing and storage facilities.

### Software and methods used by `{scdrake}`

Expand All @@ -402,4 +406,4 @@ Many things are used by `{scdrake}`, but these are really worth mentioning:
- The code is styled automatically thanks to `{styler}`.
- The documentation is formatted thanks to `{devtools}` and `{roxygen2}`.

This package was developed using `{biocthis}`.
This package was developed using `{biocthis}`.
39 changes: 24 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,13 @@ outputs](https://img.shields.io/badge/Overview%20&%20outputs-vignette(%22pipelin
diagram](https://img.shields.io/badge/Pipeline%20diagram-Show-informational)](https://github.com/bioinfocz/scdrake/blob/main/diagrams/README.md)
![License](https://img.shields.io/github/license/bioinfocz/scdrake)
[![Lifecycle:
experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
experimental](https://img.shields.io/badge/lifecycle-stable-green.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![Docker Image
CI](https://github.com/bioinfocz/scdrake/actions/workflows/docker-ci.yml/badge.svg?branch=main)](https://github.com/bioinfocz/scdrake/actions/workflows/docker-ci.yml)

`{scdrake}` is a scalable and reproducible pipeline for secondary
analysis of droplet-based single-cell RNA-seq data. `{scdrake}` is an R
analysis of droplet-based single-cell RNA-seq data (scRNA-seq) and
spot-based spatial transcriptomics data (SRT). `{scdrake}` is an R
package built on top of the `{drake}` package, a
[Make](https://www.gnu.org/software/make)-like pipeline toolkit for [R
language](https://www.r-project.org).
Expand All @@ -28,11 +29,17 @@ The main features of the `{scdrake}` pipeline are:
- Import of scRNA-seq data: [10x Genomics Cell
Ranger](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger)
output, delimited table, or `SingleCellExperiment` object.
- Quality control and filtering of cells and genes, removal of empty
droplets.
- Import of SRT data: [10x Genomics Space
Ranger](https://www.10xgenomics.com/support/software/space-ranger/latest/getting-started/what-is-space-ranger)
output, delimited table, or `SingleCellExperiment` object, and
tissue positions file as in Space ranger.
- Quality control and filtering of cells/spots and genes, removal of
empty droplets.
- Higly variable genes detection, cell cycle scoring, normalization,
clustering, and dimensionality reduction.
- Cell type annotation.
- Spatially variable genes detection (for SRT data)
- Cell type annotation using reference sets, cell type annotation
using user-provided marker genes.
- Integration of multiple datasets.
- Computation of cluster markers and differentially expressed genes
between clusters (denoted as “contrasts”).
Expand Down Expand Up @@ -108,8 +115,8 @@ You can pull the Docker image with the latest stable `{scdrake}` version
using

``` bash
docker pull jirinovo/scdrake:1.5.2
singularity pull docker:jirinovo/scdrake:1.5.2
docker pull jirinovo/scdrake:1.6.0
singularity pull docker:jirinovo/scdrake:1.6.0
```

or list available versions in [our Docker Hub
Expand Down Expand Up @@ -151,7 +158,7 @@ docker run -d \
-e USERID=$(id -u) \
-e GROUPID=$(id -g) \
-e PASSWORD=1234 \
jirinovo/scdrake:1.5.2
jirinovo/scdrake:1.6.0
```

For Singularity, also make shared directories and execute the container
Expand Down Expand Up @@ -234,7 +241,7 @@ for `{scdrake}` and you can use it to install all dependencies by

``` r
## -- This is a lockfile for the latest stable version of scdrake.
download.file("https://raw.githubusercontent.com/bioinfocz/scdrake/1.5.2/renv.lock")
download.file("https://raw.githubusercontent.com/bioinfocz/scdrake/1.6.0/renv.lock")
## -- You can increase the number of CPU cores to speed up the installation.
options(Ncpus = 2)
renv::restore(lockfile = "renv.lock", repos = BiocManager::repositories())
Expand All @@ -254,7 +261,7 @@ installed from the lockfile).

``` r
remotes::install_github(
"bioinfocz/scdrake@1.5.2",
"bioinfocz/scdrake@1.6.0",
dependencies = FALSE, upgrade = FALSE,
keep_source = TRUE, build_vignettes = TRUE,
repos = BiocManager::repositories()
Expand Down Expand Up @@ -321,7 +328,7 @@ vignette](https://bioinfocz.github.io/scdrake/articles/scdrake.html)
## Vignettes and other readings

See <https://bioinfocz.github.io/scdrake> for a documentation website of
the latest stable version (1.5.2) where links to vignettes below become
the latest stable version (1.6.0) where links to vignettes below become
real :-)

See <https://bioinfocz.github.io/scdrake/dev> for a documentation
Expand All @@ -341,6 +348,7 @@ website of the current development version.
- General information:
- Pipeline overview: `vignette("pipeline_overview")`
- FAQ & Howtos: `vignette("scdrake_faq")`
- Spatial extension: `vignette("scdrake_spatial")`
- Command line interface (CLI): `vignette("scdrake_cli")`
- Config files (internals): `vignette("scdrake_config")`
- Environment variables: `vignette("scdrake_envvars")`
Expand All @@ -352,8 +360,9 @@ website of the current development version.
- Stage `01_input_qc`: reading in data, filtering, quality
control -\> `vignette("stage_input_qc")`
- Stage `02_norm_clustering`: normalization, HVG selection,
dimensionality reduction, clustering, cell type annotation
-\> `vignette("stage_norm_clustering")`
SVG selection, dimensionality reduction, clustering,
(marker-based) cell type annotation -\>
`vignette("stage_norm_clustering")`
- Integration pipeline:
- Stage `01_integration`: reading in data and integration -\>
`vignette("stage_integration")`
Expand Down Expand Up @@ -436,8 +445,8 @@ contributing to this project, you agree to abide by its terms.
### Funding

This work was supported by [ELIXIR CZ](https://www.elixir-czech.cz)
research infrastructure project (MEYS Grant No: LM2018131) including
access to computing and storage facilities.
research infrastructure project (MEYS Grant No: LM2018131 and LM2023055)
including access to computing and storage facilities.

### Software and methods used by `{scdrake}`

Expand Down
2 changes: 1 addition & 1 deletion _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ navbar:
text: Integration pipeline guide
href: articles/scdrake_integration.html
spatial:
text: Spatial extention
text: Spatial extension
href: articles/scdrake_spatial.html
faq:
text: FAQ & Howtos
Expand Down
8 changes: 5 additions & 3 deletions inst/Rmd/single_sample/02_norm_clustering.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,9 @@ downstream methods. We want to select genes that contain useful information abou
removing genes that contain random noise. This aims to preserve interesting biological structure without the variance
that obscures that structure, and to reduce the size of the data to improve computational efficiency of later steps.

More information in [OSCA](https://bioconductor.org/books/3.15/OSCA.basic/feature-selection.html)
In STR, we can identify spatially variable genes (SVGs). We define SVGs as genes with spatially correlated patterns of expression across the tissue area. Based on paper from Li et al 2021 we decided to generate a combined set of HVGs and Spatialy variable genes (SVGs).

More information in [OSCA](https://bioconductor.org/books/3.15/OSCA.basic/feature-selection.html and [BestPracticesST](https://lmweber.org/BestPracticesST/))

```{r, results = "asis"}
scdrake::catg0('**HVG metric: "{hvg_metric}"**\n\n')
Expand Down Expand Up @@ -363,11 +365,11 @@ if (!is.null(cfg$CELL_ANNOTATION_SOURCES)) {

```{r, results = "asis"}
if (isTRUE(cfg$MANUAL_ANNOTATION)) {
scdrake::md_header("Manual cell annotation", 1, extra = "{.tabset}")
scdrake::md_header("Marker-based cell annotation", 1, extra = "{.tabset}")
scdrake::catn(
glue::glue("**Annotation was done for {cfg$ANNOTATION_CLUSTERING}**"))
cat("\n\n")
cat("For manual annotation we modified an implemented function from the Giotto package. The enrichment Z score is calculated by using method (PAGE) from Kim SY et al., BMC bioinformatics, 2005 as $$ Z = \frac{((Sm – mu)*m^\frac{1}{2})}{delta} $$. \n
cat("For marker-based annotation we modified an implemented function from the Giotto package. The enrichment Z score is calculated by using method (PAGE) from Kim SY et al., BMC bioinformatics, 2005 as $$ Z = \frac{((Sm – mu)*m^\frac{1}{2})}{delta} $$. \n
For each gene in each spot/cell, mu is the fold change values versus the mean expression
and delta is the standard deviation. Sm is the mean fold change value of a specific marker gene set
and m is the size of a given marker gene set.")
Expand Down
3 changes: 2 additions & 1 deletion vignettes/_vignette_signpost.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- General information:
- Pipeline overview: `vignette("pipeline_overview")`
- FAQ & Howtos: `vignette("scdrake_faq")`
- Spatial extension: `vignette("scdrake_spatial")`
- Command line interface (CLI): `vignette("scdrake_cli")`
- Config files (internals): `vignette("scdrake_config")`
- Environment variables: `vignette("scdrake_envvars")`
Expand All @@ -19,7 +20,7 @@
- Pipelines and stages:
- Single-sample pipeline:
- Stage `01_input_qc`: reading in data, filtering, quality control -> `vignette("stage_input_qc")`
- Stage `02_norm_clustering`: normalization, HVG selection, dimensionality reduction, clustering, cell type annotation
- Stage `02_norm_clustering`: normalization, HVG selection, SVG selection, dimensionality reduction, clustering, (marker-based) cell type annotation
-> `vignette("stage_norm_clustering")`
- Integration pipeline:
- Stage `01_integration`: reading in data and integration -> `vignette("stage_integration")`
Expand Down
14 changes: 7 additions & 7 deletions vignettes/scdrake_spatial.Rmd
Original file line number Diff line number Diff line change
@@ -1,33 +1,33 @@
---
title: "Spatial extention"
title: "Spatial extension"
date: "`r glue::glue('<sup>Document generated: {format(Sys.time(), \"%Y-%m-%d %H:%M:%S %Z%z</sup>\")}')`"
package: scdrake
output:
BiocStyle::html_document:
toc: true
toc_float: true
vignette: >
%\VignetteIndexEntry{Spatial extention}
%\VignetteIndexEntry{Spatial extension}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

***

`{scdrake}` now offer spatial extension for the first stage (`01_input_qc`) and the second stage (`02_norm_clustering`) of the single-sample pipeline. The spatial possibility is aimed on Visium technology, respectively on spot-based technologies. Scdrake provides comparable results with Seurat, Giotto (R), as well as scanpy (Python). However, we strongly discourage usage of scdrake for other technologies than Visium. For futher analyses of spatial dataset we recommend [CARD](https://github.com/YMa-lab/CARD) for deconvolution and [CellChat2](https://github.com/SiYangming/CellChat2) or [IGAN](https://github.com/Zhu-JC/IGAN) for cell-cell interaction.
`{scdrake}` now offer spatial extension for the first stage (`01_input_qc`) and the second stage (`02_norm_clustering`) of the single-sample pipeline. The spatial possibility is aimed at Visium technology, respectively on spot-based technologies. Scdrake provides comparable results with Seurat, Giotto (R), as well as scanpy (Python), and correspond to [Best Practices for Spatial Transcriptomics](https://lmweber.org/BestPracticesST/). For now, we discourage usage of scdrake for other technologies than Visium. For futher analyses of the spatial dataset we recommend [CARD](https://github.com/YMa-lab/CARD) for deconvolution and [CellChat2](https://github.com/SiYangming/CellChat2) or [IGAN](https://github.com/Zhu-JC/IGAN) for cell-cell interaction.

This vignette should serve as a supplement to other vignettes, as `vignette("stage_input_qc")` and `vignette("stage_norm_clustering")`).


***

## Spatial extention functions
## Spatial exsention functions

***

### Spatial visualization

For (`01_input_qc`) and (`02_norm_clustering`) of the single-sample pipeline we now offer visualization of tissue, as pseudo tissue spot visualization. Spatial extention will add spot coordinates (array_col and array_row) from SpaceRanger tissue_possitions.csv file, and will filter away all spots, that are by SpaceRanger labeled as not in tissue. Visualization function are implemented from the [Giotto package](https://drieslab.github.io/Giotto_website/). Visualization is automatically used for quality control and dimension reduction results.
For (`01_input_qc`) and (`02_norm_clustering`) of the single-sample pipeline we now offer visualization of tissue, as pseudo tissue spot visualization. Spatial extension will add spot coordinates (array_col and array_row) from SpaceRanger tissue_possitions.csv file, and will filter away all spots, that are by SpaceRanger labeled as not in tissue. Visualization function are implemented from the [Giotto package](https://drieslab.github.io/Giotto_website/). Visualization is automatically used for quality control and dimension reduction results.

***

Expand All @@ -37,8 +37,8 @@ For spatial analyses in stage 02_norm_clustering `vignette("stage_norm_clusterin

***

### Manual annotation
### Marker-based annotation

Manual annotation was implemented for both single-cell and spatial datasets. In summary, expression profiles and statistical metrics are computed for each cell/spot, the result is visualized using a heatmap and dimension reduction plot. For spatial datasets is enabled to visualized results in tissue coordinates, both enrichment plots for each annotation label (individual enrichment plots) and for overall results for each spot. Manual annotation is implemented from the [Giotto package](https://drieslab.github.io/Giotto_website/), the function is based on [Kim SY et al](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-6-144).
Marker-based annotation was implemented for both single-cell and spatial datasets. In summary, expression profiles and statistical metrics are computed for each cell/spot, the result is visualized using a heatmap and dimension reduction plot. For spatial datasets is enabled to visualized results in tissue coordinates, both enrichment plots for each annotation label (individual enrichment plots) and for overall results for each spot. Marker-based annotation is implemented from the [Giotto package](https://drieslab.github.io/Giotto_website/), the function is based on [Kim SY et al](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-6-144).

***
4 changes: 2 additions & 2 deletions vignettes/stage_input_qc.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ INPUT_QC_REPORT_RMD_FILE: "Rmd/single_sample/01_input_qc.Rmd"

**Type:** character scalar

A path to RMarkdown file used for HTML report of this pipeline stage. For spatial extention, the default RMarkdown file is `01_input_qc_spatial.Rmd`
A path to RMarkdown file used for HTML report of this pipeline stage. For spatial extension, the default RMarkdown file is `01_input_qc_spatial.Rmd`

***

Expand All @@ -143,7 +143,7 @@ You can also negate the selection by specifying `negate: true`.

***

#### Spatial extention
#### Spatial extension


```yaml
Expand Down
Loading

0 comments on commit 7eb50d5

Please sign in to comment.