SQL queries (as separate files) occupy an awkward spot within R
pipelines. The goal of sqltargets is to offer a shorthand tar_sql
to
reference and execute queries within a targets
project.
You can install sqltargets from CRAN with:
install.packages("sqltargets")
You can install the development version of sqltargets with:
remotes::install_github("daranzolin/sqltargets)
library(targets)
#> Warning: package 'targets' was built under R version 4.2.2
library(sqltargets)
tar_dir({ #
# Unparameterized SQL query:
lines <- c(
"-- !preview conn=DBI::dbConnect(RSQLite::SQLite())",
"select 1 AS my_col",
""
)
writeLines(lines, "query.sql")
# Include the query in a pipeline as follows.
tar_script({
library(tarchetypes)
library(sqltargets)
list(
tar_sql(query, path = "query.sql")
)
}, ask = FALSE)
})
Use tar_load
or targets::tar_load
within a SQL comment to indicate
query dependencies. Check the dependencies of any query with
tar_sql_deps
.
lines <- c(
"-- !preview conn=DBI::dbConnect(RSQLite::SQLite())",
"-- targets::tar_load(data1)",
"-- targets::tar_load(data2)",
"select 1 AS my_col",
""
)
query <- tempfile()
writeLines(lines, query)
tar_sql_deps(query)
#> [1] "data1" "data2"
Pass parameters (presumably from another object in your targets project)
from a named list with ‘glue’ syntax: {param}
.
query.sql
-- !preview conn=DBI::dbConnect(RSQLite::SQLite())
-- tar_load(query_params)
select id
from table
where age > {age_threshold}
tar_script({
library(targets)
library(tarchetypes)
library(sqltargets)
list(
tar_target(query_params, list(age_threshold = 30)),
tar_sql(query, path = "query.sql", query_params = query_params)
)
}, ask = FALSE)
Please note that the sqltargets project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Much of the code has been adapted from the excellent tarchetypes package. Special thanks to the authors and Will Landau in particular for revolutionizing data pipelines in R.