You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At today's coworking @BriannaLind@betolink@andypbarrett and I reviewed earthaccess and how to use it with R via reticulate, and next steps forward. The following are notes we can turn into concrete "todo" issues, here in the cookbook, and for earthaccess. Some recent background: #158
A big point that we came to through this conversation about Sarah Murphy's reticulate+xarray blog post: it's ok that the R code is running python code. The R syntax feels friendly to an R user, they aren't immediately concerned/aware that this is "just" python code presented as R. They are hoping to do their science using the tool they know (here, R). (in fact, it's more than ok, it's great, and no need to rewrite in R to be able to use the awesomeness of xarray and help awesome R users at the same time!)
## load R libraries
library(tidyverse) # install.packages("tidyverse")
library(reticulate) # install.packages("reticulate")
## load python library
earthaccess <- reticulate::import("earthaccess")
# use earthaccess to access data
granules <- earthaccess$search_data(
concept_id = "C2036880672-POCLOUD",
temporal = reticulate::tuple("2017-01", "2017-02") # with an earthaccess update, this can be simply c() or list()
)
The granules object is a list; these are JSON dictionaries with some extra dictionaries. granules[1] returns the following - with the big question "what do we do from here?"
It would be great if we could do this from earthaccess:
earthaccess$download(granules, "Desktop") ## this doesn't work; we want a thin wrapper. In the meantime, like Bri said: metadata in CMR JSON; we need to write code in R to get datatype. list of https links. Then skip the download and just use R
We ran Bri's code together, which is awesome and works on the staging hub with Julie's permissions. Exciting to combine these steps with earthaccess (hover over code chunk top-right to copy!)
# Load libraries
library(httr)
library(jsonlite)
library(dplyr)
library(data.table)
# Define parameters to be used in request; must be defined in a list format
cmrURL <- 'https://cmr.earthdata.nasa.gov/search/granules.umm_json' # CMR API endpoint url
parameters <- list(concept_id='C2021957657-LPCLOUD', # HLS
concept_id='C2021957295-LPCLOUD', # HLS
temporal='2021-10-17T00:00:00Z,2021-10-19T23:59:59Z') # page size limit
# Submit GET request, put retrieved list into getResponse.ls
getResponse.ls <- httr::GET(url=cmrURL, query=parameters)
cat('This request returned',getResponse.ls$header$`cmr-hits`,'granuale hits, in', as.integer(as.numeric(getResponse.ls$header$`cmr-hits`)/2000)+1, 'pages of results, and a',getResponse.ls$status_code,"status code with", getResponse.ls$header$`content-type`, "content.", getResponse.ls$description,sep=" ")
# extract content from getResponse.ls and isolate granule URLs for a single page
Content.ls <- fromJSON(content(getResponse.ls, as="text")) # Convert content received from request to workable format
RelatedURLs.ls <- Content.ls$items$umm$RelatedUrls # Go to component of list that has URLs of interest
filteredURLs <- function(x){
https_urls <- dplyr::filter(x, Type=='GET DATA')
} # Define function (filterURLs) to retrieve URLs based on value of "Type" key.
# If Type = "GET DATA", put url
filtered.ls <- lapply(RelatedURLs.ls, filteredURLs) # Apply function to list and combine all rows into a dataframe
granules.df <- do.call(rbind, filtered.ls) # combine all rows of list into a single dataframe
# extract content from getResponse.ls and isolate granule URLs for requests that have multiple pages of responses
# hits <- as.numeric(getResponse.ls$header$`cmr-hits`) # Define the number of hits in GET request
# page_size <- 2000 # Define page size
# page_numbers.ls <- list(seq(1,((hits/page_size)+1),1)) # Make a list of page numbers
# page_numbers.ls
# For each page of results (1:12) perform get request and add filtered URLS to LIST
LIST = list()
for (n in 1:1){
print(n)
cmrURL <- 'https://cmr.earthdata.nasa.gov/search/granules.umm_json'
getResponse.ls <- httr::GET(url=cmrURL, query=list(concept_id='C2021957657-LPCLOUD',
concept_id='C2021957295-LPCLOUD',
temporal='2021-10-17T00:00:00Z,2021-10-19T23:59:59Z',
page_size= '2000',
page_num=n))
Content.ls <- fromJSON(content(getResponse.ls, as="text"))
RelatedURLs.ls <- Content.ls$items$umm$RelatedUrls
LIST[[n]] <- lapply(RelatedURLs.ls, filteredURLs)
}
# ECTRACT URLS from LIST into "completeURLlist"
x <- (unlist(LIST))
x <- as.data.frame(x)
completeURLlist <- as.data.frame(x[x$x %like% "https", ])
#### Notes on asynchronous requests:
#### httr is not capable of asynchronous requests
#### need to use either async, crul, or curl
#### with reformatted request parameters
#### Check these links to get started
#### https://docs.ropensci.org/crul/articles/how-to-use-crul.html
This discussion was converted from issue #161 on October 17, 2024 18:09.
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
At today's coworking @BriannaLind @betolink @andypbarrett and I reviewed
earthaccess
and how to use it with R via reticulate, and next steps forward. The following are notes we can turn into concrete "todo" issues, here in the cookbook, and for earthaccess. Some recent background: #158A big point that we came to through this conversation about Sarah Murphy's reticulate+xarray blog post: it's ok that the R code is running python code. The R syntax feels friendly to an R user, they aren't immediately concerned/aware that this is "just" python code presented as R. They are hoping to do their science using the tool they know (here, R). (in fact, it's more than ok, it's great, and no need to rewrite in R to be able to use the awesomeness of xarray and help awesome R users at the same time!)
Context
Our current R code (https://nasa-openscapes.github.io/earthdata-cloud-cookbook/how-tos/find-data/programmatic.html) works 🥳 :
The
granules
object is a list; these are JSON dictionaries with some extra dictionaries.granules[1]
returns the following - with the big question "what do we do from here?"Ideas for next steps
When we run python functions using reticulate, the syntax is with
$
, for example:library$function
earthaccess$open
The example above uses code from Luis' AGU poster:
It would be awesome for this code to work so we could open a granule and look at it.
Then...
xarray
Building from above, really we'd like to not only open the file, but be able to use xarray to stream the data (so we don't have to download to look at it). This is possible! This awesome post: https://cougrstats.netlify.app/post/2021-04-21-using-python-in-r-studio-with-reticulate/ by Sarah Murphy.
So, we'd be able to run this code (copied from L5 of Luis' poster) with the R/reticulate syntax from the blog post:
earthaccess$download
It would be great if we could do this from earthaccess:
earthaccess$data_links(granules[0])
This idea works in python:
granules[0]$data_links
It would be great if this could work in R:
Then would need to open or download NetCDF. earthaccess does this for python; could reticulate let us use earthaccess from R, or would we need to find R-native NetCDF approaches. Would have to download. (Another further step would be streaming them with xarray - this would be the https links)
https://github.com/ropensci/tidync > netcdfs in R!
https://pjbartlein.github.io/REarthSysSci/netCDF.html#reading-restructuring-and-writing-netcdf-files-in-r
Bri's awesome R CMR Trial:
We ran Bri's code together, which is awesome and works on the staging hub with Julie's permissions. Exciting to combine these steps with earthaccess (hover over code chunk top-right to copy!)
@BriannaLind , @betolink , @andypbarrett , others - please expand ideas here or in linked issues as we tackle these going forward! 🎉
R geospatial resources
Beta Was this translation helpful? Give feedback.
All reactions