Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalizing & publishing CMIP6 documentation #11

Open
charlesbluca opened this issue Sep 2, 2020 · 8 comments
Open

Finalizing & publishing CMIP6 documentation #11

charlesbluca opened this issue Sep 2, 2020 · 8 comments

Comments

@charlesbluca
Copy link
Member

I've populated the CMIP6 documentation site with an overview of:

  • Zarr and our basis behind using the format
  • ESM collections and how to use them
  • Pangeo Catalog and how that ties into access of ESM collections

Is there anything else we could add in or clarify? If not, all that's left is to assign a domain name to this website (I assume cmip6.pangeo.io) and publish it!

cc @rabernat @naomi-henderson

@naomi-henderson
Copy link
Collaborator

@charlesbluca , is this meant to be just for the AWS collection?

@charlesbluca
Copy link
Member Author

I'm not sure - I know it's being made to coincide with CMIP6 being added to Amazon's public dataset program, but I figured the documentation there serves as a guide for data access on either platform. Ryan might have some more insight on this.

@naomi-henderson
Copy link
Collaborator

@charlesbluca : I think the AWS intake-esm json file needs to be updated - it still points to the GCS catalog, etc

@rabernat
Copy link
Member

rabernat commented Sep 4, 2020

Thanks so much for this work Charles! It's fantastic!

is this meant to be just for the AWS collection?

If we are going to the trouble to publish this website, I think we should have it cover both the GCS and S3 versions. It's not hard to do, since they are nearly identical. We just need to be clear about how you would use one vs. the other.

@rabernat
Copy link
Member

rabernat commented Sep 4, 2020

I left you a specific suggestion in #12.

@charlesbluca
Copy link
Member Author

Good point @naomi-henderson - I brought up in an issue on cmip6-pipeline that we'll need to find some way to automate the process of replacing all instances of gs://cmip6 across all catalog files to s3://cmip6-pds - hopefully when we have a better idea of how we can track changes on the Google bucket I can move forward with that. For now, I'll update the files manually, since no new data is being added to the S3 bucket.

I was thinking about adding a page prior to the cloud data overview describing the differences between data access on GCP and AWS - probably just giving the two bucket names and format for using the storage API. Another potentially more elegant (but more complicated) solution would be published two versions of the documentation for GCP or AWS with different endpoints (/gcp/ versus /aws/).

@rabernat
Copy link
Member

I brought up in an issue on cmip6-pipeline that we'll need to find some way to automate the process of replacing all instances of gs://cmip6 across all catalog files to s3://cmip6-pds

For this, perhaps we could re-generate the catalog by crawling the cloud bucket, rather than just copying the GCS catalog to AWS. They might not be in sync.

@charlesbluca
Copy link
Member Author

Revisiting this issue as it's been a while - it seems like this documentation is at a good enough point where it could be officially shared to some capacity.

Do we want to get a dedicated domain for this site (such as cmip6.pangeo.io)? Or would it be satisfactory to just include a link to it on Pangeo's main site?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants