Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish STAC meta data to production and fix data services networking (create NAT gateway) #101

Open
1 task done
smohiudd opened this issue Feb 2, 2024 · 17 comments
Open
1 task done
Assignees

Comments

@smohiudd
Copy link
Contributor

smohiudd commented Feb 2, 2024

What

All collections currently in staging should be published to the production instance.

  • The collections should include the provider and render meta data that is included in the veda-data repo.
  • Assets should reference the veda-data-store-production bucket
  • summaries should be included in all collections

PI Objective

Objective 4: Publish production data
24.3 Objective 2: Publish STAC metadata into Production VEDA

Acceptance Criteria

  • NAT Instance created in MCP
  • All STAC meta data in staging is published in production including provider and renders meta data and referencing the veda-data-store-production bucket

Tasks

Preview Give feedback
@smohiudd
Copy link
Contributor Author

Are we using the correct extension versions in build stac?

Proj extension and raster ext versions: https://github.com/NASA-IMPACT/veda-data-airflow/blob/a47015ba2b327eb5d1f54958246cb6fb5b79ccb1/docker_tasks/build_stac/utils/stac.py#L12

cc: @anayeaye, @slesaad

@smohiudd
Copy link
Contributor Author

Check if rio stac version is correct: https://github.com/NASA-IMPACT/veda-data-airflow/blob/a47015ba2b327eb5d1f54958246cb6fb5b79ccb1/docker_tasks/build_stac/requirements.txt#L7

Currently using 0.7.0 in airflow build stac

@anayeaye
Copy link
Contributor

I confirmed that we want to use rio-stac>=0.8.0 to get the correct version of the proj extension.

I think we will also have a minor refactor to import the actual versions of the extensions used by rio-stac in airflows build_stac/utils/stac.py as shown in the rio-stac documentation for building multi-asset items. Currently the utils method manually declares the projection version--given that, there may be other slight modifications to how the stac item is created.

@anayeaye
Copy link
Contributor

Rio-stac version and corresponding stac extension versions now updated in this pr NASA-IMPACT/veda-data-airflow#125

@botanical botanical changed the title Publish STAC meta data to production Publish STAC meta data to production and fix data services networking (create NAT gateway) Apr 29, 2024
@botanical
Copy link
Member

Summary of huddle on April 29, 2024

We're currently blocked on this being implemented https://jaas.gsfc.nasa.gov/servicedesk/customer/portal/2/GSD-3143 (creation of a NAT gateway).

Proof that there's a networking issue:
Screenshot of ingest API showing empty list of items
Screenshot of ingest API showing Service Unavailable Error

We also tested the worfklow setting STAC_INGESTOR_API_URL to both https://77451h4b35.execute-api.us-west-2.amazonaws.com/ and https://dev.openveda.cloud/api/ingest/ and proved that the veda_ingest_raster DAG runs successfully with either values
image

TODO

  • eventually add lifecycle policy to veda-pipeline-STAGE-mwaa-… buckets

cc @smohiudd @anayeaye @amarouane-ABDELHAK @ranchodeluxe

@botanical
Copy link
Member

An additional service desk ticket was created on May 3rd, 2024 to update the network ACL rules to allow traffic for ephemeral port range and the ticket is currently in Security Review status.

@botanical
Copy link
Member

Ephemeral port range testing

- /ingestions with https://staging-stac.delta-backend.com/collections/hls-swir-falsecolor-composite/items/Lahaina_HLS_2023-08-13_SWIR_falsecolor_cog succeeded

@anayeaye
Copy link
Contributor

anayeaye commented May 14, 2024

Now that we are unblocked, here are the notes from a backfill planning session with @botanical @smohiudd @ividito

The big backfill plan

We plan to use https://staging-stac.delta-backend.com/collections as our source of truth for the collections to publish to the VEDA instances running in MCP (we’ll do some test runs in mcp-test before moving to production).

Promote to production working definition

Our target is to promote all the data that are currently staged the UAH hosted staging instance of VEDA to the MCP hosted test and production stacks. At a hight level:

  1. Copy all staging assets from veda-data-store-staging to veda-data-store using transfer DAG
  2. Publish STAC collections stored in git:veda-data/ingestion_data/collections to MCP hosted test and production catalogs using the /ingest-api/collections endpoint
  3. Trigger discovery-items workflow in the /discovery/ endpoint using the inputs from git:veda-data/ingestion_data//discovery-items

Detailed plan

  1. Identify the exceptions: a list of collection ids to skip (these are external and/or weird data like the LPDAAC hosted data, externally hosted vector collections, and collections with provider generated metadata like the HLS environmental justice events collections). This list of ids will be filtered out of the following automated promotion steps.
  2. Wipe out mcp-test collections that aren’t in the skip list (currently mcp-test is an older snapshot of the staging STAC catalog and the assets in this catalog are in the staging S3 bucket).
  3. Copy and update git:veda-data/ingestion_data/discovery-items to new ingestion_data/production/discovery-items folder (i.e. correct buckets veda-data-store-staging-->veda-data-store and bucket prefixes should all be collection id)
  4. Try this in mcp-test before scripting for production: Iterate over the staging-stac inventory csv: for each get git:veda-data/ingestion_data/collection
  • Skip if in exception list
  • Find the correct git:veda-data/ingestion_data/collection for the collection id in the inventory csv and publish that collection via veda-backend/ingest-api/collections (if not already published by earlier backfill efforts)
  • Find the correct git:veda-data/ingestion_data/production/discovery-items input json and use as the POST body to the workflows-api/discovery endpoint to run workflows/discover-items

Git:veda-data necessary changes

We will need to start thinking about a new release for upcoming changes to the ingestion DAGs. We discussed whether we should manage this in a new branch? Should we move the discovery items into the veda-data-airflow project? For now we have decided to proceed with a slight change to the git:veda-data folder structure to accommodate different folders for each stage. As in: we currently have discovery-items configuration for staging data which will move to /staging and a new production/ folder will be created for inputs configured from production data.

Restructure folders

ingestion_data/
	Collections (automated validation on pr)
	/staging
		/discovery-items
		/dataset-config
	/production
		/discovery-items
		/dataset-config (probably not? Probably we just automate the collection+discovery items)

Update buckets and prefixes in discovery-items

Copy veda-data/staging/discovery-items to veda-data/production/discovery-items and

  1. Update to production veda-data-store bucket
  2. Update prefix to match the collection id (for many staged collections the prefix is somewhat arbitrary)

Actions:

Backfill track

  • Csv to the ticket (and steps reproduce) (@anayeaye )
  • Google sheet from csv (may be in an existing sheet) (@anayeaye)
  • Identify and document the exceptions list (all/done/??)
  • Delete non-exceptions from mcp-test in prep for prod data (?)
  • New branch of veda-data with new folder structure and updated discovery-items (@botanical )
    • New folder structure
    • Fix prefixes (random folders→exact collection id)
    • Fix bucket (veda-data-store-staging→veda-data-store)
  • Run all of it in mcp-test (everyone, TBD)
  • Run all of it in mcp-prod

observability & monitoring in MCP track

  • Deploy veda-monitoring to MCP (@ividito @smohiudd will start to add that step to veda-deploy)

@anayeaye
Copy link
Contributor

I started a new sheet to this working backfill google spreadsheet and loaded an inventory staging-collections.csv from the staging stac catalog that I generated in a notebook
with this hacky loop:

from pystac_client import Client
import pandas as pd

def get_sample_files(collection):
    """return the hrefs of a cog assets if any items exist with cog assets"""
    cog_assets=[]
    try:
        item = next(collection.get_items(), None)
        if item:
            for k in item.assets.keys():
                if k != "rendered_preview":
                    asset = item.assets[k]
                    cog_assets.append({"key": k, "href": asset.get_absolute_href()})
        return cog_assets
    except:
        return cog_assets

STAC_API_URL = "https://staging-stac.delta-backend.com/"
catalog = Client.open(STAC_API_URL)

summaries = []
collections = list(catalog.get_collections())
for collection in sorted(collections, key=lambda x: x.id):
    summaries.append({
        "id": collection.id,
        "title": collection.title,
        "sample_files": get_sample_files(collection)
    })
df = pd.DataFrame(summaries)
df.to_csv("staging-collections.csv")
df

@botanical
Copy link
Member

#121 PR to add new directory structure and update prefixes for production

@botanical
Copy link
Member

botanical commented May 15, 2024

Potential collections to exclude are:

  • 'hls-l30-002-ej-reprocessed'
  • 'hls-s30-002-ej-reprocessed'
  • 'ls8-covid-19-example-data'
  • 'landsat-c2l2-sr-antarctic-glaciers-pine-island'
  • 'landsat-c2l2-sr-lakes-aral-sea'
  • 'landsat-c2l2-sr-lakes-tonle-sap'
  • 'landsat-c2l2-sr-lakes-lake-balaton'
  • 'landsat-c2l2-sr-lakes-vanern'
  • 'landsat-c2l2-sr-antarctic-glaciers-thwaites'
  • 'landsat-c2l2-sr-lakes-lake-biwa'
  • 'combined_CMIP6_daily_GISS-E2-1-G_tas_kerchunk_DEMO'
  • 'eis_fire_fireline'
  • 'eis_fire_newfirepix'
  • 'eis_fire_perimeter'
  • 'oco2-geos-l3-daily'

@smohiudd
Copy link
Contributor Author

smohiudd commented May 17, 2024

The following discoveries failed in mcp-test:

  • ndvi_diff_Ian_2022-09-30_2022-09-05 Not in Staging Catalog, Bucket doesn't exist in veda-data-store
  • entropy_difference_2022-09-05_2022-09-30 Not in Staging Catalog, Bucket doesn't exist in veda-data-store
  • modis-lst-night-diff-2015-2022 500 Server error, confirm logs
  • houston-lst-diff External Bucket
  • nceo_africa_2017 External Bucket
  • EPA-annual-emissions_6B_Wastewater_Treatment_Domestic Bucket exists in veda-data-store but empty
  • EPA-annual-emissions_6B_Wastewater_Treatment_Industrial Bucket exists in veda-data-store but empty
  • EPA-annual-emissions_6D_Composting Bucket exists in veda-data-store but empty

@botanical
Copy link
Member

@botanical
Copy link
Member

For posterity, the nceo_africa_2017 ingestion item:

{
    "id": "AGB_map_2017v0m_COG",
    "bbox": [
        -18.273529509559307,
        -35.054059016911935,
        51.86423292864056,
        37.73103856358817
    ],
    "type": "Feature",
    "links": [],
    "assets": {
        "cog_default": {
            "href": "s3://nasa-maap-data-store/file-staging/nasa-map/nceo-africa-2017/AGB_map_2017v0m_COG.tif",
            "type": "image/tiff; application=geotiff; profile=cloud-optimized",
            "roles": [
                "data",
                "layer"
            ],
            "title": "Default COG Layer",
            "description": "Cloud optimized default layer to display on map",
            "raster:bands": [
                {
                    "scale": 1,
                    "nodata": "inf",
                    "offset": 0,
                    "sampling": "area",
                    "data_type": "uint16",
                    "histogram": {
                        "max": 429,
                        "min": 0,
                        "count": 11,
                        "buckets": [
                            405348,
                            44948,
                            18365,
                            6377,
                            3675,
                            3388,
                            3785,
                            9453,
                            13108,
                            1186
                        ]
                    },
                    "statistics": {
                        "mean": 37.58407913145342,
                        "stddev": 81.36678677343947,
                        "maximum": 429,
                        "minimum": 0,
                        "valid_percent": 50.42436439336373
                    }
                }
            ]
        }
    },
    "geometry": {
        "type": "Polygon",
        "coordinates": [
            [
                [
                    -18.273529509559307,
                    -35.054059016911935
                ],
                [
                    51.86423292864056,
                    -35.054059016911935
                ],
                [
                    51.86423292864056,
                    37.73103856358817
                ],
                [
                    -18.273529509559307,
                    37.73103856358817
                ],
                [
                    -18.273529509559307,
                    -35.054059016911935
                ]
            ]
        ]
    },
    "collection": "nceo_africa_2017",
    "properties": {
        "proj:bbox": [
            -18.273529509559307,
            -35.054059016911935,
            51.86423292864056,
            37.73103856358817
        ],
        "proj:epsg": 4326,
        "proj:shape": [
            81024,
            78077
        ],
        "end_datetime": "2017-12-31T23:59:59+00:00",
        "proj:geometry": {
            "type": "Polygon",
            "coordinates": [
                [
                    [
                        -18.273529509559307,
                        -35.054059016911935
                    ],
                    [
                        51.86423292864056,
                        -35.054059016911935
                    ],
                    [
                        51.86423292864056,
                        37.73103856358817
                    ],
                    [
                        -18.273529509559307,
                        37.73103856358817
                    ],
                    [
                        -18.273529509559307,
                        -35.054059016911935
                    ]
                ]
            ]
        },
        "proj:transform": [
            0.0008983152841195214,
            0,
            -18.273529509559307,
            0,
            -0.0008983152841195214,
            37.73103856358817,
            0,
            0,
            1
        ],
        "start_datetime": "2017-01-01T00:00:00+00:00",
        "datetime": null
    },
    "stac_version": "1.0.0",
    "stac_extensions": [
        "https://stac-extensions.github.io/projection/v1.0.0/schema.json",
        "https://stac-extensions.github.io/raster/v1.1.0/schema.json"
    ]
}

@anayeaye
Copy link
Contributor

anayeaye commented May 20, 2024

Here's a small first draft of an audit of the collections in veda-config datasets and the mcp-test stack. The unmatched collection ids are known special cases. Some of the empty collections are also expected but others may mean we need to tweak the discovery items configuration.

Note I will update this comment with the results to account for the known special cases like externally hosted assets

import requests
STAC_API_URL = "https://test.openveda.cloud/api/stac"
SRC_STAC_API_URL = "https://staging-stac.delta-backend.com"
VEDA_DATA_URL = "https://github.com/NASA-IMPACT/veda-data/tree/main/ingestion-data"

missing_collections = []
empty_collections = []
complete_collections = []
dashboard_collections = ['CMIP245-winter-median-pr', 'CMIP245-winter-median-ta', 'CMIP585-winter-median-pr', 'CMIP585-winter-median-ta', 'EPA-annual-emissions_1A_Combustion_Mobile', 'EPA-annual-emissions_1A_Combustion_Stationary', 'EPA-annual-emissions_1B1a_Abandoned_Coal', 'EPA-annual-emissions_1B1a_Coal_Mining_Surface', 'EPA-annual-emissions_1B1a_Coal_Mining_Underground', 'EPA-annual-emissions_1B2a_Petroleum', 'EPA-annual-emissions_1B2b_Natural_Gas_Distribution', 'EPA-annual-emissions_1B2b_Natural_Gas_Processing', 'EPA-annual-emissions_1B2b_Natural_Gas_Production', 'EPA-annual-emissions_1B2b_Natural_Gas_Transmission', 'EPA-annual-emissions_2B5_Petrochemical_Production', 'EPA-annual-emissions_2C2_Ferroalloy_Production', 'EPA-annual-emissions_4A_Enteric_Fermentation', 'EPA-annual-emissions_4B_Manure_Management', 'EPA-annual-emissions_4C_Rice_Cultivation', 'EPA-annual-emissions_4F_Field_Burning', 'EPA-annual-emissions_5_Forest_Fires', 'EPA-annual-emissions_6A_Landfills_Industrial', 'EPA-annual-emissions_6A_Landfills_Municipal', 'EPA-annual-emissions_6B_Wastewater_Treatment_Domestic', 'EPA-annual-emissions_6B_Wastewater_Treatment_Industrial', 'EPA-annual-emissions_6D_Composting', 'EPA-daily-emissions_5_Forest_Fires', 'EPA-monthly-emissions_1A_Combustion_Stationary', 'EPA-monthly-emissions_1B2a_Petroleum', 'EPA-monthly-emissions_1B2b_Natural_Gas_Production', 'EPA-monthly-emissions_4B_Manure_Management', 'EPA-monthly-emissions_4C_Rice_Cultivation', 'EPA-monthly-emissions_4F_Field_Burning', 'IS2SITMOGR4-cog', 'MO_NPP_npp_vgpm', 'OMI_trno2-COG', 'OMSO2PCA-COG', 'bangladesh-landcover-2001-2020', 'barc-thomasfire', 'blue-tarp-detection', 'blue-tarp-planetscope', 'caldor-fire-behavior', 'caldor-fire-burn-severity', 'campfire-albedo-wsa-diff', 'campfire-lst-day-diff', 'campfire-lst-night-diff', 'campfire-ndvi-diff', 'campfire-nlcd', 'co2-diff', 'co2-mean', 'combined_CMIP6_daily_GISS-E2-1-G_tas_kerchunk_DEMO', 'conus-reach', 'disalexi-etsuppression', 'ecco-surface-height-change', 'eis_fire_perimeter', 'facebook_population_density', 'fldas-soil-moisture-anomalies', 'frp-max-thomasfire', 'geoglam', 'grdi-cdr-raster', 'grdi-filled-missing-values-count', 'grdi-imr-raster', 'grdi-shdi-raster', 'grdi-v1-built', 'grdi-v1-raster', 'grdi-vnl-raster', 'grdi-vnl-slope-raster', 'hls-bais2-v2', 'hls-l30-002-ej-reprocessed', 'hls-s30-002-ej-reprocessed', 'hls-swir-falsecolor-composite', 'houston-aod', 'houston-aod-diff', 'houston-landcover', 'houston-lst-day', 'houston-lst-diff', 'houston-lst-night', 'houston-ndvi', 'houston-urbanization', 'landsat-nighttime-thermal', 'lis-etsuppression', 'lis-global-da-evap', 'lis-global-da-gpp', 'lis-global-da-gpp-trend', 'lis-global-da-gws', 'lis-global-da-qs', 'lis-global-da-qsb', 'lis-global-da-streamflow', 'lis-global-da-swe', 'lis-global-da-totalprecip', 'lis-global-da-tws', 'lis-global-da-tws-trend', 'lis-tvegsuppression', 'lis-tws-anomaly', 'lis-tws-nonstationarity-index', 'lis-tws-trend', 'mtbs-burn-severity', 'nceo_africa_2017', 'nightlights-hd-1band', 'nightlights-hd-monthly', 'no2-monthly', 'no2-monthly-diff', 'snow-projections-diff-245', 'snow-projections-diff-585', 'snow-projections-median-245', 'snow-projections-median-585', 'social-vulnerability-index-household', 'social-vulnerability-index-household-nopop', 'social-vulnerability-index-housing', 'social-vulnerability-index-housing-nopop', 'social-vulnerability-index-minority', 'social-vulnerability-index-minority-nopop', 'social-vulnerability-index-overall', 'social-vulnerability-index-overall-nopop', 'social-vulnerability-index-socioeconomic', 'social-vulnerability-index-socioeconomic-nopop', 'sport-lis-vsm0_100cm-percentile']

for collection_id in sorted(set(dashboard_collections)):
    collections_url = f"{STAC_API_URL}/collections/{collection_id}"
    r = requests.get(collections_url)
    if r.reason == "Not Found":
        missing_collections.append(collection_id)
    else:
        items_url = f"{STAC_API_URL}/collections/{collection_id}/items"
        r = requests.get(items_url)
        items_matched = r.json().get("context").get("matched")

        src_items_url = f"{SRC_STAC_API_URL}/collections/{collection_id}/items"
        src_r = requests.get(src_items_url)
        src_items_matched = src_r.json().get("context").get("matched")
        
        src_match = items_matched == src_items_matched
        if not src_match:
            print(f"\n{collection_id=} {items_matched=} {src_items_matched=} {src_match=}!")
            print(f"{items_url=}")
            print(f"{src_items_url=}")
            veda_data_discovery = f"{VEDA_DATA_URL}/production/discovery-items/{collection_id}.json"
            discovery=requests.get(veda_data_discovery)
            if not discovery.reason=="OK":
                print(f"DISCOVERY CONFIG FOR {collection_id=} {discovery.reason=}!")
        else:
            complete_collections.append(collection_id)
            
        if not items_matched:
            empty_collections.append(collection_id)
            
print(f"\n{len(dashboard_collections)=}")
print(f"\n{len(complete_collections)=}\n{complete_collections=}")
print(f"\n{len(missing_collections)=}\n{missing_collections=}")
print(f"\n{len(empty_collections)=}\n{empty_collections=}")

collection_id='CMIP585-winter-median-pr' items_matched=0 src_items_matched=4 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/CMIP585-winter-median-pr/items'
src_items_url='https://staging-stac.delta-backend.com/collections/CMIP585-winter-median-pr/items'

collection_id='MO_NPP_npp_vgpm' items_matched=0 src_items_matched=12 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/MO_NPP_npp_vgpm/items'
src_items_url='https://staging-stac.delta-backend.com/collections/MO_NPP_npp_vgpm/items'

collection_id='bangladesh-landcover-2001-2020' items_matched=0 src_items_matched=2 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/bangladesh-landcover-2001-2020/items'
src_items_url='https://staging-stac.delta-backend.com/collections/bangladesh-landcover-2001-2020/items'

collection_id='campfire-lst-day-diff' items_matched=0 src_items_matched=1 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/campfire-lst-day-diff/items'
src_items_url='https://staging-stac.delta-backend.com/collections/campfire-lst-day-diff/items'

collection_id='campfire-nlcd' items_matched=1 src_items_matched=2 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/campfire-nlcd/items'
src_items_url='https://staging-stac.delta-backend.com/collections/campfire-nlcd/items'

collection_id='fldas-soil-moisture-anomalies' items_matched=0 src_items_matched=499 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/fldas-soil-moisture-anomalies/items'
src_items_url='https://staging-stac.delta-backend.com/collections/fldas-soil-moisture-anomalies/items'

collection_id='geoglam' items_matched=46 src_items_matched=47 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/geoglam/items'
src_items_url='https://staging-stac.delta-backend.com/collections/geoglam/items'

collection_id='hls-swir-falsecolor-composite' items_matched=0 src_items_matched=2 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/hls-swir-falsecolor-composite/items'
src_items_url='https://staging-stac.delta-backend.com/collections/hls-swir-falsecolor-composite/items'

collection_id='houston-lst-diff' items_matched=0 src_items_matched=1 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/houston-lst-diff/items'
src_items_url='https://staging-stac.delta-backend.com/collections/houston-lst-diff/items'

collection_id='houston-urbanization' items_matched=0 src_items_matched=1 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/houston-urbanization/items'
src_items_url='https://staging-stac.delta-backend.com/collections/houston-urbanization/items'

collection_id='lis-global-da-evap' items_matched=7062 src_items_matched=6849 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/lis-global-da-evap/items'
src_items_url='https://staging-stac.delta-backend.com/collections/lis-global-da-evap/items'

collection_id='lis-global-da-gpp' items_matched=7062 src_items_matched=6841 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/lis-global-da-gpp/items'
src_items_url='https://staging-stac.delta-backend.com/collections/lis-global-da-gpp/items'

collection_id='lis-global-da-gpp-trend' items_matched=0 src_items_matched=3 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/lis-global-da-gpp-trend/items'
src_items_url='https://staging-stac.delta-backend.com/collections/lis-global-da-gpp-trend/items'

collection_id='lis-global-da-gws' items_matched=2779 src_items_matched=6844 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/lis-global-da-gws/items'
src_items_url='https://staging-stac.delta-backend.com/collections/lis-global-da-gws/items'

collection_id='lis-global-da-streamflow' items_matched=0 src_items_matched=5998 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/lis-global-da-streamflow/items'
src_items_url='https://staging-stac.delta-backend.com/collections/lis-global-da-streamflow/items'

collection_id='lis-global-da-totalprecip' items_matched=6605 src_items_matched=7364 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/lis-global-da-totalprecip/items'
src_items_url='https://staging-stac.delta-backend.com/collections/lis-global-da-totalprecip/items'

collection_id='lis-global-da-tws' items_matched=7062 src_items_matched=6768 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/lis-global-da-tws/items'
src_items_url='https://staging-stac.delta-backend.com/collections/lis-global-da-tws/items'

collection_id='lis-global-da-tws-trend' items_matched=2 src_items_matched=3 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/lis-global-da-tws-trend/items'
src_items_url='https://staging-stac.delta-backend.com/collections/lis-global-da-tws-trend/items'

collection_id='lis-tws-anomaly' items_matched=6698 src_items_matched=7031 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/lis-tws-anomaly/items'
src_items_url='https://staging-stac.delta-backend.com/collections/lis-tws-anomaly/items'

collection_id='lis-tws-trend' items_matched=0 src_items_matched=1 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/lis-tws-trend/items'
src_items_url='https://staging-stac.delta-backend.com/collections/lis-tws-trend/items'

collection_id='mtbs-burn-severity' items_matched=1 src_items_matched=5 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/mtbs-burn-severity/items'
src_items_url='https://staging-stac.delta-backend.com/collections/mtbs-burn-severity/items'

collection_id='nceo_africa_2017' items_matched=0 src_items_matched=1 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/nceo_africa_2017/items'
src_items_url='https://staging-stac.delta-backend.com/collections/nceo_africa_2017/items'

collection_id='nightlights-hd-1band' items_matched=7 src_items_matched=6 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/nightlights-hd-1band/items'
src_items_url='https://staging-stac.delta-backend.com/collections/nightlights-hd-1band/items'

collection_id='nightlights-hd-monthly' items_matched=0 src_items_matched=1134 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/nightlights-hd-monthly/items'
src_items_url='https://staging-stac.delta-backend.com/collections/nightlights-hd-monthly/items'

collection_id='no2-monthly' items_matched=0 src_items_matched=93 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/no2-monthly/items'
src_items_url='https://staging-stac.delta-backend.com/collections/no2-monthly/items'

collection_id='no2-monthly-diff' items_matched=1 src_items_matched=105 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/no2-monthly-diff/items'
src_items_url='https://staging-stac.delta-backend.com/collections/no2-monthly-diff/items'

collection_id='snow-projections-diff-585' items_matched=0 src_items_matched=40 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/snow-projections-diff-585/items'
src_items_url='https://staging-stac.delta-backend.com/collections/snow-projections-diff-585/items'

collection_id='snow-projections-median-245' items_matched=0 src_items_matched=40 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/snow-projections-median-245/items'
src_items_url='https://staging-stac.delta-backend.com/collections/snow-projections-median-245/items'

collection_id='snow-projections-median-585' items_matched=0 src_items_matched=40 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/snow-projections-median-585/items'
src_items_url='https://staging-stac.delta-backend.com/collections/snow-projections-median-585/items'

collection_id='social-vulnerability-index-household' items_matched=0 src_items_matched=5 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/social-vulnerability-index-household/items'
src_items_url='https://staging-stac.delta-backend.com/collections/social-vulnerability-index-household/items'
DISCOVERY CONFIG FOR collection_id='social-vulnerability-index-household' discovery.reason='Not Found'!

collection_id='social-vulnerability-index-household-nopop' items_matched=0 src_items_matched=5 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/social-vulnerability-index-household-nopop/items'
src_items_url='https://staging-stac.delta-backend.com/collections/social-vulnerability-index-household-nopop/items'
DISCOVERY CONFIG FOR collection_id='social-vulnerability-index-household-nopop' discovery.reason='Not Found'!

collection_id='social-vulnerability-index-housing' items_matched=0 src_items_matched=5 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/social-vulnerability-index-housing/items'
src_items_url='https://staging-stac.delta-backend.com/collections/social-vulnerability-index-housing/items'
DISCOVERY CONFIG FOR collection_id='social-vulnerability-index-housing' discovery.reason='Not Found'!

collection_id='social-vulnerability-index-housing-nopop' items_matched=0 src_items_matched=5 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/social-vulnerability-index-housing-nopop/items'
src_items_url='https://staging-stac.delta-backend.com/collections/social-vulnerability-index-housing-nopop/items'
DISCOVERY CONFIG FOR collection_id='social-vulnerability-index-housing-nopop' discovery.reason='Not Found'!

collection_id='social-vulnerability-index-minority' items_matched=0 src_items_matched=5 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/social-vulnerability-index-minority/items'
src_items_url='https://staging-stac.delta-backend.com/collections/social-vulnerability-index-minority/items'
DISCOVERY CONFIG FOR collection_id='social-vulnerability-index-minority' discovery.reason='Not Found'!

collection_id='social-vulnerability-index-minority-nopop' items_matched=0 src_items_matched=5 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/social-vulnerability-index-minority-nopop/items'
src_items_url='https://staging-stac.delta-backend.com/collections/social-vulnerability-index-minority-nopop/items'
DISCOVERY CONFIG FOR collection_id='social-vulnerability-index-minority-nopop' discovery.reason='Not Found'!

collection_id='social-vulnerability-index-overall' items_matched=0 src_items_matched=5 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/social-vulnerability-index-overall/items'
src_items_url='https://staging-stac.delta-backend.com/collections/social-vulnerability-index-overall/items'
DISCOVERY CONFIG FOR collection_id='social-vulnerability-index-overall' discovery.reason='Not Found'!

collection_id='social-vulnerability-index-overall-nopop' items_matched=0 src_items_matched=5 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/social-vulnerability-index-overall-nopop/items'
src_items_url='https://staging-stac.delta-backend.com/collections/social-vulnerability-index-overall-nopop/items'
DISCOVERY CONFIG FOR collection_id='social-vulnerability-index-overall-nopop' discovery.reason='Not Found'!

collection_id='social-vulnerability-index-socioeconomic' items_matched=0 src_items_matched=5 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/social-vulnerability-index-socioeconomic/items'
src_items_url='https://staging-stac.delta-backend.com/collections/social-vulnerability-index-socioeconomic/items'
DISCOVERY CONFIG FOR collection_id='social-vulnerability-index-socioeconomic' discovery.reason='Not Found'!

collection_id='social-vulnerability-index-socioeconomic-nopop' items_matched=0 src_items_matched=5 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/social-vulnerability-index-socioeconomic-nopop/items'
src_items_url='https://staging-stac.delta-backend.com/collections/social-vulnerability-index-socioeconomic-nopop/items'
DISCOVERY CONFIG FOR collection_id='social-vulnerability-index-socioeconomic-nopop' discovery.reason='Not Found'!

collection_id='sport-lis-vsm0_100cm-percentile' items_matched=0 src_items_matched=2 src_match=False!
items_url='https://test.openveda.cloud/api/stac/collections/sport-lis-vsm0_100cm-percentile/items'
src_items_url='https://staging-stac.delta-backend.com/collections/sport-lis-vsm0_100cm-percentile/items'

len(dashboard_collections)=117

len(complete_collections)=74
complete_collections=['CMIP245-winter-median-pr', 'CMIP245-winter-median-ta', 'CMIP585-winter-median-ta', 'EPA-annual-emissions_1A_Combustion_Mobile', 'EPA-annual-emissions_1A_Combustion_Stationary', 'EPA-annual-emissions_1B1a_Abandoned_Coal', 'EPA-annual-emissions_1B1a_Coal_Mining_Surface', 'EPA-annual-emissions_1B1a_Coal_Mining_Underground', 'EPA-annual-emissions_1B2a_Petroleum', 'EPA-annual-emissions_1B2b_Natural_Gas_Distribution', 'EPA-annual-emissions_1B2b_Natural_Gas_Processing', 'EPA-annual-emissions_1B2b_Natural_Gas_Production', 'EPA-annual-emissions_1B2b_Natural_Gas_Transmission', 'EPA-annual-emissions_2B5_Petrochemical_Production', 'EPA-annual-emissions_2C2_Ferroalloy_Production', 'EPA-annual-emissions_4A_Enteric_Fermentation', 'EPA-annual-emissions_4B_Manure_Management', 'EPA-annual-emissions_4C_Rice_Cultivation', 'EPA-annual-emissions_4F_Field_Burning', 'EPA-annual-emissions_5_Forest_Fires', 'EPA-annual-emissions_6A_Landfills_Industrial', 'EPA-annual-emissions_6A_Landfills_Municipal', 'EPA-annual-emissions_6B_Wastewater_Treatment_Domestic', 'EPA-annual-emissions_6B_Wastewater_Treatment_Industrial', 'EPA-annual-emissions_6D_Composting', 'EPA-daily-emissions_5_Forest_Fires', 'EPA-monthly-emissions_1A_Combustion_Stationary', 'EPA-monthly-emissions_1B2a_Petroleum', 'EPA-monthly-emissions_1B2b_Natural_Gas_Production', 'EPA-monthly-emissions_4B_Manure_Management', 'EPA-monthly-emissions_4C_Rice_Cultivation', 'EPA-monthly-emissions_4F_Field_Burning', 'IS2SITMOGR4-cog', 'OMI_trno2-COG', 'OMSO2PCA-COG', 'barc-thomasfire', 'blue-tarp-detection', 'blue-tarp-planetscope', 'caldor-fire-behavior', 'caldor-fire-burn-severity', 'campfire-albedo-wsa-diff', 'campfire-lst-night-diff', 'campfire-ndvi-diff', 'co2-diff', 'co2-mean', 'conus-reach', 'disalexi-etsuppression', 'ecco-surface-height-change', 'eis_fire_perimeter', 'facebook_population_density', 'frp-max-thomasfire', 'grdi-cdr-raster', 'grdi-filled-missing-values-count', 'grdi-imr-raster', 'grdi-shdi-raster', 'grdi-v1-built', 'grdi-v1-raster', 'grdi-vnl-raster', 'grdi-vnl-slope-raster', 'hls-bais2-v2', 'houston-aod', 'houston-aod-diff', 'houston-landcover', 'houston-lst-day', 'houston-lst-night', 'houston-ndvi', 'landsat-nighttime-thermal', 'lis-etsuppression', 'lis-global-da-qs', 'lis-global-da-qsb', 'lis-global-da-swe', 'lis-tvegsuppression', 'lis-tws-nonstationarity-index', 'snow-projections-diff-245']

len(missing_collections)=3
missing_collections=['combined_CMIP6_daily_GISS-E2-1-G_tas_kerchunk_DEMO', 'hls-l30-002-ej-reprocessed', 'hls-s30-002-ej-reprocessed']

len(empty_collections)=29
empty_collections=['CMIP585-winter-median-pr', 'MO_NPP_npp_vgpm', 'bangladesh-landcover-2001-2020', 'campfire-lst-day-diff', 'eis_fire_perimeter', 'fldas-soil-moisture-anomalies', 'hls-swir-falsecolor-composite', 'houston-lst-diff', 'houston-urbanization', 'lis-global-da-gpp-trend', 'lis-global-da-streamflow', 'lis-tws-trend', 'nceo_africa_2017', 'nightlights-hd-monthly', 'no2-monthly', 'snow-projections-diff-585', 'snow-projections-median-245', 'snow-projections-median-585', 'social-vulnerability-index-household', 'social-vulnerability-index-household-nopop', 'social-vulnerability-index-housing', 'social-vulnerability-index-housing-nopop', 'social-vulnerability-index-minority', 'social-vulnerability-index-minority-nopop', 'social-vulnerability-index-overall', 'social-vulnerability-index-overall-nopop', 'social-vulnerability-index-socioeconomic', 'social-vulnerability-index-socioeconomic-nopop', 'sport-lis-vsm0_100cm-percentile']

@botanical
Copy link
Member

#132 PR to restructure ingestions based on our conversation on slack

@j08lue
Copy link
Contributor

j08lue commented Jun 7, 2024

Would it be easy enough to rename this collection here before / as we publish the production catalog?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants