Skip to content

Commit

Permalink
Merge branch 'main' of github.com:HopkinsIDD/rsv-forecast-hub
Browse files Browse the repository at this point in the history
  • Loading branch information
kjsato committed Jan 29, 2024
2 parents 73285c3 + 6b5058f commit 67e60a3
Show file tree
Hide file tree
Showing 8 changed files with 23,829 additions and 22,363 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/validate-submission.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ jobs:
uses: actions/checkout@v2
with:
repository: HopkinsIDD/rsv-forecast-hub_data
token: ${{ secrets.KJ3_PAT }}
token: ${{ secrets.KJ3_PATC }}
path: ./rsv-forecast-hub_data
fetch-depth: 2

Expand Down
76 changes: 64 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,75 @@

**This repository is in development.**

This repository is designed to collect forecast data for the 2023-2024 RSV Forecast Hub run by Johns Hopkins University Infectious Disease Dynamics Group. This project collects forecasts for weekly new hospitalizations due to confirmed Respiratory Syncytial Virus (RSV).
## Rationale
Respiratory Syncytial Virus (RSV) is the #1 cause of hospitalizations in children under 5 in the United States. Tracking RSV-associated hospitalization rates helps public health professionals understand trends in virus circulation, estimate disease burden, and respond to outbreaks. Accurate predictions of hospital admissions linked to RSV will help support appropriate public health interventions and planning during the 2023-2024 RSV season, as COVID-19, influenza, and other respiratory pathogens are circulating.

# Forecasts of Confirmed RSV Hospitalizations Admissions During the 2023-2024 Season
RSV-related hospitalizations are a major contributor to the overall burden of RSV in the United States. Tracking RSV-associated hospitalization rates helps public health professionals understand trends in virus circulation, estimate disease burden, and respond to outbreaks. Accurate predictions of hospital admissions linked to RSV will help support appropriate public health interventions and planning during the 2023-2024 RSV season, as COVID-19, influenza, and other respiratory pathogens are circulating.
This repository is designed to collect forecast data for the 2023-2024 RSV Forecast Hub run by Johns Hopkins University Infectious Disease Dynamics Group. This project collects forecasts for weekly new hospitalizations due to confirmed RSV.

**How to Participate:**
This RSV Forecast Hub will serve as a collaborative forecasting challenge for weekly laboratory confirmed RSV hospital admissions. Each week, participants are asked to provide national- and jurisdiction-specific probabilistic forecasts of the weekly number of confirmed RSV hospitalizations for the following three weeks.
## How to Participate:
This RSV Forecast Hub will serve as a collaborative forecasting challenge for weekly laboratory confirmed RSV hospital admissions. Each week, participants are asked to provide national- and jurisdiction-specific probabilistic forecasts of the weekly number of confirmed RSV hospitalizations for the following three weeks. The RSV Forecast Hub is open to any team willing to provide projections at the right temporal and spatial scales. We only require that participating teams share point estimates and uncertainty bounds, along with a short model description and answers to a list of key questions about design.

The RSV Forecast Hub is open to any team willing to provide projections at the right temporal and spatial scales. We only require that participating teams share point estimates and uncertainty bounds, along with a short model description and answers to a list of key questions about design. A major output of the forecast hub is ensemble estimates of the prediction targets. Model projections should be submitted via pull request to the model-output/ folder and associated metadata should be submitted at the same time to the m odel-metadata/ folder of this GitHub repository.
Those interested in participating, please read the README file and email Kimberlyn Roosa at [email protected].

Those interested in participating, please read the README file and email Kimberlyn Roosa at [email protected].
Model projections should be submitted via pull request to the [model-output/](https://github.com/HopkinsIDD/rsv-forecast-hub/tree/main/model-output) folder and associated metadata should be submitted at the same time to the [model-metadata/](https://github.com/HopkinsIDD/rsv-forecast-hub/tree/main/model-metadata) folder of this GitHub repository.

**Prediction Targets:**
The Respiratory Syncytial Virus Hospitalization Surveillance Network (RSV-NET) is a network that conducts active, population-based surveillance for laboratory-confirmed RSV-associated hospitalizations in children younger than 18 years of age and adults. The network currently includes 58 counties in 12 states that participate in the Emerging Infections Program (California, Colorado, Connecticut, Georgia, Maryland, Minnesota, New Mexico, New York, Oregon, and Tennessee) or the Influenza Hospitalization Surveillance Program (Michigan and Utah). RSV-NET covers almost 8% of the U.S. population. RSV-NET hospitalization data are preliminary and subject to change as more data become available. Case counts and rates for recent hospital admissions are subject to reporting lags that might increase around holidays or during periods of increased hospital utilization. As new data are received each month, previous case counts and rates are updated accordingly.
### RSV Model Calibration
**RSV-NET Dataset**

Participating teams are asked to provide national- and jurisdiction-specific quantile forecasts of the weekly number of confirmed RSV hospitalizations for the epidemiological week (EW) ending on the reference date as well as the three following weeks. Teams can, but are not required to, submit forecasts for all weekly horizons or all locations. We will use the specification of EWs defined by the CDC, which run Sunday through Saturday. The target end date for a prediction is the Saturday that ends an EW of interest and can be calculated using the expression: target end date = reference date + horizon * (7 days). There are standard software packages to convert from dates to epidemic weeks and vice versa (e.g. MMWRweek and lubridate for R and pymmwr and epiweeks for Python).
The Respiratory Syncytial Virus Hospitalization Surveillance Network (RSV-NET) is a network that conducts active, population-based surveillance for laboratory-confirmed RSV-associated hospitalizations in children younger than 18 years of age and adults. The network currently includes 58 counties in 12 states that participate in the Emerging Infections Program (California, Colorado, Connecticut, Georgia, Maryland, Minnesota, New Mexico, New York, Oregon, and Tennessee) or the Influenza Hospitalization Surveillance Program (Michigan and Utah). Age- and state-specific data on laboratory-confirmed RSV hospitalization rates are available for 12 states and the US from RSV-NET spanning 2017-18 to present [(RSV-NET CDC Webpage)](https://www.cdc.gov/rsv/research/rsv-net/index.html). Age-specific weekly rates per 100,000 population are reported in this system.

# Acknowledgements
This repository follows the guidelines and standards outlined by the hubverse, which provides a set of data formats and open source tools for modeling hubs.
The data has been standardized and posted on the rsv-forecast-hub github [target-data/](https://github.com/HopkinsIDD/rsv-forecast-hub/tree/main/target-data) folder and is updated weekly. **The target in this data is the weekly number of hospitalizations in each given state (inc_hosp variable), for all ages and for each age group.** To obtain counts, we have converted RSV-NET weekly rates based on state population sizes. This method assumes that RSV-NET hospitals are representative of the whole state. To obtain national US counts, we have used the rates provided for the “overall RSV-NET network”. The data covers 2017-present. Reported age groups include: [0-6 months], [6-12 months], [1-2 yr], [2-4 yr], [5-17 yr], [18-49 yr], [50-64 yr], and 65+ years. The standardized dataset includes week- state- and age-specific RSV counts (the target), rates, and population sizes.

**Note:** Different states joined RSVnet in different years (between 2014 and 2018) while RSV surveillance throughout the network was initially limited to adults. Children RSV surveillance began in the 2018-19 season. Further, RSV-NET hospitalization data are preliminary and subject to change as more data become available. Case counts and rates for recent hospital admissions are subject to reporting lags that might increase around holidays or during periods of increased hospital utilization. As new data are received each month, previous case counts and rates are updated accordingly.

The source of age distribution used for calibration (RSV-NET vs other estimates) should be provided in the abstract metadata that is submitted with the projections.

**Other RSV datasets available for calibration:**
A few auxiliary datasets have been posted in the GitHub repositority [auxiliary-data/](https://github.com/HopkinsIDD/rsv-forecast-hub/tree/main/auxiliary-data) folder including:

- State-specific CDC surveillance from NVERSS (only last year of data available)
- State-specific ED data (only last year of data available)

### Prediction Targets
Participating teams are asked to provide **quantile forecasts of the weekly incident confirmed RSV hospital admissions (counts) in the 12 RSV-NET states, nationally for all ages, and for a set of minimal age groups** for the epidemiological week ending on the reference date as well as the three following weeks. Teams can, but are not required to, submit forecasts for all weekly horizons or all locations.

**Weekly targets:**
- Weekly reported all-age and age-specific state-level incident hospital admissions, based on RSV-NET. This dataset is updated daily and covers 2017-2023. There should be no adjustment for reporting (=raw data from RSV-NET dataset to be projected).
- All targets should be number of individuals, rather than rates.

**Age target:**
- **Required:**
- Weekly state-specific and national RSV hospitalizations for all ages (or 0-130) is the only mandatory age target.
- Additional age details (optional):
- Weekly state-specific and national RSV hospitalizations among individuals <1 yr, 1-4, 5-17, 18-49, 50-64, and 65+ (most of the RSV burden on hospitalizations comes from the 0-1 and 65+ age groups)

### Timeline
*Add info here*

## Target Data
The [target-data/](https://github.com/HopkinsIDD/rsv-forecast-hub/tree/main/target-data) folder contains the RSV hospitalization data (also called "truth data") standardized from the [Weekly Rates of Laboratory-Confirmed RSV Hospitalizations from the RSV-NET Surveillance System](https://data.cdc.gov/Public-Health-Surveillance/Weekly-Rates-of-Laboratory-Confirmed-RSV-Hospitali/29hc-w46k/about_data).

The weekly hospitalization number per location are going to be used as truth data in the hub.

## Auxiliary Data
The repository stores and updates additional data relevant to the RSV modeling efforts in the [auxiliary-data/](https://github.com/HopkinsIDD/rsv-forecast-hub/tree/main/auxiliary-data) folder:

- Population and census data:
- National and State level name and fips code as used in the Hub and associated population size.
- State level population size per year and per age from the US Census Bureau.

- Birth Rate:
- Birth Number and Rate per state and per year from 1995 to 2022 included.
- Data from the US Census Bureau and from the Centers for Disease Control and Prevention, National Center for Health Statistics. National Vital Statistics System, Natality on CDC WONDER Online Database.

- RSV data:
- The National Respiratory and Enteric Virus Surveillance System (NREVSS) data at national and state level.
- The [Weekly Rates of Laboratory-Confirmed RSV Hospitalizations from the RSV-NET Surveillance System](https://data.cdc.gov/Public-Health-Surveillance/Weekly-Rates-of-Laboratory-Confirmed-RSV-Hospitali/29hc-w46k)
- The [National Emergency Department Visits for COVID-19, Influenza, and Respiratory Syncytial Virus](https://www.cdc.gov/ncird/surveillance/respiratory-illnesses/index.html)

## Data License and Reuse:
All source code that is specific to the overall project is available under an open-source MIT license. We note that this license does not cover model code from the various teams, model scenario data if there is (available under specified licenses as described above) and auxiliary data.

## Acknowledgements
This repository follows the guidelines and standards outlined by the [hubverse](https://hubdocs.readthedocs.io/en/latest/), which provides a set of data formats and open source tools for modeling hubs.

6 changes: 6 additions & 0 deletions auxiliary-data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
## Auxiliary Data
This folder is used to store additional data relevant to RSV modeling efforts.

### Location and Census Data
The folder [location_census/](https://github.com/HopkinsIDD/rsv-forecast-hub/tree/main/auxiliary-data/location_census) contains:
- [location_census/locations.csv](https://github.com/HopkinsIDD/rsv-forecast-hub/tree/main/auxiliary-data/location_census/locations.csv) contains the state and national full name, 2 letter abbreviation, and FIPS code as used in the hub. The file also contains the population size.
3 changes: 2 additions & 1 deletion hub-config/tasks.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
"origin_date": {
"required": null,
"optional": [
"2023-10-15", "2023-10-22", "2023-10-29", "2023-11-05",
"2023-11-12", "2023-11-19", "2023-11-26", "2023-12-03",
"2023-12-10", "2023-12-17", "2023-12-24", "2023-12-31",
"2024-01-07", "2024-01-14", "2024-01-21", "2024-01-28",
Expand Down Expand Up @@ -88,7 +89,7 @@
},
"age_group":{
"required":["0-130"],
"optional":["0-0.99","1-4","5-17","5-64","18-49","50-64","65-130"]
"optional":["0-0.99","1-4","5-64","65-130"]
}
},
"output_type": {
Expand Down
89 changes: 86 additions & 3 deletions model-metadata/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,88 @@
# Model metadata
# Model Metadata

This folder should contain metadata files for the models submitting to the hub, following the recommended [model metadata guidelines in our documentation](https://hubdocs.readthedocs.io/en/latest/format/model-metadata.html).
This folder contains metadata files for each team-model submitting to the RSV Forecast Hub.

Since some metadata fields may be specific to this hub, creators of the hub are encouraged to modify the template model metadata file so that it is a valid model metadata file for your project.
## Sub-directory
Each sub-directory within the [model-metadata/](https://github.com/HopkinsIDD/rsv-forecast-hub/edit/main/model-metadata) directory has the format:

```
team-model
```

where
- ```team``` is the abbreviated team name (```team_abbr```) and
- ```model``` is the abbreviated name of your model (```model_abbr```).

Both team and model should be less than 15 characters and not include hyphens nor spaces.
The ```team-model``` should correspond to the ```team-model``` sub-directory in the associated model-output folder containing the associated projections.

## Required Information
### ```team_name```
The name of your team that is less than 50 characters, no spaces. Will be displayed online.

### ```model_name```
The name of your model that is less than 50 characters, no spaces. Will be displayed online.

#### Abbreviation
### ```team_abbr```
An abbreviated name for your team that is less than 15 alphanumeric characters (```_``` also accepted, please avoid using other punctuation characters, including ```-```, ```/```, etc.).

### ```model_abbr```
An abbreviated name for your model that is less than 15 alphanumeric characters (```_``` also accepted, please avoid using other punctuation characters, including ```-```, ```/```, etc.).

#### Team-model name and filename
The team-model abbreviation used in all the file names must be in the format of ```[team_abbr]-[model_abbr]```, where each of the ```[team_abbr]``` and ```[model_abbr]``` are text strings that are each less than 15 alphanumeric characters that do not include a hyphen or whitespace.
Note that this is a uniquely identifying field in our system, so please choose this name carefully, as it may not be changed once defined. The model abbreviation will be displayed online.

### ```model_contributers```
A list of all individuals involved in the forecasting effort, affiliations, and email addresses.
At lease one contributor needs to have a valid email address. All email addresses provided will be added to an email distribution list for model contributors.
The syntax of this field should be

```
[
{
"name": "LastName FirstName",
"affiliation": "Affiliation",
"email": "user@address"
},
{
"name": "LastName FirstName",
"affiliation": "Affiliation"
},
{
"name": "LastName FirstName",
"affiliation": "Affiliation",
"email": "user3@address"
}
]
```

### ```website_url```
A url to a website that has additional data about your model. We encourage teams to submit the most user-friendly version of your model, e.g. a dashboard, or similar, that displays model ouput.
If you have additionally a data repository where you store forecasts and other model code, please include that in your methods section below.
If you only have a more technical site, e.g. github repo, please include that link here. Will be displayed online.

### ```model_version```
A version number or date in YYYY-MM-DD format to designate the version of the model used for submitted model projections. Will be displayed online.

### ```methods```
A brief description of your methodology that is less than 200 characters. Will be displayed online.

### ```license```
We encourage teams to submit as a "cc-by-4.0" to allow the broadest possible uses including private vaccine production (which would be excluded by the "cc-by-nc-4.0" license).
Alternatively, add the name and URL of the license used, as in ```cc-by-4.0, https://creativecommons.org/licenses/by/4.0/```, or add the value ```LICENSE.txt``` if a LICENSE.txt file was added within the folder.
Will be displayed online.

## Optional
### ```team_funding```
Acknowledgement of funding sources, by name of funding agency, grant title, and grant number.

### ```data_inputs```
A brief description of the data sources used to inform the model, using as much standard terminology as possible, that includes a source name and teh type of data, such as ```CDC RSV-NET```, etc.

### ```methods_long```
A long description of your methodology.

### ```citation```
A bibliographic citation to a paper, website, or other object that people can go to to find out more about the model, in the style used by PubMed, as ```"Smith, J, Smith S, Smith C. MyModel is the best model. Nature. 2020 Aug. doi: 10.1038/s12345-678-90123-45."```
Loading

0 comments on commit 67e60a3

Please sign in to comment.