The target-data folder contains the laboratory confirmed influenza hospital admission ("gold standard") data that FluSight forecasts are eventually compared to. Influenza hospitalization data are taken from the National Healthcare Safety Network Hospital Respitatory Dataset.
Table of Contents
FluSight hospital admission prediction targets (wk inc flu hosp
,wk flu hosp rate changes
, peak week inc flu hosp
, and peak inc flu hosp
) for this season will be based on the 'total number of patients hospitalized with influenza captured for the Wednesday of the reporting week' reported through CDC's NHSN (the dataset formerly known as HHS-Protect), Weekly Hospital Respiratory Data.
These data are released weekly.
Previously collected influenza data from the 2020-21, 2022-23, and 2023-2024 influenza seasons are included in the NHSN Weekly Hospital Respiratory Data as well as originally in the archived COVID-19 Reported Patient Impact and Hospital Capacity by State Timeseries dataset. Reporting of the influenza fields 33-35 became mandatory in February 2022, and additional details are provided in the current hospital reporting guidance and FAQs. Numbers of reporting hospitals increased after the period that reporting became mandatory in early 2022 but have since stabilized at high levels of compliance. The number of hospitals reporting these data each week by state are available in the totalconfflunewadm
variable found in the Weekly Hospital Respiratory Dataset.
Weekly official counts are publicly released on Fridays. Preliminary counts are released on Wednesdays. Official counts can be revised in subsequent data updates.
Please note the following detail from the dataset description:
"Data quality: While CDC reviews reported data for completeness and errors and corrects those found, some reporting errors might still exist within the data. CDC and partners work with reporters to correct these errors and update the data in subsequent weeks. Data reported as of December 1, 2020 are subject to thorough, routine data quality review procedures, including identifying and excluding invalid values from metric calculations and application of error correction methodology; data prior to this date may have anomalies that are not yet resolved. Data prior to August 1, 2020, are unavailable. As a result of data quality implementation and submission of any backfilled data, data and metrics might fluctuate or change week-over-week after initial posting."
This implies that some values may be repeated. Extra caution should be applied in these cases and in particular for interpreting data the most recent week of data, as hospitals report hospital admissions for the previous day (further detail below).
Some of these data are also available programmatically through the EpiData API. See accessing target (truth) data section below for details.
Percent of Emergency Department visits with a specified pathogen (COVID-19, Influenza, and Respiratory Syncytial Virus) out of all emergency department visits in a given epiweek are reported by the CDC National Syndromic Surveillance Program (NSSP) and provided in the 2023 Respiratory Virus Response - NSSP Emergency Department Visits dataset. Weekly national-level Emergency Department Visits for these pathogens are available for download by age-group in this landing page.
Additional historical influenza surveillance data from other surveillance systems are available at https://www.cdc.gov/flu/weekly/fluviewinteractive.htm. These data are updated every Friday at noon Eastern Time. The "cdcfluview" R package can be used to retrieve these data. Additional potential data sources are available in Carnegie Mellon University's Epidata API.
The hospitalization target data is computed based on the totalconfflunewadm
field which provides the new weekly admissions with a confirmed diagnosis of influenza.
For each horizon of predictions, we will use the specification of epidemiological weeks (EWs) defined by the US CDC which run Sunday through Saturday.There are standard software packages to convert from dates to epidemic weeks and vice versa (e.g. MMWRweek for R and pymmwr and epiweeks for Python).
Here are a few additional resources that describe these hospitalization data:
- data dictionary for the dataset
- the official document describing the guidance for hospital reporting
While we make efforts to create clean versions of the weekly target data, these should be seen as secondary sources to the original data at the HHS Protect site. National hospitalization, i.e., US, data are constructed from these data by summing the data across all 50 states, Washington DC (DC), and Puerto Rico (PR). Note that due to low counts in additional territories, jurisdictions such as the US Virgin Islands (VI) and American Samoa (AS) will not be included.
Note that reported data are occasionally revised as data are updated. Please see appendix below for information on accessing archived data versions.
A set of comma-separated plain text files are automatically updated every week with the latest observed values for incident hospitalizations. A corresponding CSV file is created in target-data/target-hospital-admissions.csv
.
Version history of NHSN hospitalization data can be accessed through the COVID-19 Reported Patient Impact ad Hospital Capacity by State Timeseries Archive Repository.Our collaborators at the Delphi Group at
CMU have provided resources to make these data (as well as archived versions) available through their Delphi Epidata
API. State-level hospitalization time series data prepared from the old reporting form, as well as prior versions of this time series, are available under the "covidcast" endpoint of the API with the "hhs" data source name and confirmed_admissions_influenza_1d
signal name.
Note: The covidcast "hhs" data source shifts dates so that the time_values reflect date of admissions, not the date after admissions for the time period when data was reported using the previous_day_admission_influenza_confirmed field.
Various other data sets and their version histories, including data sets not available elsewhere, are available and continue to be added to the Epidata API. For influenza this season, these include:
- National Syndromic Surveillance Program data
- Google Symptoms data
- Additional fields for state-level HHS&NHSN hospitalization reporting (old form)
- Facility-level hospitalization reporting (old form)
- Facility lookup
Other data available in the API are listed in the API documentation and signal discovery tool (in development). To access these data, teams can utilize the epidatr R package or epidatpy Python package. Some basic examples of pulling old influenza data are available here.