Discord
@tycho/l1
Dataset
1
0
Public

Project Tycho Level 1 Data

Please note: Version 2 of Project Tycho is now available.

Please note: Version 2 of Project Tycho is now available. This listing contains Version 1.0.0 (Level 1) data and was last updated 2017-12-07

1. Dataset Content and Format

Project Tycho data include counts of infectious disease cases or deaths per time interval. A count is equivalent to a data point. Project Tycho level 1 data include data counts that have been standardized for a specific, published, analysis. Standardization of level 1 data included representing various types of data counts into a common format and excluding data counts that are not required for the intended analysis. In addition, external data such as population data may have been integrated with disease data to derive rates or for other applications.

Version 1.0.0 of level 1 data includes counts at the state level for smallpox, polio, measles, mumps, rubella, hepatitis A, and whooping cough and at the city level for diphtheria. The time period of data varies per disease somewhere between 1916 and 2010. This version includes cases as well as incidence rates per 100,000 population based on historical population estimates. These data have been used by investigators at the University of Pittsburgh to estimate the impact of vaccination programs in the United States, published in the New England Journal of Medicine: http://www.nejm.org/doi/full/10.1056/NEJMms1215400. See this paper for additional methods and detail about the origin of level 1 version 1.0.0 data.

Level 1 version 1.0.0 data is represented in a CSV file with 7 columns: - epi_week: a six digit number that represents the year and epidemiological week for which disease cases or deaths were reported (yyyyww) - state: the two digit postal code state abbreviation that represents the state for which a count has been reported - loc: the name of a state or city for which a count has been reported, capitalized - loc_type: the type of location (STATE or CITY) for which a count has been reported - disease: the disease for which a count has been reported: HEPATITIS A, MEASLES, MUMPS, PERTUSSIS, POLIO, RUBELLA, SMALLPOX, or DIPHTHERIA - cases: the number of cases reported for the specified disease, epidemiological week, and location - incidence_per_100000: the number of cases per 100,000 people, computed using historical population counts for cities and states as reported by the US Census Bureau

2. Citation

Willem G. van Panhuis, John Grefenstette, Su Yon Jung, Nian Shong Chok, Anne Cross, Heather Eng, Bruce Y Lee, Vladimir Zadorozhny, Shawn Brown, Derek Cummings, Donald S. Burke. Contagious Diseases in the United States from 1888 to the present. NEJM 2013; 369(22): 2152-2158.

3. Contact Information

In case of questions or ideas, please contact Project Tycho via email ([email protected]) or via the website (www.tycho.pitt.edu).