An Abandoned Public New York City Dataset Shows Early January 2020 Cases
Whence the specimens though?
A public dataset in New York City Open Data shows that specimen taken from New Yorkers in early January 2020 tested positive for SARS-CoV-2.
The file, named “COVID-19 Outcomes by Testing Cohorts: Cases, Hospitalizations, and Deaths”, was launched by the NYC Department of Health and Mental Hygiene (DOHMH) on April 28, 2020, and reports “outcomes (confirmed cases, hospitalizations, and deaths) for cohorts defined by each date of specimen collection (specimen_date)." Data were last updated on October 2, 2021.
The Testing Cohort file is different from the city’s “COVID-19 Daily Counts of Cases, Hospitalizations, and Deaths,” in that it’s organized first by extraction date (i.e., the date each data point was extracted from what DOHMH called “live disease surveillance database”), followed by the date a specimen was taken. Outcomes are shown by the date a specimen was collected.1
For example, if an NYC resident tested positive for the virus and was later hospitalized, both the test collection data and the hospitalization show under the specimen date — not the date of the hospitalization.2
When were these early specimen added?
As far as I can tell, a specimen from January 2020 was first added to the file on November 12, 2020.
On December 25, 2020, a death was added to the specimen.
Another specimen was added to the January 1, 2020 date, with a hospitalization.
But by January 21, 2021, the January 1, 2020 specimens were removed. [Update: Given the 1/1/2020 specimen was first entered on 11/12/2020, it may be a simple data entry error.]
Other specimens taken the first month of 2020 were added to the database in January 2021, for a total of 233 specimens tested, with 23% confirmed as cases.
Whence these specimens?
According to the dataset dictionary, all the tests and results in the Testing Cohort file were “passively reported to the NYC Health Department by hospital, commercial, and public health laboratories.” No other information about sources is given.
The first case of covid-19 confirmed by New York officials was a 39-year-old female healthcare worker who had returned from a trip to Iran, “tested positive,” and wasn’t hospitalized.
A study published in June 2021 reported positive samples taken in the later weeks of January, from New Yorkers who came to hospitals with influenza-like illness (ILI) but tested negative for what the authors call “routine respiratory pathogens.”3 Retrospective testing of the specimens collected as early as January 25, 2020, identified SARS-CoV-2 RNA. While it’s possible these results were added to the city’s Testing Cohort dataset, they don't account for the early January 2020 specimens.
Could the specimens be from either the CDC or another country testing New Yorkers who were traveling to or from China in the last weeks of 2019 or the outset of 2020?
Dr. Nancy Messonnier, Director of the CDC’s National Center for Immunization and Respiratory Diseases, told reporters in mid-February 2020 that “the CDC and its partners” had screened more than 30,000 passengers from China. However, Messonnier also said the screening began mid-January, which is too late to be those earliest Testing Cohort specimens.
[Update: The January 2020 specimens and cases could also be “Checkuary” mistakes. That is, they are actually from January 2021, but were entered as January 2020 by mistake. Such errors aren’t as good of an excuse for the February 2020 specimens, however.]
NYC DOHMH gives no explanation in the dataset as to why the last extraction is October 2, 2021, or why they stopped updating the file.
Why does it matter?
The earlier SARS-CoV-2 was circulating in New York City, the more questions Americans should have about the city’s cataclysmic death toll.
Despite being billed as highly contagious, evidence of a deadlier-than-flu virus silently spreading is nowhere in mortality data for the five boroughs, in the months & weeks leading up to mid-March 2020.
If specimens from 2019 exist, NYC DOHMH should test those immediately, and release the results to the public.
On 4/12/23, I posted an update about my correspondence with NYC DOHMH about this dataset in a note.
NY DOHMH’s definitions for each data column in the file follow:
extract_date: Date of extraction from live disease surveillance database
specimen_date: Date of specimen collection, equivalent to diagnosis date,
number_tested: Count of NYC residents newly tested for SARS-CoV-2
number_confirmed: For patients with specimens collected for SARS-CoV-2 testing on the given specimen_date, the count of those same patients who were confirmed to be COVID-19 cases
number_hospitalized: For patients with specimens collected for SARS-CoV-2 testing on the given specimen_date and confirmed to be COVID-19 cases, the count of those same patients who were ever hospitalized
number_deaths: For patients with specimens collected for SARS-CoV-2 testing on the given specimen_date and confirmed to be COVID-19 cases, the count of those same patients who died
By contrast, the Daily Counts dataset starts on February 29, 2020, and shows cases, hospitalizations, and death by date of diagnosis, admission, and occurrence, respectively.
Tested with multiplex diagnostic panels. The authors gives BioMerieux FilmArray Respiratory Panel and Cepheid Xpert® Xpress Flu/RSV as examples.
This solidifies my original idea that coronavirus was always here. You can make a case that something happened to cause a 19 strain but most people had a natural immunity to coronavirus. Public health intervention actually made this all worse
There was something bad circulating in SE Wisconsin in the fall of 2019. My whole family got it and it lasted a month for me. Three of us went to doctors (me three times) and nobody was sure what it was