Skip to content

Collecting COVID-19 testing data from around the world

Notifications You must be signed in to change notification settings

watronfire/COVID-19-Testing-Results

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crawl cases and save to file

The purpose of this project is to crawl various government websites and collect COVID-19 testing data. This will hopefully aid in quantifying and comparing responses between countries.

Spiders

Currently spiders are available for the following countries/states. This is not neccessarily the same the above table because some data is collected manually.

Country Region Source URL Notes Additional URL Scrap
Bahrain Asia Bahrain Ministry of Health Table is in easily accesible format, but translation isn't easy. Scrapy
Japan Asia covid19japan GitHub Collects data from a number of government sources which I can't read/parse Github
Malaysia Asia Malaysia Ministry of Health Scrapy
Pakistan Asia Pakistan National Institute of Health PDF including testing data from providences is listed here. Manual
Palestine Asia Corona Virus (COVID-19) in Palestine Provide an API for grabbing the current data. Unclear where to get historic data Scrapy
South Korea Asia Coronavirus-Dataset GitHub Had data for tests performed and positive cases up to March 20th. Unclear if still being updated. Github
Vietnam Asia Vietnam Ministry of Health Easily parsable table. Also provide testing information at the state-wide level which isn't utilized at the moment. Scrapy
Costa Rica Central America Costa Rica Ministry of Health Manual
Austria Europe Austria Ministry of Public Affairs List number of cases performed and positive cases for the entire country as well as all federal states. Scrapy
Czech Republic (Czechia) Europe Czech Republic Ministry of Health Current cases at link which can be scrapped. Past data pulled from wikipedia 2020 coronavirus pandemic in the Czech Republic Scrapy
Estonia Europe Estonia Government Tests performed and positive tests provided, but historic data and deaths grabbed manually from interactive application. CoronaCard Scrapy
Finland Europe Finland Public Health Institute Current data is presented on this webpage, historical data is probably available in daily press releases. Scrapy
Greece Europe 2020 coronavirus pandemic in Greece Information released by greek government in daily PDFs. Will take values from Wikipedia. Wikipedia
Hungary Europe Hungary Government List total cases and positive cases. Past cases through wayback machine. Scrapy
Iceland Europe Iceland Government Positive cases can be parsed but total tested in only available in the interactive graphs. Provided a download data option though. Manual
Italy Europe COVID-19 GitHub Presidenza del Consiglio dei Ministri is publishing all data on github repository. Github
Latvia Europe Latvia Center for Disease Prevention and Control Official twitter account uploads daily tests results. Haven't found a source for deaths. Scrapy
Lithuania Europe Lithuania Ministry of Health Can parse directly from daily news releases. Historical values were collected from interactive map. Scrapy
Poland Europe @micalrg's Google Doc Polish government is tweeting out daily data which is being recorded by @micalrg. Manual
Portugal Europe Portugal Ministry of Health Releases number of tests performed and positive tests in interactive table. Can't parse with scrapy but will pull manually. Manual
Romania Europe Romania Ministry of Health Data taken from daily afternoon press briefings. Have to translate so might be errors. Manual
United Kingdom Europe UK Government Cummulative test counts are released daily. Data for Northern Ireland and Scotland are also being recorded on @Tomwhite on GitHub covid-19-uk-data GitHub Scrapy/Github
Alberta, Canada North America Alberta Provincial Government Collated test data can be found on website provided. Unable to parse, but can be added manually. Manual
British Columbia, Canada North America British Columbia Center for Disease Control Scrapy
Manitoba, Canada North America Manitoba Government Scrapy
Canada National Lab North America Canada Government Total number of cases doesn't match negative + positive, so difference is recorded as pending. Scrapy
New Brunswick, Canada North America New Brunswick Provincial Government Scrapy
NL, Canada North America Newfoundland and Labrador Government Scrapy
Nova Scotia, Canada North America Nova Scotia Provincial Government Scrapy
NWT, Canada North America Northwest Territories Health and Social Servies Scrapy
Ontario, Canada North America Ontario Provincial Government Scrapy
Quebec, Canada North America Quebec Ministry of Health and Social Services Scrapy
Saskatchewan, Canada North America Saskatchewan Government Scrapy
Yukon, Canada North America Yukon Government Scrapy
USA North America Covid Tracking Project Official sources aren't too good. Will pull from The COVID Tracking Project. Spiders are available for a number of states as backup. Github
Australia Capital Territory Oceania Australia Capital Health Department Scrapy
New South Wales, Australia Oceania NSW Health Department Press briefings are available at the link which are individually grabbed and parse Scrapy
Philippines Oceania Philippines Department of Health Can find negative and positve test results, but not deaths. Need additional source besides interactive maps. Scrapy
--------------------------- --------------- ----------------------- ---------------------------------------------------------------------------------- -------------------------- -------------

Data

Refer to the diagram below to see what dates for what countries are available. Black indicates unavailable, and white indicates available. Updated data is typically added at 9 PM PST. completeness

Dependencies

Configuration

  1. Modify pipeline (covid19/pipeline.py) to specify were to save data.
  2. Run pipeline with covid19/update_all.py script. Countries specified for manual scrapping in the table below must be updated separately.

About

Collecting COVID-19 testing data from around the world

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%