This codebase is designed to track measures of
Simply put, it represents the mean number of cases generated by an infectious individual. For a number less than 1, this implies less than 1 person, on average, is infected by an infectious individual and therefore leads to a decrease in the spread of the virus. For a number greater than 1, this implies an increasing spread of the virus.
To setup your environment using pip
:
pip install -r requirements.txt
Recommended python
versions: > 3.5
To verify if the codebase can indeed run on your local system, run the following:
cd ./src/
chmod +x *sh
# The shell files below pull in data and shapefiles for the various geographies
./get_rt.sh
./get_county_data.sh
# To calculate Rts for all counties in a state, run the generate_rt.py files
# You can add multiple states to your list
python generate_rt.py --filtered_states DE --output_path='../data/rt_county/'
# To instead calculate Rt for the entire state, add the --state_level_only flag
python generate_rt.py --filtered_states DE --state_level_only --output_path='../data/rt_state/'
# To consolidate the various states you ran above into one file run the commands below
python generate_rt_combine.py --files_path='../data/rt_state/' --state_level
python generate_rt_combine.py --files_path='../data/rt_county/'
# Finally, to create your HTML output plots, run the command below
python generate_plots.py --country_name="USA"
cd ../
If this works correctly, you should see a folder for “USA” created in the output/
folder.
Navigate to ./output/USA/country_county_static.html
to play around locally.
- What geographies are currently supported?
- Currently the US and England are the only supported geographies.
- How do I add a new geography?
- To add a new geography, you require access to Daily Case Values for your country, states and counties (or equivalent geographical breakdowns).
- This codebase has expanded to include England in an updated iteration. You could build a similar pipeline using
this notebook as a kick-off point.
- The codebase takes very long to run. What can I do to speed things up?
- Powerful computation could greatly boost calculations (especially at the county level). Additional tips and tricks can be viewed in the rt-condax.sh file (as an example). Theano’s thread locks limit parallelization. This can be overcome by leveraging the
theano.NOBACKUP
flag. You can read further about Theano flags here
Case Count Data is sourced from: USA Facts (USA) and Data.gov.uk (England).
Population Data sourced from: County Health Rankings (USA) and Data.gov.uk (England)”
This repo is licensed under the MIT license
- Bettencourt, Ribeiro (2008). Real Time Bayesian Estimation of the Epidemic Potentialof Emerging Infectious Diseases. Link here
- Salvatier J., Wiecki T.V., Fonnesbeck C. (2016) Probabilistic programming in Python using PyMC3. PeerJ Computer Science 2:e55 DOI: 10.7717/peerj-cs.55
- Kevin Systrom and Thomas Vladeck (2020). Realtime Rt mcmc. Source Notebook Link here
- Oliphant, T. E. (2006). A guide to NumPy (Vol. 1). Trelgol Publishing USA.
- Kumar, Ravin and Colin, Carroll and Hartikainen, Ari and Martin, Osvaldo A (2019). ArviZ a unified library for exploratory analysis of Bayesian models in Python.
- Plotly Technologies Inc. Collaborative data science. Montréal, QC, 2015. Link: here.
Leave a Reply