Skip to content

Instantly share code, notes, and snippets.

@psthomas
Last active September 16, 2020 22:17
Show Gist options
  • Save psthomas/e124d02cbe1658b5d0fedc43b9eae628 to your computer and use it in GitHub Desktop.
Save psthomas/e124d02cbe1658b5d0fedc43b9eae628 to your computer and use it in GitHub Desktop.
Comparing microcovid estimated infections to seroprevalence studies
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"import urllib\n",
"\n",
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# New York"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"contents = urllib.request.urlopen(\n",
" \"https://data.covidactnow.org/latest/us/states/NY.OBSERVED_INTERVENTION.timeseries.json\"\n",
").read().decode('utf8')\n",
"nydata = json.loads(contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"https://www.microcovid.org/paper/7-basic-method#step-two-underreporting-factor\n",
"\n",
"We use these multipliers:\n",
"* If the percentage of positive tests is 5% or lower, we suggest a 6x underreporting factor.[4]\n",
"* If the percentage of positive tests is between 5% and 15%, we suggest a 8x factor.\n",
"* If the percentage of positive tests is greater than 15%, we suggest at least a 10x factor. This indicates dangerously little testing in your area compared to the number of infected people.\n"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [],
"source": [
"def get_multiplier(el):\n",
" # Assume a 10x multiplier if no test positivity ratio\n",
" # is available. Mainly because missing data is early\n",
" # during rapid growth phase for NY, Spain.\n",
" if pd.isna(el):\n",
" return 10 #6\n",
" elif el <= 0.05:\n",
" return 6\n",
" elif el <= 0.15:\n",
" return 8\n",
" elif el > 0.15:\n",
" return 10"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>date</th>\n",
" <th>testPositivityRatio</th>\n",
" <th>multiplier</th>\n",
" <th>cumulativeConfirmedCases</th>\n",
" <th>new_cases</th>\n",
" <th>estimated_cases</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2020-03-02</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>1.0</td>\n",
" <td>0.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2020-03-03</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>2.0</td>\n",
" <td>1.0</td>\n",
" <td>10.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2020-03-04</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>11.0</td>\n",
" <td>9.0</td>\n",
" <td>90.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2020-03-05</td>\n",
" <td>0.363636</td>\n",
" <td>10</td>\n",
" <td>22.0</td>\n",
" <td>11.0</td>\n",
" <td>110.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2020-03-06</td>\n",
" <td>0.380282</td>\n",
" <td>10</td>\n",
" <td>44.0</td>\n",
" <td>22.0</td>\n",
" <td>220.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>192</th>\n",
" <td>2020-09-10</td>\n",
" <td>0.008971</td>\n",
" <td>6</td>\n",
" <td>446637.0</td>\n",
" <td>756.0</td>\n",
" <td>4536.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>193</th>\n",
" <td>2020-09-11</td>\n",
" <td>0.009063</td>\n",
" <td>6</td>\n",
" <td>447498.0</td>\n",
" <td>861.0</td>\n",
" <td>5166.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>194</th>\n",
" <td>2020-09-12</td>\n",
" <td>0.009099</td>\n",
" <td>6</td>\n",
" <td>448347.0</td>\n",
" <td>849.0</td>\n",
" <td>5094.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>195</th>\n",
" <td>2020-09-13</td>\n",
" <td>0.009317</td>\n",
" <td>6</td>\n",
" <td>449072.0</td>\n",
" <td>725.0</td>\n",
" <td>4350.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>196</th>\n",
" <td>2020-09-14</td>\n",
" <td>0.009357</td>\n",
" <td>6</td>\n",
" <td>449658.0</td>\n",
" <td>586.0</td>\n",
" <td>3516.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>197 rows × 6 columns</p>\n",
"</div>"
],
"text/plain": [
" date testPositivityRatio multiplier cumulativeConfirmedCases \\\n",
"0 2020-03-02 NaN 10 1.0 \n",
"1 2020-03-03 NaN 10 2.0 \n",
"2 2020-03-04 NaN 10 11.0 \n",
"3 2020-03-05 0.363636 10 22.0 \n",
"4 2020-03-06 0.380282 10 44.0 \n",
".. ... ... ... ... \n",
"192 2020-09-10 0.008971 6 446637.0 \n",
"193 2020-09-11 0.009063 6 447498.0 \n",
"194 2020-09-12 0.009099 6 448347.0 \n",
"195 2020-09-13 0.009317 6 449072.0 \n",
"196 2020-09-14 0.009357 6 449658.0 \n",
"\n",
" new_cases estimated_cases \n",
"0 0.0 0.0 \n",
"1 1.0 10.0 \n",
"2 9.0 90.0 \n",
"3 11.0 110.0 \n",
"4 22.0 220.0 \n",
".. ... ... \n",
"192 756.0 4536.0 \n",
"193 861.0 5166.0 \n",
"194 849.0 5094.0 \n",
"195 725.0 4350.0 \n",
"196 586.0 3516.0 \n",
"\n",
"[197 rows x 6 columns]"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"metrics = pd.DataFrame(nydata['metricsTimeseries'])\n",
"metrics = metrics[['date', 'testPositivityRatio']]\n",
"metrics['date'] = pd.to_datetime(metrics['date'])\n",
"metrics['multiplier'] = metrics['testPositivityRatio'].apply(get_multiplier)\n",
"\n",
"cases = pd.DataFrame(nydata['actualsTimeseries'])\n",
"cases = cases[['date', 'cumulativeConfirmedCases']]\n",
"cases['date'] = pd.to_datetime(cases['date'])\n",
"cases.sort_values(by='date', ascending=True)\n",
"cases.set_index('date', inplace=True)\n",
"cases['new_cases'] = cases['cumulativeConfirmedCases'].diff()\n",
"cases = cases.dropna()\n",
"cases.reset_index(drop=False, inplace=True)\n",
"\n",
"est_cases = metrics.merge(cases, on='date')\n",
"est_cases['estimated_cases'] = est_cases['new_cases'] * est_cases['multiplier']\n",
"est_cases"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>testPositivityRatio</th>\n",
" <th>multiplier</th>\n",
" <th>cumulativeConfirmedCases</th>\n",
" <th>new_cases</th>\n",
" <th>estimated_cases</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>194.000000</td>\n",
" <td>197.000000</td>\n",
" <td>197.000000</td>\n",
" <td>197.000000</td>\n",
" <td>197.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>0.109436</td>\n",
" <td>7.370558</td>\n",
" <td>312442.619289</td>\n",
" <td>2282.522843</td>\n",
" <td>20623.624365</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>0.147683</td>\n",
" <td>1.752475</td>\n",
" <td>148239.338938</td>\n",
" <td>2906.914878</td>\n",
" <td>29992.285098</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>0.007526</td>\n",
" <td>6.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>0.010094</td>\n",
" <td>6.000000</td>\n",
" <td>251608.000000</td>\n",
" <td>626.000000</td>\n",
" <td>3768.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>0.014273</td>\n",
" <td>6.000000</td>\n",
" <td>383591.000000</td>\n",
" <td>779.000000</td>\n",
" <td>4674.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>0.173796</td>\n",
" <td>10.000000</td>\n",
" <td>417056.000000</td>\n",
" <td>2715.000000</td>\n",
" <td>21720.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>0.507358</td>\n",
" <td>10.000000</td>\n",
" <td>449658.000000</td>\n",
" <td>12274.000000</td>\n",
" <td>122740.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" testPositivityRatio multiplier cumulativeConfirmedCases \\\n",
"count 194.000000 197.000000 197.000000 \n",
"mean 0.109436 7.370558 312442.619289 \n",
"std 0.147683 1.752475 148239.338938 \n",
"min 0.007526 6.000000 1.000000 \n",
"25% 0.010094 6.000000 251608.000000 \n",
"50% 0.014273 6.000000 383591.000000 \n",
"75% 0.173796 10.000000 417056.000000 \n",
"max 0.507358 10.000000 449658.000000 \n",
"\n",
" new_cases estimated_cases \n",
"count 197.000000 197.000000 \n",
"mean 2282.522843 20623.624365 \n",
"std 2906.914878 29992.285098 \n",
"min 0.000000 0.000000 \n",
"25% 626.000000 3768.000000 \n",
"50% 779.000000 4674.000000 \n",
"75% 2715.000000 21720.000000 \n",
"max 12274.000000 122740.000000 "
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"est_cases.describe()"
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x116c1e880>"
]
},
"execution_count": 73,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 576x432 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"est_cases.plot(x='date', y=['new_cases', 'estimated_cases', 'multiplier'],\n",
" kind='line', grid=True, logy=True, figsize=(8,6))"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Observed infection rate: 0.013446895403880041\n",
"Estimated infection rate: 0.13439822148757238\n"
]
}
],
"source": [
"# Just assume April 23 was the end date, even though that's when it was reported\n",
"total_observed_cases = est_cases[est_cases['date'] < '2020-04-23']['new_cases'].sum()\n",
"total_estimated_cases = est_cases[est_cases['date'] < '2020-04-23']['estimated_cases'].sum()\n",
"pop = nydata['population']\n",
"print('Observed infection rate:', total_observed_cases/pop)\n",
"print('Estimated infection rate:', total_estimated_cases/pop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"New York state did a preliminary [seroprevalence study](https://twitter.com/NYGovCuomo/status/1253352837255438338) completed on April 23rd that estimated a New York City infection rate of `21.2%` and a statewide rate of `13.9%`. The multiplier approach does really well, estimating `13.4%` seroprevalence at that date. This was early in pandemic, how well does this approach work now that some locations have built up more testing capacity?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Spain"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [],
"source": [
"spainsource = pd.read_csv('https://covid.ourworldindata.org/data/owid-covid-data.csv')"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>date</th>\n",
" <th>new_cases</th>\n",
" <th>positive_rate</th>\n",
" <th>multiplier</th>\n",
" <th>estimated_cases</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>36918</th>\n",
" <td>2020-02-01</td>\n",
" <td>1.0</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>10.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36927</th>\n",
" <td>2020-02-10</td>\n",
" <td>1.0</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>10.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36942</th>\n",
" <td>2020-02-25</td>\n",
" <td>1.0</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>10.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36943</th>\n",
" <td>2020-02-26</td>\n",
" <td>6.0</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>60.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36944</th>\n",
" <td>2020-02-27</td>\n",
" <td>8.0</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>80.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37139</th>\n",
" <td>2020-09-09</td>\n",
" <td>8866.0</td>\n",
" <td>0.101</td>\n",
" <td>8</td>\n",
" <td>70928.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37140</th>\n",
" <td>2020-09-10</td>\n",
" <td>10764.0</td>\n",
" <td>0.104</td>\n",
" <td>8</td>\n",
" <td>86112.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37141</th>\n",
" <td>2020-09-11</td>\n",
" <td>12183.0</td>\n",
" <td>0.107</td>\n",
" <td>8</td>\n",
" <td>97464.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37144</th>\n",
" <td>2020-09-14</td>\n",
" <td>27404.0</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>274040.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37145</th>\n",
" <td>2020-09-15</td>\n",
" <td>9437.0</td>\n",
" <td>NaN</td>\n",
" <td>10</td>\n",
" <td>94370.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>184 rows × 5 columns</p>\n",
"</div>"
],
"text/plain": [
" date new_cases positive_rate multiplier estimated_cases\n",
"36918 2020-02-01 1.0 NaN 10 10.0\n",
"36927 2020-02-10 1.0 NaN 10 10.0\n",
"36942 2020-02-25 1.0 NaN 10 10.0\n",
"36943 2020-02-26 6.0 NaN 10 60.0\n",
"36944 2020-02-27 8.0 NaN 10 80.0\n",
"... ... ... ... ... ...\n",
"37139 2020-09-09 8866.0 0.101 8 70928.0\n",
"37140 2020-09-10 10764.0 0.104 8 86112.0\n",
"37141 2020-09-11 12183.0 0.107 8 97464.0\n",
"37144 2020-09-14 27404.0 NaN 10 274040.0\n",
"37145 2020-09-15 9437.0 NaN 10 94370.0\n",
"\n",
"[184 rows x 5 columns]"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spaindata = spainsource.copy()\n",
"spaindata = spaindata[spaindata.iso_code == 'ESP']\n",
"spaindata = spaindata[['date', 'new_cases', 'positive_rate']]\n",
"spaindata['date'] = pd.to_datetime(spaindata['date'])\n",
"# Get rid of zero days, for plot\n",
"spaindata = spaindata[spaindata['new_cases'] != 0]\n",
"spaindata['multiplier'] = spaindata['positive_rate'].apply(get_multiplier)\n",
"spaindata['estimated_cases'] = spaindata['new_cases']*spaindata['multiplier']\n",
"spaindata"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>new_cases</th>\n",
" <th>positive_rate</th>\n",
" <th>multiplier</th>\n",
" <th>estimated_cases</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>184.000000</td>\n",
" <td>125.000000</td>\n",
" <td>184.000000</td>\n",
" <td>184.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>3278.081522</td>\n",
" <td>0.046296</td>\n",
" <td>7.760870</td>\n",
" <td>28052.423913</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>4477.298175</td>\n",
" <td>0.050818</td>\n",
" <td>1.782548</td>\n",
" <td>39448.516695</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>-713.000000</td>\n",
" <td>0.008000</td>\n",
" <td>6.000000</td>\n",
" <td>-7130.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>395.500000</td>\n",
" <td>0.013000</td>\n",
" <td>6.000000</td>\n",
" <td>2394.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>1359.500000</td>\n",
" <td>0.025000</td>\n",
" <td>8.000000</td>\n",
" <td>8361.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>4701.000000</td>\n",
" <td>0.072000</td>\n",
" <td>10.000000</td>\n",
" <td>45402.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>27404.000000</td>\n",
" <td>0.278000</td>\n",
" <td>10.000000</td>\n",
" <td>274040.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" new_cases positive_rate multiplier estimated_cases\n",
"count 184.000000 125.000000 184.000000 184.000000\n",
"mean 3278.081522 0.046296 7.760870 28052.423913\n",
"std 4477.298175 0.050818 1.782548 39448.516695\n",
"min -713.000000 0.008000 6.000000 -7130.000000\n",
"25% 395.500000 0.013000 6.000000 2394.000000\n",
"50% 1359.500000 0.025000 8.000000 8361.000000\n",
"75% 4701.000000 0.072000 10.000000 45402.500000\n",
"max 27404.000000 0.278000 10.000000 274040.000000"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spaindata.describe()"
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x11797d3a0>"
]
},
"execution_count": 74,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 576x432 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# A bit of a mess because not smoothed, but multiplier appears to be working.\n",
"spaindata.plot(x='date', y=['new_cases', 'estimated_cases', 'multiplier'],\n",
" kind='line', grid=True, logy=True, figsize=(8,6))"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Observed infection rate: 0.00486500339404247\n",
"Estimated infection rate: 0.04692102835710837\n"
]
}
],
"source": [
"# https://en.wikipedia.org/wiki/Demographics_of_Spain\n",
"spainpop = 47007367\n",
"spaincases = spaindata[spaindata['date'] <= '2020-05-13']['new_cases'].sum()\n",
"spainestimates = spaindata[spaindata['date'] <= '2020-05-13']['estimated_cases'].sum()\n",
"print('Observed infection rate:', spaincases/spainpop)\n",
"print('Estimated infection rate:', spainestimates/spainpop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Based on the total reported count, `0.48%` of the nation has been infected as of 5/13/2020, while [seroprevalence](https://www.reuters.com/article/us-health-coronavirus-spain-study/spanish-antibody-study-points-to-5-of-population-affected-by-coronavirus-idUSKBN22P2RP) [says](https://www.vox.com/2020/5/16/21259492/covid-antibodies-spain-serology-study-coronavirus-immunity) `5%`. So there's a 10x undercount of official cases. The model predicts around `4.7%` of Spain has been infected (`3.05%` if you fill in missing early values with 6x multiplier instead of 10x). Overall this seems to perform really well, but I still wonder if it will undercount in places with really good testing capacity? Or will places with good testing capacity not tend to have large outbreaks? "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment