Hacklytics 2020

"Change the world..."

Team Members: Tanishq Sandhu, Hongrui Lyu, Danny Kim, S. Kamreen

What is a major problem affecting our world currently..?

...The Coronavirus (Covid-19)


Step 1: First, we looked for a specific question to investigate:
Does closing the border actually ineffective in containing an epidemic as the CDC claims? Also, out of our own curiosity, does access to healthcare affect the spread of epidemics even with how fast their transmission rate is?

Step 2: Next, we looked for a localized epidemic in the past we could use to analyze and generalize from:
The 2014 Outbreak of Ebola in Sierra Leone


Step 3: We then searched through datasets for data with features that particularly interested us and compiled them into one table and processed/clean them to ready them for our analysis:
* See above for the column headers of our final csv *

Step 4: For our visual analysis, we concluded using Tableau to show movement of Ebola to neighboring countries in conjunction with data of when those countries closed their borders would be most meaningful.

Step 5: For our statistical analysis, we used a Paired T-Test to determine if there is statistical significance between closing borders and epidemic spread. Also, we used the Linear Regression Machine Learning Model to look at correlation between healthcare access and concentration of infected indioviduals.

Visualization- Whether Closing Borders Effects Epidemic Spread (2014)

Statistical Analysis

card image

Machine Learning for Access to Healthcare


We made a linear regression model using Machine Learning to predict if good healthcare has a positive effect on reducing the amount of ebola cases there are. We scored a country’s healthcare system with three level: 0,1,2. These numbers are determined by how many doctors there are per 100,000 people in that population. Countries with a single digit number on the lower spectrum received a 0, while the ones on the higher spectrum received a 1. Countries with 2 digits in their doctors receive a 2. To get the average rate of infection, we subtracted the two rows of the value column from each other, then created a new column and find a mean of those rates for each country. Once we made our new column, we made a scatter plot to visualize our result. Then we added in a line in our graph to show the direction of our graph. We found negative correlation between the quality of healthcare and the number of infection cases. According to your test, we found -.708 correlation between quality of health care and number of infections. This does not mean that this is the cause of the decrease/preventative of new ebola cases, but experiments can be explored to test out this correlation. Nevertheless, it helps support the CDC and WHO suggestion about borders, since this testing shows what an effective way to combat the disease really is. Our last test implies that border has little to no effect on the transmission of disease, but strengthening healthcare, whether its our own or neighbors, could help fight against the disease.

card image View Python Code on Github

card image

Efficacy of Closed Borders in Containing Epidemics


When we started this project, we wanted to test out the CDC and WHO organization suggestion that shutting down can be harmful rather than helpful. The indicator that we use to test this idea is the amount of people infected before and after a country closed its borders. Our population is the all of the infected cases from all 6 countries, while our sample size are the infected cases in each individual country. For the parameters we are testing the mean of all of the entire population before and after the border close, so we decided to use a paired T-Test to run our tests. We determine the two main explanatory variable by looking reports on when a country close down its borders. Our null hypothesis was that border closing has no effect on the spread of disease, while our alternative hypothesis was border closing has an increased risk of having more people sick. We labeled all reports before the border close as the variable before(country name), while we labeled the cases after the closing as the variable after(country name). Once we set up the data, we then proceeded with the test, and our answers were quite interesting, According to the data, our p-value was around 0.07797, higher than our alpha = 0.05. Because of this, we fail to reject the null hypothesis and have insufficient evidence to conclude that closing down borders can increase risk of the disease spreading. However this doesn’t necessarily mean that the CDC and WHO were wrong. Spreading disease was one thing, but other things like access to valuable resources, state relations, and burden to the global economy can be damaged. Our own dataset had its limitations since it did not record every single case of the 2014 Ebola crisis.

card image View R Code on Github

Executive Summary

Based on our analysis, we concluded that governments should reallocate their focus from border control to improving healthcare to contain and prevent epidemic spread internally and externally to neighboring countries. Although methods such as border control may provide instant gratification, investment in healthcare infrastructure will prove to be beneficial in the long term not only in containing epidemics but also preventing them in the first place.

card image