Duke DataFest Analysis Reveals How COVID-19 Impacts Communities Already Suffering from Health Disparities
This is the first installment in a series of articles profiling winners of the Duke ASA DataFest: COVID-19 Virtual Data Challenge, co-sponsored by the Duke University Department of Statistical Science and the Duke AI Health Institute.
By Rabail Baig
Aside from altering the very fabric of daily life across the United States and the world, the COVID-19 pandemic has exposed the many existing shortcomings and inequities of the American healthcare system. The burgeoning public health crisis has resulted in more than 5 million confirmed cases nationwide and close to 163,000 deaths as of the beginning of August. However, some communities and groups have been disproportionately impacted, as a prize-winning analysis by Duke’s Meredith Brown, Matt Feder, and Pouya Mohammadi, presented at this year’s Duke American Statistical Association (ASA) DataFest: COVID-19 Virtual Data Challenge.
In “Regression Analysis of COVID-19’s Effect on Different Communities”, which won in the “Judge’s Pick” category in the DataFest Challenge, Brown, Feder, and Mohammadi examined how the COVID-19 pandemic is disproportionately affecting low-income communities and people of color in the United States. Combining data collected by the New York Times and the U.S. Census, the researchers found that the proportion of Black persons living in a given county is highly correlated with deaths per capita. They also found that there was a strong correlation between the number of cases in a county and the proportion of county residents living in poverty. And because the relationship is logarithmic, as the number of cases multiply in a county, the deaths per capita in impoverished counties increase at a greater rate than the deaths per capita of wealthier counties.
Although submitted in April of this year, at 5 months into the COVID-19 pandemic, their analysis seems prescient. This is particularly true in light of the emergence of the largest contemporary social movement demanding equal rights, justice and an end to police brutality for Black people, whose fatality rate at the hands of police officers is 2.8 times higher than that of white people and who are also dying from COVID-19 at a rate 2.5 times higher than white people.
“We wanted to treat this topic as not only a public health crisis but also something rooted in systems of inequality that involve access to healthcare, housing, transportation, and so many other interconnected parts of our lives,” noted Meredith, Matt, and Pouya, all of whom are majoring in Computer Science and Statistical Science at Duke. For these students, the contest was their first exposure to using data science to help understand a real-world issue.
Their analysis, built on some of the early information culled from pandemic hotspots such as New York, explored how a person’s social determinants of health could have a substantial effect on their susceptibility to acquiring COVID-19.
“We know that in the United States, there is a large disparity of wealth and resources throughout its citizens. Due to issues like systemic racism, the oppression of minorities, and a lack of healthcare options available to America’s lower income brackets, it is very possible that these communities will be ravaged by the impacts of COVID-19 significantly more than others with greater privileges and resources,” said the team.
Subsequent events have proven them correct. As of June 12, 2020, age-adjusted hospitalization rates were highest among non-Hispanic Black persons followed by Hispanic or Latino persons according to the Centers for Disease Control and Prevention (CDC), with non-Hispanic Black persons approximately 5 times more likely to be hospitalized than non-Hispanic white persons. An NPR analysis further confirmed that in 32 states plus Washington D.C., African Americans are dying at rates higher than their proportion of the population.
“These issues have been present in our country for ages, and the systemic racism has permeated almost every aspect of our lives, from policing to healthcare to housing. The latter two have had huge effects on the discriminatory impacts that the coronavirus has had on the black community,” said the team, expressing their support for the ongoing Black Lives Matter Movement. “The movement is helping to lead this long overdue reckoning on race and helping the country work to dismantle its systems of oppression.”
Their findings underscore the dire need for the government’s public health response to encompass help for marginalized communities of color in meeting basic needs, including food, wage support, and even temporary housing, especially among those exposed to or sick from the virus. As the team’s work shows, focusing on these reducing these disparities and providing access to vital resources is crucial for helping communities respond to the virus effectively in ways that ensure the safety and well-being of everyone.
About the Researchers
While Meredith and Matt are rising juniors, Pouya is a rising senior. All three were batch-mates for the Duke Statistical Science 360 course that focuses on Bayesian methods, and shared a passion for the real-world implications of data science. The group found the DataFest experience challenging yet enthralling.
“One of the most surprising things that we realized during the course of our work was how much focus of solving a problem with data science is on cleaning and organizing the data into a form that we can use,” said the contestants, explaining how in the process of doing so, they were able to reinforce their knowledge of statistical analysis while learning new coding methods that allowed them to create the RShiny application that showcased their work.
Founded at UCLA by Dr. Robert Gould, DataFests are now sponsored by the American Statistical Association and held on campuses around the world. The Duke ASA DataFest: COVID-19 Virtual Data Challenge was co-sponsored by the Duke University Department of Statistical Science and the Duke AI Health Institute, and was open to all undergraduate and master’s-level students across Duke, encouraging participants to use data science to explore unique effects of the COVID-19 pandemic on daily life and different aspects of the social fabric of the United States.
“We had an incredible time participating in DataFest this year, which was an improvement on last year’s event as it allowed us more pace to work on our own pace and there was better communication all around between teammates and organizers thanks to Zoom,” said the contestants.
This summer, Matt is working on a team using data science in cybersecurity techniques to improve the detection of malicious requests on Duke’s servers. Pouya is currently at Amazon working as a software development engineer intern on a team that is developing an open-source machine learning library for time series data. Meredith is doing COVID-19 research as a summer intern with the Duke +DS program, as well as working as a teaching assistant for classes in the Computer Science and Statistical Science Departments.