Duke DataFest Analysis Supports Effectiveness of Social Distancing in Reducing the Spread of COVID-19

This is the second installment in a series of articles profiling winners of the Duke ASA DataFest: COVID-19 Virtual Data Challenge, co-sponsored by the Duke University Department of Statistical Science and the Duke AI Health Institute.

By Rabail Baig

Across the world and in the United States, multiple studies have shown that social distancing is effective at reducing the spread of SARS-CoV-2 both at interpersonal and statewide levels. An early analysis of social distancing in the United States amid the COVID-19 pandemic, presented at this year’s Duke American Statistical Association (ASA) DataFest: COVID-19 Virtual Data Challenge by Duke undergraduates Shannon Houser and Jack Lichtenstein, echoed those findings and won the “Best Visualizations” prize at the contest.

Using data available from Google Mobility Reports, the duo explored how factors such as population density, initial number of positive coronavirus cases per capita, governor’s political affiliation, and official shelter-in-place orders influenced the magnitude of a state’s social distancing early during the COVID-19 pandemic.

“In the early stages of the pandemic, I came across people discussing Google Mobility Reports on my social media feed, where Google aggregates anonymized tracking data from people’s smartphones, providing insights into movement trends of individuals in a particular region or country over time and across different places such as their homes, workplaces, grocery stores, parks and transit stations et cetera,” explained Jack, who thought the dataset was a perfect fit for the DataFest challenge.

“Social distancing was especially interesting to us because we are both from New Jersey and we were under extreme lockdown conditions for a few months when cases were at an all-time high in March,” added Shannon. “We wanted to learn more about which states were social distancing, to what effect, and what factors led to social distancing measures.”

An important step in the team’s analysis was to analyze each state’s social distancing and general preventative measures while controlling for their current case count as well as population size. They clustered the country into four areas based on their population density and confirmed cases as recorded by mobility data aggregated at the end of March. Their analysis focused on identifying trends of social distancing while controlling for these clusters.

The findings from Shannon and Jack’s analysis offer further support for the effectiveness of social distancing. They also reveal that the practice of distancing was not randomly distributed throughout the country in the early days of the pandemic, but was influenced by factors including population density, initial number of cases per capita, the political affiliation of the sitting governor and the stay-at-home orders they did or did not issue, as well as the geographical location of the state.

“Our findings have surprisingly been pretty accurate when assessing the response of the U.S. government to COVID-19, which was very much reactive instead of proactive since the very first cases.” said Shannon and Jack.

In the initial months of the pandemic’s spread in the United States, social distancing and preventative measures were taken seriously only in states that experienced outbreaks of cases, and as social distancing yielded results and began slowing the spread, some states decided to re-open and experienced a resurgence of cases that led in turn to an uptick in the number of deaths.

The team’s analysis also highlighted how partisanship and politics have had an outsize effect on the trajectory of the coronavirus pandemic, which at present has resulted in over 5 million cases and more than 163,000 deaths in the United States.

“The handling of the pandemic was politicized in ways unimaginable back when we were working on the project, so much so that even wearing or not wearing a mask was somehow contrived as a political statement,” said the team. “The politicized nature of the U.S.’s early response was alarming, but the catastrophic results of the politicization are truly heartbreaking.”

About the Researchers

Sharing a joint fondness for applied mathematics, sports, and Chick-fil-A, Shannon and Jack are both New Jersey natives and rising juniors at Duke. While Jack is majoring in data science, Shannon is majoring in statistical science with a double-minor in biology and chemistry. Jack works for the Duke Men’s Basketball Team as a video and data analyst, providing statistical analyses and reports for the team’s coaches. Shannon, who hopes to become a doctor, is currently working in a wet lab at Duke, studying Parkinson disease under Duke’s Laurie Sanders, PhD, and was a finalist for the Environmental Mutagenesis and Genomics Society’s 2020 Young Scientist Awards. She is also doing statistical analysis with Duke’s Project ADAM, investigating the cardiac arrest preparedness of North Carolina schools. Both found the DataFest to be an enlightening and exciting experience.

“It was my very first time using real-world data that isn’t already cleaned and made tidy for you,” said Shannon.

“We put in a lot of elbow grease in pulling different data sources together and cleaning and tidying different datasets to ensure compatibility,” added Jack, who along with Shannon found working on data visualizations very stimulating and later, rewarding.

Founded at UCLA by Dr. Robert Gould, DataFests are now sponsored by the American Statistical Association and held on campuses around the world. The Duke ASA DataFest: COVID-19 Virtual Data Challenge was co-sponsored by the Duke University Department of Statistical Science and the Duke AI Health Institute, and was open to all undergraduate and master’s-level students across Duke, encouraging participants to use data science to explore unique effects of the COVID-19 pandemic on daily life and different aspects of the social fabric of the United States.

“I’m primarily interested in sports analytics and data, and I had never worked with medical or health-related data before the DataFest, so it was a new and interesting experience,” said Jack, who was beyond excited to win in the Best Visualization category.

“It felt really good to be selected in a winning category, especially visualizations which are so important in the field,” said Shannon.

During the summer, Shannon is working on analyzing survey results from Project ADAM and is grateful to be able to work on the beach. Jack is doing a virtual summer internship called Data+, which according to him is “another fantastic opportunity that Duke provides for students to work with real-world data.”

Both miss the campus, their classroom, and their friends.