Researchers publish social media data early for pandemic response

Researchers at Georgia State University published a data set of more than 140 million tweets related to COVID-19 ahead of schedule to serve as a resource for the global research community.
covid keywords
These are the most common keyword phrases that Georgia State researchers found in tweets related to the pandemic. (Georgia State University)

To help represent the spread and impact of the coronavirus pandemic, researchers at the Georgia State University on Monday released a data set of more than 140 million tweets related to COVID-19 as a resource for the global research community.

The work is part of research that collects and tracks social media chatter to understand mobility patterns during natural disasters, but researchers decided to release their data before finalizing their own results to assist other researchers studying the current pandemic.

“It was a big decision to make to release the data before having a few papers prepared on it, but it is for the common good,” Juan Banda, assistant professor of computer science at UG and lead researchers on the project, said in a press release. “We are all on the same planet together, and any additional data that could be easily available for other researchers to analyze can make the difference. I am a big believer in open-science, and this is definitely a time where it’s important to have the greatest number of eyes on the research.”

The research team began collecting tweets about coronavirus on March 10, using keywords like COVID19, CoronavirusPandemic, COVID-19, 2019nCoV, CoronaOutbreak, coronavirus and WuhanVirus, and now have a data set with more than 140 million tweets dating back to January 1. The data provides new insight into the outbreak, the team says, which includes information on travel, displacement, diagnoses, treatment and a historical record of the outbreak’s timing. The work is also hoped to help identify how people are getting and using information related to the pandemic on social media.


“This dataset will allow researchers to investigate the spread of misinformation relating to COVID-19, study the change in population behaviors and sentiments as the virus spreads in different geographic areas and quantify the effects of social distancing efforts and changes in human mobility patterns over course of the pandemic,” Gerardo Chowell, chair of Georgia State’s department of population health sciences, said in the release.

As scientists around the world work to reduce the spread and impact of COVID-19, the researchers hope their work can improve future outcomes and even encourage the public to change its behavior.

“Indirectly, by being able to tackle sources of disinformation and highlight instances of people not following rules, I believe we can get everybody to do their part in flattening the curve, Banda said. “In a future scenario, having this data will allow researchers to be better prepared and build systems to detect community transmission, and devise interventions to not be in the current position we are now.”

Betsy Foresman

Written by Betsy Foresman

Betsy Foresman was an education reporter for EdScoop from 2018 through early 2021, where she wrote about the virtues and challenges of innovative technology solutions used in higher education and K-12 spaces. Foresman also covered local government IT for StateScoop, on occasion. Foresman graduated from Texas Christian University in 2018 — go Frogs! — with a BA in journalism and psychology. During her senior year, she worked as an intern at the Center for Strategic and International Studies in Washington, D.C., and moved back to the capital after completing her degree because, like Shrek, she feels most at home in the swamp. Foresman previously worked at Scoop News Group as an editorial fellow.

Latest Podcasts