Researchers publish social media data early for pandemic response

Researchers at Georgia State University published a data set of more than 140 million tweets related to COVID-19 ahead of schedule to serve as a resource for the global research community.
covid keywords
These are the most common keyword phrases that Georgia State researchers found in tweets related to the pandemic. (Georgia State University)

To help represent the spread and impact of the coronavirus pandemic, researchers at the Georgia State University on Monday released a data set of more than 140 million tweets related to COVID-19 as a resource for the global research community.

The work is part of research that collects and tracks social media chatter to understand mobility patterns during natural disasters, but researchers decided to release their data before finalizing their own results to assist other researchers studying the current pandemic.

“It was a big decision to make to release the data before having a few papers prepared on it, but it is for the common good,” Juan Banda, assistant professor of computer science at UG and lead researchers on the project, said in a press release. “We are all on the same planet together, and any additional data that could be easily available for other researchers to analyze can make the difference. I am a big believer in open-science, and this is definitely a time where it’s important to have the greatest number of eyes on the research.”

The research team began collecting tweets about coronavirus on March 10, using keywords like COVID19, CoronavirusPandemic, COVID-19, 2019nCoV, CoronaOutbreak, coronavirus and WuhanVirus, and now have a data set with more than 140 million tweets dating back to January 1. The data provides new insight into the outbreak, the team says, which includes information on travel, displacement, diagnoses, treatment and a historical record of the outbreak’s timing. The work is also hoped to help identify how people are getting and using information related to the pandemic on social media.


“This dataset will allow researchers to investigate the spread of misinformation relating to COVID-19, study the change in population behaviors and sentiments as the virus spreads in different geographic areas and quantify the effects of social distancing efforts and changes in human mobility patterns over course of the pandemic,” Gerardo Chowell, chair of Georgia State’s department of population health sciences, said in the release.

As scientists around the world work to reduce the spread and impact of COVID-19, the researchers hope their work can improve future outcomes and even encourage the public to change its behavior.

“Indirectly, by being able to tackle sources of disinformation and highlight instances of people not following rules, I believe we can get everybody to do their part in flattening the curve, Banda said. “In a future scenario, having this data will allow researchers to be better prepared and build systems to detect community transmission, and devise interventions to not be in the current position we are now.”

Latest Podcasts