Carnegie Mellon students develop AI tool to analyze government reports

Four graduate students at Carnegie Mellon University developed GovScan, a PDF search tool with a ChatGPT-styled interface.
cartoon of magnifying glass over documents
(Getty Images)

A team of four graduate students at Carnegie Mellon University developed a new application that uses generative AI to improve the usability of government reports.

The tool, GovScan, seeks to supercharge PDF search and improve efficiency for government employees, researchers and analysts with a ChatGPT-styled interface that analyzes PDF documents uploaded into its database, according to a video demonstration.

“The tool might not seem all that flashy, but the utility of it against the sheer volume of data is significant,” professor Christopher Goranson, who helped lead the students in the tool’s development, said in a press release. “The team took the time to really understand the challenges facing their partner and then created something that directly addressed the problem.” 

If a policy analyst asks GovScan a question about which states provided child care funding for low-income, single-parent households, the tool scans all PDF reports in its database and provides a list of results that cites sources for further review.


The tool was developed in coordination with government employees who were tasked with reviewing reports for child care funding from all 50 states, each of which contained hundreds of pages. Tasks that used to take workers hours of searching to pinpoint specific data points can now do so in about 30 seconds, according to the release.

“It reduces the cognitive load for researchers,” Aakash Dolas, one of the students who developed GovScan, said in the release. “The saved time and effort free up humans to spend their time and attention on analyzing and understanding the results.”

“It’s not efficiency for efficiency’s sake,” student Tyler Faris said in the release. “It’s efficiency for better decision-making and better management.”

The work is available on GitHub under an MIT open-source license. It includes the code the students created for the query engine and data pipeline that enables GovScan’s operation. According to the release, students are continuing the project’s development.

Latest Podcasts