Advertisement

Kids need to experiment with AI

In this contributed Op-Ed, Emelia Probasco argues that to understand the strengths and weaknesses of AI, kids need to get their hands on AI in low-stakes settings.
Listen to this article
0:00
Learn more. This feature uses an automated voice, which may result in occasional errors in pronunciation, tone, or sentiment.
In this photo illustration, the home page for the OpenAI "ChatGPT" app is displayed on a laptop screen on February 03, 2023 in London, England. OpenAI, whose online chatbot ChatGPT made waves when it was debuted in December, announced this week that a commercial version of the service, called ChatGPT Plus, would soon be available to users in the United States. (Photo by Leon Neal/Getty Images)

The president just signed an executive order on AI education for youth, aimed at promoting AI literacy through teacher training, student apprenticeships, and other workforce development efforts. As the administration and schools around the country turn to the task, I have a personal recommendation: promote more AI experiments at science fairs.

My son recently decided to do his middle school science fair project on AI. This was surprising because I study AI, and I can confirm that he is in the prime of his independence-from-mom years. What surprised me more, however, were the results, which were worse than I expected.  

The project posed 60 Trivial Pursuit questions to three major generative AI chatbots: ChatGPT, Claude, and Gemini. He wanted to see which one got the most correct and which ones had hallucinations. Hallucinations are when a generative AI chatbot confidently states an incorrect answer to a question. For example, in response to the Trivial Pursuit question: “What science fiction writer played a part in the development of the radar?” ChatGPT confidently stated: 

“H.G. Wells, primarily known as a science fiction writer, is noted for his forward-thinking ideas that paralleled technological advancements like radar.” 

Advertisement

The correct answer is Arthur C. Clarke.  

I hypothesized that he might get just a couple of hallucinations. Trivial Pursuit questions and answers are fairly standard, and his questions came from the 1980s version of the game. In all likelihood, the questions and answers are buried somewhere in the data sets that created these chatbots. 

It turns out my hypothesis was wrong, which is depressing and embarrassing as both a mom and a researcher. In total, he got 15 hallucinations, an average 9% error rate across the three platforms.

If you, like me, are surprised by this hallucination rate, I think we could be forgiven. AI models hallucinate at different rates depending on the type of question you ask. I don’t typically ask AI trivia questions and so underestimated the trivia-specific error rates, which are upwards of 10%. Indeed, more of us may be familiar with news about hallucination rates of between 0.7% and 2.4%, but those amazing rates only apply in instances where you’ve asked the AI to summarize a given document.

Then the experiment got more interesting. After asking, “How many ballots did it take John F. Kennedy to win the 1960 Democratic presidential nomination?” Gemini responded with: 

Advertisement

“I can’t help with responses on elections and political figures right now. I’m trained to be as accurate as possible but I can make mistakes sometimes. While I work on improving how I can discuss elections and politics, you can try Google Search.”

Gemini could have returned the right answer (ChatGPT and Claude did), but a Google-engineered guardrail prevented it from doing so and returned a “Google it” answer instead. These guardrails exist to prevent the misuse of AI chatbots. Companies, governments, and society have a legitimate interest in preventing chatbots from engaging in certain conversations, like discussing suicide with teenagers. But a guardrail to prevent an answer about the 1960 Democratic nomination? It makes clear how far AI providers can go to make AI conform to political or social pressure.  

Even with the hallucinations and guardrails, all three of these chatbots would have done better than my son or me.  Therein lies one of the lessons of this science project and one that I hope will become a part of school curricula across the country: while AI is pretty great, it is not perfect. 

It is up to each of us to learn about AI’s strengths and weaknesses. And there may be no better way to learn than by trying them out in repercussion-free ways. Reading and knowing about hallucinations is not the same as experiencing them. Most importantly, this experimentation should happen before AI is adopted for tasks with real consequences.  

The side benefit of this experimentation is that it will bring more young people into the ongoing debate about AI controls and regulation. Engineers and corporations are already making choices about what can and cannot be done by their AI creations — like whether to answer questions about election results from the 1960s. More voices are needed to shape and govern AI for the benefit of all.

Advertisement

To my son’s credit, when I asked him about his observations from his project, he offered a key insight. He declared that even though Claude and ChatGPT got more answers correct, he thought Gemini should be the winner. Why? Because Gemini functioned more like a search engine and offered hyperlinks to sources so that he could check the answers. 

I’m glad he is making this discovery when he is young and in a consequences-free setting. I hope he, and a lot of other kids and parents, make many more personal discoveries about the strengths and weaknesses of AI at science fairs in the years ahead. Before the use cases become less trivial.

Emelia Probasco is a senior fellow at Georgetown University’s Center for Security and Emerging Technology (CSET), where she studies the military applications of AI. Her son is a student in the Anne Arundel County Public School system.

Emelia Probasco

Written by Emelia Probasco

Emelia Probasco is a senior fellow at Georgetown University's Center for Security and Emerging Technology (CSET), where she studies the military applications of AI. Her son is a student in the Anne Arundel County Public School system.

Latest Podcasts