Utilzyra

The hype around AI tutors is easy to get swept up in. The evidence, however, counsels restraint.

Several studies have found that chatbot tutors can backfire: students lean on them too heavily, receive spoon-fed solutions, and fail to retain the material. Even AI tutors explicitly designed to withhold answers haven't reliably outperformed traditional instruction.

Yet the researchers behind these sobering findings haven't abandoned the field. Many are still experimenting, searching for a design that actually works.

One promising direction has less to do with how an AI tutor explains a concept and more to do with which problems it asks a student to tackle next.

A team at the University of Pennsylvania — including some researchers already on record as AI skeptics — recently tested this idea in a study involving nearly 800 Taiwanese high school students learning Python programming. Every student worked with the same AI tutor, built to guide rather than give away answers. The single variable was sequencing.

Half the students followed a fixed curriculum: problems progressed from easy to hard in a predetermined order. The other half received a personalized sequence, with the AI continuously adjusting problem difficulty based on each student's performance and their interactions with the chatbot.

The approach draws on a well-established principle in education: the "zone of proximal development." Problems that are too easy produce boredom; problems that are too hard produce frustration. The goal is to keep students in a productive middle ground — challenged enough to grow, but not so overwhelmed that they disengage.

Students in the personalized group outperformed their peers on a final exam. The researchers characterized the gap as equivalent to six to nine months of additional schooling — a striking claim for an after-school online course that ran just five months. The AI tutor's creator, Angel Chung, a doctoral student at the Wharton School, was candid about the limitations of that comparison, calling her conversion of statistical units "not a perfect estimate." The draft paper, posted online in March 2026, has not yet undergone peer review.

Even with those caveats, the result offers early evidence that relatively modest design choices — in this case, calibrating problem difficulty to the individual student in real time — can meaningfully affect outcomes.

Chung noted that ChatGPT's responses can already feel deeply personal, since they're generated in direct response to each student's specific questions. But that kind of conversational responsiveness isn't the same as genuine pedagogical personalization. "Students usually don't know what they don't know," she said. "The student doesn't have the ability to ask the right questions to get the best tutoring."

To address that gap, Chung's team paired a large language model with a separate machine-learning algorithm that tracks how students engage with the course platform — how they answer practice questions, how often they revise their code, and the nature of their chatbot conversations — and uses that behavioral data to select the next problem.

How different students interact with the chatbot tutor

Source: Chung et al, Effective Personalized AI Tutors via LLM-Guided Reinforcement Learning, March 2026

The implication is that personalization isn't just about tailoring explanations — it's about tailoring the learning path itself.

That idea predates generative AI by decades. Long before ChatGPT, education researchers built "intelligent tutoring systems" that attempted something similar: model what a student knew and deliver the right next problem accordingly. Those earlier systems couldn't hold a natural conversation, but they could offer hints and instant feedback. Rigorous research showed that well-designed versions produced significant learning gains.

Their persistent weakness was engagement. Students often simply didn't want to use them.

Modern AI tools may help solve that problem. A chatbot that converses in a natural, almost human way could hold students' attention in ways that older systems couldn't.

The University of Pennsylvania study offers some evidence for this. Students in the personalized group spent roughly three additional minutes per problem compared with those in the fixed-sequence group — translating to about an hour more practice per module. The researchers believe this increased engagement, not just the sequencing itself, drove the better outcomes.

Prior knowledge also shaped the results. Students who were new to Python benefited most from personalized sequencing; those with existing Python experience did equally well under either approach. Students from less selective high schools also appeared to gain more from personalization.

How students' background affected results

All students had access to the same AI tutor. The treatment difference compares a personalized sequence of problem difficulty against a fixed sequence progressing from easy to hard. Source: Chung et al, Effective Personalized AI Tutors via LLM-Guided Reinforcement Learning, March 2026

There's an important caveat about who these students were. All of them had voluntarily enrolled in an optional programming course to bolster their college applications. Most were highly motivated, came from educated households, and many already had some coding experience.

Whether a similar approach would work for less motivated students — particularly those who are academically behind and most in need of support — remains an open question.

One potential answer may lie in combining the new with the old. Ken Koedinger, a professor at Carnegie Mellon University and a pioneer of intelligent tutoring systems, is exploring how new AI models can flag struggling students to remote human tutors in real time, enabling a person to step in before a student completely disengages. "We are having more success," Koedinger said.

For now, at least, humans aren't obsolete.

Contact staff writer Jill Barshay at 212-678-3595, jillbarshay.35 on Signal, or [email protected].

This story about AI tutors was produced by The Hechinger Report, a nonprofit, independent news organization focused on education. Sign up for Proof Points and other Hechinger newsletters.

The post The quest to build a better AI tutor appeared first on The Hechinger Report.

How Developers Are Engineering Smarter AI Tutors That Actually Teach

How different students interact with the chatbot tutor

How students' background affected results

Further Reading

Secret Principal: What No Training Can Teach You About Leading a School

Butterfly numbers are dropping but here are five species you may see more of

Man attacked homeless services user with hatchet during incident of 'harrowing violence'