AI Tutor Review: Do OpenAI, Google, and Anthropic’s Student Modes Work?

10

Introduction

The integration of AI into education is accelerating, with major tech companies investing heavily in AI-powered learning tools. OpenAI, Google, and Anthropic have recently unveiled new learning and study versions of their models, designed to function as AI tutors. Mashable tested these tools to assess their effectiveness, considering the broader context of AI’s role in education and potential pitfalls. This review aims to provide insights into whether these tools genuinely assist learning or fall short of expectations.

The Rise of AI Tutors and Contextual Challenges

The introduction of AI tutors reflects a growing trend toward incorporating technology into education, mirroring the adoption of laptops and libraries. However, the “five percent problem” highlights a recurring challenge in education tech design: while highly motivated students can benefit, the majority (around 95%) may not experience substantial improvements. This is partially due to the focus on short-term gains like grades rather than deeper understanding. Moreover, many AI tools lack personalization and context, hindering their ability to cater to individual learning styles and curricula.

Methodology: Testing AI Tutors

To evaluate these tools, Mashable analyzed questions directly from the New York Regents Exam, New York State Common Core Standards, AP exams, and social science curricula from the Southern Poverty Law Center’s Learning for Justice program. Rather than focusing on typical STEM prompts, the review incorporated humanities questions to assess the tools’ capabilities in areas where AI has faced criticism. The goal was to simulate a student’s typical approach: beginning with a request for homework help and allowing the conversation to flow naturally until it became unhelpful.

Key Insights from Experts

Hamsa Bastani, an associate professor at the University of Pennsylvania’s Wharton School, emphasizes that understanding the behavior of average students is crucial. She notes that existing generative AI chatbots are often repurposed rather than specifically designed for learning, leading to inadequate safeguards against revealing answers. Dylan Arena, chief data science and AI officer at McGraw Hill, uses the analogy of retrofitting a modern motor into an older machine, highlighting the challenge of integrating AI into existing educational frameworks and the need for deeper personalization beyond simply asking for user information.

Review of AI Tutors

  • ChatGPT: While adept at practice tests and clarifying grading standards, it frequently provides answers directly and prioritizes rote practice, making it frustrating for complex questions.
  • Gemini: Excels in math instruction and offers options like flashcards and quizzes, but is prone to generating unhelpful assessments and emphasizes practice over comprehension.
  • Claude: Focuses on the learning process rather than perfect marks, making it suitable for social science learners willing to build critical thinking skills, but can be overly Socratic and demanding.

Let’s Get Real: Limitations and Concerns

Despite their varying approaches, all three AI tutors share common limitations:

  • Design: The chatbot format, with its constrained text window and lack of visual elements, is not ideal for learning, particularly when it comes to understanding complex concepts.
  • Personalization: AI tutors lack context about individual students and their curricula, hindering the ability to truly personalize lessons.
  • Social Awareness: AI tutors often lack the flexibility and social awareness of a human teacher, leading to an endless cycle of optimization and difficulty reaching a definitive answer.

McGraw Hill’s Approach: A Safer Alternative

McGraw Hill offers a different approach, integrating AI tools directly into its educational materials. This eliminates the need for student-provided context and offers a more controlled environment. Furthermore, the company’s assessment tool ALEKS provides a safer way to build personal student learning profiles and feed back into AI features.

Conclusion: Chatbots Can’t Replace Great Teachers

While AI tutors may offer some benefits, they cannot replace the expertise and adaptability of a human teacher. Ultimately, the current generation of chatbots struggles to provide a truly personalized and engaging learning experience. More work is needed to create AI tools that address these limitations and align with the social and contextual aspects of effective learning.