This program is tentative and subject to change.

Thu 19 Feb 2026 14:40 - 15:00 at Meeting Room 100 - Assessment and Feedback

Cryptic compiler error messages continue to present a significant barrier for novice programmers, especially in foundational languages like C. Although large language models (LLMs) can generate accurate and comprehensible error explanations, their computational requirements, propensity for over-assistance, and privacy concerns constrain their suitability for widespread adoption in educational tools. This work investigates how Supervised Fine-Tuning (SFT) can enhance the performance of smaller, open-source models when explaining C compiler errors to students in introductory programming courses (CS1/2). We derive a training dataset of 40,000 input-output pairs from CS1/2 student C compiler errors to fine-tune three open-source models: Qwen3-4B, Llama-3.1-8B, and Qwen3-32B. Model performance was assessed through a dual evaluation framework involving expert human reviewers and a large-scale automated analysis of 8,000 responses using an ensemble of models as judges. Our results indicate that SFT significantly improves both expert and LLM-as-judge ratings in smaller open-source models, with reduced gains in the larger model. We analyse the trade-offs between model size and quality, and validate LLM-as-judge by demonstrating inter-rater agreement with experts. Our findings demonstrate that fine-tuning smaller models on high-quality data is a viable strategy for creating specialised pedagogical tools. We provide a replicable methodology for enabling broader access to advanced AI capabilities within educational contexts, especially with smaller, economical models.

This program is tentative and subject to change.

Thu 19 Feb

Displayed time zone: Central Time (US & Canada) change

13:40 - 15:00
Assessment and FeedbackPapers at Meeting Room 100
13:40
20m
Talk
Assessing Student Proficiency in Foundational Developer Tools Through Live Checkoffs
Papers
Connor McMahon pc, Lauren Feldman University of North Carolina at Chapel Hill
14:00
20m
Talk
Understanding Student Interaction with AI-Powered Next-Step Hints: Strategies and Challenges
Papers
Anastasiia Birillo JetBrains Research, Aleksei Rostovskii JetBrains Research, Yaroslav Golubev JetBrains Research, Hieke Keuning Utrecht University
14:20
20m
Talk
Personalized Exam Prep (PEP): Scaling No-Stakes, No-LLM Dialogue-Based Assessments in a Large CS Course
Papers
Kelly Cochran pc, Chris Piech Stanford University
14:40
20m
Talk
Fine-Tuning Open-Source Models as a Viable Alternative to Proprietary LLMs for Explaining Compiler Messages
Papers
Lorenzo Lee Solano University of New South Wales, Sydney, Charles Koutcheme Aalto University, Juho Leinonen Aalto University, Alexandra Vassar University of New South Wales, Sydney, Jake Renzella University of New South Wales, Sydney