Fine-Tuning Open-Source Models as a Viable Alternative to Proprietary LLMs for Explaining Compiler Messages
This program is tentative and subject to change.
Cryptic compiler error messages continue to present a significant barrier for novice programmers, especially in foundational languages like C. Although large language models (LLMs) can generate accurate and comprehensible error explanations, their computational requirements, propensity for over-assistance, and privacy concerns constrain their suitability for widespread adoption in educational tools. This work investigates how Supervised Fine-Tuning (SFT) can enhance the performance of smaller, open-source models when explaining C compiler errors to students in introductory programming courses (CS1/2). We derive a training dataset of 40,000 input-output pairs from CS1/2 student C compiler errors to fine-tune three open-source models: Qwen3-4B, Llama-3.1-8B, and Qwen3-32B. Model performance was assessed through a dual evaluation framework involving expert human reviewers and a large-scale automated analysis of 8,000 responses using an ensemble of models as judges. Our results indicate that SFT significantly improves both expert and LLM-as-judge ratings in smaller open-source models, with reduced gains in the larger model. We analyse the trade-offs between model size and quality, and validate LLM-as-judge by demonstrating inter-rater agreement with experts. Our findings demonstrate that fine-tuning smaller models on high-quality data is a viable strategy for creating specialised pedagogical tools. We provide a replicable methodology for enabling broader access to advanced AI capabilities within educational contexts, especially with smaller, economical models.
This program is tentative and subject to change.
Thu 19 FebDisplayed time zone: Central Time (US & Canada) change
13:40 - 15:00 | |||
13:40 20mTalk | Assessing Student Proficiency in Foundational Developer Tools Through Live Checkoffs Papers | ||
14:00 20mTalk | Understanding Student Interaction with AI-Powered Next-Step Hints: Strategies and Challenges Papers Anastasiia Birillo JetBrains Research, Aleksei Rostovskii JetBrains Research, Yaroslav Golubev JetBrains Research, Hieke Keuning Utrecht University | ||
14:20 20mTalk | Personalized Exam Prep (PEP): Scaling No-Stakes, No-LLM Dialogue-Based Assessments in a Large CS Course Papers | ||
14:40 20mTalk | Fine-Tuning Open-Source Models as a Viable Alternative to Proprietary LLMs for Explaining Compiler Messages Papers Lorenzo Lee Solano University of New South Wales, Sydney, Charles Koutcheme Aalto University, Juho Leinonen Aalto University, Alexandra Vassar University of New South Wales, Sydney, Jake Renzella University of New South Wales, Sydney | ||