Fine-Tuning Open-Source Models as a Viable Alternative to Proprietary LLMs for Explaining Compiler Messages (SIGCSE TS 2026 - Papers)

Wed 18 - Sat 21 February 2026 St. Louis, Missouri, United States

Who

Lorenzo Lee Solano, Charles Koutcheme, Juho Leinonen, Alexandra Vassar, Jake Renzella

Track

SIGCSE TS 2026 Papers

Abstract

Cryptic compiler error messages continue to present a significant barrier for novice programmers, especially in foundational languages like C. Although large language models (LLMs) can generate accurate and comprehensible error explanations, their computational requirements, propensity for over-assistance, and privacy concerns constrain their suitability for widespread adoption in educational tools. This work investigates how Supervised Fine-Tuning (SFT) can enhance the performance of smaller, open-source models when explaining C compiler errors to students in introductory programming courses (CS1/2). We derive a training dataset of 40,000 input-output pairs from CS1/2 student C compiler errors to fine-tune three open-source models: Qwen3-4B, Llama-3.1-8B, and Qwen3-32B. Model performance was assessed through a dual evaluation framework involving expert human reviewers and a large-scale automated analysis of 8,000 responses using an ensemble of models as judges. Our results indicate that SFT significantly improves both expert and LLM-as-judge ratings in smaller open-source models, with reduced gains in the larger model. We analyse the trade-offs between model size and quality, and validate LLM-as-judge by demonstrating inter-rater agreement with experts. Our findings demonstrate that fine-tuning smaller models on high-quality data is a viable strategy for creating specialised pedagogical tools. We provide a replicable methodology for enabling broader access to advanced AI capabilities within educational contexts, especially with smaller, economical models.

Lorenzo Lee Solano

University of New South Wales, Sydney

Australia

Charles Koutcheme