Enhancing Undergraduate Data Science Education Through Structured, Project-Based Learning
How do we move undergraduate data science education beyond textbook exercises to reflect authentic, professional practice? This lightning talk presents a replicable two-project model implemented in our “Data Science Fundamentals” course, which has served over 150 students across multiple semesters. This scaffolded project sequence is designed to build both collaborative and technical skills in parallel.
The model begins with a group-based data visualization project focused on communication: students develop a proposal, provide structured peer feedback, and present their findings. It then transitions to an individual quantitative analysis project emphasizing technical rigor and reproducibility, requiring model training and publication of a formal report on GitHub. This progression is supported by key scaffolding elements, including milestone check-ins and a guest lecture from a university librarian on dataset discovery.
With an 85% average response rate on course evaluations, students report a marked increase in confidence in applying data science techniques and a strong appreciation for the projects’ real-world relevance. The findings demonstrate that this structured sequence effectively prepares students for the multifaceted nature of data science work.
The talk will briefly describe the course structure, share reusable prompts and rubrics, and outline open questions we intend to explore next, such as measuring transfer to later courses and sustaining equitable teamwork at scale. We aim to spark discussion, find collaborators, and gather feedback on adapting this model across diverse institutional contexts.