• MINE CETINKAYA-RUNDEL University of Edinburgh, Duke University, RStudio
  • MINE DOGUCU University of California Irvine
  • WENDY RUMMERFIELD University of California Irvine



Statistics education research, Data science, Teaching statistics, Statistics curriculum, r language


Many data science applications involve generating questions, acquiring data and preparing it for analysis—be it exploratory, inferential, or modeling focused—and communicating findings. Most data science curricula address each of these steps as separate units in a course or as separate courses. Open-ended term projects, on the other hand, allow students to put each of these steps into practice, sequentially and iteratively. In this paper we discuss what we mean by data science projects, why they are crucial in introductory data science courses, who works on these projects and how, when in the term they can be implemented, and where they can be shared.


Allaire, J., Xie, Y., McPherson, J., Luraschi, J., Ushey, K., Atkins, A., … Iannone, R. (2021). RMarkdown: Dynamic documents for R.

Bailey, B., Spence, D. J., & Sinn, R. (2013). Implementation of discovery projects in statistics. Journal of Statistics Education, 21(3), Article 1.

Bell, S. (2010). Project-based learning for the 21st century: Skills for the future. The Clearing House: A Journal of Educational Strategies, Issues and Ideas, 83(2), 39–43.

Bryan, J., STAT 545 TAs, & Hester, J. (2020). Happy Git and GitHub for the useR.

Chang, W., Cheng, J., Allaire, J., Xie, Y., & McPherson, J. (2020). Shiny: Web application framework for R.

Cobb, G. (2015). Mere renovation is too little too late: We need to rethink our undergraduate curriculum from the ground up. The American Statistician, 69(4), 266–282.

Çetinkaya-Rundel, M. (2020, February). Shiny Contest 2020 is here! RStudio.

Çetinkaya-Rundel, M., & Ellison, V. (2021). A fresh look at introductory data science. Journal of Statistics and Data Science Education, 29(S1), S16–S26.

De Veaux, R. D., Agarwal, M., Averett, M., Baumer, B. S., Bray, A., Bressoud, T. C., … Ye, P. (2017). Curriculum guidelines for undergraduate programs in data science. Annual Review of Statistics and Its Application, 4(1), 15–30.

Dogucu, M., & Çetinkaya-Rundel, M. (2021). Web scraping in the statistics and data science curriculum: Challenges and opportunities. Journal of Statistics and Data Science Education, 29(S1), S112–S122.

Fiksel, J., Jager, L. R., Hardin, J. S., & Taub, M. A. (2019). Using GitHub classroom to teach statistics. Journal of Statistics Education, 27(2), 110–119.

GAISE College Report ASA Revision Committee. (2016). Guidelines for assessment and instruction in statistics education (GAISE): College report 2016.

Geier, R., Blumenfeld, P. C., Marx, R. W., Krajcik, J. S., Fishman, B., Soloway, E., & Clay-Chambers, J. (2008). Standardized test outcomes for students engaged in inquiry-based science curricula in the context of urban reform. Journal of Research in Science Teaching, 45(8), 922–939.

GitHub. (2021a). Mastering issues - GitHub guides.

GitHub. (2021b). GitHub pages.

Gould, R., & Çetinkaya-Rundel, M. (2013). Teaching statistical thinking in the data deluge (pp. 377–391). Springer Fachmedien Wiesbaden.

Lazar, N. A., Reeves, J., & Franklin, C. (2011). A capstone course for undergraduate statistics majors. The American Statistician, 65(3), 183–189.

Lu, R., & Bol, L. (2007). A comparison of anonymous versus identifiable e-peer review on college student writing performance and the extent of critical feedback. Journal of Interactive Online Learning, 6(2).

Michaelsen, L., & Sweet, M. (2004). Team-based learning. Sterling.

R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing.

RStudio Team. (2021). RStudio: Integrated development environment for R. RStudio, PBC.

Rundel, C., & Çetinkaya-Rundel, M. (2021). ghclass: Tools for managing classes on GitHub.

Smucker, B. J., & Bailer, A. J. (2015). Beyond normal: Preparing undergraduates for the work force in a statistical consulting capstone. The American Statistician, 69(4), 300–306.

Spurrier, J. D. (2001). A capstone course for undergraduate statistics majors. Journal of Statistics Education, 9(1).

USCLAP. (2021). USCLAP Competition.

Vance, E. (2021). Using team-based learning to teach data science. Journal of Statistics and Data Science Education.

White, D. (2019). A project-based approach to statistics and data science. PRIMUS, 29(9), 997–1038.

Wild, C. J., Pfannkuch, M., Regan, M., & Horton, N. J. (2011). Towards more accessible conceptions of statistical inference. Journal of the Royal Statistical Society: Series A (Statistics in Society), 174(2), 247–295.