The 5Ws AND 1H OF TERM PROJECTS IN THE INTRODUCTORY DATA SCIENCE CLASSROOM

Authors

  • MINE CETINKAYA-RUNDEL University of Edinburgh, Duke University, RStudio https://orcid.org/0000-0001-6452-2420
  • MINE DOGUCU University of California Irvine
  • WENDY RUMMERFIELD University of California Irvine

DOI:

https://doi.org/10.52041/serj.v21i2.37

Keywords:

Statistics education research, Data science, Teaching statistics, Statistics curriculum, r language

Abstract

Many data science applications involve generating questions, acquiring data and preparing it for analysis—be it exploratory, inferential, or modeling focused—and communicating findings. Most data science curricula address each of these steps as separate units in a course or as separate courses. Open-ended term projects, on the other hand, allow students to put each of these steps into practice, sequentially and iteratively. In this paper we discuss what we mean by data science projects, why they are crucial in introductory data science courses, who works on these projects and how, when in the term they can be implemented, and where they can be shared.

References

Allaire, J., Xie, Y., McPherson, J., Luraschi, J., Ushey, K., Atkins, A., … Iannone, R. (2021). RMarkdown: Dynamic documents for R. https://github.com/rstudio/rmarkdown

Bailey, B., Spence, D. J., & Sinn, R. (2013). Implementation of discovery projects in statistics. Journal of Statistics Education, 21(3), Article 1. https://doi.org/10.1080/10691898.2013.11889682

Bell, S. (2010). Project-based learning for the 21st century: Skills for the future. The Clearing House: A Journal of Educational Strategies, Issues and Ideas, 83(2), 39–43. https://doi.org/10.1080/00098650903505415

Bryan, J., STAT 545 TAs, & Hester, J. (2020). Happy Git and GitHub for the useR. https://happygitwithr.com

Chang, W., Cheng, J., Allaire, J., Xie, Y., & McPherson, J. (2020). Shiny: Web application framework for R. https://CRAN.R-project.org/package=shiny

Cobb, G. (2015). Mere renovation is too little too late: We need to rethink our undergraduate curriculum from the ground up. The American Statistician, 69(4), 266–282. https://doi.org/10.1080/00031305.2015.1093029

Çetinkaya-Rundel, M. (2020, February). Shiny Contest 2020 is here! RStudio. https://blog.rstudio.com/2020/02/12/shiny-contest-2020-is-here

Çetinkaya-Rundel, M., & Ellison, V. (2021). A fresh look at introductory data science. Journal of Statistics and Data Science Education, 29(S1), S16–S26. https://doi.org/10.1080/10691898.2020.1804497

De Veaux, R. D., Agarwal, M., Averett, M., Baumer, B. S., Bray, A., Bressoud, T. C., … Ye, P. (2017). Curriculum guidelines for undergraduate programs in data science. Annual Review of Statistics and Its Application, 4(1), 15–30. https://doi.org/10.1146/annurev-statistics-060116-053930

Dogucu, M., & Çetinkaya-Rundel, M. (2021). Web scraping in the statistics and data science curriculum: Challenges and opportunities. Journal of Statistics and Data Science Education, 29(S1), S112–S122. https://doi.org/10.1080/10691898.2020.1787116

Fiksel, J., Jager, L. R., Hardin, J. S., & Taub, M. A. (2019). Using GitHub classroom to teach statistics. Journal of Statistics Education, 27(2), 110–119. https://doi.org/10.1080/10691898.2019.1617089

GAISE College Report ASA Revision Committee. (2016). Guidelines for assessment and instruction in statistics education (GAISE): College report 2016. https://www.amstat.org/docs/default-source/amstat-documents/gaisecollege_full.pdf

Geier, R., Blumenfeld, P. C., Marx, R. W., Krajcik, J. S., Fishman, B., Soloway, E., & Clay-Chambers, J. (2008). Standardized test outcomes for students engaged in inquiry-based science curricula in the context of urban reform. Journal of Research in Science Teaching, 45(8), 922–939. https://doi.org/10.1002/tea.20248

GitHub. (2021a). Mastering issues - GitHub guides. https://guides.github.com/features/issues

GitHub. (2021b). GitHub pages. https://docs.github.com/en/github/working-with-github-pages

Gould, R., & Çetinkaya-Rundel, M. (2013). Teaching statistical thinking in the data deluge (pp. 377–391). Springer Fachmedien Wiesbaden. https://doi.org/10.1007/978-3-658-03104-6_27

Lazar, N. A., Reeves, J., & Franklin, C. (2011). A capstone course for undergraduate statistics majors. The American Statistician, 65(3), 183–189. https://doi.org/10.1198/tast.2011.10240

Lu, R., & Bol, L. (2007). A comparison of anonymous versus identifiable e-peer review on college student writing performance and the extent of critical feedback. Journal of Interactive Online Learning, 6(2). https://digitalcommons.odu.edu/cgi/viewcontent.cgi?article=1002&context=efl_fac_pubs

Michaelsen, L., & Sweet, M. (2004). Team-based learning. Sterling. https://digitalcommons.georgiasouthern.edu/ct2-library/199

R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org

RStudio Team. (2021). RStudio: Integrated development environment for R. RStudio, PBC. http://www.rstudio.com

Rundel, C., & Çetinkaya-Rundel, M. (2021). ghclass: Tools for managing classes on GitHub. https://rundel.github.io/ghclass-dev/articles/articles/peer.html

Smucker, B. J., & Bailer, A. J. (2015). Beyond normal: Preparing undergraduates for the work force in a statistical consulting capstone. The American Statistician, 69(4), 300–306. https://doi.org/10.1080/00031305.2015.1077731

Spurrier, J. D. (2001). A capstone course for undergraduate statistics majors. Journal of Statistics Education, 9(1). https://doi.org/10.1080/10691898.2001.11910643

USCLAP. (2021). USCLAP Competition. https://www.causeweb.org/usproc/usclap

Vance, E. (2021). Using team-based learning to teach data science. Journal of Statistics and Data Science Education. https://doi.org/10.1080/26939169.2021.1971587

White, D. (2019). A project-based approach to statistics and data science. PRIMUS, 29(9), 997–1038. https://doi.org/10.1080/10511970.2018.1488781

Wild, C. J., Pfannkuch, M., Regan, M., & Horton, N. J. (2011). Towards more accessible conceptions of statistical inference. Journal of the Royal Statistical Society: Series A (Statistics in Society), 174(2), 247–295. https://doi.org/10.1111/j.1467-985X.2010.00678.x

Downloads

Published

2022-07-04