• SUSANNE PODWORNY Paderborn University
  • SVEN HÜSING Paderborn University
  • CARSTEN SCHULTE Paderborn University




Statistics education research, Jupyter Notebooks, Data science, Reproducible data analysis, Epistemic programming


Aspects of data science surround us in many contexts, for example regarding climate change, air pollution, and other environmental issues. To open the “data-science-black-box” for lower secondary school students we developed a data science project focussing on the analysis of self-collected environmental data. We embed this project in computer science education, which enables us to use a new knowledge-based programming approach for the data analysis within Jupyter Notebooks and the programming language Python. In this paper, we evaluate the second cycle of this project which took place in a ninth-grade computer science class. In particular, we present how the students coped with the professional tool of Jupyter Notebooks for doing statistical investigations and which insights they gained.


Atkinson, R. K., Derry, S. J., Renkl, A., & Wortham, D. (2000). Learning from examples: Instructional principles from the worked examples research. Review of Educational Research, 70(2), 181–214. https://doi.org/10/csm67w

Bargagliotti, A., Franklin, C., Arnold, P., Gould, R., Johnson, S., Perez, L., & Spangler, D. A. (2020). Pre-K–12 guidelines for assessment and instruction in statistics education II (GAISE II). American Statistical Association. https://www.amstat.org/asa/files/pdfs/GAISE/GAISEIIPreK-12_Full.pdf

Ben-Zvi, D. (2018). Three paradigms to develop students’ statistical reasoning. In M. A. Sorto, A. White, & L. Guyot (Eds.), Looking back, looking forward. Proceedings of the Tenth International Conference on Teaching Statistics (ICOTS10), Kyoto, Japan, July 8–13. International Statistical Institute. https://iase-web.org/icots/10/proceedings/pdfs/ICOTS10_2E1.pdf?1531364242

Ben-Zvi, D., Makar, K., & Garfield, J. (Eds.), (2018). International handbook of research in statistics education. Springer.

Bereiter, C. (1980). Development in writing. In L. W. Gregg & E. R. Steinberg (Eds), Cognitive processes in writing (pp. 73–93). Erlbaum.

Biehler, R. (1997). Students’ difficulties in practicing computer-supported data analysis: Some hypothetical generalizations from results of two exploratory studies. In J. Garfield & G. Burrill (Eds), Role of technology in teaching and learning statistics (pp. 169–190). International Statistical Institute.

Biehler, R., Frischemeier, D., & Podworny, S. (2015). Preservice teachers’ reasoning about uncertainty in the context of randomization tests. In A. Zieffler & E. Fry (Eds), Reasoning about uncertainty: Learning and teaching informal inferential reasoning (pp. 129–162). Catalyst Press.

Biehler, R., & Schulte, C. (2018). Perspectives for an interdisciplinary data science curriculum at German secondary schools. In R. Biehler, L. Budde, D. Frischemeier, S. Podworny, C. Schulte & T. Wassong (Eds), Paderborn symposium on data science education at school level 2017: The Collected extended abstracts (pp. 2–14). Universitätsbibliothek Paderborn.

Bilgin, A. A., Newbery, G., & Petcoz, P. (2015). Engaging and motivating students with authentic statistical projects in a capstone unit. In M. A. Sorto (Ed.), Advances in statistics education: Developments, experiences, and assessments. Proceedings of the Satellite Conference of the International Association for Statistical Education (IASE), Rio de Janeiro, July 22–24. https://iase-web.org/documents/papers/sat2015/IASE2015%20Satellite%2028_BILGIN.pdf?1438922661

Burrill, G., & Biehler, R. (2011). Fundamental statistical ideas in the school curriculum and in training teachers. In C. Batanero, G. Burrill, & C. Reading (Eds), Teaching statistics in school mathematics: Challenges for teaching and teacher education. A joint ICMI/IASE study (pp. 57–69). Springer Science+Business Media. https://doi.org/10.1007/978-94-007-1131-0_10

Caspersen, M. E. (2007). Educating novices in the skills of programming. [Doctoral dissertation, University of Aarhus].

Chance, B. (2002). Components of statistical thinking and implications for instruction and assessment. Journal of Statistical Thinking 10(3), 1–14.

Chance, B., Ben-Zvi, D., Garfield, J., & Medina, E. (2007). The role of technology in improving student learning of statistics. Technology Innovations in Statistics Education 1(1), 1–6. https://doi.org/10.5070/T511000026

Cobb, P., Confrey, J., Lehrer, R., & Schauble, L. (2003). Design experiments in educational research. Educational Researcher, 32(1), 9–13.

DiSessa, A. A. (2000). Changing minds: Computers, learning, and literacy. MIT Press.

Fincher, S., & Petre, M. (1998). Project-based learning practices in computer science education. In FIE ’98. Proceedings of the 28th Annual Frontiers in Education Conference: Moving from “teacher-centered” to “learner-centered” education, Tempe, Arizona (pp. 1185–1191). https://doi.org/10.1109/FIE.1998.738607

Garfield, J., & Ben-Zvi, D. (2008). Developing students’ statistical reasoning: Connecting research and teaching practice. Springer.

Gómez-Blancarte, A., & Ortega, A. S. (2018). Research on statistical projects: Looking for the development of statistical literacy, reasoning and thinking. In M. A. Sorto, A. White, & L. Guyot (Eds.), Looking back, looking forward. Proceedings of the Tenth International Conference on Teaching Statistics (ICOTS10), Kyoto, Japan, July 8–14. International Statistical Institute. https://iase-web.org/icots/10/proceedings/pdfs/ICOTS10_1E3.pdf

Guo, P. J., & Seltzer, M. (2012). BURRITO: Wrapping your lab notebook in computational infrastructure. In Proceedings of TaPP 12, 4th USENIX workshop on the theory and practice of provenance. https://www.usenix.org/conference/tapp12/workshop-program/presentation/Guo

Hershkowitz, R., Schwarz, B. B., & Dreyfus, T. (2001). Abstraction in context: Epistemic actions. Journal for Research in Mathematics Education, 32(2), 195–222.

Höper, L., Hüsing, S., Malatyali, H., Schulte, C., & Budde, L. (2021). Methodik für Datenprojekte im Informatikunterricht. [Methods for data projects in computer science classes] LOG IN, 195/196, 37–44.

Hüsing, S. (2021). Epistemic programming: An insight-driven programming concept for data science. 21st Koli Calling: International Conference on Computing Education Research, Article 42. https://doi.org/10.1145/3488042.3490510

Kandel, S., Heer, J., Plaisant, C., Kennedy, J., van Ham, F., Riche, N. H., Weaver, C., Lee, B., Brodbeck, D., & Buono, P. (2011). Research directions in data wrangling: Visualizations and transformations for usable and credible data. Information Visualization, 10(4), 271–288.

Kay, A. (2007). Thoughts about teaching science and mathematics to young children. Viewpoints Research Institute memo. Viewpoints Research Institute.

Knuth, D. E. (1984). Literate programming. The Computer Journal, 27(2), 97–111. https://doi.org/10.1093/comjnl/27.2.97

Krajcik, J. S., & Blumenfeld, P. C. (2006). Project-based learning. In R. K. Sawyer (Ed), The Cambridge handbook of the learning sciences (pp. 317–333). Cambridge University Press.

Kroes, P. & Meijers, A. (2006). The dual nature of technical artefacts. Studies in History and Philosophy of Science, 37, 1–4.

Lesser, L. M. (2007). Critical values and transforming data: Teaching statistics with social justice. Journal of Statistics Education, 15(1), 2–22.

Lopez, M. Lisette, Wilkerson, Michelle H, & Gutiérrez, K. (2020). Contextualizing, historicizing, and re-authoring data-as-text in the middle school science classroom. The Interdisciplinarity of the Learning Sciences, 14th International Conference of the Learning Sciences (ICLS), Nashville, Tennessee. https://par.nsf.gov/biblio/10209452

Makar, K., & Fielding-Wells, F. (2011). Teaching teachers to teach statistical investigations. In C. Batanero, G. Burrill, & C. Reading (Eds.). Teaching statistics in school mathematics: Challenges for teaching and teacher education. A Joint ICMI/IASE Study, the 18th ICMI Study (pp. 347–358). Springer.

McNamara, A. (2015). Bridging the gap between tools for learning and for doing statistics. [Doctoral dissertation, University of California].

McNamara, A. (2019). Key attributes of a modern statistical computing tool. The American Statistician, 73(4), 375–384.

Mayring, P. (2015). Qualitative content analysis: Theoretical background and procedures. In A. Bikner-Ahsbahs, C. Knipping, & N. Presmeg (Eds.), Approaches to qualitative research in mathematics education (pp. 365–380). Springer.

Nagpal, A., & Gabrani, G. (2019). Python for data analytics, scientific and technical applications. In Proceedings of the 2019 Amity International Conference on Artificial Intelligence (AICAI) (pp. 140–145). Dubai, United Arab Emirates. https://ieeexplore.ieee.org/document/8701341/

Odden, T. O., & Malthe-Sørenssen, A. (2021). Using computational essays to scaffold professional physics practice. European Journal of Physics, 42(1), 1–22.

Pattis, R. E. (1990). A philosophy and example of CS-1 programming projects. ACM SIGCSE Bulletin, 22(1), 34–39.

Perez, F., & Granger, B. E. (2015). Project Jupyter: Computational narratives as the engine of collaborative data science. https://documents.pub/reader/full/project-jupyter-computational-narratives-as-the-engine-of-

Pfannkuch, M. (2007). Year 11 Students’ informal inferential reasoning: A case study about the interpretation of box plots. International Electronic Journal of Mathematics Education, 2(3), 149–167.

Ridgway, J., Ridgway, R., & Nicholson, J. (2018). Data science for all: A stroll in the foothills. In M. A. Sorto, A. White, & L. Guyot (Eds.), Looking back, looking forward. Proceedings of the Tenth International Conference on Teaching Statistics (ICOTS10), Kyoto, Japan, July 8–13. ISI. https://iase-web.org/icots/10/proceedings/pdfs/ICOTS10_3A1.pdf?1531364253

Rubin, A., & Mokros, J. (2018). Data clubs for middle school youth: Engaging young people in data science. In M. A. Sorto, A. White, & L. Guyot (Eds.), Looking back, looking forward. Proceedings of the Tenth International Conference on Teaching Statistics (ICOTS10), Kyoto, Japan, July 8–13. ISI. https://iase-web.org/icots/10/proceedings/pdfs/ICOTS10_9B2.pdf?1531364299

Rule, A., Tabard, A., & Hollan, J. D. (2018). Exploration and explanation in computational notebooks. Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 1–12.

Sandve, G. K., Nekrutenko, A., Taylor, J., & Hovig, E. (2013). Ten simple rules for reproducible computational research. PLoS Computational Biology, 9(10), 1–4.

Schulte, C. (2013). Reflections on the Role of Programming in Primary and Secondary Computing Education. Proceedings of the 8th Workshop in Primary and Secondary Computing Education, 17–24. https://doi.org/10.1145/2532748.2532754

Schulte, C., & Budde, L. (2018). A framework for computing education: Hybrid interaction system: The need for a bigger picture in computing education. 18th Koli Calling: International Conference on Computing Education Research, Article 12. https://doi.org/10.1145/3279720.3279733

Shearer, C. (2000). The CRISP-DM model: The new blueprint for data mining. Journal of Data Warehousing, 5(4), 13–22.

Strohmaier, A., Vogel, F., & Reiss, K. M. (2018). Collaborative epistemic writing and writing-to-learn in mathematics: Can it foster mathematical argumentation competence? RISTAL: Research in Subject-matter Teaching and Learning, 1(1), 135–149. https://doi.org/10.23770/rt1817

Tedre, M., & Apiola, M. (2013). Three computing traditions in school computing education. In D. M. Kadijevich, C. Angeli, & C. Schulte (Eds.), Improving computer science education (pp. 100–116). Routledge.

Tissenbaum, M., Sheldon, J., & Abelson, H. (2019). From computational thinking to computational action. Communications of the ACM, 62(3), 34–36.

Verhoeven, P. S. (2013). Engaging students in statistics education: Situated learning in statistics projects. Proceedings of the 59th ISI World Statistics Congress, Hong Kong.

Wild, C. J., & Pfannkuch, M. (1999). Statistical thinking in empirical enquiry. International Statistical Review, 67(3), 223–248.

Wing, J. (2011). Research notebook: Computational thinking: What and why. The link magazine. http://people.cs.vt.edu/~kafura/CS6604/Papers/CT-What-And-Why.pdf

Winkelnkemper, F. (2018). Responsive positioning: A user interface technique based on structured space [Doctoral dissertation, Paderborn University].

Wolfram, S. (2017). What is a computational essay? Stephan Wolfram Writings. https://writings.stephenwolfram.com/2017/11/what-is-a-computational-essay/