Sensitivity Analysis and Temporal Stability of Student Success Predictors based on Different Data Sources in Education

Marija Pokos Lukinec[0009-0006-1564-5970], Dijana Oreški [0000-0002-3820-0126], and Dino Vlahek[0000-0002-3911-8685]
University of Zagreb, Faculty of Organization and Informatics, Pavlinska 2, 42000, Varaždin, Croatia
mapokos@foi.hr,
dijana.oreski@foi.hr,
dvlahek@foi.hr
DOI: 10.46793/eLearning2025.058PL

 

Abstract. Student success prediction is a central topic in educational data mining and learning analytics, as institutions increasingly rely on data-driven approaches to enhance learning outcomes. However, the dynamic nature of educational environments raises questions about the long-term reliability of predictive features used in these models. This study aims to investigate the temporal stability of features extracted by sensitivity analysis of predictive models developed by integrating data from various sources, including the e-learning system, student attendance records, teacher opinions, and meteorological data. In this study, the stability of success predictors is modeled using machine learning algorithms – Random Forest and Gradient Boosted Decision Tree. By applying regression metrics, the precision of the model is assessed to determine the reliability of predictive features over time. Identification of the relevant success predictors and their temporal stability rovides insights into significant success predictors in the long term. The results support the development of robust predictive models and highlight key features that contribute to the reliable analysis of student success outcomes.

Keywords: Predictive Data Modeling, Machine Learning, Stability of Success Predictors, Learning Management System, Learning Analytics, Random Forest, Gradient Boosted Tree

Acknowledgment

This paper is supported by Croatian Science Foundation under the project SIMON: Intelligent system for automatic selection and machine learning algorithms in social sciences, UIP-2020-02-6312.

References

  1. Society for Learning Analytics (SoLAR): What is Learning Analytics? Available: https://www.solaresearch.org/about/what-is-learning-analytics/, last accessed 2025/07/01.
  2. Conijn, R., Snijders, C., Kleingeld, A., Matzat, U.: Predicting Student Performance from LMS Data: A Comparison of 17 Blended Courses Using Moodle LMS. In: IEEE Transactions on Learning Technologies, vol. 10, no. 1, pp. 17–29 (2017).
  3. Open-meteo: Free Weather API. Available: https://open-meteo.com/en/docs, last accessed

2025/03/21.

  1. Saqr, M., Fors, U., Tedre, M.: How Learning Analytics Can Early Predict Under-Achieving Students in a Blended Medical Education Course. Medical Teacher 39(7), 757–767 (2017).
  2. Saqr, M., López Pernas, S.: Why Explainable AI May Not Be Enough: Predictions and Mispredictions in Decision Making in Education. Smart Learning Environments 11(52), 1–16 (2024). Available: https://doi.org/10.1186/s40561-024-00343-4.
  3. Abdulkareem Shafiq, D., Marjani, M., Ariyaluran Habeeb, R.A., Asirvatham, D.: Digital Footprints of Academic Success: An Empirical Analysis of Moodle Logs and Traditional Factors for Student Performance. Education Sciences 15(3), 304 (2025). Available: https://doi.org/10.3390/educsci15030304.
  4. Klonkay, P.: Procjena indeksa kvalitete zraka za lebdeće čestice PM10 i PM2,5 grada Zagreba primjenom metoda strojnog učenja (Diplomski rad). Fakultet kemijskog inženjerstva i tehnologije (2024).
  5. Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: CRISP-DM 1.0 – Step-by-Step Data Mining Guide (2000). Available: https://mineracaodedados.files.wordpress.com/2012/12/crisp-dm-1-0.pdf.
  6. Breiman, L.: Random Forests. Machine Learning 45, 5–32 (2001).
  7. Shah, J.: Gradient Boosting. Technical Report, no. 1–6, Computer Science Department, Nirma University, Ahmedabad, India (2020).
  8. Blessed Deho, O., Joksimovic, S., et al.: Should Learning Analytics Models Include Sensitive Attributes? Explaining the Why. IEEE Transactions on Learning Technologies 16(4), 560–571 (2023).
  9. Balabaeva, K., Kovalchuk, S.: Comparison of Temporal and Non-Temporal Features Effect on Machine Learning Models Quality and Interpretability for Chronic Heart Failure Patients. Procedia Computer Science 156, 87–96 (2019).
  10. Teinemaa, I., Dumas, M., et al.: Temporal Stability in Predictive Process Monitoring. Data Mining and Knowledge Discovery 32, 1306–1338 (2018). Available: https://doi.org/10.1007/s10618-018-0575-9.
  11. Chai, T., Draxler, R.R.: Root Mean Square Error (RMSE) or Mean Absolute Error (MAE)? – Arguments Against Avoiding RMSE in the Literature. Geoscientific Model Development 7, 1247–1250 (2014).
  12. Lifesight: What is Normalized Root Mean Square Error (NRMSE)?. Available: https://lifesight.io/glossary/normalized-root-mean-square-error/, last accessed 2025/04/07.
  13. Jalilibal, Z., Amiri, A., et al.: Monitoring the Coefficient of Variation: A Literature Review. Computers & Industrial Engineering 161, 107600 (2021).
  14. Van Rossum, G., Drake, F.L.: Python 3 Reference Manual. Scotts Valley, CA: CreateSpace (2009).

 

Izvor: Proceedings of the 16th International Conference on e-Learning (ELEARNING2025)