Skip to Main content Skip to Navigation
New interface

Outlier Detection as a Tool for Reinforcing Data Analysis and Prediction in Education

Abstract : Educational institutions seek to design effective mechanisms that improve academic results, enhance the learning process, and avoid dropout. The performance analysis and performance prediction of students in their studies may show drawbacks in the educational formations and detect students with learning problems. This induces the task of developing techniques and data-based models which aim to enhance teaching and learning. Classical models usually ignore the students-outliers with uncommon and inconsistent characteristics although they may show significant information to domain experts and affect the prediction models. The outliers in education are barely explored and their impact on the prediction models has not been studied yet in the literature. Thus, the thesis aims to investigate the outliers in educational data and extend the existing knowledge about them. The thesis presents three case studies of outlier detection for different educational contexts and ways of data representation (numerical dataset for the German University, numerical dataset for the Russian University, sequential dataset for French nurse schools). For each case, the data preprocessing approach is proposed regarding the dataset peculiarities. The prepared data has been used to detect outliers in conditions of unknown ground truth. The characteristics of detected outliers have been explored and analysed, which allowed extending the comprehension of students' behaviour in a learning process. One of the main tasks in the educational domain is to develop essential tools which will help to improve academic results and reduce attrition. Thus, plenty of studies aim to build models of performance prediction which can detect students with learning problems that need special help. The second goal of the thesis is to study the impact of outliers on prediction models. The two most common prediction tasks in the educational field have been considered: (i) dropout prediction, (ii) the final score prediction. The prediction models have been compared in terms of different prediction algorithms and the presence of outliers in the training data. This thesis opens new avenues to investigate the students' performance in educational environments. The understanding of outliers and the reasons for their appearance can help domain experts to extract valuable information from the data. Outlier detection might be a part of the pipeline in the early warning systems of detecting students with a high risk of dropouts. Furthermore, the behavioral tendencies of outliers can serve as a basis for providing recommendations for students in their studies or making decisions about improving the educational process.
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Thursday, June 30, 2022 - 5:06:39 PM
Last modification on : Tuesday, July 5, 2022 - 3:49:13 AM
Long-term archiving on: : Monday, October 3, 2022 - 4:02:21 PM


Version validated by the jury (STAR)


  • HAL Id : tel-03710491, version 1


Daria Novoseltseva. Outlier Detection as a Tool for Reinforcing Data Analysis and Prediction in Education. Education. Université Paul Sabatier - Toulouse III, 2022. English. ⟨NNT : 2022TOU30009⟩. ⟨tel-03710491⟩



Record views


Files downloads