Educational Data Mining: A Comparative Study to Predict Student Academic Performance Using Deep Neural Network and Other Machine Learning Techniques
Syed Amin Ullah, Institute of Computer Sciences and Information Technology (ICS/IT), The University of Agriculture Peshawar, Pakistan.
Mohib Ullah, Institute of Computer Sciences and Information Technology (ICS/IT), The University of Agriculture Peshawar, Pakistan.
Rafiullah Khan, Institute of Computer Sciences and Information Technology (ICS/IT), The University of Agriculture Peshawar, Pakistan.
Kamran Ullah, Institute of Computer Sciences and Information Technology (ICS/IT), The University of Agriculture Peshawar, Pakistan.
Yasir Ahmed, Institute of Computer Sciences and Information Technology (ICS/IT), The University of Agriculture Peshawar, Pakistan.
Atta Ur Rehman, Institute of Computer Sciences and Information Technology (ICS/IT), The University of Agriculture Peshawar, Pakistan.
Corresponding Author:
Syed Amin Ullah (syedaminullah.kmu@gmail.com)
Abstract:
Educational data mining (EDM) is an emerging discipline that encompasses various techniques to explore and analyze different aspects of educational data to better understand a student’s learning capabilities. It is very useful in analyzing and predicting students’ academic performance. In fact, predicting students’ academic performance has become essential for educational institutions to improve the student’s learning and enhance the quality of education and overall performance of the institutions. This research attempts to predict a student’s academic performance based on previous records. It also attempts to compare the performance of various classification techniques such as Naive Bayes (NB), J48, Random Forest (RF), Hoeffding Tree (HT), Random Tree (RT), Deep Neural Network (DNN), Multi-Layer Perceptron (MLP), Simple Logistic Regression (SL), Logistic Regression (LR), Reduced Error Pruning Tree (REPTR), and Lazy K-Nearest Neighbor (LBK). A five years Bachelor’s degree program dataset obtained from educational institutions in Peshawar, Khyber Pakhtunkhwa, Pakistan has been utilized in the present work. It is observed from the experimental results that all the classifiers have successfully predicted students’ academic performance. The finding indicates that among all the algorithms, Naïve Bayes and Hoeffding Tree algorithm outperformed the rest with an accuracy of 59.45%, recall of 59.5%, and an F-Measure of 57.8%. On the other hand, the J48 classifiers exhibited a precision of 62.4%, which were comparatively higher than other classification algorithms.
Keywords:
Educational Data Mining; Data Mining; Academic Performance; Classification; Machine Learning; Cross-Validation; SSC; HSSC; HEIs