A comparative study of deep learning techniques in software fault prediction
##plugins.themes.academic_pro.article.main##
Author
-
Ha Thi Minh PhuongThe University of Danang - Vietnam-Korea University of Information and Communication Technology, VietnamDang Thi Kim NganThe University of Danang - Vietnam-Korea University of Information and Communication Technology, VietnamNguyen Thanh BinhThe University of Danang - Vietnam-Korea University of Information and Communication Technology, Vietnam
Từ khóa:
Tóm tắt
Software fault prediction (SFP) is an important approach in software engineering that ensures software quality and reliability. Prediction of software faults helps developers identify faulty components in software systems. Several studies focus on software metrics which are input into machine learning models to predict faulty components. However, such studies may not capture the semantic and structural information of software that is necessary for building fault prediction models with better performance. Therefore, this paper discusses the effectiveness of deep learning models including Deep Belief Networks (DBN), Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and Long-Short Term Memory (LSTM) that are utilized to construct fault prediction models based on the contextual information. The experiment, which has been conducted on seven Apache datasets, with Precision, Recall, and F1-score are performance metrics. The comparison results show that LSTM and RNN are potential techniques for building highly accurate fault prediction models.
Tài liệu tham khảo
-
[1] Krasner, “The Cost of Poor Software Quality in the US: A 2020 Report”, Consortium for Information & Software Quality, January 2021.
[2] Zhou and L. Lu, “Fault prediction via lstm based on sequence and tree structure”, in 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS). IEEE, 2020, pp. 366–373.
[3] Liang, Y. Yu, L. Jiang, and Z. Xie, “Seml: A semantic lstm model for software fault prediction”, IEEE Access, vol. 7, pp. 83 812–83 824, 2019.
[4] Cai, L. Lu, and S. Qiu, “An abstract syntax tree encoding method for cross-project fault prediction”, IEEE Access, vol. 7, pp. 170 844– 170 853, 2019.
[5] Li, X.-Y. Jing, and X. Zhu, “Progress on approaches to software fault prediction”, IET Software, vol. 12, no. 3, pp. 161–175, 2018.
[6] Nam, S. J. Pan, and S. Kim, “Transfer fault learning”, in 2013 35th International Conference on Software Engineering (ICSE). IEEE, 2013, pp. 382–391.
[7] Gray, D. Bowes, N. Davey, Y. Sun, and B. Christianson, “Using the support vector machine as a classification method for software fault prediction with static code metrics”, in Engineering Applications of Neural Networks: 11th International Conference, EANN 2009, London, UK, August 27-29, 2009. Proceedings 11. Springer, 2009, pp. 223–234.
[8] Wang and W.-h. Li, “Naive Bayes software fault prediction model”, in 2010 International conference on computational intelligence and software engineering. Ieee, 2010, pp. 1–4.
[9] H. Halstead, Elements of Software Science (Operating and programming systems series). Elsevier Science Inc., 1977.
[10] J. McCabe, “A complexity measure”, IEEE Transactions on Software Engineering, no. 4, pp. 308–320, 1976.
[11] R. Chidamber and C. F. Kemerer, “A metrics suite for object-oriented design”, IEEE Transactions on Software Engineering, vol. 20, no. 6, pp. 476–493, 1994.
[12] Harrison, S. J. Counsell, and R. V. Nithi, “An evaluation of the mood set of object-oriented software metrics”, IEEE Transactions on Software Engineering, vol. 24, no. 6, pp. 491–496, 1998.
[13] Loyola and Y. Matsuo, “Learning feature representations from change dependency graphs for fault prediction”, in 2017 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 2017, pp. 361–372.
[14] Jiang, L. Tan, and S. Kim, “Personalized fault prediction”, in 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE). Ieee, 2013, pp. 279–289.
[15] Liu, Y. Zhou, Y. Yang, H. Lu, and B. Xu, “Code churn: A neglected metric in effort-aware just-in-time fault prediction”, in 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE, 2017, pp. 11–19.
[16] Chen, Y. Zhao, Q. Wang, and Z. Yuan, “Multi: Multi-objective effortaware just-in-time software fault prediction”, Information and Software Technology, vol. 93, pp. 1–13, 2018.
[17] Pascarella, F. Palomba, and A. Bacchelli, “Fine-grained just-in-time fault prediction”, Journal of Systems and Software, vol. 150, pp. 22–36, 2019.
[18] -Y. Jing, S. Ying, Z.-W. Zhang, S.-S. Wu, and J. Liu, “Dictionary learning based software fault prediction”, in Proceedings of the 36th international conference on software engineering, 2014, pp. 414–423.
[19] Tong, B. Liu, and S. Wang, “Software fault prediction using stacked denoising autoencoders and two-stage ensemble learning”, Information and Software Technology, vol. 96, pp. 94–111, 2018.
[20] Wang, T. Liu, and L. Tan, “Automatically learning semantic features for fault prediction”, in Proceedings of the 38th International Conference on Software Engineering, 2016, pp. 297–308.
[21] Li, P. He, J. Zhu, and M. R. Lyu, “Software fault prediction via convolutional neural network”, in 2017 IEEE international conference on software quality, reliability and security (QRS). IEEE, 2017, pp. 318–328.
[22] K. Dam, T. Tran, T. Pham, S. W. Ng, J. Grundy, and A. Ghose, “Automatic feature learning for vulnerability prediction”, arXiv preprint arXiv:1708.02368, 2017.
[23] Duan, S. S. Keerthi, W. Chu, S. K. Shevade, and A. N. Poo, “Multicategory classification by soft-max combination of binary classifiers”, in Multiple Classifier Systems: 4th International Workshop, MCS 2003 Guildford, UK, June 11–13, 2003 Proceedings 4. Springer, 2003, pp. 125–134.
[24] Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality”, Advances in neural information processing systems, vol. 26, 2013.
[25] Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space”, arXiv preprint arXiv:1301.3781, 2013.
[26] Abadiet al., “TensorFlow: A System for Large-Scale Machine Learning”, in the Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16), Savannah, GA, USA.
[27] “The Apache software foundation”, com. [Online]. Available: https://github.com/apache [Accessed June 03, 2024].
[28] Jureczko and L. Madeyski, “Towards identifying software project clusters with regard to fault prediction”, in Proceedings of the 6th international conference on predictive models in software engineering, PROMISE ’10. Association for Computing Machinery, New York, NY, USA, 2010.
[29] Wang, W. Zhuang, and X. Zhang, “Software defect prediction based on gated hierarchical LSTMs”, IEEE Transactions on Reliability, vol. 70, pp. 711–727, 2021.
[30] Y. Yu, C. Y. Huang, and N. C. Fang, “Use of Deep Learning Model with Attention Mechanism for Software Fault Prediction,” in Proceedings of 8th International Conference on Dependable Systems and Their Applications (DSA), China, 2021, pp. 161-171.
[31] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed Representations of Words and Phrases and Their Compositionality”, in Proceedings of the 26th International Conference on Neural Information Processing Systems, 2013, pp. 3111–3119.