Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı

Pınar Tüfekci; Çetin Mutlu Önal

doi:10.29130/dubited.1287453

Research Article

Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı

Year 2024, Volume: 12 Issue: 1, 307 - 319, 26.01.2024

Pınar Tüfekci Çetin Mutlu Önal

https://doi.org/10.29130/dubited.1287453

Abstract

Gelişen teknoloji sayesinde bilgiye kolay erişim sağlansa da, bu durum kötü amaçlı eylemlerin artışına da sebep olmuştur. Android işletim sistemlerinde sıklıkla rastlanan kötü amaçlı yazılımlar (malware), kullanıcıların cihazındaki verilere erişerek büyük bir tehdit oluşturmaktadır. Bu çalışma, kötü amaçlı yazılımları tespit etmek amacıyla yüksek doğruluklu ve güvenilir bir model geliştirmeyi hedeflemektedir. Modelleme çalışmalarında popüler bir veri seti olan DREBIN-215 Android Malware Dataset kullanılmıştır. Makine Öğrenmesi algoritmaları arasından Support Vector Machines (SVM), Gradient Boosting (GB), Multi Layer Perceptron (MLP), Naïve Bayes (MNB), K-En Yakın Komşu (KNN) ve Random Forest (RF) algoritmaları uygulanmıştır. Algoritmaların performansları, varsayılan parametreler ve GridSearch yöntemiyle elde edilen en iyi hiperparametre değerlerinin kullanılmasıyla değerlendirilmiştir. En başarılı model, SVM algoritmasıyla en iyi hiperparametrelerin uygulanması sonucu %99.07 doğruluk oranıyla elde edilmiştir.

Keywords

Kötü amaçlı yazılım, Makine öğrenmesi, Support Vector Machines, Gradient Boosting, Multi Layer Perceptron.

References

[1] A. T. Kabakuş, İ. A. Doğru and A. Çetin, "Android kötücül yazılım tespit ve koruma sistemleri", Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi, vol. 31, no. 1, pp. 9-16, Feb. 2015.
[2] M. Grace, Y. Zhou, Q. Zhang, S. Zou and X. Jiang, "RiskRanker: scalable and accurate zero-day android malware detection", MobiSys '12: Proceedings of the 10th international conference on Mobile systems, applications, and services, June 2012, Pages 281–294, https://doi.org/10.1145/2307636.2307663
[3] N. Zhang, Y. Tan, C. Yang and Y. Li, "Deep learning feature exploration for Android malware detection", Applied Soft Computing, Volume 102, April 2021, https://doi.org/10.1016/j.asoc.2020.107069
[4] A. Razgallah, R. Khoury, S. Halle and K. Khanmohammadi, "A survey of malware detection in Android apps: Recommendations and perspectives for future research", Computer Science Review, Volume 39, February 2021, https://doi.org/10.1016/j.cosrev.2020.100358
[5] A. Guerra-Manzanares, M. Luckner and H. Bahsi, "Concept drift and cross-device behavior: Challenges and implications for effective android malware detection", Computers & Security, Volume 120, September 2022, https://doi.org/10.1016/j.cose.2022.102757
[6] F. Ou and J. Xu, "S3Feature: A static sensitive subgraph-based feature for android malware detection", Computers & Security, Volume 112, January 2022, https://doi.org/10.1016/j.cose.2021.102513
[7] A. Martin, R. Lara-Cabrera and D. Camacho, "Android malware detection through hybrid features fusion and ensemble classifiers: The AndroPyTool framework and the OmniDroid dataset", Information Fusion, Volume 52, December 2019, Pages 128-142, https://doi.org/10.1016/j.inffus.2018.12.006
[8] D. Vasan, M. Alazab, S. Wassan, H. Naeem, B. Safaei and Q. Zheng, "IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture", Computer Networks, Volume 171, 22 April 2020, https://doi.org/10.1016/j.comnet.2020.107138
[9] A. Ananya, P. Vinod and M. Shojafar, "SysDroid: A Dynamic ML-based Android Malware Analyzer using System Call Traces", Cluster Computing, December 2020, DOI:10.1007/s10586-019-03045-6
[10] K. Lin, X. Xu and F. Xiao, "MFFusion: A Multi-level Features Fusion Model for Malicious Traffic Detection based on Deep Learning", Computer Networks, Volume 202, 15 January 2022, https://doi.org/10.1016/j.comnet.2021.108658
[11] W. W. Lo, S. Layeghy, M. Sarhan, M. Gallagher and M. Portmann, "Graph neural network-based android malware classification with jumping knowledge", 2022 IEEE Conference on Dependable and Secure Computing (DSC) , 1–9, 2022.
[12] L. Onwuzurike, E. Mariconti, P. Andriotis, E. D. Cristofaro, G. J. Ross and G. Stringhini, "Mamadroid: Detecting android malware by building markov chains of behavioral models", ACM Trans. Priv. Secur. 22, 14:1–14:3, 2019. [13] Y. Wu, X. Li, D. Zou, W. Yang, X. Zhang and H. Jin, "Malscan: Fast market-wide mobile malware scanning by social-network centrality analysis", in: 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019, San Diego, CA, USA, IEEE. pp. 139–150, 2019.
[14] P. Xu, C. Eckert and A. Zarras, "Detecting and categorizing android malware with graph neural networks", in: SAC ’21: The 36th ACM/SIGAPP Symposium on Applied Computing, pp. 409–412, 2021.
[15] H. Gao, S. Cheng and W. Zhang, "Gdroid: Android malware detection and classification with graph convolutional network", Comput. Secur. 106, 2021.
[16] M.S. Rana, S. S. M. M. Rahman, and A. H. Sung, "Evaluation of tree based machine learning classifiers for android malware detection", Computational Collective Intelligence: 10th International Conference, ICCCI 2018, Bristol, UK, September 5-7, 2018, Proceedings, Part II 10. Springer International Publishing, 2018.
[17] H. Peng, C. Gates, B. Sarma, N. Li, Y. Qi, R. Potharaju, and I. Molloy, I. "Using probabilistic generative models for ranking risks of android apps", In Proceedings of the 2012 ACM conference on Computer and communications security (pp. 241-252), 2012.
[18] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. E. R. T. Siemens, "Drebin: Effective and explainable detection of android malware in your pocket", In Ndss, Vol. 14, pp. 23-26, 2014.
[19] J. Li, L. Sun, Q. Yan, Z. Li, W. Srisa-an and H. Ye, "Significant Permission Identification for Machine-Learning-Based Android Malware Detection", in IEEE Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3216-3225, July 2018, doi: 10.1109/TII.2017.2789219.
[20] M. Qiao, A. H. Sung and Q. Liu, "Merging Permission and API Features for Android Malware Detection", 2016 5th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), Kumamoto, Japan, 2016, pp. 566-571, doi: 10.1109/IIAI-AAI.2016.237.
[21] A. Aydın , İ. A. Doğru and M. Dörterler , "Makine Öğrenmesi Algoritmalarıyla Android Kötücül Yazılım Uygulamalarının Tespiti", Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, vol. 22, no. 2, pp. 1087-1094, Aug. 2018, doi:10.19113/sdufbed.20066
[22] A. Güngör , İ. Dogru , N. Barışçı and S. Toklu , "Görüntü tabanlı özelliklerden ve makine öğrenmesi yöntemlerinden faydalanılarak kötücül yazılım tespiti", Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, vol. 38, no. 3, pp. 1781-1792, Jan. 2023, doi:10.17341/gazimmfd.994289
[23] A. Utku, İ. A. Doğru and M. A. Akcayol, "Decision tree based android malware detection system", 26th Signal Processing and Communications Applications Conference (SIU), Izmir, Turkey, 2018, pp. 1-4, doi: 10.1109/SIU.2018.8404151.
[24] Z. Liu, R. Wang, N. Japkowicz, H. M. Gomes, B. Peng and W. Zhang, "SeGDroid: An Android malware detection method based on sensitive function call graph learning", Expert Systems with Applications, 2023, https://doi.org/10.1016/j.eswa.2023.121125
[25] S. Yang, Y. Wang, H. Xu, F. Xu and M. Chen, "An Android Malware Detection and Classification Approach Based on Contrastive Lerning", Computers & Security Volume 123, 2022, https://doi.org/10.1016/j.cose.2022.102915
[26] J. Sahs and L. Khan, "A Machine Learning Approach to Android Malware Detection," 2012 European Intelligence and Security Informatics Conference, Odense, Denmark, 2012, pp. 141-147, 2012 doi: 10.1109/EISIC.2012.34.
[27] Ö. Kiraz and İ. A. Doğru, "Android Kötücül Yazılım Tespit Sistemleri İncelemesi", Düzce Üniversitesi Bilim ve Teknoloji Dergisi, vol. 5, no. 1, pp. 281-298, Jan. 2017.
[28] S. Haykin, "Neural Networks: A Comprehensive Foundation", Prentice- Hall, Ontario, 837s, 1999.
[29] J. Friedman, "Greedy Function Approximation: A Gradient Boosting Machine", The Annals of Statistics, 29(5), 11-28, 2000.
[30] V. N. Vapnik "The Nature of Static Learing Theory", Springer, 314s, 2000. [31] J. VanderPlas, "Python Data Science Handbook Essential Tools for Working with Data", O'Reilly Media, 2016.
[32] G. O. Campos, A. Zimek, J. Sander, R.J.G.B. Campello, B. Micenková, E. Schubert, I. Assent and M.E. Houle, "On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study", Data Mining and Knowledge Discovery, vol. 30, no. 4, pp. 891–927, 2016. [33] L. Breiman, "Random Forests", Statistics Department University of California Berkeley, 1- 33, 2001.
[34] B. J. Erickson and F. Kitamura, “Magician's Corner: 9. Performance Metrics for Machine Learning Models”, Radiology. Artificial intelligence vol. 3, 3e, 2021, doi:10.1148/ryai.2021200126.

Using Machine Learning Algorithms for Malware Detection

Year 2024, Volume: 12 Issue: 1, 307 - 319, 26.01.2024

Pınar Tüfekci Çetin Mutlu Önal

https://doi.org/10.29130/dubited.1287453

Abstract

Although advanced technology has facilitated easy access to information, it has also led to an increase in malicious activities. Malware, frequently encountered in Android operating systems, poses a significant threat by accessing users' data on their devices. This study aims to develop a highly accurate and reliable model for detecting malware. The modeling work used the popular and imbalanced dataset, DREBIN-215 Android Malware Dataset. Machine Learning algorithms such as Support Vector Machines (SVM), Gradient Boosting (GB), Multi Layer Perceptron (MLP), Naïve Bayes (MNB), K-Nearest Neighbor (KNN) and Random Forest (RF) were applied. The performance of the algorithms was evaluated using default parameters and the best hyperparameter values obtained through the GridSearch method. The most successful model was achieved with an accuracy rate of 99.07% by applying the best hyperparameters in the SVM algorithm.

Keywords

Malware, Machine learning, Support Vector Machines, Gradient Boosting, Multi Layer Perceptron

References

[1] A. T. Kabakuş, İ. A. Doğru and A. Çetin, "Android kötücül yazılım tespit ve koruma sistemleri", Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi, vol. 31, no. 1, pp. 9-16, Feb. 2015.
[2] M. Grace, Y. Zhou, Q. Zhang, S. Zou and X. Jiang, "RiskRanker: scalable and accurate zero-day android malware detection", MobiSys '12: Proceedings of the 10th international conference on Mobile systems, applications, and services, June 2012, Pages 281–294, https://doi.org/10.1145/2307636.2307663
[3] N. Zhang, Y. Tan, C. Yang and Y. Li, "Deep learning feature exploration for Android malware detection", Applied Soft Computing, Volume 102, April 2021, https://doi.org/10.1016/j.asoc.2020.107069
[4] A. Razgallah, R. Khoury, S. Halle and K. Khanmohammadi, "A survey of malware detection in Android apps: Recommendations and perspectives for future research", Computer Science Review, Volume 39, February 2021, https://doi.org/10.1016/j.cosrev.2020.100358
[5] A. Guerra-Manzanares, M. Luckner and H. Bahsi, "Concept drift and cross-device behavior: Challenges and implications for effective android malware detection", Computers & Security, Volume 120, September 2022, https://doi.org/10.1016/j.cose.2022.102757
[6] F. Ou and J. Xu, "S3Feature: A static sensitive subgraph-based feature for android malware detection", Computers & Security, Volume 112, January 2022, https://doi.org/10.1016/j.cose.2021.102513
[7] A. Martin, R. Lara-Cabrera and D. Camacho, "Android malware detection through hybrid features fusion and ensemble classifiers: The AndroPyTool framework and the OmniDroid dataset", Information Fusion, Volume 52, December 2019, Pages 128-142, https://doi.org/10.1016/j.inffus.2018.12.006
[8] D. Vasan, M. Alazab, S. Wassan, H. Naeem, B. Safaei and Q. Zheng, "IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture", Computer Networks, Volume 171, 22 April 2020, https://doi.org/10.1016/j.comnet.2020.107138
[9] A. Ananya, P. Vinod and M. Shojafar, "SysDroid: A Dynamic ML-based Android Malware Analyzer using System Call Traces", Cluster Computing, December 2020, DOI:10.1007/s10586-019-03045-6
[10] K. Lin, X. Xu and F. Xiao, "MFFusion: A Multi-level Features Fusion Model for Malicious Traffic Detection based on Deep Learning", Computer Networks, Volume 202, 15 January 2022, https://doi.org/10.1016/j.comnet.2021.108658
[11] W. W. Lo, S. Layeghy, M. Sarhan, M. Gallagher and M. Portmann, "Graph neural network-based android malware classification with jumping knowledge", 2022 IEEE Conference on Dependable and Secure Computing (DSC) , 1–9, 2022.
[12] L. Onwuzurike, E. Mariconti, P. Andriotis, E. D. Cristofaro, G. J. Ross and G. Stringhini, "Mamadroid: Detecting android malware by building markov chains of behavioral models", ACM Trans. Priv. Secur. 22, 14:1–14:3, 2019. [13] Y. Wu, X. Li, D. Zou, W. Yang, X. Zhang and H. Jin, "Malscan: Fast market-wide mobile malware scanning by social-network centrality analysis", in: 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019, San Diego, CA, USA, IEEE. pp. 139–150, 2019.
[14] P. Xu, C. Eckert and A. Zarras, "Detecting and categorizing android malware with graph neural networks", in: SAC ’21: The 36th ACM/SIGAPP Symposium on Applied Computing, pp. 409–412, 2021.
[15] H. Gao, S. Cheng and W. Zhang, "Gdroid: Android malware detection and classification with graph convolutional network", Comput. Secur. 106, 2021.
[16] M.S. Rana, S. S. M. M. Rahman, and A. H. Sung, "Evaluation of tree based machine learning classifiers for android malware detection", Computational Collective Intelligence: 10th International Conference, ICCCI 2018, Bristol, UK, September 5-7, 2018, Proceedings, Part II 10. Springer International Publishing, 2018.
[17] H. Peng, C. Gates, B. Sarma, N. Li, Y. Qi, R. Potharaju, and I. Molloy, I. "Using probabilistic generative models for ranking risks of android apps", In Proceedings of the 2012 ACM conference on Computer and communications security (pp. 241-252), 2012.
[18] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. E. R. T. Siemens, "Drebin: Effective and explainable detection of android malware in your pocket", In Ndss, Vol. 14, pp. 23-26, 2014.
[19] J. Li, L. Sun, Q. Yan, Z. Li, W. Srisa-an and H. Ye, "Significant Permission Identification for Machine-Learning-Based Android Malware Detection", in IEEE Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3216-3225, July 2018, doi: 10.1109/TII.2017.2789219.
[20] M. Qiao, A. H. Sung and Q. Liu, "Merging Permission and API Features for Android Malware Detection", 2016 5th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), Kumamoto, Japan, 2016, pp. 566-571, doi: 10.1109/IIAI-AAI.2016.237.
[21] A. Aydın , İ. A. Doğru and M. Dörterler , "Makine Öğrenmesi Algoritmalarıyla Android Kötücül Yazılım Uygulamalarının Tespiti", Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, vol. 22, no. 2, pp. 1087-1094, Aug. 2018, doi:10.19113/sdufbed.20066
[22] A. Güngör , İ. Dogru , N. Barışçı and S. Toklu , "Görüntü tabanlı özelliklerden ve makine öğrenmesi yöntemlerinden faydalanılarak kötücül yazılım tespiti", Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, vol. 38, no. 3, pp. 1781-1792, Jan. 2023, doi:10.17341/gazimmfd.994289
[23] A. Utku, İ. A. Doğru and M. A. Akcayol, "Decision tree based android malware detection system", 26th Signal Processing and Communications Applications Conference (SIU), Izmir, Turkey, 2018, pp. 1-4, doi: 10.1109/SIU.2018.8404151.
[24] Z. Liu, R. Wang, N. Japkowicz, H. M. Gomes, B. Peng and W. Zhang, "SeGDroid: An Android malware detection method based on sensitive function call graph learning", Expert Systems with Applications, 2023, https://doi.org/10.1016/j.eswa.2023.121125
[25] S. Yang, Y. Wang, H. Xu, F. Xu and M. Chen, "An Android Malware Detection and Classification Approach Based on Contrastive Lerning", Computers & Security Volume 123, 2022, https://doi.org/10.1016/j.cose.2022.102915
[26] J. Sahs and L. Khan, "A Machine Learning Approach to Android Malware Detection," 2012 European Intelligence and Security Informatics Conference, Odense, Denmark, 2012, pp. 141-147, 2012 doi: 10.1109/EISIC.2012.34.
[27] Ö. Kiraz and İ. A. Doğru, "Android Kötücül Yazılım Tespit Sistemleri İncelemesi", Düzce Üniversitesi Bilim ve Teknoloji Dergisi, vol. 5, no. 1, pp. 281-298, Jan. 2017.
[28] S. Haykin, "Neural Networks: A Comprehensive Foundation", Prentice- Hall, Ontario, 837s, 1999.
[29] J. Friedman, "Greedy Function Approximation: A Gradient Boosting Machine", The Annals of Statistics, 29(5), 11-28, 2000.
[30] V. N. Vapnik "The Nature of Static Learing Theory", Springer, 314s, 2000. [31] J. VanderPlas, "Python Data Science Handbook Essential Tools for Working with Data", O'Reilly Media, 2016.
[32] G. O. Campos, A. Zimek, J. Sander, R.J.G.B. Campello, B. Micenková, E. Schubert, I. Assent and M.E. Houle, "On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study", Data Mining and Knowledge Discovery, vol. 30, no. 4, pp. 891–927, 2016. [33] L. Breiman, "Random Forests", Statistics Department University of California Berkeley, 1- 33, 2001.
[34] B. J. Erickson and F. Kitamura, “Magician's Corner: 9. Performance Metrics for Machine Learning Models”, Radiology. Artificial intelligence vol. 3, 3e, 2021, doi:10.1148/ryai.2021200126.

There are 31 citations in total.

Details

Primary Language	Turkish
Subjects	Engineering
Journal Section	Articles
Authors	Pınar Tüfekci 0000-0003-4842-2635 Çetin Mutlu Önal 0009-0009-9136-2608
Publication Date	January 26, 2024
Published in Issue	Year 2024 Volume: 12 Issue: 1

Cite

APA	Tüfekci, P., & Önal, Ç. M. (2024). Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı. Düzce Üniversitesi Bilim Ve Teknoloji Dergisi, 12(1), 307-319. https://doi.org/10.29130/dubited.1287453
AMA	Tüfekci P, Önal ÇM. Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı. DUBİTED. January 2024;12(1):307-319. doi:10.29130/dubited.1287453
Chicago	Tüfekci, Pınar, and Çetin Mutlu Önal. “Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı”. Düzce Üniversitesi Bilim Ve Teknoloji Dergisi 12, no. 1 (January 2024): 307-19. https://doi.org/10.29130/dubited.1287453.
EndNote	Tüfekci P, Önal ÇM (January 1, 2024) Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı. Düzce Üniversitesi Bilim ve Teknoloji Dergisi 12 1 307–319.
IEEE	P. Tüfekci and Ç. M. Önal, “Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı”, DUBİTED, vol. 12, no. 1, pp. 307–319, 2024, doi: 10.29130/dubited.1287453.
ISNAD	Tüfekci, Pınar - Önal, Çetin Mutlu. “Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı”. Düzce Üniversitesi Bilim ve Teknoloji Dergisi 12/1 (January 2024), 307-319. https://doi.org/10.29130/dubited.1287453.
JAMA	Tüfekci P, Önal ÇM. Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı. DUBİTED. 2024;12:307–319.
MLA	Tüfekci, Pınar and Çetin Mutlu Önal. “Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı”. Düzce Üniversitesi Bilim Ve Teknoloji Dergisi, vol. 12, no. 1, 2024, pp. 307-19, doi:10.29130/dubited.1287453.
Vancouver	Tüfekci P, Önal ÇM. Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı. DUBİTED. 2024;12(1):307-19.

Download Cover Image

Article Files

Full Text