Academic Editors: Zbigniew Kotulski,
Yuanzhang Li, Yu-an Tan, Chen Liang
and Hongfei Zhu
Received: 17 February 2025
Revised: 1 April 2025
Accepted: 4 April 2025
Published: 8 April 2025
Citation:Alhogail, A.; Alharbi, R.A.
Effective ML-Based Android Malware
Detection and Categorization.
Electronics2025,14, 1486.
https://doi.org/10.3390/
electronics14081486
Copyright:© 2025 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license
(https://creativecommons.org/
licenses/by/4.0/).
Article
Effective ML-Based Android Malware Detection and Categorization
Areej Alhogail
1,
*
and Rawan Abdulaziz Alharbi
2
1
STCs Artificial Intelligence Chair, Department of Information Systems, College of Computer and Information
Sciences, King Saud University, Riyadh 11543, Saudi Arabia
2
Department of Information Systems, College of Computer and Information Sciences, King Saud University,
Riyadh 11543, Saudi Arabia;
[email protected]
*Correspondence:
[email protected]
Abstract:The rapid proliferation of malware poses a significant challenge regarding
digital security, necessitating the development of advanced techniques for malware
detection and categorization. In this study, we investigate Android malware detection
and categorization using a two-step machine learning (ML) framework combined with
feature engineering. The proposed framework first performs binary categorization
to detect malware and then applies multi-class categorization to categorize malware
into types, such as adware, banking Trojans, SMS malware, and riskware. Feature
selection techniques such as chi-squared testing and select-from-model (SFM) were
employed to reduce dimensionality and enhance model performance. Various ML
classifiers were evaluated, and the proposed model achieved outstanding accuracy, at
97.82% for malware detection and 96.09% for malware categorization. The proposed
framework outperforms existing approaches, demonstrating the effectiveness of feature
engineering and random forest (RF) models in addressing computational efficiency.
This research contributes a robust and interpretable framework for Android malware
detection that is resource-efficient and practical for use in real-world applications. It
also offers a scalable approach via which practitioners can deploy efficient malware
detection systems. Future work will focus on real-time implementation and adaptive
methodologies to address evolving malware threats.
Keywords:malware detection; malware categorization; malware classification; cybersecurity;
mobile application; mobile security; machine learning
1. Introduction
With the evolution of cybersecurity threats and the measures taken to address
them, malware has remained one of the most alarmingly effective threats against
digital systems globally. “Malware” is an umbrella term for malicious software, which
refers to several harmful programs designed to gain access to, intercept, or harm the
functionality of computer systems and networks. The growth in the number and
complexity of malware attacks calls for effective and reliable techniques for malware
detection and categorization. The crucial role of accurate malware categorization is
beyond doubt. According to the Official Cybercrime Report [1], in 2023, the global
economic losses from cybercrime due to malware abuse were believed to have reached
$8 trillion. Malware attacks lead to expensive losses due to fraud and threats to
personal, national, and infrastructural security.
In today’s rapidly evolving digital landscape, malware is increasingly being
crafted with sophisticated techniques that allow it to bypass traditional detection
Electronics2025,14, 1486 https://doi.org/10.3390/electronics14081486