ABSTRACT
The widespread occurrence of mobile malware still poses a significant security threat to billions of smartphone users. To counter this threat, several machine learning-based detection systems have been proposed within the last decade. These methods have achieved impressive detection results in many settings, without requiring the manual crafting of signatures. Unfortunately, recent research has demonstrated that these systems often suffer from significant performance drops over time if the underlying distribution changes---a phenomenon referred to as concept drift. So far, however, it is still an open question which main factors cause the drift in the data and, in turn, the drop in performance of current detection systems. To address this question, we present a framework for the in-depth analysis of dataset affected by concept drift. The framework allows gaining a better understanding of the root causes of concept drift, a fundamental stepping stone for building robust detection methods. To examine the effectiveness of our framework, we use it to analyze a commonly used dataset for Android malware detection as a first case study. Our analysis yields two key insights into the drift that affects several state-of-the-art methods. First, we find that most of the performance drop can be explained by the rise of two malware families in the dataset. Second, we can determine how the evolution of certain malware families and even goodware samples affects the classifier's performance. Our findings provide a novel perspective on previous evaluations conducted using this dataset and, at the same time, show the potential of the proposed framework to obtain a better understanding of concept drift in mobile malware and related settings.
- [n. d.]. Adware Dowgin. https://vms.drweb.com/virus/?i=21714828. Accessed: 2023-07-06.Google Scholar
- [n. d.]. Adware Kuguo. https://vms.drweb.com/virus/?i=17938587. Accessed: 2023-07-06.Google Scholar
- [n. d.]. Message Digest class. https://developer.android.com/reference/java/ security/MessageDigest. Accessed: 2023-07-06.Google Scholar
- Kevin Allix, Tegawendé F Bissyandé, Jacques Klein, and Yves Le Traon. 2015. Are your training datasets yet relevant?. In International Symposium on Engineering Secure Software and Systems. Springer, 51--67.Google ScholarCross Ref
- Brandon Amos, Hamilton Turner, and Jules White. 2013. Applying machine learning classifiers to dynamic android malware detection at scale. In 2013 9th international wireless communications and mobile computing conference (IWCMC). IEEE, 1666--1671.Google Scholar
- Marco Ancona, Enea Ceolini, Cengiz Öztireli, and Markus H. Gross. 2017. Towards better understanding of gradient-based attribution methods for Deep Neural Networks. In International Conference on Learning Representations.Google Scholar
- Daniel Arp, Erwin Quiring, Feargus Pendlebury, Alexander Warnecke, Fabio Pierazzi, Christian Wressnegger, Lorenzo Cavallaro, and Konrad Rieck. 2022. Dos and Don'ts of Machine Learning in Computer Security. In 31st USENIX Security Symposium (USENIX Security 22). USENIX Association, Boston, MA, 3971--3988. https://www.usenix.org/conference/usenixsecurity22/presentation/arpGoogle Scholar
- Daniel Arp, Michael Spreitzenbarth, Malte Hübner, Hugo Gascon, and Konrad Rieck. 2014. Drebin: Effective and explainable detection of android malware in your pocket.. In Proc. of the Network and Distributed System Security Symposium (NDSS), Vol. 14. 23--26.Google ScholarCross Ref
- Erin Avllazagaj, Ziyun Zhu, Leyla Bilge, Davide Balzarotti, and Tudor Dumitras. 2021. When Malware Changed Its Mind: An Empirical Study of Variable Program Behaviors in the Real World. In 30th USENIX Security Symposium (USENIX Security 21). USENIX Association, 3487--3504. https://www.usenix.org/conference/ usenixsecurity21/presentation/avllazagajGoogle Scholar
- Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one 10, 7 (2015), e0130140.Google ScholarCross Ref
- Federico Barbero, Feargus Pendlebury, Fabio Pierazzi, and Lorenzo Cavallaro. 2022. Transcending TRANSCEND: Revisiting Malware Classification in the Presence of Concept Drift. In Proc. of the IEEE Symposium on Security and Privacy (S&P). IEEE. https://doi.org/10.1109/SP46214.2022.9833659Google ScholarCross Ref
- Bernhard E. Boser, Isabelle M. Guyon, and Vladimir N. Vapnik. 1992. A Training Algorithm for Optimal Margin Classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory. Association for Computing Machinery, New York, NY, USA, 144--152. https://doi.org/10.1145/130385.130401Google ScholarDigital Library
- Yizheng Chen, Zhoujie Ding, and David Wagner. 2023. Continuous Learning for Android Malware Detection. arXiv:2302.04332 [cs.CR]Google Scholar
- Zhi Chen, Zhenning Zhang, Zeliang Kan, Jacopo Cortellazzi, Feargus Pendlebury, Fabio Pierazzi, Lorenzo Cavallaro, and Gang Wang. 2023. Is It Overkill? Analyzing Feature-Space Concept Drift in Malware Detectors (2023 ed.). IEEE.Google Scholar
- Nello Cristianini and John Shawe-Taylor. 2000. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods (1 ed.). Cambridge University Press.Google ScholarCross Ref
- Stefan Decker. [n. d.]. G DATA Mobile Malware Report: Criminals keep up the pace with Android malware. https://www.gdatasoftware.com/news/2021/ 10/37093-g-data-mobile-malware-report-criminals-keep-up-the-pace-withandroid-malware. Accessed: 2023-06--19.Google Scholar
- 2018. Detecting concept drift in data streams using model explanation. Expert Systems with Applications 92 (2018), 546--559.Google ScholarDigital Library
- Dan Goodin. [n. d.]. Potentially millions of Android TVs and phones come with malware preinstalled. https://arstechnica.com/informationtechnology/2023/05/potentially-millions-of-android-tvs-and-phones-comewith-malware-preinstalled/. Accessed: 2023-06--239.Google Scholar
- Arash Habibi Lashkari Gurdip Kaur. [n. d.]. Understanding Android malware Families: Riskware - is it worth it? https://www.itworldcanada.com/blog/ understanding-android-malware-families-riskware-is-it-worth-it-article4/446692. Accessed: 2023-06--20.Google Scholar
- Joachim Herz, Dudley K Strickland, et al. 2001. LRP: a multifunctional scavenger and signaling receptor. The Journal of clinical investigation 108, 6 (2001), 779--784.Google ScholarCross Ref
- T Ryan Hoens, Robi Polikar, and Nitesh V Chawla. 2012. Learning from streaming data with concept drift and imbalance: an overview. Progress in Artificial Intelligence 1, 1 (2012), 89--101.Google ScholarCross Ref
- Cheng-Yu Hsieh, Chih-Kuan Yeh, Xuanqing Liu, Pradeep Ravikumar, Seungyeon Kim, Sanjiv Kumar, and Cho-Jui Hsieh. 2020. Evaluations and methods for explanation through robustness analysis. arXiv preprint arXiv:2006.00442 (2020).Google Scholar
- Roberto Jordaney, Kumar Sharad, Santanu K Dash, Zhi Wang, Davide Papini, Ilia Nouretdinov, and Lorenzo Cavallaro. 2017. Transcend: Detecting concept drift in malware classification models. In 26th USENIX security symposium (USENIX security 17). 625--642.Google Scholar
- Mina Esmail Zadeh Nojoo Kambar, Armin Esmaeilzadeh, Yoohwan Kim, and Kazem Taghva. 2022. A survey on mobile malware detection methods using machine learning. In 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC). IEEE, 0215--0221.Google ScholarCross Ref
- Zeliang Kan, Feargus Pendlebury, Fabio Pierazzi, and Lorenzo Cavallaro. 2021. Investigating Labelless Drift Adaptation for Malware Detection. In ACM Workshop on Artificial Intelligence and Security (AISec).Google Scholar
- Deqiang Li, Tian Qiu, Shuo Chen, Qianmu Li, and Shouhuai Xu. 2021. Can We Leverage Predictive Uncertainty to Detect Dataset Shift and Adversarial Examples in Android Malware Detection?. In Proc. of the Annual Computer Security Applications Conference (ACSAC). https://doi.org/10.1145/3485832.3485916Google ScholarDigital Library
- Zachary C Lipton. 2018. The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue 16, 3 (2018), 31--57.Google ScholarDigital Library
- Federico Maggi, William Robertson, Christopher Kruegel, and Giovanni Vigna. [n. d.]. Protecting a Moving Target: Addressing Web Application Concept Drift. In Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID). https://doi.org/10.1007/978--3--642-04342-0_2Google ScholarCross Ref
- Francesco Mercaldo and Antonella Santone. 2020. Deep learning for image-based mobile malware detection. Journal of Computer Virology and Hacking Techniques 16, 2 (2020), 157--171.Google ScholarCross Ref
- Michael Mimoso. [n. d.]. Gunpoder Android Malware Hides Malicious Behaviors in Adware. https://threatpost.com/gunpoder-android-malware-hides-maliciousbehaviors-in-adware/113654/. Accessed: 2023-06--19.Google Scholar
- Grégoire Montavon, Alexander Binder, Sebastian Lapuschkin, Wojciech Samek, and Klaus-Robert Müller. 2019. Layer-wise relevance propagation: an overview. Explainable AI: interpreting, explaining and visualizing deep learning (2019), 193-- 209.Google Scholar
- Jose G Moreno-Torres, Troy Raeder, Rocío Alaiz-Rodríguez, Nitesh V Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern recognition 45, 1 (2012), 521--530.Google Scholar
- Annamalai Narayanan, Mahinthan Chandramohan, Lihui Chen, and Yang Liu. 2017. Context-aware, adaptive, and scalable android malware detection through online learning. IEEE Transactions on Emerging Topics in Computational Intelligence 1, 3 (2017), 157--175.Google ScholarCross Ref
- Fairuz Amalina Narudin, Ali Feizollah, Nor Badrul Anuar, and Abdullah Gani. 2016. Evaluation of machine learning classifiers for mobile malware detection. Soft Computing 20 (2016), 343--357.Google ScholarDigital Library
- Feargus Pendlebury, Fabio Pierazzi, Roberto Jordaney, Johannes Kinder, Lorenzo Cavallaro, et al. 2019. TESSERACT: Eliminating experimental bias in malware classification across space and time. In Proceedings of the 28th USENIX Security Symposium. USENIX Association, 729--746.Google Scholar
- Wojciech Samek, Grégoire Montavon, Andrea Vedaldi, Lars Kai Hansen, and Klaus-Robert Müller. 2019. Explainable AI: interpreting, explaining and visualizing deep learning. Vol. 11700. Springer Nature.Google ScholarDigital Library
- Silvia Sebastián and Juan Caballero. 2020. AVclass2: Massive Malware Tag Extraction from AV Labels. In Annual Computer Security Applications Conference. Association for Computing Machinery, New York, NY, USA, 42--53. https://doi.org/10.1145/3427228.3427261Google ScholarDigital Library
- Shawn Shan, Arjun Nitin Bhagoji, Haitao Zheng, and Ben Y Zhao. 2022. Poison forensics: Traceback of data poisoning attacks in neural networks. In 31st USENIX Security Symposium (USENIX Security 22). 3575--3592.Google Scholar
- Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning Important Features through Propagating Activation Differences. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (Sydney, NSW, Australia) (ICML'17). JMLR.org, 3145--3153.Google Scholar
- Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In International conference on machine learning. PMLR, 3319-- 3328.Google Scholar
- Roman Unuchek. [n. d.]. Mobile malware evolution. https://securelist.com/ mobile-malware-evolution-2016/77681/. Accessed: 2022-05--22.Google Scholar
- Zhiqiang Wang, Qian Liu, and Yaping Chi. 2020. Review of android malware detection based on deep learning. IEEE Access 8 (2020), 181102--181126.Google ScholarCross Ref
- Alexander Warnecke, Daniel Arp, Christian Wressnegger, and Konrad Rieck. 2020. Evaluating explanation methods for deep learning in security. In 2020 IEEE european symposium on security and privacy (EuroS&P). IEEE, 158--174.Google Scholar
- Fengguo Wei, Yuping Li, Sankardas Roy, Xinming Ou, and Wu Zhou. 2017. Deep ground truth analysis of current android malware. In Detection of Intrusions and Malware, and Vulnerability Assessment: 14th International Conference, DIMVA 2017, Bonn, Germany, July 6--7, 2017, Proceedings 14. Springer, 252--276.Google Scholar
- Ke Xu, Yingjiu Li, Robert Deng, Kai Chen, and Jiayun Xu. 2019. Droidevolver: Selfevolving android malware detection system. In 2019 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 47--62.Google ScholarCross Ref
- Limin Yang, Arridhana Ciptadi, Ihar Laziuk, Ali Ahmadzadeh, and Gang Wang. 2021. BODMAS: An Open Dataset for Learning based Temporal Analysis of PE Malware. In 2021 IEEE Security and Privacy Workshops (SPW). 78--84. https: //doi.org/10.1109/SPW53761.2021.00020Google ScholarCross Ref
- Limin Yang, Wenbo Guo, Qingying Hao, Arridhana Ciptadi, Ali Ahmadzadeh, Xinyu Xing, and Gang Wang. 2021. {CADE}: Detecting and Explaining Concept Drift Samples for Security Applications. In Proc. of the USENIX Security Symposium. https://www.usenix.org/conference/usenixsecurity21/presentation/yangliminGoogle Scholar
- Weikai Yang, Zhen Li, Mengchen Liu, Yafeng Lu, Kelei Cao, Ross Maciejewski, and Shixia Liu. 2020. Diagnosing concept drift with visual analytics. In 2020 IEEE conference on visual analytics science and technology (VAST). IEEE, 12--23.Google ScholarCross Ref
- Yu Zhang and Ke Tang. 2021. A survey on neural network interpretability. IEEE Transactions on Emerging Topics in Computational Intelligence 5, 5 (2021), 726--742.Google ScholarCross Ref
- Francesco Zola, Jan Lukas Bruse, and Mikel Galar. 2023. Temporal Analysis of Distribution Shifts in Malware Classification for Digital Forensics. In 2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). IEEE Computer Society, 439--450.Google Scholar
Index Terms
- Drift Forensics of Malware Classifiers
Recommendations
The Next Malware Battleground: Recovery After Unknown Infection
Malware has become a natural aspect of Internet computing due to the imperfectness of systems that identify malware and prevent their installation. Our ability to control the volume of unwanted and malicious traffic on the Internet—the spam messages, ...
Correlation Analysis between Spamming Botnets and Malware Infected Hosts
SAINT '11: Proceedings of the 2011 IEEE/IPSJ International Symposium on Applications and the InternetMany of recent cyber attacks are being launched by botnets for the purpose of carrying out large-scale cyber attacks such as spam emails, Distributed Denial of Service (DDoS), network scanning and so on. In many cases, these botnets consist of a lot of ...
Tracking concept drift in malware families
AISec '12: Proceedings of the 5th ACM workshop on Security and artificial intelligenceThe previous efforts in the use of machine learning for malware detection have assumed that malware population is stationary i.e. probability distribution of the observed characteristics (features) of malware populations don't change over time. In this ...
Comments