research-article

Drift Forensics of Malware Classifiers

Authors:
Theo Chow

King's College London, London, United Kingdom

King's College London, London, United Kingdom

0009-0003-2125-8828
View Profile

,
Zeliang Kan

King's College London & University College London, London, United Kingdom

King's College London & University College London, London, United Kingdom

0009-0007-4740-1134
View Profile

,
Lorenz Linhardt

TU Berlin & BIFOLD, Berlin, United Kingdom

TU Berlin & BIFOLD, Berlin, United Kingdom

0000-0002-5533-5524
View Profile

,
Lorenzo Cavallaro

University College London, London, United Kingdom

University College London, London, United Kingdom

0000-0002-3878-2680
View Profile

,
Daniel Arp

TU Berlin, Berlin, United Kingdom

TU Berlin, Berlin, United Kingdom

0000-0003-3628-794X
View Profile

,
Fabio Pierazzi

King's College London, London, United Kingdom

King's College London, London, United Kingdom

0000-0002-1254-1758
View Profile

AISec '23: Proceedings of the 16th ACM Workshop on Artificial Intelligence and SecurityNovember 2023Pages 197–207https://doi.org/10.1145/3605764.3623918

Published:26 November 2023Publication History

AISec '23: Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security

Pages 197–207

ABSTRACT

The widespread occurrence of mobile malware still poses a significant security threat to billions of smartphone users. To counter this threat, several machine learning-based detection systems have been proposed within the last decade. These methods have achieved impressive detection results in many settings, without requiring the manual crafting of signatures. Unfortunately, recent research has demonstrated that these systems often suffer from significant performance drops over time if the underlying distribution changes---a phenomenon referred to as concept drift. So far, however, it is still an open question which main factors cause the drift in the data and, in turn, the drop in performance of current detection systems. To address this question, we present a framework for the in-depth analysis of dataset affected by concept drift. The framework allows gaining a better understanding of the root causes of concept drift, a fundamental stepping stone for building robust detection methods. To examine the effectiveness of our framework, we use it to analyze a commonly used dataset for Android malware detection as a first case study. Our analysis yields two key insights into the drift that affects several state-of-the-art methods. First, we find that most of the performance drop can be explained by the rise of two malware families in the dataset. Second, we can determine how the evolution of certain malware families and even goodware samples affects the classifier's performance. Our findings provide a novel perspective on previous evaluations conducted using this dataset and, at the same time, show the potential of the proposed framework to obtain a better understanding of concept drift in mobile malware and related settings.

References

[n. d.]. Adware Dowgin. https://vms.drweb.com/virus/?i=21714828. Accessed: 2023-07-06.Google Scholar
[n. d.]. Adware Kuguo. https://vms.drweb.com/virus/?i=17938587. Accessed: 2023-07-06.Google Scholar
[n. d.]. Message Digest class. https://developer.android.com/reference/java/ security/MessageDigest. Accessed: 2023-07-06.Google Scholar
Kevin Allix, Tegawendé F Bissyandé, Jacques Klein, and Yves Le Traon. 2015. Are your training datasets yet relevant?. In International Symposium on Engineering Secure Software and Systems. Springer, 51--67.Google ScholarCross Ref
Brandon Amos, Hamilton Turner, and Jules White. 2013. Applying machine learning classifiers to dynamic android malware detection at scale. In 2013 9th international wireless communications and mobile computing conference (IWCMC). IEEE, 1666--1671.Google Scholar
Marco Ancona, Enea Ceolini, Cengiz Öztireli, and Markus H. Gross. 2017. Towards better understanding of gradient-based attribution methods for Deep Neural Networks. In International Conference on Learning Representations.Google Scholar
Daniel Arp, Erwin Quiring, Feargus Pendlebury, Alexander Warnecke, Fabio Pierazzi, Christian Wressnegger, Lorenzo Cavallaro, and Konrad Rieck. 2022. Dos and Don'ts of Machine Learning in Computer Security. In 31st USENIX Security Symposium (USENIX Security 22). USENIX Association, Boston, MA, 3971--3988. https://www.usenix.org/conference/usenixsecurity22/presentation/arpGoogle Scholar
Daniel Arp, Michael Spreitzenbarth, Malte Hübner, Hugo Gascon, and Konrad Rieck. 2014. Drebin: Effective and explainable detection of android malware in your pocket.. In Proc. of the Network and Distributed System Security Symposium (NDSS), Vol. 14. 23--26.Google ScholarCross Ref
Erin Avllazagaj, Ziyun Zhu, Leyla Bilge, Davide Balzarotti, and Tudor Dumitras. 2021. When Malware Changed Its Mind: An Empirical Study of Variable Program Behaviors in the Real World. In 30th USENIX Security Symposium (USENIX Security 21). USENIX Association, 3487--3504. https://www.usenix.org/conference/ usenixsecurity21/presentation/avllazagajGoogle Scholar
Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one 10, 7 (2015), e0130140.Google ScholarCross Ref
Federico Barbero, Feargus Pendlebury, Fabio Pierazzi, and Lorenzo Cavallaro. 2022. Transcending TRANSCEND: Revisiting Malware Classification in the Presence of Concept Drift. In Proc. of the IEEE Symposium on Security and Privacy (S&P). IEEE. https://doi.org/10.1109/SP46214.2022.9833659Google ScholarCross Ref
Bernhard E. Boser, Isabelle M. Guyon, and Vladimir N. Vapnik. 1992. A Training Algorithm for Optimal Margin Classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory. Association for Computing Machinery, New York, NY, USA, 144--152. https://doi.org/10.1145/130385.130401Google ScholarDigital Library
Yizheng Chen, Zhoujie Ding, and David Wagner. 2023. Continuous Learning for Android Malware Detection. arXiv:2302.04332 [cs.CR]Google Scholar
Zhi Chen, Zhenning Zhang, Zeliang Kan, Jacopo Cortellazzi, Feargus Pendlebury, Fabio Pierazzi, Lorenzo Cavallaro, and Gang Wang. 2023. Is It Overkill? Analyzing Feature-Space Concept Drift in Malware Detectors (2023 ed.). IEEE.Google Scholar
Nello Cristianini and John Shawe-Taylor. 2000. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods (1 ed.). Cambridge University Press.Google ScholarCross Ref
Stefan Decker. [n. d.]. G DATA Mobile Malware Report: Criminals keep up the pace with Android malware. https://www.gdatasoftware.com/news/2021/ 10/37093-g-data-mobile-malware-report-criminals-keep-up-the-pace-withandroid-malware. Accessed: 2023-06--19.Google Scholar
2018. Detecting concept drift in data streams using model explanation. Expert Systems with Applications 92 (2018), 546--559.Google ScholarDigital Library
Dan Goodin. [n. d.]. Potentially millions of Android TVs and phones come with malware preinstalled. https://arstechnica.com/informationtechnology/2023/05/potentially-millions-of-android-tvs-and-phones-comewith-malware-preinstalled/. Accessed: 2023-06--239.Google Scholar
Arash Habibi Lashkari Gurdip Kaur. [n. d.]. Understanding Android malware Families: Riskware - is it worth it? https://www.itworldcanada.com/blog/ understanding-android-malware-families-riskware-is-it-worth-it-article4/446692. Accessed: 2023-06--20.Google Scholar
Joachim Herz, Dudley K Strickland, et al. 2001. LRP: a multifunctional scavenger and signaling receptor. The Journal of clinical investigation 108, 6 (2001), 779--784.Google ScholarCross Ref
T Ryan Hoens, Robi Polikar, and Nitesh V Chawla. 2012. Learning from streaming data with concept drift and imbalance: an overview. Progress in Artificial Intelligence 1, 1 (2012), 89--101.Google ScholarCross Ref
Cheng-Yu Hsieh, Chih-Kuan Yeh, Xuanqing Liu, Pradeep Ravikumar, Seungyeon Kim, Sanjiv Kumar, and Cho-Jui Hsieh. 2020. Evaluations and methods for explanation through robustness analysis. arXiv preprint arXiv:2006.00442 (2020).Google Scholar
Roberto Jordaney, Kumar Sharad, Santanu K Dash, Zhi Wang, Davide Papini, Ilia Nouretdinov, and Lorenzo Cavallaro. 2017. Transcend: Detecting concept drift in malware classification models. In 26th USENIX security symposium (USENIX security 17). 625--642.Google Scholar
Mina Esmail Zadeh Nojoo Kambar, Armin Esmaeilzadeh, Yoohwan Kim, and Kazem Taghva. 2022. A survey on mobile malware detection methods using machine learning. In 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC). IEEE, 0215--0221.Google ScholarCross Ref
Zeliang Kan, Feargus Pendlebury, Fabio Pierazzi, and Lorenzo Cavallaro. 2021. Investigating Labelless Drift Adaptation for Malware Detection. In ACM Workshop on Artificial Intelligence and Security (AISec).Google Scholar
Deqiang Li, Tian Qiu, Shuo Chen, Qianmu Li, and Shouhuai Xu. 2021. Can We Leverage Predictive Uncertainty to Detect Dataset Shift and Adversarial Examples in Android Malware Detection?. In Proc. of the Annual Computer Security Applications Conference (ACSAC). https://doi.org/10.1145/3485832.3485916Google ScholarDigital Library
Zachary C Lipton. 2018. The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue 16, 3 (2018), 31--57.Google ScholarDigital Library
Federico Maggi, William Robertson, Christopher Kruegel, and Giovanni Vigna. [n. d.]. Protecting a Moving Target: Addressing Web Application Concept Drift. In Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID). https://doi.org/10.1007/978--3--642-04342-0_2Google ScholarCross Ref
Francesco Mercaldo and Antonella Santone. 2020. Deep learning for image-based mobile malware detection. Journal of Computer Virology and Hacking Techniques 16, 2 (2020), 157--171.Google ScholarCross Ref
Michael Mimoso. [n. d.]. Gunpoder Android Malware Hides Malicious Behaviors in Adware. https://threatpost.com/gunpoder-android-malware-hides-maliciousbehaviors-in-adware/113654/. Accessed: 2023-06--19.Google Scholar
Grégoire Montavon, Alexander Binder, Sebastian Lapuschkin, Wojciech Samek, and Klaus-Robert Müller. 2019. Layer-wise relevance propagation: an overview. Explainable AI: interpreting, explaining and visualizing deep learning (2019), 193-- 209.Google Scholar
Jose G Moreno-Torres, Troy Raeder, Rocío Alaiz-Rodríguez, Nitesh V Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern recognition 45, 1 (2012), 521--530.Google Scholar
Annamalai Narayanan, Mahinthan Chandramohan, Lihui Chen, and Yang Liu. 2017. Context-aware, adaptive, and scalable android malware detection through online learning. IEEE Transactions on Emerging Topics in Computational Intelligence 1, 3 (2017), 157--175.Google ScholarCross Ref
Fairuz Amalina Narudin, Ali Feizollah, Nor Badrul Anuar, and Abdullah Gani. 2016. Evaluation of machine learning classifiers for mobile malware detection. Soft Computing 20 (2016), 343--357.Google ScholarDigital Library
Feargus Pendlebury, Fabio Pierazzi, Roberto Jordaney, Johannes Kinder, Lorenzo Cavallaro, et al. 2019. TESSERACT: Eliminating experimental bias in malware classification across space and time. In Proceedings of the 28th USENIX Security Symposium. USENIX Association, 729--746.Google Scholar
Wojciech Samek, Grégoire Montavon, Andrea Vedaldi, Lars Kai Hansen, and Klaus-Robert Müller. 2019. Explainable AI: interpreting, explaining and visualizing deep learning. Vol. 11700. Springer Nature.Google ScholarDigital Library
Silvia Sebastián and Juan Caballero. 2020. AVclass2: Massive Malware Tag Extraction from AV Labels. In Annual Computer Security Applications Conference. Association for Computing Machinery, New York, NY, USA, 42--53. https://doi.org/10.1145/3427228.3427261Google ScholarDigital Library
Shawn Shan, Arjun Nitin Bhagoji, Haitao Zheng, and Ben Y Zhao. 2022. Poison forensics: Traceback of data poisoning attacks in neural networks. In 31st USENIX Security Symposium (USENIX Security 22). 3575--3592.Google Scholar
Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning Important Features through Propagating Activation Differences. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (Sydney, NSW, Australia) (ICML'17). JMLR.org, 3145--3153.Google Scholar
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In International conference on machine learning. PMLR, 3319-- 3328.Google Scholar
Roman Unuchek. [n. d.]. Mobile malware evolution. https://securelist.com/ mobile-malware-evolution-2016/77681/. Accessed: 2022-05--22.Google Scholar
Zhiqiang Wang, Qian Liu, and Yaping Chi. 2020. Review of android malware detection based on deep learning. IEEE Access 8 (2020), 181102--181126.Google ScholarCross Ref
Alexander Warnecke, Daniel Arp, Christian Wressnegger, and Konrad Rieck. 2020. Evaluating explanation methods for deep learning in security. In 2020 IEEE european symposium on security and privacy (EuroS&P). IEEE, 158--174.Google Scholar
Fengguo Wei, Yuping Li, Sankardas Roy, Xinming Ou, and Wu Zhou. 2017. Deep ground truth analysis of current android malware. In Detection of Intrusions and Malware, and Vulnerability Assessment: 14th International Conference, DIMVA 2017, Bonn, Germany, July 6--7, 2017, Proceedings 14. Springer, 252--276.Google Scholar
Ke Xu, Yingjiu Li, Robert Deng, Kai Chen, and Jiayun Xu. 2019. Droidevolver: Selfevolving android malware detection system. In 2019 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 47--62.Google ScholarCross Ref
Limin Yang, Arridhana Ciptadi, Ihar Laziuk, Ali Ahmadzadeh, and Gang Wang. 2021. BODMAS: An Open Dataset for Learning based Temporal Analysis of PE Malware. In 2021 IEEE Security and Privacy Workshops (SPW). 78--84. https: //doi.org/10.1109/SPW53761.2021.00020Google ScholarCross Ref
Limin Yang, Wenbo Guo, Qingying Hao, Arridhana Ciptadi, Ali Ahmadzadeh, Xinyu Xing, and Gang Wang. 2021. {CADE}: Detecting and Explaining Concept Drift Samples for Security Applications. In Proc. of the USENIX Security Symposium. https://www.usenix.org/conference/usenixsecurity21/presentation/yangliminGoogle Scholar
Weikai Yang, Zhen Li, Mengchen Liu, Yafeng Lu, Kelei Cao, Ross Maciejewski, and Shixia Liu. 2020. Diagnosing concept drift with visual analytics. In 2020 IEEE conference on visual analytics science and technology (VAST). IEEE, 12--23.Google ScholarCross Ref
Yu Zhang and Ke Tang. 2021. A survey on neural network interpretability. IEEE Transactions on Emerging Topics in Computational Intelligence 5, 5 (2021), 726--742.Google ScholarCross Ref
Francesco Zola, Jan Lukas Bruse, and Mikel Galar. 2023. Temporal Analysis of Distribution Shifts in Malware Classification for Digital Forensics. In 2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). IEEE Computer Society, 439--450.Google Scholar

Index Terms

Drift Forensics of Malware Classifiers
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Reasoning about belief and knowledge
2. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
    1. Malware and its mitigation

Recommendations

The Next Malware Battleground: Recovery After Unknown Infection

Malware has become a natural aspect of Internet computing due to the imperfectness of systems that identify malware and prevent their installation. Our ability to control the volume of unwanted and malicious traffic on the Internet—the spam messages, ...
Read More
Correlation Analysis between Spamming Botnets and Malware Infected Hosts
SAINT '11: Proceedings of the 2011 IEEE/IPSJ International Symposium on Applications and the Internet

Many of recent cyber attacks are being launched by botnets for the purpose of carrying out large-scale cyber attacks such as spam emails, Distributed Denial of Service (DDoS), network scanning and so on. In many cases, these botnets consist of a lot of ...
Read More
Tracking concept drift in malware families
AISec '12: Proceedings of the 5th ACM workshop on Security and artificial intelligence

The previous efforts in the use of machine learning for malware detection have assumed that malware population is stationary i.e. probability distribution of the observed characteristics (features) of malware populations don't change over time. In this ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AISec '23: Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security
November 2023
252 pages
ISBN:9798400702600
DOI:10.1145/3605764
Program Chairs:
Maura Pintor
University of Cagliari, Italy
,
Xinyun Chen
Google Brain, USA
,
Florian Tramèr
ETH Zürich, Switzerland
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 November 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
concept shift
explainable ai
machine learning
malware
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate94of231submissions,41%
Upcoming Conference
CCS '24

Sponsor:

sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

Salt Lake City , UT , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 132
  Total Downloads
- Downloads (Last 12 months)132
- Downloads (Last 6 weeks)25
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Drift Forensics of Malware Classifiers

AISec '23: Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security

ABSTRACT

References

Cited By

Index Terms

Recommendations

The Next Malware Battleground: Recovery After Unknown Infection

Correlation Analysis between Spamming Botnets and Malware Infected Hosts

Tracking concept drift in malware families

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Drift Forensics of Malware Classifiers

AISec '23: Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security

ABSTRACT

References

Cited By

Index Terms

Recommendations

The Next Malware Battleground: Recovery After Unknown Infection

Correlation Analysis between Spamming Botnets and Malware Infected Hosts

Tracking concept drift in malware families

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media