Generating End-to-End Adversarial Examples for Malware Classifiers Using   Explainability

Ishai Rosenberg and Shai Meir and Jonathan Berrebi and Ilay Gordon and   Guillaume Sicard and Eli David

by Ishai Rosenberg and Shai Meir and Jonathan Berrebi and Ilay Gordon and Guillaume Sicard and Eli David

Released as a article .

2022

Abstract

In recent years, the topic of explainable machine learning (ML) has been extensively researched. Up until now, this research focused on regular ML users use-cases such as debugging a ML model. This paper takes a different posture and show that adversaries can leverage explainable ML to bypass multi-feature types malware classifiers. Previous adversarial attacks against such classifiers only add new features and not modify existing ones to avoid harming the modified malware executable's functionality. Current attacks use a single algorithm that both selects which features to modify and modifies them blindly, treating all features the same. In this paper, we present a different approach. We split the adversarial example generation task into two parts: First we find the importance of all features for a specific sample using explainability algorithms, and then we conduct a feature-specific modification, feature-by-feature. In order to apply our attack in black-box scenarios, we introduce the concept of transferability of explainability, that is, applying explainability algorithms to different classifiers using different features subsets and trained on different datasets still result in a similar subset of important features. We conclude that explainability algorithms can be leveraged by adversaries and thus the advocates of training more interpretable classifiers should consider the trade-off of higher vulnerability of those classifiers to adversarial attacks.
In text/plain format

Archived Files and Locations

application/pdf 1.6 MB
file_3aph5zgw7fh6rghlsfpugi6nhy arxiv.org (repository)
web.archive.org (webarchive)

Read Archived PDF

Preserved and Accessible

Type article
Stage

submitted

Date 2022-06-01
Version v2
Language en ^?

arXiv 2009.13243v2

Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)

Cite This

BibTeX
CSL-JSON
MLA
Harvard

Lookup Links

Worldcat
wikidata.org
CORE.ac.uk
Semantic Scholar
Google Scholar

Catalog Record
Revision: ece9f2f1-f574-43e1-8636-79e0f4688f46
API URL: JSON

Edit Metadata View History

Generating End-to-End Adversarial Examples for Malware Classifiers Using Explainability release_5ubzsgeqgfhwdg76tasieoaq3i

Abstract

Archived Files and Locations

Generating End-to-End Adversarial Examples for Malware Classifiers Using Explainability `release_5ubzsgeqgfhwdg76tasieoaq3i`