DeepSec Talk 2024: Blackbox Android Malware Detection Using Machine Learning and Evasion Attacks Techniques – Professor Dr. Razvan Bocu

Sanna/ September 18, 2024/ Conference

Over the past ten years, researchers have extensively explored the vulnerability of Android malware detectors to adversarial examples through the development of evasion attacks. Nevertheless, the feasibility of these attacks in real-world use case scenarios is debatable. Most of the existing published papers are based on the assumptions that the attackers know the details of the target classifiers used for malware detection. Nevertheless, in reality, malicious actors have limited access to the target classifiers.

This talk presents a problem-space adversarial attack designed to effectively evade blackbox Android malware detectors in real-world use case scenarios. The proposed approach constructs a collection of problem-space transformations derived from benign donors that share opcode-level similarity with malware applications through the consideration of an n-gram-based approach. These transformations are then used to present malware instances as legitimate entities through an iterative and incremental manipulation strategy.

The presentation will describe a manipulation model that is based on a query-efficient optimization algorithm, which can identify and implement the required sequences of transformations into the malware applications. The model has already been evaluated relative to over 1,000 malware applications. This shows the effectiveness of the reported approach relative to the generation of real-world adversarial examples in both software and hardware-related scenarios. The experiments that we conducted show that the proposed model may effectively trick various malware detectors into believing that malware entities are legitimate. More precisely, the proposed model generates evasion rates of 90%–95% relative to data sets like DREBIN, Sec-SVM, ADE-MA, MaMaDroid, and Opcode-SVM. The average number of required computational operations belongs to the range [1..7].

Additionally, it applies to note that the proposed adversarial attack preserves its stealthiness against the virus detection core of three popular commercial antivirus softwares. The obtained evasion rate is 87%, which further proves the proposed model’s relevance for real-world use case scenarios.

We asked Dr. Razvan Bocu a few more questions about his talk.

Please tell us the top 5 facts about your talk.

The talk concerns the assessment of techniques that are used to enhance and fine tune Android malware detectors. We propose a comprehensive and generalized evasion attack, which can bypass black-box Android malware classifiers through a two-step process: (i) preparation and (ii) manipulation.

The first step involves implementing a donor selection technique to create an action set comprising a collection of problem-space transformations, which relates to code snippets known as gadgets. These gadgets are derived by conducting program slicing on benign apps, known as donors, which are publicly available. By injecting each gadget into a malware app, specific payloads from a benign donor can be incorporated into the malware app. The proposed technique uses an n-gram-based similarity (sequence of n adjacent symbols in a particular order) method to identify suitable donors, particularly benign apps that exhibit similarities to malware apps at the opcode level(specifies operation to be executed). Applying transformations derived from these donors to malware apps can enable them to appear legitimate(benign), or move them towards blind spots of machine learning classifiers. We propose a black-box evasion attack that generates real-world Android Adversarial Attacks(AE-Adversarial Examples) that adhere to problem-space constraints.

To the best of our knowledge, this is one of the few studies in the Android scope that successfully evades ML-based malware detectors by effectively manipulating malware samples without performing feature-space perturbations. We show this is a query-efficient attack capable of deceiving various black-box ML-based malware detectors through minimal querying.

It is important to assert that we assess the practicality of the proposed evasion attack under real-world constraints by evaluating its performance in deceiving popular commercial antivirus products. Specifically, our findings show that the proposed approach may significantly diminish the effectiveness of three popular commercial antivirus products, achieving an average evasion rate of approximately 86%.

How did you come up with it? Was there something like an initial
spark that set your mind on creating this talk?

The consideration of this topic is derived from concrete practical aspects. Thus, although Android is based on a heavily customized Linux kernel, which is relatively immune to malware attacks, certain attack patterns can cause considerable functional damage, which may go up to the total logical destruction of the affected devices. Therefore, the proper optimization of the relevant detection engines is essential.

Why do you think this is an important topic?

The increasingly sophisticated nature of malware patterns makes it increasingly difficult for existing detection engines to be efficiently calibrated. Therefore, substantial problems may occur to mobile devices manufacturers, while the end users may be affected by loss of essential functionality, or even corruption or total loss of valuable and sensitive personal data. Therefore, I appreciate that any endeavour that improves the accuracy of implied malware detectors is absolutely relevant.

Is there something you want everybody to know – some good advice for
our readers maybe?

The regular users, who possess superficial technical skills, should always exercise caution using their mobile devices, including the accessed online resources and installed applications, even considering the Play Store occurrences. Additionally, the technically skilled persons should know the mobile threat landscape is continuously and rapidly evolving, and the proposed talk addresses a time-resistant and highly adaptable approach, which may be considered to properly train the malware detectors.

A prediction for the future – what do you think will be the next innovations or future downfalls when it comes to your field of expertise / the topic of your talk in particular?

Although stringent, the proper calibration of Android malware detectors is still insufficiently approached, both at a conceptual and empirical, real-world level. Therefore, I predict a significant increase of this topic’s attractiveness for relevant actors from both industry and academia.

Professor Dr. Razvan Bocu received a B.S. degree in computer science, a B.S. degree in sociology, and an M.S. degree in computer science from Transilvania University of Brasov, Romania, in 2005, 2007, and 2006, respectively. He also received a Ph.D. degree from the National University of Ireland, Cork, in 2010. He is a Research and Teaching Staff Member in the Department of Mathematics and Computer Science at the Transilvania University of Brasov. He is author or coauthor of over 60 technical papers, together with six books and book chapters. Dr. Bocu is an editorial reviewing board member of 28 technical journals in information technology and biotechnology, which includes prestigious journals like Journal of Network and Computer Applications, IEEE Transactions on Dependable and Secure Computing, International Journal of Computers Communications & Control. He is also a Research Scientist with Siemens Industry Software, Brasov, Romania. In this capacity, he supervises research projects with strategic business value.

DeepSec Talk 2024: Blackbox Android Malware Detection Using Machine Learning and Evasion Attacks Techniques – Professor Dr. Razvan Bocu

Please tell us the top 5 facts about your talk.

How did you come up with it? Was there something like an initial spark that set your mind on creating this talk?

Why do you think this is an important topic?

Is there something you want everybody to know – some good advice for our readers maybe?

A prediction for the future – what do you think will be the next innovations or future downfalls when it comes to your field of expertise / the topic of your talk in particular?

Share this:

How did you come up with it? Was there something like an initial
spark that set your mind on creating this talk?

Is there something you want everybody to know – some good advice for
our readers maybe?