Adversarial Malware Attack and Defense

Machine learning (ML)-based malware detectors are widely used in antivirus (AV) systems for their speed and scalability in detecting malware. However, malware developers can exploit weaknesses in these models by creating adversarial samples - malware that has been subtly altered to evade detection. This type of attack is known as an adversarial attack. In this article, we’ll examine different strategies to evade detection, explore methods for automating adversarial sample generation, and discuss potential solutions for strengthening malware detectors against this attack.

Destroying Advanced ML-Based Malware Detectors: A Competition Series

Some well-known internet companies (Endgame, CUJO AI, Microsoft, MRG Effitas, VMRay, and Nvidia) organized a public competition, challenging participants to create adversarial attacks against state-of-the-art ML-based malware detectors. Over the course of three years, Ceschin et al. [1,2,3] successfully evaded all models with adversarial attacks, exposing the vulnerability of ML-based malware detection systems in defending against adversarial threats.

Competition 2019

In the first year, the competition utilized three machine learning models for malware detection: MalConv, Non-Negative MalConv, and LightGBM. Both MalConv and Non-Negative MalConv are deep neural network (DNN) models that use the raw byte content as input to determine its maliciousness. While the MalConv model considers both malicious and benign indicators, the Non-Negative MalConv model looks only for malicious evidence. The third model, LightGBM, is a gradient boosting decision tree model that relies on a wide range of static features such as PE (Portable Executable) headers, file size, imported libraries, and other characteristics of the binary file to classify it. The winning team of the competition employed two adversarial techniques to bypass the malware detection models.

The first strategy was to bias malware detectors by appending additional random bytes to the malware binaries. This worked well against MalConv, but both Non-Negative MalConv and LightGBM were resistant to this method. Since DNN-based models like MalConv treat each byte of the raw binary as a feature, adding a large number of random bytes causes most input features to appear benign, leading MalConv to misclassify the malware as benign. However, since Non-Negative MalConv only looks for malicious signs and LightGBM does not consider all bytes as features, they remained robust to this method. The authors also tested appending goodware strings instead of random bytes, and surprisingly, this tactic bypassed all three models. It was later discovered that the Non-Negative MalConv model had a bug that allowed evasion through the use of goodware strings, highlighting the critical need for thorough testing of such solutions. Although appending data alters the binary, it remains executable because the OS loader examines the header and ignores the appended bytes. Enforcing stricter loading policies that block tampered binaries from running could help mitigate adversarial attacks using data appending, underscoring the fact that OS has responsibility in defending against such attacks.

The second approach focused on hiding the malicious code. The authors used the UPX packer to obscure the PE structure, aiming to evade detection. The UPX packer compresses the entire PE into other PE sections. However, this strategy was unsuccessful, as the models were biased against the UPX packer, classifying any file packed with it as malicious. Since UPX is a widely used tool, many malware samples in the training sets were possibly packed using UPX, leading to this bias in the models. The team also explored using droppers to embed the malware into a new file, but traditional droppers generated large binaries that exceeded the competition’s size limit. To address this, they developed a custom dropper that embedded the original malware as a PE binary resource at compile time and extracted it at runtime. Ultimately, the authors automated malware generation by combining both methods - hiding the malware using the dropper and biasing by appending goodware data. This combined approach enabled them to bypass all models for every sample.

Competition 2020

In the second year, the organizers invited both defense and attack strategies. The first-place model was based on RandomForest algorithm, leveraging categorical, textual, and numerical attributes extracted from PE binaries. Since adversarial samples can transfer across models and only black-box access was available for competing models, the team focused on attacking their own model. The previous year’s winning attack strategy was ineffective against all samples because the defense model could detect the presence of the FindResource API call used by the dropper. To overcome this, the team developed a dropper that mimicked a benign application (Calculator), which also uses the FindResource API. By imitating Calculator, the dropper successfully bypassed their own model. However, this strategy failed against another competition model, likely due to its ability to detect part of the embedded payload. To hide the payload, the team employed two techniques - encoding the malware binary as a base64 string and XORing it with a key. This approach ultimately bypassed all the competing models for all samples.

Competition 2021

In the third year, the organizers introduced a new rule for attack solutions, prohibiting filesystem dropping. This aligns with real-world threat model because while the dropper may evade detection, conventional antivirus tools can detect the dropped executable. To bypass this restriction, the team employed a memory-based approach to extract the embedded payload in memory at runtime. This approach is known as process hollowing technique, where the malicious payload is injected into a legitimate process.

The winning team of these three competitions show that these adversarial attacks also affect ML models used by real antivirus solutions.

Adversarial Malware Generation Strategies

Researchers have explored various methods for automating large-scale adversarial malware generation in both whitebox and blackbox settings.

In a whitebox setting, the attacker has access to the model’s gradients, allowing them to evaluate whether the attack is progressing effectively. Generative Adversarial Network (GAN) has emerged as a popular technique for generating synthetic data that closely mimics real samples. The recent success of GANs in producing highly realistic images (such as those seen on thispersondoesnotexist.com ) has inspired cybersecurity researchers to explore GANs for generating adversarial malware in whitebox setting. Wang et al. [5] propose Mal-LSGAN model to generate effective adversarial malware samples. Mal-LSGAN involves two neural networks - a Generator that creates new adversarial samples and a Discriminator that detects whether the generated sample is malicious. With each iteration, both networks improve, eventually reaching a point where their performances converge. However, this approach operates on feature vectors rather than binary data, meaning the generated feature vectors do not correspond to actual executable malware. Mapping these feature vectors back to executable binaries remains an active area of research.

In contrast, blackbox setting provides only limited access to the model, where the attacker does not know the model’s confidence levels. Instead, blackbox models return either hard labels (benign or malicious) or soft labels (a score indicating the binary’s level of maliciousness). Blackbox attacks typically rely on random transformations of the input malware, which can be inefficient. One strategy to improve blackbox attacks is to transfer adversarial attacks generated in a white-box setting to the black-box model. Another approach involves using genetic algorithms to perturb the input malware. Demetrio et al. [4] introduced a genetic algorithm-based method called GAMMA (Genetic Adversarial Machine learning Malware Attack), which automatically generates a large number of malware samples through functionality-preserving manipulations, while minimizing queries and optimizing the payload size. Instead of random manipulations, it uses bytes from goodware samples.

One might wonder if these advanced adversarial malware generation strategies are relevant in the real world. While it’s true that most attackers lack the technical expertise to develop such sophisticated tactics on their own, they don’t necessarily need that level of skill. There exists a vast underground market where a wide range of advanced attack tools are readily available even to less-skilled adversaries. For example, attackers can rent thousands of botnets without needing the expertise to manage them. In the same way, general adversaries can obtain sophisticated adversarial malware generation tools in these markets, allowing them to bypass modern defenses without understanding the underlying techniques.

Defending Against Adversarial Attacks

The success of adversarial attacks on ML-based malware detection methods highlight the critical need for robustness in these systems. Simply achieving high accuracy during evaluation is not enough; robustness against adversarial attacks is the key metric that defines the effectiveness of a solution. We know that adversarial approaches can bias the model or hide the malicious content in the malware to evade detection. Researchers have explored similar strategies (biasing and hiding) to enhance the robustness of defense models against adversarial attacks.

To bias a defense model towards detection of adversrial binaries, we can train the model on a vast number of automatically generated adversarial samples. Lucas et al. [6] demonstrate various adversarial training methods that improve the robustness of raw-binary malware classifiers like MalConv against advanced adversarial attacks. The authors explored three strategies for generating adversarial samples. The In-Place Replacement approach swaps instructions for equivalent ones, the Displacement approach moves instructions within the binary using jumps to preserve the execution order, and the Kreuk (Appending) approach adds bytes to the end of the executable. Of these, the Kreuk approach is the least computationally expensive, while the other two require more processing power. However, Kreuk’s approach is slightly less robust than the other two. Despite the tradeoff between training cost and robustness, all three approaches have shown promising results in defending against adversarial attacks.

Similar to adversarial hiding strategy, researchers use moving target defense (MTD) to increase complexity and uncertainty for attackers by shifting the attack surface. This approach can randomly select a defense solution in response to attacker queries. Due to the random selection strategy, the defense model becomes unpredictable and prevents attackers from knowing whether their strategy is successful. Rather than relying on random selection, Chhabra et al. [7] propose a reinforcement learning (RL) based method that dynamically chooses the most effective defense strategy by learning from the adversarial attacks encountered. Both the random selection and RL-based MTD approaches have shown promising results in defending against adversarial attacks.

References

[1] Ceschin, Fabrício, et al. “Shallow security: On the creation of adversarial variants to evade machine learning-based malware detectors.” Proceedings of the 3rd Reversing and Offensive-oriented Trends Symposium. 2019. Link

[2] Ceschin, Fabricio, et al. “No need to teach new tricks to old malware: Winning an evasion challenge with xor-based adversarial samples.” Reversing and Offensive-oriented Trends Symposium. 2020. Link

[3] Ceschin, Fabricio, et al. “Adversarial Machine Learning, Malware Detection, and the 2021’s MLSEC Competition.” 2021. Link

[4] Demetrio, Luca, et al. “Functionality-preserving black-box optimization of adversarial windows malware.” IEEE Transactions on Information Forensics and Security 16 (2021): 3469-3478. Link

[5] Wang, Jianhua, et al. “Mal-LSGAN: An Effective Adversarial Malware Example Generation Model.” 2021 IEEE Global Communications Conference (GLOBECOM). IEEE, 2021. Link

[6] Lucas, Keane, et al. “Adversarial training for {Raw-Binary} malware classifiers.” 32nd USENIX Security Symposium (USENIX Security 23). 2023. Link

[7] Chhabra, Anshuman, and Prasant Mohapatra. “Moving Target Defense against Adversarial Machine Learning.” Proceedings of the 8th ACM Workshop on Moving Target Defense. 2021. Link