Large Language Models (LLM) in Cybersecurity

Introduction

Large language models (LLMs) have revolutionized a wide range of fields, from natural language processing to software development. Trained on vast datasets, including code from the internet, LLMs can generate human-like text and code in multiple programming languages based on user inputs. In the context of cybersecurity, LLMs offer both significant potential and substantial risks. On the one hand, they can assist in tasks like assisting developers in writing code, vulnerability repair, and reverse engineering. On the other hand, LLMs introduce new threats, such as the generation of insecure code, susceptibility to data poisoning, and even automated malware creation, empowering attackers with sophisticated tools. This article explores both the benefits and the challenges that LLMs present to cybersecurity.

Vulnerability in LLM Generated Code

Attackers often exploit bugs and vulnerabilities in legitimate programs. Bugs can be any software issues, such as crashes, improper functionality, or buffer overflows. Vulnerabilities, however, refer specifically to security-related bugs, such as crashes or buffer overflows, that attackers can exploit. To make programs resilient against exploitation, both the safety and security of the software must be ensured.

Safety in a program is achieved when it functions as intended under expected conditions. Security, however, requires testing the program with unusual or unexpected inputs to ensure it does not exhibit unintended behavior when faced with inputs outside its specifications. While traditional software validation primarily addresses a program’s safety properties, assessing security resilience requires additional methods. For this purpose, security tools like fuzzers (designed to test with random or malformed inputs) and static analyzers like Github CodeQL (used to detect known vulnerability patterns in code) are widely used.

In recent years, LLM-based code assistants like OpenAI’s Codex and GitHub Copilot have gained popularity for helping developers write code faster and more efficiently. However, the open-source code these models are trained on is largely unvetted and may contain vulnerabilities. Adversaries can also intentionally add vulnerable code snippets to public GitHub repositories to potentially influence LLM training datasets negatively.

Researchers have investigated the prevalence of bugs and vulnerabilities in codes generated by LLMs. In one study, Pearce et al.[1] assessed the security risks in code produced by GitHub Copilot, focusing on scenarios related to high-risk cybersecurity weaknesses listed in MITRE’s top 25 Common Weakness Enumeration (CWE). To evaluate vulnerability, the authors prompted Copilot to generate code within contexts prone to these specific weaknesses. The study found that approximately 40% of the code snippets generated by Copilot contained vulnerabilities. The authors conclude that, while LLM-based code assistants like Copilot can improve productivity, they should be used with caution due to security risks.

While Pearce et al.’s study focused on code entirely generated by LLMs, a comprehensive security evaluation must also consider how developers interact with these tools. Sandoval et al.[2] conducted a user study comparing two groups - one with access to Github Copilot and one without. Their findings revealed that the AI-assisted group produced security-critical bugs at a rate only slightly higher (by 10%) than the non-assisted group, while also producing more functional code overall. Moreover, the study showed that developers, rather than the LLM, introduced the majority of bugs within the AI-assisted group’s code. Since humans perform better with LLM assistance than without it, the continued use of LLM-based code assistants is well justified.

Vulnerability Repair with LLM

Vulnerabilities and bugs are persistent challenges in the software development life cycle. Security tools like fuzzers and static analyzers are widely used to detect and address these issues. However, thier effective integration into the development workflow requires significant expertise and planning. Incorporating these tools into the Continuous Integration and Continuous Delivery (CI/CD) pipeline can be resource-intensive and demands advanced skills.

As codebases evolve, regression testing becomes essential to ensure that previous functionality remains intact. Static analyzer tools like GitHub CodeQL provide faster analysis but primarily detect known CWE patterns, limiting their precision. Fuzzing, on the other hand, tests programs with random inputs but requires substantial resources. Both approaches are complex and demand a high level of skill to implement effectively.

To reduce costs and simplify security workflows, recent research explores using LLMs to automatically repair vulnerabilities and bugs, potentially making secure development more accessible to developers at any skill level. Pearce et al.[3] investigated the capabilities of LLMs in this area and found promising results. Through extensive experiments, they identified optimal model parameters and input prompts, discovering that low temperature setting (which promote more deterministic outputs) produces more predictable and accurate fixes, while higher temperatures yield creative but often flawed solutions. Detailed prompts - including comments and additional code context - significantly improve success rates, particularly for complex issues.

While LLMs show potential in addressing simple, localized vulnerabilities, they are not yet ready for handling complex, real-world cases. A major limitation is the token restriction, which limits LLM’s ability to process the multi-file context often needed for real-world codebases. Another limitation is that LLMs perform only static analysis. Many real-world programs contain vulnerabilities, such as race conditions in multithreaded applications, which can only be identified at runtime. To address these cases, LLMs could be paired with external agents, such as traditional security tools, which provide additional context on runtime behavior and help fix dynamic vulnerabilities.

Reverse Engineering with LLM

Reverse engineering is the process of analyzing a software to understand its functionality, which can be applied for both defensive and malicious purposes. In particular, malware reverse engineering can provide valuable insights for detection and mitigation strategies. This task is human-intensive, relying heavily on expertise and experience. LLMs, trained to recognize functional intent in open-source software through function names, variable names, and code comments, has potential in reverse engineering by providing code explanations.

Recent research has examined whether LLMs can aid software reverse engineering by generating such explanations. Pearce et al.[4] conducted a study on LLM performance in information extraction tasks across various real-world programs, including malware and industrial control systems. A common challenge in reverse engineering is that the target software is often only available in binary format or in an obfuscated state. To simulate this challenge, the authors prepared source code instances using two methods: (1) obfuscating the source code by randomizing variable names, function names, and values, and (2) compiling and then decompiling the source code. They then created a quiz with binary classification and open-ended questions, prompting the LLM with either obfuscated or decompiled code along with quiz questions.

In a zero-shot setting, the LLM correctly answered 53.39% (72,754 out of 136,260) of the questions, suggesting that while LLMs show potential in reverse engineering, they are not yet fully equipped for the task. This is largely due to information loss during compilation, which converts code from a high-level to a low-level abstraction, making accurate decompilation to high-level abstractions difficult.

Possible improvements for LLMs in reverse engineering could involve integrating external information sources, such as running the software in a sandbox to observe behaviors and feeding these into the LLM. Another approach could involve using multiple decompilation tools, as different tools produce varied outputs, they together could provide a more complete context of the target program.

Poisoning attacks against LLMs

Since LLMs are trained on unvetted, open-source code repositories, they are vulnerable to data poisoning attacks. Attackers can exploit this by creating multiple fake GitHub accounts and populating repositories with insecure or malicious code. If malicious code patterns become prevalent in the dataset for a specific context, the LLM may start generating that malicious code when prompted in that same context. This type of attack is similar to the Sybil attack where attackers create multiple fake identities to gain influence.

LLM poisoning is particularly effective because even small modifications, like altering a single variable value, can facilitate many attacks. For instance, if an attacker influences the LLM to generate code that uses ECB encryption mode instead of secure modes like CBC, the resulting ciphertext would be vulnerable to eavesdropping and known plaintext attacks. This is because ECB mode is deterministic, allowing patterns from the original data to be visible in the ciphertext.

Simple poisoning attacks that directly inject malicious code into training dataset are generally ineffective for poisoning LLM training data, as static analysis tools can easily detect them. To evade such detection, Aghakhani et al.[5] propose the COVERT and TROJANPUZZLE methods. The COVERT method hides malicious code within docstrings or comments. However, advanced filters may still detect such patterns. Instead of directly inserting insecure code, the TROJANPUZZLE method exploits LLM’s substitution capabilities and replaces key components with random tokens or placeholders. This approach bypasses both static analysis and advanced filtering since the full malicious payload does not explicitly appear in the dataset.

These types of attacks are extremely difficult to prevent. Aghakhani et al. demonstrate that the fine-pruning approach - removing inactive neurons (those not triggered by clean data) and retraining on clean data - only slightly mitigates the TROJANPUZZLE attack but also degrades the model’s overall performance on coding tasks. This highlights the urgent need to develop resilient methods for training secure code suggestion models or to establish rigorous testing processes for code suggestions before they reach programmers. Current LLM architectures combine both data and control planes, complicating efforts to defend against these attacks. Separating the control and data planes in LLM architectures is an open challenge that may offer a solution to these vulnerabilities.

Automatic Malware Generation with LLM

LLMs are trained on vast amounts of code from publicly accessible repositories. Since these sources are not vetted, both legitimate and malware code can be present in the training data. Consequently, LLMs have the potential to generate malware code. However, ethical filters typically built into LLM applications prevent the generation of such content, blocking malware code generation requests. Despite this, attackers can bypass these filters by applying jailbreaking techniques that exploit prompt engineering.

To automate the generation of functional and effective malware using LLMs, Botacin proposed GPThreats-3 [6], an approach that first creates individual malware building blocks using GPT-3 and then combines these blocks. While these building blocks are not malicious by themselves, they become harmful when assembled in a specific context. For example, an encryption routine could be used by both ransomware and legitimate backup software. Although this method is effective, it is not efficient. The generated building blocks often contain systematic errors that must be corrected before they can be used in the final malware. Additionally, creating a single functional building block requires multiple LLM queries. Each query yields different variations, allowing attackers to combine the functional building blocks into thousands of unique, functional malware samples.

Attackers can distribute these variants to victims, mimicking server-side polymorphism, which enables them to evade detection by constantly altering the malware’s code or characteristics while retaining its core functionality. This technique allows attackers to customize malware distribution based on the victim’s country, operating system, or antivirus tool. GPT-3’s generated malware variations reportedly evade many antivirus tools on the VirusTotal service, indicating that LLMs can produce effective malware.

This approach differs significantly from bootstrapping code with an LLM-based code assistant because it automates the entire process. Consequently, it requires advanced skills to generate the malware building blocks, correct systematic errors in these blocks, and combine them effectively. While less-skilled attackers may struggle to use this method independently, more sophisticated attackers could incorporate it into a malware generation pipeline and offer malware generation as a service to less-skilled attackers.

Conclusion

In conclusion, while many assume that LLMs possess true understanding of natural text and code, they are fundamentally based on statistical patterns. This misconception is a fallacy. However, due to the vast amount of data LLMs are trained on, their statistical outputs are often practical and effective in addressing many cybersecurity-related challenges.

Moreover, LLMs are likely to be integrated into security workflows as intuitive interfaces for complex tasks, such as generating firewall rules. This allows even less skilled users to engage with advanced security tasks, similar to how usable security concepts (generating strong passwords with patterns) make complex tasks more accessible. As LLMs continue to evolve, they will undoubtedly play a crucial role in improving both the usability and effectiveness of cybersecurity tools. LLMs will continue to be utilized in scenarios where they outperform humans or existing tools.

References

[1] Pearce, Hammond, et al. “Asleep at the keyboard? assessing the security of github copilot’s code contributions.” 2022 IEEE Symposium on Security and Privacy (SP). IEEE, 2022. Link

[2] Sandoval, Gustavo, et al. “Lost at c: A user study on the security implications of large language model code assistants.” 32nd USENIX Security Symposium (USENIX Security 23). 2023. Link

[3] Pearce, Hammond, et al. “Examining zero-shot vulnerability repair with large language models.” 2023 IEEE Symposium on Security and Privacy (SP). IEEE, 2023. Link

[4] Pearce, Hammond, et al. “Pop quiz! can a large language model help with reverse engineering?.” arXiv preprint arXiv:2202.01142 (2022). Link

[5] Aghakhani, Hojjat, et al. “Trojanpuzzle: Covertly poisoning code-suggestion models.” 2024 IEEE Symposium on Security and Privacy (SP). IEEE, 2024. Link

[6] Botacin, Marcus. “Gpthreats-3: Is automatic malware generation a threat?.” 2023 IEEE Security and Privacy Workshops (SPW). IEEE, 2023. Link