Hallucination Control: Benefits and Risks of Deploying LLMs as Part of Security Processes

Hallucination Control: Benefits and Risks of Deploying LLMs as Part of Security Processes
2024-7-29 16:11:18 Author: securityboulevard.com(查看原文) 阅读量:7 收藏

Large language models (LLMs) trained on vast quantities of data can make security operations teams smarter. LLMs provide in-line suggestions and guidance on response, audits, posture management and more. Most security teams are experimenting with or using LLMs to reduce manual toil in workflows. This can be both for mundane and complex tasks.

For example, an LLM can query an employee via email if they meant to share a document that was proprietary and process the response with a recommendation for a security practitioner. An LLM can also be tasked with translating requests to look for supply chain attacks on open source modules and spinning up agents focused on specific conditions — new contributors to widely used libraries, improper code patterns — with each agent primed for that specific condition.

That said, these powerful AI systems bear significant risks that are unlike other risks facing security teams. Models powering security LLMs can be compromised through prompt injection or data poisoning. Continuous feedback loops and machine learning algorithms without sufficient human guidance can allow bad actors to probe controls and then induce poorly targeted responses. LLMs are prone to hallucinations, even in limited domains. Even the best LLMs make things up when they don’t know the answer.

Security processes and AI policies around LLM use and workflows will become more critical as these systems become more common across cybersecurity operations and research. Making sure those processes are complied with and are measured and accounted for in governance systems, will prove crucial to ensuring that CISOs can provide sufficient GRC (governance, risk and compliance) coverage to meet new mandates like the Cybersecurity Framework 2.0.

The Huge Promise of LLMs in Cybersecurity

CISOs and their teams constantly struggle to keep up with the rising tide of new cyberattacks. According to Qualys, the number of CVEs reported in 2023 hit a new record of 26,447. That’s up more than 5X from 2013.

This challenge has only become more taxing as the attack surface of the average organization grows larger with each passing year. AppSec teams must secure and monitor many more software applications. Cloud computing, APIs, multi-cloud and virtualization technologies have added additional complexity. With modern CI/CD tooling and processes, application teams can ship more code, faster and more frequently. Microservices have both splintered monolithic apps into numerous APIs and attack surfaces and also punched many more holes in global firewalls for communication with external services or customer devices.

Advanced LLMs hold tremendous promise to reduce the workload of cybersecurity teams and to improve their capabilities. AI-powered coding tools have widely penetrated software development. Github research found that 92% of developers are using or have used AI tools for code suggestion and completion. Most of these “copilot” tools have some security capabilities. Programmatic disciplines with relatively binary outcomes such as coding (code will either pass or fail unit tests) are well suited for LLMs. Beyond code scanning for software development and in the CI/CD pipeline, AI could be valuable for cybersecurity teams in several other ways:

Enhanced Analysis: LLMs can process massive amounts of security data (logs, alerts, threat intelligence) to identify patterns and correlations invisible to humans. They can do this across languages, around the clock, and across numerous dimensions simultaneously. This opens new opportunities for security teams. LLMs can burn down a stack of alerts in near real-time, flagging the ones that are most likely to be severe. Through reinforcement learning, the analysis should improve over time.
Automation: LLMs can automate security team tasks that normally require conversational back and forth. For example, when a security team receives an IoC and needs to ask the owner of an endpoint if they had signed into a device or if they are located somewhere outside their normal work zones, the LLM can perform these simple operations and then follow up with questions as required and links or instructions. This used to be an interaction that an IT or security team member had to conduct themselves. LLMs can also provide more advanced functionality. For example, a Microsoft Copilot for Security can generate incident analysis reports and translate complex malware code into natural language descriptions.
Continuous Learning and Tuning: Unlike previous machine learning systems for security policies and comprehension, LLMs can learn on the fly by ingesting human ratings of their response and by returning on newer pools of data that may not be contained in internal log files. Using the same underlying foundational model, cybersecurity LLMs can be tuned for different teams and their needs, workflows, or regional or vertical-specific tasks. This also means that the entire system can instantly be as smart as the model, with changes propagating quickly across all interfaces.

Risk of LLMs for Cybersecurity

As a new technology with a short track record, LLMs have serious risks. Worse, understanding the full extent of those risks is challenging because LLM outputs are not 100% predictable or programmatic. For example, LLMs can “hallucinate” and make up answers or answer questions incorrectly, based on imaginary data. Before adopting LLMs for cybersecurity use cases, one must consider potential risks including:

Prompt Injection: Attackers can craft malicious prompts specifically to produce misleading or harmful outputs. This type of attack can exploit the LLM’s tendency to generate content based on the prompts it receives. In cybersecurity use cases, prompt injection might be most risky as a form of insider attack or attack by an unauthorized user who uses prompts to permanently alter system outputs by skewing model behavior. This could generate inaccurate or invalid outputs for other users of the system.
Data Poisoning: The training data LLMs rely on can be intentionally corrupted, compromising their decision-making. In cybersecurity settings, where organizations are likely using models trained by tool providers, data poisoning might occur during the tuning of the model for the specific customer and use case. The risk here could be an unauthorized user adding bad data — for example, corrupted log files — to subvert the training process. An authorized user could also do this inadvertently. The result would be LLM outputs based on bad data.
Hallucinations: As mentioned previously, LLMs may generate factually incorrect, illogical, or even malicious responses due to misunderstandings of prompts or underlying data flaws. In cybersecurity use cases, hallucinations can result in critical errors that cripple threat intelligence, vulnerability triage and remediation, and more. Because cybersecurity is a mission-critical activity, LLMs must be held to a higher standard of managing and preventing hallucinations in these contexts.

As AI systems become more capable, their information security deployments are expanding rapidly. To be clear, many cybersecurity companies have long used pattern matching and machine learning for dynamic filtering. What is new in the generative AI era are interactive LLMs that provide a layer of intelligence atop existing workflows and pools of data, ideally improving the efficiency and enhancing capabilities of cybersecurity teams. In other words, GenAI can help security engineers do more with less effort and the same resources, yielding better performance and accelerated processes.

Strategies for Mitigating LLM Risks in Cybersecurity

While most CISOs and CIOs have created AI policies, it is no surprise that more extensive due diligence, oversight and governance are required for the use of AI in a cybersecurity context. According to Deloitte’s annual cyberthreat report, 66% of organizations suffered ransomware attacks. There was also a 400% increase in IoT malware attacks. And in 2023 91% of organizations had to remediate a supply chain attack affecting the code or systems they used.

That’s because the long-standing cybersecurity practices that worked in the past, haven’t caught up to the capabilities and threats presented by Large Language Models (LLMs). LLMs trained on vast quantities of data can make both security operations teams, and the threats they’re trying to mitigate, smarter. Because LLMs are different from other security tools, a different set of approaches is required to mitigate their risks. Some involve new security technologies. Others are tried-and-true tactics modified for LLMs. These include:

Adversarial Training: As part of the fine-tuning or testing process, cybersecurity users should expose LLMs to inputs designed to test their boundaries and induce the LLM to break the rules or behave maliciously. It works best at the training or tuning stage before the system is fully implemented. This can involve generating adversarial examples using techniques such as adding noise, crafting specific misleading prompts, or using known attack patterns to simulate potential threats. That said, CISOs should have their teams (or the vendors) perform adversarial attacks on an ongoing basis to ensure compliance and identify risks or failures.

Build in Explainability: In LLMs, explainability is the ability to explain why a specific output was offered. This requires that cybersecurity LLM vendors add a layer of explainability to their LLM-powered tools; deep neural networks used to build LLM models are in the early stages of developing full explainability. Tellingly, few security LLMs today promise explainability. That’s because it is very difficult to build reliable explainability and even the largest, best-resourced LLM makers struggle to do it. The lack of explainability leads logically to the next few mitigation steps.

Continuous Monitoring: Putting in place systems to monitor security controls is not novel. Asset inventories and security posture management tools attempt this. However, LLMs are a different instance and continuous monitoring must detect anomalous or unexpected LLM outputs in real-world use. This is particularly challenging when the outputs are unpredictable and potentially infinite. Large AI providers like OpenAI and Anthropic are deploying specific LLMs to monitor their LLMs — a spy to catch a spy, so to speak. In the future, most LLM deployments will be in pairs — one for output and use, the other for monitoring.

Human-in-the-Loop: Because LLMs are so novel and potentially risky, organizations should combine LLM suggestions with human expertise for critical decision-making. However, keeping a human in the loop does not completely solve the problem. Research on human decision-making when they are paired with AIs has demonstrated that LLMs which appear more authoritative induce the human operators to “take their hands off the wheel” and overly trust the AIs. To address this issue, CISOs, and their teams need to create a security process where LLMs are not overly trusted or assigned too much responsibility so that human operators become overly dependent and unable to distinguish LLM errors and hallucinations. One mode might be to have LLMs initially introduced in “Suggestion Only” mode, where they provide advice and guidance but are not permitted to enact changes, share information or otherwise interact with systems and others without explicit permission from their human operator.

Sandboxing and Gradual Deployment: It is crucial to thoroughly test LLMs in isolated environments before live deployment. This is related to adversarial training but is different because the LLM should be test-driven in circumstances that are nearly identical to real cybersecurity processes and workflows. This training should even constitute real attacks based on real-world vulnerabilities and TTPs in play in the field. Most security controls and tools are put through a similar process of sandbox deployment, with good reason. Cybersecurity is so multifaceted and complex, with organizations deploying dozens of tools, that unexpected interactions and behaviors can emerge.

LLMs have introduced a greater risk of the unexpected, so, their integration, usage and maintenance protocols should be extensive and closely monitored. Once the CISO is satisfied that the LLM will be safe enough and effective, then deployment should be gradual and methodical. A good approach is to deploy the LLM initially for less critical and complex tasks and slowly introduce it into the most cognitively challenging workflows and processes, where good judgment is essential.

Wrapping AI Security in Process Management

Successfully integrating AI-driven solutions like LLMs requires a robust process management framework. The application of process mining in cybersecurity provides a comprehensive way to manage and optimize these integrations, ensuring that AI deployments enhance rather than hinder security efforts. Some key considerations for deploying process mining to monitor compliance with proper LLM security practices include:

Integration and Orchestration: Successful AI security implementations depend on seamless integration with existing security tools and processes. Gutsy’s process mining technology offers detailed insights into how these tools interact, highlighting inefficiencies and areas for improvement. This ensures that AI solutions work harmoniously within the broader security ecosystem, rather than creating isolated silos
Facilitating Gradual Deployment and Sandboxing: Gutsy’s process mining facilitates thorough testing by simulating real-world scenarios and workflows, allowing organizations to identify and address potential security process challenges in a safe environment before deploying LLMs. Tuning Gutsy to capture the right metrics and the right human actions of the LLM security will ensure better compliance in production.
Continuous Monitoring and Improvement: AI systems must be continuously monitored to ensure they operate correctly and adapt to new threats. Gutsy’s real-time event data collection and analysis provide an ongoing understanding of security processes, allowing organizations to quickly identify and rectify any issues. This proactive approach helps maintain the reliability and accuracy of AI-driven security measures.
Eliminating Bias and Assumptions: By using actual, observable event data, Gutsy removes human biases and assumptions from the analysis process. This leads to more accurate and reliable insights, which are crucial for fine-tuning AI models and ensuring their outputs are based on factual data. This is particularly important in avoiding the propagation of errors through AI systems.

A New Era Where Cybersecurity Tools Make Mistakes

Given the “black box” nature of LLMs, CISOs cannot expect 100% guarantees that these systems will not make mistakes. All major LLM vendors address this issue clearly in their FAQs and Responsible AI disclosures. Some will offer to indemnify users against LLM errors. For CISOs looking to leverage the clear power and utility of LLMs, the right mindset is to accept that these systems will make errors and will require a certain amount of additional “babysitting”. If the LLMs can sufficiently improve team performance and enable them to better manage and respond to the rising tide of threats, then a handful of mistakes and hallucinations are, in the aggregate, an acceptable trade-off. For CISOs looking to deploy LLMs to their teams (and their teams are certainly using LLMs somewhere in their workflows already), the key is to construct a security process and set of checks to ensure that the risk of LLM-driven problems are minimized and the power of these new forms of powerful human assistants on the balance deliver better cybersecurity and enhanced capabilities to human operators facing the increasingly daunting array of cyberthreats.

文章来源: https://securityboulevard.com/2024/07/hallucination-control-benefits-and-risks-of-deploying-llms-as-part-of-security-processes/
如有侵权请联系:admin#unsafe.sh