Sandfly's Agentless Linux EDR + AI Are a Powerful Combo

Sandfly's Agentless Linux EDR + AI Are a Powerful Combo
Sandfly BlogAt Sandfly, we love the potential of Artificial Intelligence. In fact, we have actively 2026-7-2 22:54:19 Author: sandflysecurity.com(查看原文) 阅读量:4 收藏

Sandfly Blog

At Sandfly, we love the potential of Artificial Intelligence. In fact, we have actively used and integrated Large Language Models (LLMs) for Linux forensic analysis and threat investigation since Sandfly 5.5 was released last year.

LLMs are unparalleled at connecting disparate dots, summarizing complex system behaviors, and helping security analysts understand potential compromises at lightning speed. More than this, LLMs help security teams understand the arcane and often confusing world of Linux attacks and compromise.

So when customers bring up AI, we make it clear that the future of Linux security isn't EDR versus AI, it is using EDR to make AI more useful, accurate, and safe. In this post we want to discuss how Sandfly is leveraging AI and discuss the reasons why AI needs a solid detection and response foundation to perform best.

Very Confident Wrong Answers

While integrating AI into Sandfly, we've found that there are places where it can be leveraged extremely well, and other areas where it may cause trouble. Our overriding conclusion is that AI systems cannot do anything without accurate, structured data. Like virtually all computer science problems: Garbage in, garbage out.

More than this, LLMs can give very confident wrong answers, especially when you let them speculate on data that has big gaps or is inaccurate. Data accuracy and completeness are key to allowing LLMs to perform at their best. If you shovel a bunch of random data into them, they may output wrong information or waste time and money going down rabbit holes that don't lead to the actual problem.

AI May Not Have Repeatable Results

LLMs are largely non-deterministic, meaning the answers you get from them for the same question may vary, often significantly, each time you ask.

If you let an AI investigate a raw Linux host, for example, without a structured data layer guiding it, it may take a different path every single time. It might find a problem in session A, but completely miss critical attack indicators during session B. Customers need repeatability and assurance. By using Sandfly to gather a standardized and predictable baseline of data first, the LLM is given a reliable map for thorough analysis. It also assures that performance and stability metrics are maintained across hosts because Sandfly runs identically each time, rather than acting randomly during an investigation.

Guarding Forensic Soundness

When human investigators or automated tools look at a system, maintaining forensic soundness is critical. For instance, Sandfly’s collection engine is explicitly designed not to modify file access times (MAC times) when inspecting data. If an LLM is given direct access to run standard system binaries across a machine, it can inadvertently alter these access times and pollute the forensic environment in other ways. Feeding an LLM safely collected, forensically sound data keeps the host intact.

Stealth Rootkits and Compromised Systems

If an AI agent directly executes standard Linux commands like ps, ls, or top on a compromised host, it relies on the integrity of those binaries. If the machine is infected with a stealth kernel rootkit, those binaries may simply lie to the AI. LLMs cannot easily decloak hidden processes, compromised binaries, or cloaked files on their own.

On the other hand, Sandfly has specialized forensic engines designed to bypass standard OS layers, uncover the hidden truth, and hand that verified data to the LLM to build the full picture. Stealth rootkits are easily decloaked this way, as opposed to relying on potentially compromised systems to misrepresent their data. Sandfly is purpose-built to go into hostile environments to get answers.

A recent example is the Scales eBPF rootkit that deployed sophisticated hiding along with multiple persistence mechanisms to steal credentials. This rootkit was distributed through a supply-chain attack affecting over 1,500 Linux packages. Sandfly easily detected and decloaked this rootkit, and the alerts passed into an LLM give clear and consistent results as seen below.

Logs vs. Active Threat Hunting

We have hundreds of thousands of pieces of Linux malware in our research library, and we hate to break it to you, but almost all of it goes to great lengths not to show up in any logs. So, if you're only looking at Linux logs for attacks, even with AI, you are likely to be disappointed. Even worse, attackers also disable auditing systems, blind EDR agents, perform low observability system operations, and use living-off-the-land tactics to appear normal. Often, they are doing all of the above!

As a result, taking a bunch of Linux logs and dumping them into an AI system is likely to miss significant malicious activity. In our view, serious malware and threat actors need active hunting to find them. Sandfly moves around Linux networks actively looking for trouble, not just waiting for something to happen. The data we gather is deliberate, focused, and not grabbed by chance.

Active data gathering enables accurate, efficient analysis rather than forcing an LLM to guess on incomplete logs or passively gathered telemetry. If you remember our earlier point, LLMs get into trouble when you allow them to guess. Active threat hunting grabs high-fidelity data that an LLM can accurately interpret immediately.

Bridging the Gap on Security Hygiene

While LLMs excel at analyzing text and patterns, they generally cannot efficiently hunt for things like SSH key compromises, bad passwords that lead to access, or configuration drift across a massive fleet on their own. They need this data gathered and structured to make them useful. These non-EDR components are incredibly valuable for Linux security. Sandfly can efficiently gather these security artifacts, giving the AI the exact context it needs to evaluate risk and impacts.

Again as an efficiency layer, a system like Sandfly tells the LLM: "This admin account has a bad password," or "This SSH key was banned but is on this host." There is no ambiguity about the problem, unlike trying to read intentions from spurious logs and authentication messages.

Ensuring Safety and Preventing Prompt Injection

Safety is the most significant architectural concern when dealing with AI and cybersecurity. Giving an LLM direct access to an infrastructure endpoint, for instance, introduces massive risks. If a machine is compromised, a malicious actor can use prompt injection (like naming a file or process with malicious instructions) to hijack the AI's execution capabilities. Not just this, but the LLM may decide to initiate actions directly on a host despite safety protocols, and these may be destructive. The LLM must be isolated from the investigation mechanism so it cannot be directly targeted, and if targeted, it cannot be made to do something destructive or cause further compromise.

Sandfly provides safety, as the LLM never gets direct access to a host, nor can it make Sandfly run unauthorized commands.

Isolation Layer Architecture for Safety

Sandfly uses an isolation layer philosophy with AI to ensure safety and performance. You can leverage AI for analysis, but Sandfly collects the data reliably, accurately, and safely.

Instead of choosing between manual legacy triage or giving an AI unconstrained access, the ideal architecture separates collection from judgment:

Purpose-built collection: An agentless, read-only collector gathers a forensically sound data from the host in a safe way, even if the host is compromised. Sandfly is built to hunt for attackers and rootkits with specialized tools to deal with these threats.
Off-host AI analysis: The LLM analyzes Sandfly data in an isolated environment but cannot execute actions directly on the host. Sandfly sits between the LLM and the host to gather data in a safe and controlled way.
Enrich context as needed: If the LLM needs more context, it can read or request additional forensic data that has been safely gathered by Sandfly. This provides the data LLMs need to narrow down the problem and make a conclusion without ever acting directly on the target host.

This approach keeps the AI agent in a constrained environment with solid data, and it works strongly for Linux attack analysis. Attacks directed against the LLM are not a risk, as the LLM must act through a read-only intermediary like Sandfly to do the actual legwork on the system. The LLM is also constrained against accidental actions it may try on a host system. Malicious commands cannot be run, and data integrity is preserved.

Bring Your Own Model (BYOM)

It’s tempting for security vendors to claim their problem is so unique it requires a proprietary AI model. However, this just forces customers to manage a fragmented, expensive portfolio of AI tools.

Our research showed that building a custom model is completely unnecessary for our problem space of Linux security. Because leading frontier models have already trained on the internet's collective knowledge of Linux, malware, and exploits, they all perform exceptionally well when fed Sandfly's precise data. Furthermore, open-weight models running on-premise (even on consumer-grade GPUs) are now remarkably close in performance to commercial counterparts for most investigations.

We don't need to reinvent the wheel. Sandfly’s Bring Your Own Model (BYOM) approach lets you use your existing LLM licenses from leading frontier companies or run local models to ensure your data never goes to a third party. You get the best AI analysis available without the cost, complexity, or headache of supporting "yet another AI model."

Elevating Linux Security with Sandfly and AI

Sandfly's robust data combined with a customer-selected LLM provides a turnkey way to get Linux security expertise in-house. Our agentless approach is the safest, most compatible, and most robust foundation possible for AI LLM integration without risky endpoint agents.

We love what AI can do for threat hunting, but it only works when fed the truth. By combining the deep knowledge of AI with the safety and precision of agentless EDR, we are delivering the fastest, simplest, and most effective Linux attack detection possible.

文章来源: https://sandflysecurity.com/blog/sandflys-agentless-linux-edr-ai-are-a-powerful-combo
如有侵权请联系:admin#unsafe.sh