Machines Triage. Humans Decide.
The AI-orchestrated cyberespionage campaign that Anthropic disclosed in 2026-6-4 14:57:11 Author: binarydefense.com(查看原文) 阅读量:3 收藏

The AI-orchestrated cyberespionage campaign that Anthropic disclosed in late 2025, the most agentic offensive operation publicly documented, required humans at four to six decision points per campaign. The attackers had every incentive to run the operation fully autonomously. They kept humans in the loop anyway, for the calls that mattered. They knew where the AI couldn't be trusted to decide.

That same structural question is the one defenders have to answer. Where do you let the agent decide, and where do you keep a human in command? It is not a values question. It is a capability question. Agents are excellent at one kind of work and terrible at another. The architecture that wins routes each kind to the side that does it well.

Human-driven security is not nostalgia. It is the operating system.

Deterministic work is where agents earn their keep

The work agents do well has a few features in common. The rules are stable. The training corpus is large. The output is falsifiable. The cost of a wrong answer is small enough that you can correct it without an incident report.

Most of tier-1 SOC work falls into that bucket. PowerShell deobfuscation. YARA matching against known malware families. MITRE ATT&CK technique tagging. IOC enrichment against threat intel feeds. Anomaly detection against established user-behavior baselines. Baseline correlation across endpoint, identity, and network telemetry. Each of those tasks has a deterministic core: there is a right answer, the answer can be checked against ground truth, and the failure modes are bounded.

This is the layer the agentic-SOC category is genuinely good at. The "reduces the repetitive first 80 percent of triage" framing is descriptive, not aspirational. The bulk of what arrives in an alert queue does not require judgment. It requires fast, accurate, repeatable processing. Putting humans on that work doesn't make the work better. It makes the humans tired.

The wins on this layer are real. A SOC running well-tuned agents against deterministic triage produces fewer false escalations, shorter per-ticket investigation times, and more consistent enrichment. Those are real operational gains. They are also the easiest gains to measure, which is why the category emphasizes them.

Ambiguous judgment is where humans stay irreplaceable

The other layer is harder to describe because it is not a list of tasks. It is a property of certain decisions: they require judgment under ambiguity, with material consequences, on incomplete evidence.

A concrete example helps. An alert fires on a domain admin account performing bulk database export at 2am. The user is on the on-call rotation and has a legitimate reason to be active. The export volume is unusual but not unprecedented. Is this exfiltration or maintenance? The model can list the evidence and weight each indicator. It cannot tell you, with confidence and reproducibility, what to do next. The right answer depends on context the model does not have access to: the user's recent project history, the company's quarterly close calendar, what the security team negotiated with that engineering manager last month, whether the org is in a sensitive contract renewal that would change the cost of a wrong call.

That is the shape of the ambiguous layer. The evidence is incomplete. The context is local. The cost of being wrong is asymmetric. The decision is not "is this malicious" but "what is the right next step given what we know and what we don't."

Models hallucinate confidently in this space. When a model encounters ambiguity, it tends to produce a confident-sounding answer that pattern-matches to its training distribution rather than to the specific situation in front of it. That failure mode is fine for low-stakes work. It is not fine for incident response, where a wrong containment decision can take a production system down or let an actual intrusion continue.

Humans handle this layer not because humans are smarter than models in some general sense. They are not. Humans handle it because the calculus on ambiguous decisions involves stakes, context, and accountability that the model has no exposure to. A senior analyst makes a containment call partly based on the alert, partly based on knowing the CFO's calendar, partly based on having been on the receiving end of a bad call last year. That blend is not something an agent can be trained on. It is something a human carries forward across cases.

The handoff between them is the architecture that matters

If those are the two layers, the architecture question is how they connect. The naive answer is "the agent handles tier-1, the human handles tier-3, tier-2 is the handoff." That description is roughly right but undersells what good handoff design actually requires.

The architecture that works has four specific properties.

First, every model verdict is explainable. The analyst sees what the model saw: confidence score, rules that fired, YARA matches, MITRE technique mappings, threat intel correlations, sandbox behavior if available. There is no black box at the handoff. The analyst can audit why the model concluded what it concluded and overrule it on evidence the model didn't have or didn't weight correctly.

Second, the model has explicit boundaries on what it can decide alone. A high-confidence verdict on a low-stakes finding (block a known-bad domain at the perimeter) can execute without analyst sign-off. A medium-confidence verdict on a high-impact action (isolate a production endpoint) requires explicit analyst approval. The boundary is set by the cost asymmetry, not by the model's confidence alone. This is the LLM-as-judge pattern: the LLM produces a reasoned verdict, but it cannot override the ML score without corroborating evidence such as sandbox results, YARA hits, or threat intel matches.

Third, analyst decisions feed back into the model. When a human overrules, the system records why. When a human escalates, the pattern is captured. Over time, the model gets better at routing the same shape of case to the human earlier, and the false-positive surface narrows. The feedback loop is not optional. Without it, the agent layer drifts and the human layer gets noisier over time.

Fourth, the audit trail is complete. Every model verdict and every human decision is logged in a way that lets a third party reconstruct what happened and why. This matters more in 2026 than it did in 2022 because the regulatory and board-level scrutiny on autonomous decisions is going to keep rising. Building this in early is cheaper than retrofitting it after a board asks the question. Post 5 picks up the accountability angle directly.

Those four properties together, explainable verdicts, explicit decision boundaries, feedback loops, complete audit trails, are what differentiates a real machines-and-humans architecture from a layered set of disconnected tools. The kit you can buy that does all four well is uncommon. Building it in-house at mid-market scale is even harder.

Designing your SOC around the distinction

The applied version of all of this is a design question you can run against your own SOC this week.

For each process in your SOC playbook, ask three questions. Is the decision deterministic, or does it require judgment under ambiguity? What is the cost of being wrong, both in absolute terms and asymmetrically? Is there context outside the alert data that materially affects the right answer?

Processes that are deterministic, low-cost-of-error, and self-contained can be agent-owned. Alert enrichment, IOC correlation, log normalization, deduplication, common-case malware triage against known families. Move those off your senior analysts' queue. They are burning expensive judgment time on cheap pattern-matching, and they will be more useful and less burnt out if you stop asking them to do it.

Processes that involve ambiguous judgment, asymmetric cost-of-error, or external context belong to humans. Attribution calls. Containment decisions on production systems. Threat hunting hypotheses that don't map cleanly to existing detection rules. Communication with business stakeholders during an active investigation. Decisions about whether to disclose externally. None of those benefit from agent autonomy. All of them benefit from agent assistance: the agent surfaces evidence, runs the deterministic enrichment, drafts the timeline. But the call stays with a human.

The processes that sit in the middle are where most SOCs are still working out the boundary. Some triage decisions, some escalation routing, some response actions. Walk those one at a time. The right boundary is rarely "fully autonomous" or "fully manual." It is usually "agent processes the case to a specific point, hands to a human at the decision moment, human approves or modifies, system logs the decision."

That design discipline, process by process, decision by decision, is the part that takes the longest and pays the most. You cannot buy your way to it. You can only build it.

Human-driven isn't nostalgia

The framing of machines versus humans in security operations is the wrong question. The right question is "what kind of decision is this, and where should it live?" The answer comes out the same way every time. Deterministic work runs better on machines. Ambiguous judgment runs better through humans. The architecture that wins routes each to the side that does it well, with an explicit handoff between them.

The agentic-SOC category is good at the first layer. The wins are real. The category does not yet have a clean answer for the second layer, and pretending it does is what produced the false dichotomy between "AI replaces analysts" and "AI is dangerous." Neither is right. The two layers are different work, requiring different tools, with the value created in the handoff between them.

That is what human-driven means. Not "AI is bad." Not "humans are smarter." The decisions that matter most stay with humans because that is the structural property of those decisions, not a values claim about people.

The next post picks up the practical side. What does the analyst role actually look like in 2027? What changes about training, retention, and the day-to-day work of being on the SOC floor? 


文章来源: https://binarydefense.com/resources/blog/machines-triage-humans-decide
如有侵权请联系:admin#unsafe.sh