Endpoint security software faces a tough challenge — it needs to be able to rapidly distinguish between desired and unwanted behavior with few false positives and false negatives, and attackers work hard to obfuscate (or cloak) their malicious code to prevent detection by security scanners.
To maximize protection, security software wants visibility into attack chains at their weakest — after any obfuscation has been stripped away, immediately before (or during) execution. Unfortunately, this is hard, because most security software hooks are either very low-level (e.g. kernel drivers watching file and process activity) and thus lack context, or very high-level (e.g. scanning downloaded files), where obfuscation is in place.
The Antimalware Scan Interface is a Windows platform mechanism that allows host applications to call out to security software before performing sensitive operations (e.g. running script, elevating UAC, invoking ActiveX objects, etc). The security software can scan the data buffers and return a “Allow”/”Deny” verdict based on its threat intelligence. AMSI is pluggable on both sides — any host application can call into AMSI, and any security software can receive the calls. On a default Windows system, hosts include cscript
, wscript
, PowerShell, the .NET Framework, UAC, WMI, etc. Microsoft Office desktop applications call AMSI to scan VBA macros. Microsoft Defender (and most 3rd party AV products) will scan data buffers provided by the host and return verdicts based on threat intelligence. Microsoft Defender for Endpoint also uses AMSI to implement several of its attack surface reduction rules.
AMSI protections help improve blocking in common initial access vectors (e.g. running a downloaded .js
file) and improve protection over traditional AV signatures (which can be fooled by obfuscation) and behavior monitoring (which might detect misbehavior too late). It’s especially useful against fileless threats in which the attack code was never on disk for a traditional AV sensor to scan.
The documentation for application developers who want to call into AMSI is pretty good. For security software developers, there’s both documentation and an sample AMSI scanner on GitHub.
The biggest limitation in AMSI is that a host application must call it– unlike most security software sensors, which rely on kernel drivers and process injection techniques to provide security for every application, AMSI requires the explicit participation of an application to call out to AMSI and respect its verdict.
Hosts must choose when they call AMSI– for example, a host might not choose to call AMSI if it doesn’t think a given script is potentially dangerous (e.g. the Windows Scripting hosts may only call AMSI if they see a potentially-dangerous object created).
Similarly, AMSI is never called to evaluate JavaScript running inside your browser (Edge, Chrome, Firefox, etc). Unlike cscript
and wscript
which run “shell scripts” with access to powerful capabilities (writing files, launching processes, etc), browsers execute “web platform” JavaScript inside tight security sandboxes that prevent interaction with files and other objects on the system. Browsers compete on runtime performance, and the overhead of security scans on sandbox would yield a poor cost/benefit.
As with any other security software component, attackers have attempted to tamper with AMSI, using a wide range of techniques ranging from simple to ingenious. The majority of such techniques require that the attacker have at least partially-compromised the victim PC (to manipulate the execution environment), and a major goal of AMSI is to foil the attacks that would grant the attacker initial access to start with.
AMSI has been around for a long time now, and hasn’t seen a ton of changes over the years. However, there’s recently a greatly renewed interest in empowering security software without requiring that each vendor write its own kernel drivers, and the model used by AMSI is a great one for providing powerful visibility and control in user-space.
To that end, I’d love to see more scenarios for AMSI-like callouts. Most notably among them– URL Reputation. Today, myriad applications present, transmit, and load content from URLs, but security software often is not well-positioned to “see” those URLs. Except for security code directly integrated into browsers (e.g. SafeBrowsing for Chrome, SmartScreen for Edge), security software is often forced to “guess” where a given network request is going via low-level packet-sniffing. This approach (at best) supplies only the hostname (not the full URL) will soon become even less reliable, as browsers further improve their privacy against network sniffers. Imagine if every application could easily call out to the platform security software and ask “Hey, I’m going to go grab these 40 URLs. Any objections?“
Another scenario is malicious browser extensions. Today, security software can watch the browser load an extension’s code out of its filesystem location and block those loads if the files are known to be malicious, but at the browser level this blocking has a poor UX. All the browser “sees” is that an extension failed to load, but it doesn’t know why. It will still think that the extension should be present and loading, and can’t tell the user that the file was blocked for security reasons. Browsers don’t expose an API for security software to even know what extensions are installed and enabled; security inventory software that wants to report extensions up to the SOC must parse browsers’ internal configuration files (which is unsupported) to try to determine which extensions are allowed. If browsers could call into AMSI before loading an extension, this could provide a much better UX.
A further scenario is invocation of App Protocols. App Protocols provide the easiest way for an attacker to escape a browser’s security sandbox and are thus of prime interest to attackers. However, security software typically doesn’t get to directly observe the URL launched by the browser — instead, it must rely on kernel sensors (e.g. this callback) to watch for CreateProcess
calls, and this might not be sufficient– not every protocol invocation results in a CreateProcess call (e.g. some result in a COM object creation) and a kernel sensor lacks important context (e.g. the URL of the page that asked to launch the protocol). If browsers could call into AMSI before invoking an App Protocol, they could provide a better UX and improve threat intelligence.
Any other scenarios I haven’t thought of yet?
-Eric