tl;dr
In our Research and Intelligence Fusion Team (RIFT) we applied an incremental anomaly detection model to detect suspicious TLS certificates. This model gives security operations teams the opportunity to detect suspicious behavior in real-time, despite the contained traffic being encrypted. This blogpost discusses the research that we performed and the model that we applied with the Half-Space-Trees algorithm.
Introduction
Encrypted network traffic is both a challenge and an advantage for cyber security defenders. Protocols as Transport Layer Security (TLS) are widely used by organizations as a mitigation to prevent eavesdroppers from viewing sensitive data.
However, adversaries have adopted to using encryption too. Subsequently, more network traffic is being encrypted, which means defenders cannot do deep packet inspection the same way as they once did.
There is often (meta)data to consider when looking at encrypted traffic which still has operational value. In this blogpost, we describe the research on the characteristics of TLS certificates that we conducted and the incremental machine learning model that we applied to detect the anomalous certificates. This model gives us the following advantages:
Adversaries may install TLS certificates too
Nowadays, TLS certificates are widely used by organizations as a form of authorization and to prevent eavesdroppers from viewing sensitive data. If you visit the domain fox-it.com, most browsers show the lock symbol in the search bar. The lock symbol means that the TLS certificate of the domain you visit is entrusted by the browser and an encrypted connection over TLS is established. In addition, it should authorize that the data you get back is actually coming from fox-it.com.
However, adversaries may install TLS certificates too (MITRE, T1608.003) (ii). The certificates can be used for credibility, to spoof the identity of the victim, or to encrypt traffic to stay undetected. Adversaries can obtain certificates in different ways, most commonly by:
Research at RIFT
Our research started with investigating a dataset containing malicious and legitimate TLS certificates. An example of a legitimate certificate is the certificate used by the fox-it.com domain. You can observe that the attributes in the fields are properly filled in (the abbreviation of the attributes in the certificate can be found in table 1). In the subject name you can find the information of the owner (NCC Group), and in the Issuer field the information of the Certificate Authority (Entrust).
Subject Name:
C=GB, L=Manchester, O=NCC Group PLC, CN=www.nccgroup[.]com
Issuer Name:
C=US, O=Entrust, Inc., OU=See http://www.entrust[.]net/legal-terms, OU=(c) 2012 Entrust, Inc. – for authorized use only, CN=Entrust Certification Authority – L1K
Example 1. TLS certificate used by fox-it.com
The second example is an anomalous certificate that was used in Cobalt Strike. This is a clear example of a self-signed certificate because there is no Certificate Authority present in the Issuer Name. Furthermore, the Organization names, “lol” (O) and the empty Organizational Units (OU) look anomalous. If you investigate further, you may find that the domain in the Common Name (CN) attribute is related to Ryuk ransomware (v, vi).
Subject Name:
C=US, ST=TX, L=Texas, O=lol, OU=, CN=idrivedownload[.]com
Issuer Name:
C=US, ST=TX, L=Texas, O=lol, OU=, CN=idrivedownload[.]com
Example 2. TLS certificate used in Cobalt Strike for Ryuk ransomware
Attribute | Meaning |
---|---|
C | Country of the entity |
S | State of the province |
L | Locality |
O | Organizational name |
OU | Organizational unit |
CN | Common name |
We conducted an exploratory analysis on our dataset of known legitimate and malicious certificates with the knowledge of our security operations centers. Furthermore, we applied supervised models to run on our dataset. By applying white-box algorithms such as the Random Forest, we identified features that helped identify the malicious certificates. For example, the amount of empty attributes has a statistical relationship with how likely it is used for malicious activities. However, we did not want to train a model solely on our known malicious certificates, but broaden the scope of our detection and find new patterns in new data. Hence, we want to apply a model that can detect anomalies in real-time in a unsupervised way.
Anomaly detection: Taking the isolation-based approach
The Isolation Forest was the first isolation-based anomaly detection model, created by Liu et al. in 2008 (viii). Since anomalies are by definition rare and behave differently, they figured that anomalies are easy to isolate from the rest of the data, which became the foundation of the isolation-based approach. The Isolation Forest computes how easy an anomaly is isolated from the rest of the data by the amounts of splits being made in a binary tree-based structure (viii). The more anomalous an observation, the closer to the root of the tree (and thus faster) it gets isolated. The advantage of this approach is that it does not require a lot of memory or computational costs, in contrast to density and distance based approaches that execute lots of calculations (viii, ix). More importantly, the isolation-based approach has proven repeatedly to be a very effective method to detect anomalies (viii, ix).
Half-Space-Trees: What’s in a name?
Half-Space-Trees (HST) algorithm is an unsupervised anomaly detection algorithm that works isolation based and is an incremental learning successor of the Isolation Forest. The HST is an ensemble method, meaning it consists of multiple single half-space-trees (x). Graph 1 demonstrates how a simple half-space-tree isolates anomalies. Next to being an incremental learner for streaming data, a major advantage of the HST is that it can build trees without data: it only needs the data space dimensions. In this way, the trees can be built quickly and efficiently for fast anomaly detection (ix, x). Another advantage to us is that the HST is available in the River package for incremental machine learning with streaming data (xi).
Graph 1: An example of 2-dimensional data in a window divided by two simple half-space-trees, the visualization is inspired by the original paper (x). A single half-space-tree divides the window space in half-spaces based on the features in the data. Every single half-space-tree does this randomly and goes on as long as the set height of the tree. The half-space-tree calculates the amount of data points per subspace and gives a mass score to that subspace (which is represented by the colors). The subspaces where most datapoints fall in are considered high-mass subspaces, and the subspaces with low or no data points are considered low-mass subspaces. Most data points are expected to fall in high-mass subspaces because they need many more splits i.e., a higher tree to be isolated. The sum of the mass of all single half-space-trees become the final anomaly score of the HST (x).
Testing the model on our dataset
The HST was initially trained on legitimate certificates. When the model observes a suspicious TLS certificate it should isolate the certificate rapidly and give a high anomaly score. We tested the model on our test data that included both legitimate and malicious certificates.
The anomaly scores fall between 0 and 1. The closer the anomaly score is to 1, the easier it was to isolate the certificate and the more likely that the certificate is anomalous. For example, our certificate from example 1, used by fox-it.com, received an anomaly score of 0.43. The certificate from example 2, used by Ryuk ransomware, received an anomaly score of 0.84. The performance metrics on our test set can be found in table 2.
Performance Metric | Score |
---|---|
Precision | 0.95 |
Recall | 0.98 |
F-Score | 0.96 |
Testing the model in the security operations centers
After these results, we tested the model in our global security operation centers to explore how the HST performs on real-life streaming data. After training on the sensors, we analyzed and tuned the outputs that the model generated. For example, we saw that certain thresholds for anomalies can differ per sensor, so we adapted these on a sensor level as well.
The good, the bad, the weird
Keep in mind that high anomaly scores do not instantly indicate malicious behavior, but can be a sign of weirdness or novelty as well. Hence, to improve the detection results, we combined our model with rules and other models. The insights of our research helped with creating these rules and choosing other models to combine it with. In the end, it is also a matter of feedback loops.
Conclusions
We applied an unsupervised, incremental anomaly detection model with the Half-Space-Trees algorithm in our security operations centers. Even though the surrounding data may be encrypted, the model is able to detect anomalous TLS certificates rapidly and sent an alerts output to the SOC analyst. Because we combine the model with other rules and models, the precision and attribution of the alerts output is enhanced.
We encourage you to look at TLS Certificates
We would like to encourage other cyber security defenders to look at the characteristics of TLS certificates to detect malicious activities despite encrypted traffic. Encryption does not equal invisibility and there is often (meta)data to consider when searching for anomalous behavior. Particularly, as a Data Science team we found that the Half-Space-Trees is an effective and quick anomaly detector in streaming data.
References
[i] https://attack.mitre.org/techniques/T1608/003/
[ii] NCC Group & Fox-IT. (2021). “Incremental Machine Learning by Example: Detecting Suspicious Activity with Zeek Data Streams, River, and JA3 Hashes.”
[iii] Mokbel, M. (2021). “The State of SSL/TLS Certificate Usage in Malware C&C Communications.” Trend Micro. <https://www.trendmicro.com/en_us/research/21/i/analyzing-ssl-tls-certificates-used-by-malware.html>
[iv] <https://sslbl.abuse.ch/statistics/>
[v] <https://attack.mitre.org/software/S0446/>
[vi] Goody, K., Kennelly, J., Shilko, J. Elovitz, S., Bienstock, D. (2020). “Kegtap and SingleMalt with Ransomware Chaser.” FireEye. <https://www.fireeye.com/blog/jp-threat-research/2020/10/kegtap-and-singlemalt-with-a-ransomware-chaser.html>
[vii] Cooper, D., Santesson, S., Farrell, S., Boeyen, S., Housley, R., and W. Polk. (2008). “Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile”, RFC 5280, DOI 10.17487/RFC5280. <https://datatracker.ietf.org/doc/html/rfc5280>
[viii] Liu, F. T. , Ting, K. M. & Zhou, Z. (2008). “Isolation Forest”. Eighth IEEE International Conference on Data Mining, pp. 413-422, doi: 10.1109/ICDM.2008.17. <https://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/icdm08b.pdf?q=isolation-forest>
[ix] Togbe, M.U., Chabchoub, Y., Boly, A., Barry, M., Chiky, R., & Bahri, M. (2021). “Anomalies Detection Using Isolation in Concept-Drifting Data Streams.” Comput., 10, 13. <https://www.mdpi.com/2073-431X/10/1/13>
[x] Tan, S. Ting, K. & Liu, F.T. (2011). “Fast Anomaly Detection for Streaming Data.” 1511-1516. 10.5591/978-1-57735-516-8/IJCAI11-254. <https://www.ijcai.org/Proceedings/11/Papers/254.pdf>
[xi] <https://riverml.xyz>