APT’s – advanced persistent threats – are among the most feared threats in the cyberspace. They are well known for their use of highly sophisticated techniques, long lasting intrusions, and their slow and stealthy movement through infiltrated infrastructures. Often backed by nation states, APT groups possess the resources to target sensitive and classified information from governmental as well as economic institutions. Besides financial profit and geopolitical destabilization, exfiltrating confidential, intellectual property is the central motivator for many threat actors. Key precautionary measures are early warnings and actionable threat intelligence for detecting intruders, but how does one find a reliable, up-to-date and trustworthy threat intelligence source?
Let’s try to find an answer to that question by following our journey through the jungle of TI providers that DCSO took this summer. But be careful, there are some pitfalls ahead of us!
Our Initial Motivation
As part of our Threat Detection and Hunting service, DCSO operates an Intrusion Detection System (IDS) located in customer networks in order to detect known threats and network traffic anomalies. In this context, we continuously strive to improve our detection capabilities, especially with regard to advanced threats and therefore require a reliable source of APT-related threat intelligence. Consequently, we formed a working group to conduct an internal assessment of threat intelligence providers with members of DCSO’s Threat Intelligence, Threat Detection & Hunting, and Technology Scouting & Evaluation teams. While aiming to find the best threat intelligence provider for sourcing APT-related indicators for TDH’s network sensors, we first wanted to get an overview of the TI market situation in order to subsequently select promising providers. Then, we were hoping to compare them based on objective criteria and metrics. Did everything work out as planned you may ask? Well, let’s have a look together …
First, it was necessary to establish a target definition of the ideal data feed. Besides features like a low false-positive rate, constantly updated indicators, and both accurate and comprehensive time information, we generally desire additional intelligence options, such as curated IDS rule sets or broad contextual information. Along the lines of this idealized data feed, we then tailored test scenarios to ensure a fair and comparable evaluation of the TI providers. To convert the TI feeds into a comparable format, we relied on the MISP project with its integrated normalization and data correlation features, aiming to channel threat data into our MISP instance (also check out our MISP-dockerized project on GitHub). After finalizing the testing methodology, the evaluation for selected candidates started with the first APIs and data feeds becoming accessible.
Jumping into the Evaluation
Since our evaluation focused on APT-related threat intelligence, we expected indicators to be already attributed to the corresponding actors. Unfortunately, some vendors lacked this basic requirement for our assessment, and consequently, we had to exclude them from our regular assessment. While studying the data schemes of the remaining vendors and inspecting the first data arriving at our analysis backend, we quickly realized that the delivered TI data were too diverse to put them into a comparable scheme. Notably, there were the following challenges:
- Different data formats and delivery methods with some of them not being natively supported by the MISP platform
- Heterogeneous approaches to indicate APT data (if being tagged at all)
- Either no context or contextual information on various, incompatible detail levels
As a consequence, we changed our initial plan, discarded our MISP-based comparison idea, and continued evaluating the TI feeds intrinsically based on their characteristics. By doing so, we reduced the potential risk of data loss during data conversion actions and completed a fairer assessment per feed, keeping the various flavors of threat intelligence services in mind. In the end, we built custom data loaders, saved each feed’s raw data separately, and only merged specific attributes across all vendors for statistical analysis.
Even with this approach, we had to unfortunately make compromises in some cases. A significant issue arose while comparing the provided timestamps: Almost every vendor followed its own approach to add timing information to indicators, such as first seen, last seen/updated, or expiry dates, if available at all. Sometimes timing information was only available for a feed in a general manner, while other feeds added data directly at the event or indicator level. Timestamps were not used uniformly, even within the same technology stack, which made it nearly impossible to give an accurate verdict about the timeliness of indicators in retrospect. On the other hand, indicator timeliness of APT-related data is really a dominant data quality criteria, especially considering that APTs are usually long-lasting attacks with high dwell times?
When shifting the focus on indicator overlaps and false-positive hits, it is possible to determine more precious details about the characteristics. Looking at those metrics told us a lot about each vendor’s approach to sharing APT data: Is there an overlap to publicly available information (OSINT)? Does another vendor ship the same indicator data, even faster or with more context? Are Content Delivery Networks (CDNs) or Alexa Top 100k domains included? And ultimately, what level of confidence (regarding APTs) should one give the information observed?
Finally, we created regional heat maps for the APT-related information where we mapped network- and file indicators to adversaries from specific regions (e.g., China, Russia, etc.). Such an overview can really tell a lot about the focus areas of TI providers, which will ultimately help to find the proper data provider matching one’s personal threat model (e.g., if you’re doing business in the petroleum industry, you may want to focus on Iranian threat actors). The quality of the heat map representation depends, however, on the contextual data provided; during our assessment, we had to consider a significant number of indicators as unknown or untagged, since simple actor or region information was not available.
Long story short: selecting a threat intelligence provider is awfully difficult, and finding the best is even worse. Comprehensive solution testing already demands a lot of time in general, but with threat intelligence providers, new and unexpected challenges arise that can slow down the evaluation process significantly or demand additional trade-offs. Thankfully, we found a lot of high quality product documentation and received generous access to labeled data and additional contextual information. Based on the test dimensions introduced before, our assessment provided in-depth insight into the mechanics and internals of threat intelligence feeds, but also their advantages and their limitations. Even if we were not able to compare the feeds to the extent initially desired, we were able to find the best-fitting data providers for our particular use cases, while focusing on enhancing our TDH service.
If you are a data scientist or security researcher facing similar questions, we would strongly suggest that you try understanding your threat model and use cases first. Also, find a definition of APT data for yourself: What makes an IoC APT’ish for you? Do you require explicitly tagged APT information? Which of those, indicator data or context, is more important to you? Once you know your focus, develop internal benchmarks to ensure usable results but be prepared to skip some desired analysis steps or data fields if you run into dead ends. You won’t be able to measure everything, and probably you’ll come up with many new questions/considerations that you haven’t had before. Therefore, it doesn’t make sense to struggle with some details, make sure not to lose the bigger picture and your overall focus.
More insight, also covering suggestions for vendors and blue teamers, is given through a talk with Alicia Hickey and Dror-John Roecher held on this year’s hack.lu.
Who we are
The Threat Intelligence -Team helps clients to reduce the threat posed by adversaries to their networks by leveraging the power of collaborative defense in combination with comprehensive analytics and contextualized threat intelligence. DCSO delivers actionable intelligence on all levels – from atomic Indicators of Compromise (IoC) to insights into the political, economic and cultural context of adversaries.