ACRE

How do you know if you have enough detections?

Security leaders frequently ask a deceptively simple question: do we have enough detections? After years of working in detection engineering, I haven’t found a satisfying answer, so this post will introduce ACRE (ATT&CK Coverage Ratio Evaluation), a metric designed to answer that question in a way that is intuitive, time‑bound, and meaningful across organizations.

The Heatmap Problem: “Enough” Is Undefined

Most organizations attempt to answer detection sufficiency by mapping their rules to MITRE ATT&CK® and visualizing coverage using heatmaps. ATT&CK will show a vast number of techniques that attackers can use to achieve their objective, so it seems wise for us to try to detect all of them. The intent is good—but the results are often misleading.¹ A typical ATT&CK heatmap could look like the one below:

Heatmap A, src

Heatmaps like this raise more questions than they answer, especially when they are being shown to senior management:

What does “good” coverage look like?
- Should every technique be green? Is that even possible?
What should we do next to improve? Should we turn the colorless parts to red? Should we make the red cells yellow? What are the thresholds for the colors anyway?

Because a heatmap (especially one derived from ATT&CK Navigator, a wonderfully flexible tool) can represent anything we want, and the colors/gradients can also represent anything we want, we must have a reason or a standard for consistent representation. In other words, the heatmap alone does not answer the question “Is this enough?“.

Another major challenge is that heatmaps are too complex to show progress. Look at Heatmap A and newly introduced Heatmap B. Let’s assume they represent two snapshots from the same organization. Which one do you think came first? If you were presenting this to an executive, what story would you tell about the changes?

Heatmap B, from this helpful reference on how to use ATT&CK Navigator

One more substantial problem is that MITRE ATT&CK® is not flat — but a heatmap is. That’s normal for models (“all models are wrong, some are useful”), but it means using a flat heatmap to show detection coverage specifically will oversimplify things to a non-useful degree.

Heatmaps fail to define a clear goal, allow measured progress, or guide prioritization decisions. They don’t tell us whether we should add detections where we have none, or improve detections where we already have some. In short:

While ATT&CK heatmaps can be useful tools, they should not be used to track detection coverage!

What a Useful Metric Looks Like

Before introducing a solution, it’s worth defining what standards we should use to evaluate it. A useful detection coverage metric must be:

Intuitive – an executive should understand it without explanation
Time‑bound – it should show improvement or regression over time
Aligned – it should improve as teams do meaningful detection work
Translatable – it should be meaningful across organizations

Traditional metrics like “number of alerts” or “attacks blocked” fail these tests. Detection coverage, however, can meet them—if measured correctly.

ACRE methodology

ACRE stands for ATT&CK Coverage Ratio Evaluation. At its core, ACRE is a ratio:

(Techniques we detect)² / (Techniques that are detectable)

Putting ACRE into a time-series line chart addresses all of our problems:

Inherently time-bound
Intuitive (“line go up” is a straightforward way to know if things are going well)
Aligned to detection engineering technical outcomes, not just tracking activity
Translates well across organizations through a consistent measurement

Over two years of generating this metric at my organization, it has helped uncover real detection gaps, accurately represent detection engineering work to leadership, and justify investment in collaborative detection research. However, producing this simple ratio required solving three hard problems:

What should the denominator be for this evaluation?
What ATT&CK techniques are actually detectable?
What counts toward the numerator?

Problem 1: Defining the Denominator

Many teams default to using all ATT&CK techniques as the denominator (this is inherent to the heatmap visualization as well). This is flawed. ATT&CK includes techniques that cannot be detected today or perhaps ever (insufficient telemetry) and some that can never be completely detected (due to breadth of the technique) - in other words, some techniques should only be detected opportunistically.

Including techniques like these in coverage assessments makes “100% coverage” impossible and meaningless. The denominator must be limited to techniques that are realistically detectable.

Of course, there’s the other side of the problem as well, where ATT&CK does not include the entire denominator. The project was originally designed (in 2013) to focus on “Windows enterprise networks” and “endpoint telemetry” (per the FAQ, and while ATT&CK has grown to include other platforms since then (see ATT&CK platform overview), most techniques are linked to Windows endpoints. But given MITRE ATT&CK is the best-defined technique inventory we have, we should adopt a satisficing approach (a term coined by Herbert A. Simon) and use realistically detectable ATT&CK techniques as our denominator for now (while we build out what other techniques might be missing, especially for other platforms like Trusted Service Infrastructure (TSI)).

Problem 2: Determining What Is Detectable

Manually evaluating hundreds of ATT&CK techniques is impractical and could lead to inconsistent results. Instead, ACRE leverages MITRE CAR Coverage, which compares multiple open‑source detection libraries and shows what techniques are commonly detected. This provides a defensible, evidence‑based answer to the question:

If multiple independent detection libraries detect this technique, it is detectable.

That said, CAR-based coverage (derived from the website above) has limitations:

It does not cleanly separate coverage by operating system (if a cross-OS technique like PowerShell shows up in any detection library for any OS, it’s counted as covered)
It can overcount detections via naive string matching³
The calculation list is outdated (some techniques listed in CAR (like T1108) have since been deprecated)

To address this, ACRE:

Splits coverage analysis by OS (see MITRE ATT&CK® is not flat)
Parses library detection formats rather than string-searching for technique IDs
Defines a minimum rule count for detectability (avoiding cases where a rule author incorrectly applies a technique ID)

Furthermore, since we should comprehensively detect the middle of the attack, ACRE applies a higher weight for covering techniques in mid-stage tactics (effectively, after execution but before impact).

Problem 3: Defining the Numerator

The hardest question is deceptively simple: How many detections are “enough” for a technique?

My thinking on this topic is evolving - I now think the answer depends on the type of technique — but some standard principles remain.

Summiting the Pyramid Levels shows that many existing detections look at ephemeral or tool-based properties, detecting “instances” rather than “procedures”. We cannot know how many level 1-3 detections are necessary to cover a technique, because new tools can be created at any point, obviating the detections.

However, approaches like the Technique Research Report (TRR) show us that we can identify all possible procedures for many techniques. With a complete list of procedures for all techniques we need to monitor, we could measure our detections against that procedure list. Unfortunately, there are many more techniques to research and document in this way, and building out the full procedure list will take time (and collaborative research). Until then, ACRE adopts the satisficing approach again:

If we have at least one valid detection for a technique x Host OS pair, we count it as covered.

This allows us to measure breadth of coverage, even if depth remains unmeasured. In other words, the metric will show where some ATT&CK techniques are completely uncovered, which should be used to drive detection research projects (why are those gaps uncovered)? Better yet, because ACRE uses open-source rules as the source, there’s a ready-made reference to see what types of rules could be added to fill the gap.

Using ACRE

For all that background, it’s quite easy to use ACRE.

Generate your ACRE score and find gaps with skills I’ve provided below.
Do this regularly - at least quarterly - so you can get credit for your detection improvements and see if the denominator (detectable techniques) evolves.

Generating ACRE score

Use the provided platform-coverage skill and the Security-Detections-MCP to calculate which techniques are detectable (solving problem 2)
- Note: the skill is written to work with Claude Code, but any GenAI product should be able to adapt it. Just ask!
- Also, your GenAI product should be able to install the MCP for you - note that if you’re using Windows, it’s much easier to use Windows Subsystem for Linux (WSL) than the native Windows installation.
Export your custom detections and run the acre-coverage skill to calculate your ACRE score (solving problem 3). Your export must include the following fields for each rule:
- Creation Date
- MITRE ATT&CK techniques that the rule covers
- OS the rule covers — if your export doesn’t include this, ask a GenAI tool to infer it from your data source or log source field
- Rule name and/or ID (for subsequent human analysis)
After running those two skills, you’ll get a report and an acre-history.json file you can use to track progress over time. Export the data to Excel, add a simple line chart, and you have an enterprise-ready metric that’s also useful to defenders.

ACRE output example

Here’s an example of what the acre-coverage skill will generate when you run it:

acre_output-example
ACRE Score Report — 2026-04-13

Scores

Platform ACRE Score Covered Techniques Weighted Score
Windows 0.1115 (11.2%) 25 / 226 detectable 35.0 / 314.0
Linux 0.1561 (15.6%) 15 / 98 detectable 21.0 / 134.5
macOS 0.1038 (10.4%) 4 / 39 detectable 5.5 / 53.0

Detectable threshold: ≥5 open-source rules (Sigma + Splunk + Elastic combined)

Tactic weight (1.5×): persistence, privilege-escalation, defense-evasion, credential-access, discovery, lateral-movement, collection

This is the first run — trend history will appear on the next run after you add detections.

Top Coverage Gaps — Windows

320 total gaps across all platforms.

Technique Name Weight Sigma Splunk Elastic
T1218 System Binary Proxy Execution 1.5× 140 14 18
T1027 Obfuscated Files or Information 1.5× 85 6 3
T1055 Process Injection 1.5× 26 26 11
T1087.002 Domain Account 1.5× 21 27 7
T1036 Masquerading 1.5× 34 9 11
T1218.005 Mshta 1.5× 7 12 26
T1098 Account Manipulation 1.5× 16 13 18
T1003 OS Credential Dumping 1.5× 23 7 15
T1068 Exploitation for Privilege Escalation 1.5× 10 17 16
T1133 External Remote Services 1.5× 11 31 1

Top Coverage Gaps — Linux

Technique Name Weight Sigma Splunk Elastic
T1068 Exploitation for Privilege Escalation 1.5× 5 8 37
T1548.003 Sudo and Sudo Caching 1.5× 0 35 19
T1543 Create or Modify System Process 1.5× 0 3 36
T1574 Hijack Execution Flow 1.5× 0 0 33
T1078 Valid Accounts 1.5× 0 2 32
T1574.006 Dynamic Linker Hijacking 1.5× 2 5 32
T1082 System Information Discovery 1.5× 9 5 23
T1014 Rootkit 1.5× 1 3 22
T1547.006 Kernel Modules and Extensions 1.5× 1 7 20
T1562.001 Disable or Modify Tools 1.5× 3 5 20

Top Coverage Gaps — macOS

Technique Name Weight Sigma Splunk Elastic
T1087.002 Domain Account 1.5× 0 11 1
T1562.001 Disable or Modify Tools 1.5× 1 3 8
T1082 System Information Discovery 1.5× 4 2 5
T1133 External Remote Services 1.5× 1 5 4
T1110 Brute Force 1.5× 0 9 1
T1005 Data from Local System 1.5× 0 0 8
T1543.004 Launch Daemon 1.5× 2 0 5
T1555.001 Keychain 1.5× 1 0 5
T1078.003 Local Accounts 1.5× 4 0 5
T1647 Plist File Modification 1.5× 0 1 5

Output Files

File Contents
acre_current.json Today’s scores
acre_history.json History (1 entry — baseline established)
acre_gaps.json 319 prioritized gaps with rule file paths
parsed_detections.json 50 normalized detection records
Link to original

Platform	ACRE Score	Covered Techniques	Weighted Score
Windows	0.1115 (11.2%)	25 / 226 detectable	35.0 / 314.0
Linux	0.1561 (15.6%)	15 / 98 detectable	21.0 / 134.5
macOS	0.1038 (10.4%)	4 / 39 detectable	5.5 / 53.0

Technique	Name	Weight	Sigma	Splunk	Elastic
T1218	System Binary Proxy Execution	1.5×	140	14	18
T1027	Obfuscated Files or Information	1.5×	85	6	3
T1055	Process Injection	1.5×	26	26	11
T1087.002	Domain Account	1.5×	21	27	7
T1036	Masquerading	1.5×	34	9	11
T1218.005	Mshta	1.5×	7	12	26
T1098	Account Manipulation	1.5×	16	13	18
T1003	OS Credential Dumping	1.5×	23	7	15
T1068	Exploitation for Privilege Escalation	1.5×	10	17	16
T1133	External Remote Services	1.5×	11	31	1

Technique	Name	Weight	Sigma	Splunk	Elastic
T1068	Exploitation for Privilege Escalation	1.5×	5	8	37
T1548.003	Sudo and Sudo Caching	1.5×	0	35	19
T1543	Create or Modify System Process	1.5×	0	3	36
T1574	Hijack Execution Flow	1.5×	0	0	33
T1078	Valid Accounts	1.5×	0	2	32
T1574.006	Dynamic Linker Hijacking	1.5×	2	5	32
T1082	System Information Discovery	1.5×	9	5	23
T1014	Rootkit	1.5×	1	3	22
T1547.006	Kernel Modules and Extensions	1.5×	1	7	20
T1562.001	Disable or Modify Tools	1.5×	3	5	20

Technique	Name	Weight	Sigma	Splunk	Elastic
T1087.002	Domain Account	1.5×	0	11	1
T1562.001	Disable or Modify Tools	1.5×	1	3	8
T1082	System Information Discovery	1.5×	4	2	5
T1133	External Remote Services	1.5×	1	5	4
T1110	Brute Force	1.5×	0	9	1
T1005	Data from Local System	1.5×	0	0	8
T1543.004	Launch Daemon	1.5×	2	0	5
T1555.001	Keychain	1.5×	1	0	5
T1078.003	Local Accounts	1.5×	4	0	5
T1647	Plist File Modification	1.5×	0	1	5

File	Contents
`acre_current.json`	Today’s scores
`acre_history.json`	History (1 entry — baseline established)
`acre_gaps.json`	319 prioritized gaps with rule file paths
`parsed_detections.json`	50 normalized detection records

What’s next to improve ACRE?

Better defining the numerator: how much does a given rule increase coverage?
- Decomposing techniques into TRRs will allow more precise detection measurements, along with a host of other benefits like the ability to validate vendor detection effectiveness
Better defining the denominator:
- ATT&CK is well-covered for host operating systems, but we need similar data for Trusted Service Infrastructure (TSI) and other types of systems that must be defended
- I’m collecting different types of Technique matrices that will need to be deduplicated and combined to expand the denominator

It doesn’t help that MITRE ATT&CK® is not designed for detection! ↩
Technically, detection rules can’t detect techniques, but they can detect ATT&CK Procedures and Instances that are linked to a technique. ↩
It seems like the data was generated by doing text searches in the detection repos, but that means it matched deprecated Sigma rules or documentation in other repos. ↩

Thriving Defense

Explorer

ACRE

How do you know if you have enough detections?

The Heatmap Problem: “Enough” Is Undefined

What a Useful Metric Looks Like

ACRE methodology

Problem 1: Defining the Denominator

Problem 2: Determining What Is Detectable

Problem 3: Defining the Numerator

Using ACRE

Generating ACRE score

ACRE output example

acre_output-example

ACRE Score Report — 2026-04-13

Scores

Top Coverage Gaps — Windows

Top Coverage Gaps — Linux

Top Coverage Gaps — macOS

Output Files

What’s next to improve ACRE?

Graph View

Table of Contents

Backlinks

Thriving Defense

Explorer

ACRE

How do you know if you have enough detections?

The Heatmap Problem: “Enough” Is Undefined

What a Useful Metric Looks Like

ACRE methodology

Problem 1: Defining the Denominator

Problem 2: Determining What Is Detectable

Problem 3: Defining the Numerator

Using ACRE

Generating ACRE score

ACRE output example

acre_output-example

ACRE Score Report — 2026-04-13

Scores

Top Coverage Gaps — Windows

Top Coverage Gaps — Linux

Top Coverage Gaps — macOS

Output Files

What’s next to improve ACRE?

Footnotes

Graph View

Table of Contents

Backlinks