Why isn’t anomaly detection well-regarded in Cyber Security?

Evan Powell

October 15, 2025

Anomaly detection has long been viewed as a promising approach to indicating potential cybersecurity attacks. With the sheer volume of logs, alerts, and network traffic, the task of identifying unusual activity is daunting. Traditional machine learning (ML) methods have been applied to anomaly detection for years, but despite their early promise, they are not well regarded by cybersecurity practitioners. As recent articles and research have indicated, ML-based anomaly detection is often seen as contributing to a flood of alerts that can overwhelm security teams, making a bad problem worse.

Press enter or click to view image in full sizeJust another day in the SOC

I believe there are a few reasons that past approaches to anomaly detection via Machine Learning have failed to live up to their promise. Please share your thoughts as well — by learning from what has NOT worked, perhaps we can find better paths towards more effective and efficient cyber security.

Even if I had labels, they’d lead me astray

Traditional ML-based anomaly detection relies heavily on human-made, pre-labeled data of attacks — and attempts to see similar patterns of attacks in new data.

There are a couple of challenges with this approach:

First, attack labels are very hard to come by. While Mitre has done an incredible job gathering common attack patterns in Mitre Att&ck — these sequences are defined, by design, at a high level. As someone who has spent countless hours looking for good labels of attacks from various academic papers and research institutes, I know that they are few and far between; the high-quality ones we have found can be useful in evaluating the accuracy of a model but are insufficient to be the raw material of training a state-of-the-art model.

Second, perhaps even more importantly, models and rules based on specific attack patterns are blind to novel attacks — which are increasingly important, in part because attackers are innovating with the help of generative AI and because the rate of innovation in operations (e.g. SaaS, PaaS, API platforms) has created so much more room for novelty in attacks.

In addition to limitations in finding useful labels of attacks, there are a variety of other challenges that have limited the effectiveness of anomaly detection models for cybersecurity.

Humans all the way down

Ironically, while ML systems are built in part to augment and improve upon the rules-based approaches that still dominate in security, traditional ML is based on feature design by humans. This requires the same sort of understanding of connections between data that authoring rules-based indicators requires. And the model designers are often less qualified to understand which features are important than traditional detection engineers, who are closer to security operations.

The all-too-human frailty of these models is particularly important in anomaly detection in noisy systems such as enterprise and service provider networks. These networks are changing constantly, and the permutations of routes through the network and states of the entities emitting logs is effectively infinite.

For example, a spike in network traffic between two entities on the network might trigger an alert based on volume alone. But what if that spike is due to a legitimate software update or a routine maintenance task? An accurate rule or set of features might look at time of day, day of the week, the calendar itself, the duration, sequence, and frequency of events plus other context such as all the prior roles that a particular entity, individual, or server has taken on in a given network, AND the relationships of these events to the events of most if not all of the other events occurring at the same time. The complexity is staggering, and beyond the comprehension of humans whether we are tasked with building rules or selecting features.

The more you build, via rules and machine learning systems, the more brittle the system becomes in part because, to be accurate, it must stretch the understanding of us mere humans.

Press enter or click to view image in full sizeAttacks and attack vectors have changed — add another rule, tweak the old one, and compare to your baseline. You do have a baseline — right?

Even if you could feature engineer the relationships between all entities on a network — including end-user devices, of course — current approaches to training and delivering insights have a hard time scaling to handle the sheer volume of data. As a result, with current approaches, the more data you have, the more frequent and disruptive the false positives become.

Lack of Transfer Learning

One additional major limitation of traditional ML models is their inability to perform transfer learning. Transfer learning, which has been a breakthrough in many AI fields, allows a model to take the knowledge it has gained in one domain and apply it to another, related domain. Unfortunately, most traditional ML approaches in cybersecurity are narrowly trained on specific datasets or attack patterns and cannot generalize well beyond these. This means that for each new dataset or environment, the model essentially starts from scratch, limiting its utility and requiring significant retraining. In contrast, true foundation models are able to adapt to new contexts with minimal retraining, allowing them to recognize anomalies in vastly different network setups without losing their effectiveness.

Inability to Capture Complex Event Sequences and Time Signatures

Another significant shortcoming of traditional ML methods is their poor handling of complex sequences of events and the timing of those events. For many types of cyber attacks, the sequence and timing of actions are critical indicators. Traditional models often fail to consider how events unfold over time or their dependencies on one another. While early attempts with models like RNNs and LSTMs tried to address this by looking at event sequences, they still suffer from issues like gradient collapse, which limits their ability to retain information about events that occurred farther back in time. This is a serious drawback when detecting sophisticated, slow-burning attacks that unfold over hours or even days. Current models struggle to contextualize these long-term dependencies, further limiting their effectiveness in real-world cybersecurity applications.

For now, let’s conclude with this: While anomaly detection is a well-known discipline, it has largely failed in cybersecurity due to a variety of factors — including but not limited to those mentioned here.

Humans are simply not able to conceptualize the permutations and complexity of what makes for normal flows and typical behaviors by entities on a network of any complexity. And even if we obscure the role of the humans to “just” feature engineering, we still are exposing our all-too-human limitations when we build these models.

It really is a case of “humans all the way down.”

In the next post, I’ll dive into a particularly promising approach that emerged out of graph theory and start to discuss the role of deep learning in anomaly detection. I am motivated by a strong belief that “more of the same” is an irresponsible approach to cyber security. The attackers are winning. We need to do better — and to do so, we should learn from prior attempts to apply technologies from outside of cyber security, including anomaly detection.

‍

Why isn’t anomaly detection well-regarded in Cyber Security?

Even if I had labels, they’d lead me astray

Humans all the way down

Lack of Transfer Learning

Inability to Capture Complex Event Sequences and Time Signatures

See the threats your tools can’t.