Blog

Understanding LogLM: The deep learning foundation model built for threat detection

|

Foundation models have transformed how we approach complex problems in artificial intelligence. Security researchers have begun exploring whether this approach works for cybersecurity telemetry. DeepTempo's LogLM (Log Language Model) provides an answer: deep learning foundation models trained on operational data can learn behavioral structures that reveal attacker intent.

LogLM is not a language model repurposed for security. It is a transformer-based foundation model built specifically to understand how systems communicate. Where traditional detection relies on signatures or deviation from baseline, DeepTempo's two-stage architecture learns behavioral representations that classifiers interpret to identify attacker intent. This distinction matters because attackers can make individual network flows appear normal. They cannot make the behavioral timeline structure normal.

How foundation models work for threat detection

Foundation models in cybersecurity operate differently than language counterparts. Purpose-built security foundation models are encoder-focused systems that create behavioral embeddings rather than generating text. They excel at representing complex patterns through learned representations that capture structural relationships.

The transformer architecture enables this through self-attention mechanisms that process entire behavioral timelines rather than individual events. Where recurrent networks struggle with longer contexts, transformers maintain effectiveness across extended timelines. This proves critical for detecting attacker intent that manifests across multiple flows.

The two-stage architecture

DeepTempo implements threat detection through deliberate separation between behavioral learning and intent classification. This design reflects a fundamental insight: understanding how systems operate differs from determining what a specific operational pattern attempts to accomplish.

Stage one: LogLM foundation model. The foundation model learns behavioral representations from flow telemetry between endpoints. Raw NetFlow data undergoes canonical pairing to ensure bidirectional communication appears consistent. The system constructs behavioral timelines representing how two endpoints interact over time. Each flow contributes multiple features that characterize the communication pattern.

LogLM processes these behavioral timelines through transformer layers that create embedding vectors. Similar behavioral timelines cluster together in this embedding space. The model learns structural similarities without understanding intent. A reconnaissance behavioral timeline might cluster near other reconnaissance activity because their structural patterns align, not because LogLM recognizes "reconnaissance."

Stage two: Intent classifiers. Separate classifier heads interpret the behavioral embeddings to assign intent. A binary classifier determines whether a behavioral timeline represents operational activity or attacker behavior. The multi-label classifier then maps malicious behavioral timelines to MITRE ATT&CK tactics.

This separation proves essential. The foundation model creates embeddings that represent behavioral timeline structures. Classifiers determine what those structures mean. When new attack techniques emerge, retraining classifiers requires less data than rebuilding the foundation model's behavioral representations.

What LogLM learns from behavioral timelines

Individual network flows often appear normal during attacks. An attacker scanning a network might operate within expected parameters and use legitimate methods. Examined alone, each connection looks benign. The behavioral timeline structure reveals the malicious intent.

LogLM trains on diverse operational data including both routine system activity and attacker behavior. The model develops representations that encode how flows relate within a single behavioral timeline. These relationships capture structural patterns in how endpoints communicate. The classifiers then interpret these representations to identify structural signatures of malicious intent.

Traditional anomaly detection trains only on benign data and flags deviations from baseline. An attacker operating within normal parameters might evade detection. DeepTempo trains LogLM on diverse operational data including both benign and malicious behavioral timelines. LogLM creates embeddings where structurally similar behavioral timelines cluster together. The classifiers then interpret these embeddings to identify reconnaissance patterns, independent of whether they deviate from expected baselines.

Detection without attack chain reconstruction

DeepTempo evaluates each behavioral timeline independently. The system does not reconstruct full attack chains or track progression across multiple behavioral timelines. When the system flags reconnaissance, that determination comes from classifiers interpreting LogLM embeddings that match learned reconnaissance patterns, not from tracking subsequent lateral movement.

This differs fundamentally from correlation engines attempting to piece together attack narratives. Each behavioral timeline exhibits operational intent or attacker intent based on its structure. Detection happens at the behavioral timeline level. The classifier interprets the embedding and assigns probability across intent categories.

Attackers cannot make reconnaissance behavioral timeline structure match operational patterns while accomplishing reconnaissance objectives. The structure itself reveals intent. This proves more robust than tracking multi-stage progressions because attackers can introduce delays, use different endpoints, or fragment activities across infrastructure to evade correlation.

Zero-shot detection capability

Foundation models demonstrate zero-shot performance on tasks they were not explicitly trained for. LogLM exhibits this capability because it learns general behavioral structures rather than specific attack instances. This generalization differs from language models adapted for security because LogLM trains on operational data patterns.

When deployed in new environments, LogLM creates embeddings that cluster structurally similar behavioral timelines together. Classifiers trained to interpret these embeddings can then identify malicious patterns even when specific attack techniques differ from training examples. This emerges from training on diverse operational data. The embeddings capture fundamental structural properties of how attacker behavior differs from operational activity. These properties transfer across environments because they reflect intrinsic characteristics of behavioral timeline structures.

The classifier layer still requires environment adaptation. But the behavioral understanding encoded in LogLM embeddings provides a strong starting point. Organizations achieve effective detection with significantly less labeled data than traditional machine learning approaches require.

Why this matters for detection teams

Traditional detection systems miss attacks that stay within expected parameters. Signature-based tools catch known threats. Anomaly detection flags deviations. Both fail when attackers understand these detection mechanisms and craft attacks accordingly.

DeepTempo detects based on structural representations learned from diverse training data. An attacker attempting to blend into normal traffic patterns still produces behavioral timeline structures that create distinctive embeddings. LogLM learned to represent these structures during training on operational data that included both benign and malicious behavioral timelines across diverse environments. Classifiers trained on these embeddings identify malicious intent.

This works without per-environment tuning. No rules. No signatures. No baseline configuration. LogLM deploys with pre-trained behavioral representations that generalize across infrastructure types. Detection happens because classifiers interpret these representations to recognize structural patterns indicating malicious intent regardless of how carefully attackers attempt to blend in.

Technical implications

The two-stage architecture separates concerns that traditional detection conflates. LogLM learns representation. Classifiers interpret meaning. This separation enables continuous improvement where classifier updates capture emerging threats without requiring foundation model retraining. It also provides explainability through embedding space analysis. Similar behavioral timelines cluster together, allowing analysts to understand why the system flagged specific activity.

The transformer architecture scales effectively across behavioral timeline lengths. Where older neural network designs struggled with long sequences, transformers maintain performance as behavioral timelines extend. This proves critical for detecting patient attackers who space their activities over time. The behavioral timeline structure remains detectable regardless of timing.

Foundation models in cybersecurity represent an architectural shift from reactive detection to proactive behavioral representation. LogLM creates embeddings that capture structural properties of behavioral timelines during training on diverse operational data. During operation, classifiers interpret these embeddings to identify malicious patterns. Detection comes from structural representation and interpretation, not from deviation measurement or signature matching.

MITRE: Reconnaissance (TA0043), Lateral Movement (TA0008), Command and Control (TA0011)

What we learned

Deep learning foundation models can learn behavioral representations that generalize across environments and attack techniques. The key insight involves training on structural patterns in operational data rather than attempting to define malicious behavior through rules or baselines. Transformers provide the architectural foundation for processing behavioral timelines effectively. Separating behavioral learning from intent classification enables better adaptation to evolving threats.

Detection happens at the behavioral timeline level based on structural signatures that reveal intent. Attackers can make individual flows appear normal. They cannot make the behavioral timeline structure normal while accomplishing malicious objectives. This fundamental constraint makes intent-based detection more robust than signature matching or anomaly detection.

The approach works without environment-specific tuning because foundation models learn general behavioral representations during training on diverse data. LogLM creates embeddings that capture structural properties of behavioral timelines. These embeddings transfer across infrastructure types because they encode intrinsic patterns in how operational activity structures differ from attacker behavior structures. Classifiers interpret these embeddings to identify threats.

The result: earlier detection of threats that traditional tools miss.

Get in touch to run a 30 day risk-free assessment in your environment. DeepTempo will analyze your existing data to identify threats that are active. Catch threats that your existing NDRs and SIEMs might be missing!

Table of contents

See the threats your tools can’t.

DeepTempo’s LogLM works with your existing stack to uncover evolving threats that traditional systems overlook — without adding complexity or replacing what already works.