Foundation models vs. analyst copilots: which does what

Evan Powell

May 3, 2026

The category that gets conflated

Walk a security trade show floor in 2026 and AI security means at least three different things at different booths. It can mean an analyst copilot wrapped around a SIEM that drafts queries in natural language and summarizes alert clusters. It can mean a code analysis tool that uses large language models to find vulnerabilities. It can mean a detection product that uses a foundation model trained on telemetry to detect attacks directly.

These are different products with different use cases, different engineering, and different value to a SOC. The marketing collapses them into one category labeled AI and the consequence is that buyers compare products that do not actually compete.

What analyst copilots do well

Analyst copilots are useful and worth investing in. The use cases are clear.

Query drafting. Analysts who do not write KQL or SPL fluently get production-quality queries in seconds. This compresses the gap between junior and senior analysts and reduces time-to-investigation on routine alerts.

Alert summarization. A cluster of related alerts gets a one-paragraph summary. Analysts spend less time reading raw events and more time deciding what to do.

Incident narrative generation. When an investigation is complete, the copilot produces a draft incident report. Analysts edit rather than start from scratch.

Knowledge lookup. What does this MITRE technique mean, what are the indicators of this CVE, how do I respond to this kind of phishing. The copilot pulls from documentation faster than a human.

Tier-1 escalation drafting. Junior analysts who are unsure whether to escalate get a structured second opinion that reduces both false escalation and missed escalation.

SOCs that adopt analyst copilots well report measurable reductions in investigation time and improvements in junior-analyst output. The category exists for a reason.

What analyst copilots do not do

The category has limits, and understanding them matters.

They do not detect attacks the underlying tools missed. A copilot wrapped around a SIEM operates on the data the SIEM ingests and the alerts the SIEM produced. If the SIEM did not catch the attack, the copilot has nothing to summarize. The detection gap is unchanged by adding a copilot.

They hallucinate on technical specifics. General-purpose LLMs produce confident output that is sometimes wrong. In a security investigation, a hallucinated indicator or MITRE mapping is worse than no answer.

They are slow at production telemetry volumes. A copilot can summarize one alert quickly. Reading the underlying telemetry stream of a million events per minute is not what they do.

They cost money per query. Frontier LLM API pricing is reasonable for human-driven workflows. It is not reasonable for machine-driven detection at telemetry volumes. Cost-per-detection in a copilot architecture is one to three orders of magnitude higher than what a SOC can sustain.

These are not flaws in the copilots. They are properties of the architecture.

Why detection foundation models are a different category

Detection foundation models are a different architecture for a different purpose. Training data is operational telemetry, not natural language. Output is embeddings, not text. The downstream consumer is a classifier that produces structured detections, not a human reading prose. Latency is measured in milliseconds. Cost-per-inference is small enough to operate at telemetry volumes.

These properties are what make a foundation model useful for detection. They are also what make the model useless as an analyst copilot. A LogLM cannot summarize an incident in plain English. It produces embeddings that classifiers turn into detections. That is what it is for.

The two categories are complementary. A SOC running both gets faster human investigation (from the copilot) and better detection coverage (from the detection model). A SOC running only one gets one of those benefits.

What we built and what we did not

DeepTempo built LogLM as a detection foundation model for detections - and then built our co-pilot that can interpret our results, add context and so on. We did NOT try to use a co-pilot for detections. The reason is structural. Building a copilot well is a different engineering project than building a detection model. The training data is different. The architecture is different. The evaluation discipline is different. The deployment infrastructure is different. The cost economics are different.

Trying to do both with one model produces a product that is mediocre at both. Several vendors are trying this. The results are predictable. We built a purpose built model, our LogLM, for detection because the value to a SOC is largest where existing tools fail, and existing tools fail at detection rather than at investigation.

How the engineering effort split

It is worth being concrete about what building LogLM specifically required, because it explains why the categories are different. Training data assembly took ongoing engineering investment to curate substantial volumes of operational telemetry across cloud, data center, and OT environments, with adversarial coverage and privacy-aware handling. None of this is what an analyst copilot needs.

The tokenizer is custom, designed to preserve the structure of operational telemetry rather than shred identifiers into subword fragments. An analyst copilot uses a natural-language tokenizer because its inputs are natural language. The architecture decisions about context length, attention pattern, and inference cost were made for production telemetry volumes. The classifier head against MITRE ATT&CK is trained on labeled adversarial activity. The evaluation discipline is built around zero-shot accuracy, adaptation curves, false positive rate at scale, and adversarial robustness.

These are different products. They share the word AI in marketing copy. They share little else.

Table of contents

Sample H2

Sample H3