Building Vigil - Why take the time and effort?

Eric Zietlow

April 9, 2026

How it started

Vigil began as a side project. In response to a growing trend. A vendor wraps a reasoning model in proprietary middleware, charges an enterprise $40,000 to $200,000 a year to start, and gives the security team no way to inspect why an agent made a decision, no path to extend the platform for their environment, and no mechanism to share improvements with anyone else. The agent logic is a black box. The integrations are proprietary. The data leaves your environment. That is not a security tool. It is a dependency with a dashboard.

Vigil is the alternative: fully open-source, Apache 2.0, local-first, every agent's reasoning readable, every workflow a text file you can modify and share. You can clone it and have a running SOC in under three minutes.

What the architecture looks like

The design follows three layers that interlock cleanly: agents, workflows, and integrations. Keeping those layers separate was the most important structural decision early on, because it means you can extend any one of them without touching the others.

Agents are specialists. Each one has a defined role, a tuned reasoning mode, and access to a specific set of tools. The Triage Agent runs in fast mode, because its job is noise reduction at volume. The Investigator runs in deep reasoning mode, because its job is root cause reconstruction across correlated evidence. The Responder has a confidence threshold: it auto-approves containment actions above 0.90 and routes everything below that to a human reviewer. That threshold is configurable. The automation boundary is yours to set. Vigil currently ships with 13 agents covering triage, investigation, MITRE mapping, correlation, response, reporting, detection engineering, enrichment, identity analysis, network forensics, malware analysis, compliance mapping, and case management.

Workflows are where agents become useful to an actual analyst. A workflow chains agents into a complete end-to-end playbook, triggered by a single command. The four that ship with Vigil today cover incident response from alert to audit-ready documentation, full investigation with timeline reconstruction across all data sources, proactive threat hunting using MITRE ATT&CK as the framework, and forensic analysis with chain-of-custody documentation. Defining a new workflow means writing a SKILL.md file that specifies agent sequence, phase-level tool access, and natural-language instructions for each step. If your team has a workflow that works, you can encode it in a file and share it with the community.

Integrations run on MCP. The Model Context Protocol is an open standard, and building on it means every MCP server in the security ecosystem is a potential Vigil integration without custom code on our side. Vigil ships with 30+ integrations covering Splunk, CrowdStrike, VirusTotal, Shodan, Hybrid Analysis, Jira, Slack, and DeepTempo, among others. Adding a new one means wrapping an existing API in an MCP server, not petitioning a vendor roadmap.

Raw alert
↓
Triage Agent (fast mode)   → severity score, noise filter
↓
Investigator (deep mode)   → timeline, root cause, correlated evidence
↓
Responder                  → blast radius, containment action
  ≥ 0.90 confidence → auto-approved
  < 0.90 confidence → routed to human reviewer
↓
Reporter                   → MITRE-tagged incident report, audit-ready

What was actually hard to build

Getting agent coordination right took more time than I expected. The simple sequential pipeline works for clean cases, but breaks when an investigation requires an agent to branch, backtrack, or pass partial findings to a parallel agent. The Claude Agent SDK handles coordination and state management, which removed a category of problems, but getting the context boundaries right, what each agent carries forward from the previous phase and what it discards, required careful thought and real test scenarios, not just design intuition.

The detection engineering layer was its own problem. Vigil ships with 7,200+ detection rules spanning Sigma, Splunk SPL, Elastic, and KQL. Getting the Detection Engineer agent to do useful coverage analysis across rule formats, rather than just counting rules, required tight prompt design and concrete test cases where the expected output was defined before I wrote a line of agent code. The agent now identifies gaps by MITRE tactic, generates new rule templates, and evaluates whether an existing rule actually fires on the telemetry you have. That is more useful than a rule count.

Authentication is still in progress. The current build runs in dev mode that bypasses it. That is a known gap and I am not going to hide it. I would rather ship something real and say clearly where the rough edges are than ship something polished that obscures them.

The Auto-Contributor experiment

One of the more unusual things built into Vigil is the Auto-Contributor tool. You point it at a proprietary AI SOC vendor's website or a PDF of their capability claims, it analyzes what they do, compares it against what Vigil and the open-source ecosystem already cover, and generates ready-to-file GitHub issues with acceptance criteria for the gaps. The goal is to make Vigil a superset of every proprietary AI SOC, one contribution at a time.

It is an early experiment. But the underlying point is worth taking seriously: as reasoning models improve, the bottleneck in open-source development stops being "can we write the code" and starts being "can we correctly identify the right work to do." Auto-Contributor is a first attempt to make that identification systematic, and to explore what contribution looks like when the contributor is itself an agent.

What I would do differently

I would start with a cleaner database schema. The Postgres models reflect several iterations of the data model that happened in parallel with feature development, and there is real debt there. I would also invest earlier in benchmarking infrastructure. We need quantitative measures of agent quality across workflow types, and right now the evaluation is too manual. The contrib/benchmarking directory is the start of fixing that.

The design lesson I keep returning to: every agent wants to do too much, and every workflow wants to handle every edge case. The agents that work best in practice are the ones where the responsibility boundary is drawn tight and the tool set is small and directly serves that responsibility. That applies to individual agents, to workflows, and to the project as a whole.

Where it goes next

The roadmap includes expanded MCP integrations for Sentinel, Elastic, Palo Alto Cortex, Google Chronicle, and AWS Security Hub. A community skills catalog is in progress. Additional LLM backends, including local models via Ollama, are scoped. Federated deployment for organizations running agents across multiple environments with data staying local to each is on the list. Agentic red teaming, through collaboration with Stanford's Artemis project and others, is in active development.

None of that matters as much as what happens in the next few weeks. If security engineers who run real SOCs clone this, run it against real data, find what breaks, and open a pull request, the project gets better. That is how open-source security tooling actually works. Vigil is early. The architecture is solid, the agents are running, and the workflows are production-tested. What it needs now is the community.