How We Build - mallcop

The numbers

Counted from git history, the work graph, and pytest collection across both repos. The idea was conceived March 6, 2026. It launched March 15.

9 days from idea to launch

461 commits
(128 OSS + 333 cloud)

878 subagent dispatches

174,154 lines of Python
(86,799 OSS + 87,355 cloud)

3,909 tests
(3,170 OSS + 739 cloud)

1,016 work items tracked
(687 closed, 329 open)

232 adversarial findings
(108 fixed, 124 open)

52 modules shipped
(11 connectors, 15 detectors, 11 skills, 15 actors)

1 human

Commit breakdown: 105 feature, 148 fix, 25 test, 14 refactor, 35 site, 35 work tracking. Launch day alone: 142 commits across 8 parallel tracks. The fix count reflects a codebase under active adversarial review. Fixes land fast because the test suite catches regressions before they compound.

The process

Every feature goes through five stages. The stages are sequential. Work within each stage runs in parallel across independent AI sessions.

1. Design

An AI session researches the codebase, existing patterns, and constraints, then produces a design. The design is reviewed and iterated before any code is written. The human makes all architecture and priority decisions. The AI produces options and trade-off analysis.

2. Decomposition

The approved design is broken into outcome-scoped work units. Each one has a verifiable done condition, is self-contained enough to survive context loss between sessions, and fits in one AI session. Dependencies are wired into a graph. The human reviews the decomposition before work starts.

3. Implementation

One AI session implements the work unit. Four independent AI sessions review the implementation for correctness, edge cases, API contract compliance, and test quality. The human dispatches sessions and resolves conflicts between reviewers.

4. Automated testing

Every code change runs through the full test suite: unit tests, integration tests, and end-to-end tests. Tests are written before the implementation (red-green). The suite must be green before any commit lands on main. CI enforces the gate.

5. Adversarial sweep

After all implementation lands, a multi-pass adversarial review runs across the entire codebase. Five independent AI sessions each take one pass: security, bugs, dead code, anti-patterns, test coverage. Findings are triaged by severity. Ship blockers are fixed before release.

Adversarial sweep results

Four sweeps have run across both repos. A pre-launch sweep, a post-fix resweep, a hardened sweep focused on token theft and agent corruption, and a verification sweep. Together they produced 232 findings. 108 have been fixed. The rest are triaged and tracked.

37 security 10 high, 20 medium, 7 low

63 bugs 16 high, 36 medium, 11 low

22 dead code 4 high, 8 medium, 10 low

54 anti-patterns 14 high, 28 medium, 12 low

44 test gaps 16 high, 20 medium, 8 low

12 agent corruption 3 high, 5 medium, 4 low

The sweep finding real issues is the point. A clean result would be suspicious. 108 findings fixed across four sweeps. The remaining 124 are tracked and prioritized.

Velocity over time

Daily activity from March 6 (first design session) through March 16 (day after launch). Data extracted from git history, the work graph, and session transcripts. The charts measure resonance at the AI-app edge: agent dispatch density, flaw injection rate, and the ratio of work completed to work discovered.

The human role

Role	What it means
Architect	Makes all design and priority decisions. Sets constraints. Approves decompositions before work starts.
Dispatcher	Assigns work to parallel AI sessions. Routes the right task to the right capability tier. Resolves conflicts between reviewers.
Ship authority	Decides what blocks a release versus what ships later. No finding is dismissed without human triage.
Quality gate	Reviews sweep findings. Approves or escalates. The multi-layer process is the review.

The human does not write code. Quality assurance comes from the process: test-driven development enforced by CI, four independent reviewers per implementation, and a five-pass adversarial sweep before every release. The git history is the record.

Why we disclose this

Mallcop uses AI to find security problems in your environment. We used the same approach to build it. That raises a legitimate question: if AI can make mistakes, does that mean mallcop has bugs that compromise the analysis it produces?

The answer is the process. Every commit requires green tests. Every implementation goes through four independent reviews. Every release goes through a five-pass adversarial sweep. Four sweeps have found 232 real issues. 108 have been fixed. The rest are tracked and open.

If this methodology produces trustworthy results, the product works. If it does not, you should not use it. The code is Apache 2.0. Read it.

View on GitHub → Security Model

How mallcop is built