Self-Improvement

What self-improvement means

Self-improvement is not autonomous code modification without oversight. It is a structured loop with safety gates at every step.

Mallcop monitors its own investigation quality over time. When it identifies a weakness - a false positive pattern in triage, a detector that consistently misclassifies a particular event type, a parser template that drifts from a service's actual log format - a coding agent is invoked to propose a fix.

The fix is validated against the shakedown harness before anything is applied. If validation passes, the fix is applied according to the artifact's approval level. Code changes always require human review. Prompt changes can be autonomous.

Artifact safety tiers

Different artifacts have different blast radii. Mallcop applies proportionally less caution to lower-risk changes.

Tier	Artifact	Default Approval	Reasoning
1	Prompt instructions (`POST.md`)	Autonomous	Pure text. Shakedown validates behavior. No code execution path.
2	Parser templates (`apps/*/parser.yaml`)	Autonomous	Declarative YAML. Matched against log lines. Limited blast radius.
3	Declarative detector rules (YAML)	Gated	Affects what gets flagged. Must pass shakedown before apply.
4	Event classification code (Python)	Human review	Code execution. Can affect all events. Requires human sign-off.
5	New detector code (Python)	Human review	New code execution path. Always requires human review.

Approval levels can be adjusted in mallcop.yaml. See Configuration: Self-Improvement. You can set Tier 3 to autonomous if you trust the shakedown harness for your deployment. You can set Tier 1 to human if you want to review every prompt change.

The improvement loop

Quality monitoring

The quality system tracks investigation outcomes over time. It looks for patterns: the same finding type being escalated repeatedly when it should be resolved, investigation reasoning that mentions missing context, or shakedown failures after an update.

Weakness detection

When a pattern exceeds a threshold (e.g., a scenario type failing the shakedown more than 30% of the time over the last 30 days), the quality system creates an improvement task with the specific weakness identified.

Fix proposal

A coding agent receives the weakness description, the failing scenarios, and the artifact to modify. It proposes a fix. Budget ceiling: max_donuts_per_cycle. If budget is exhausted without a valid proposal, the task is deferred.

Shakedown validation

The proposed fix is tested against the full shakedown harness. Pass means the fix resolves the target weakness without introducing regressions. Fail means the coding agent gets another attempt with the failure context. 3 retries maximum.

Apply or escalate

If shakedown passes: apply according to the artifact's approval level (autonomous, gated, or human review). If all retries fail: create a human-review task with the failing scenarios, attempted fixes, and shakedown output.

Shakedown validation

The shakedown harness runs every proposed fix against the full scenario suite. It uses real LLM calls - not mocks. It grades:

Chain action: Did the actor chain resolve vs. escalate correctly?
Triage action: Did triage make the right call?
Reasoning quality: Did the reasoning mention required evidence? Did it avoid forbidden shortcuts?
Tool coverage: Did the actor call the tools needed to reach its conclusion?
Token efficiency: Did it stay within budget?

Failures are machine-readable: each failure includes a fix_target (which file to edit) and a fix_hint (what kind of change to make). This feeds directly back into the coding agent's next attempt.

# Shakedown output for a failing scenario
Scenario: AF-02-distributed-spray
Verdict:  FAIL
Notes:
  - Wrong triage action: expected escalate, got resolve
  - Missing reasoning: "distributed" not mentioned
Fix target: triage/POST.md
Fix hint: Add guidance for distinguishing single-account vs. distributed spray patterns

Approval levels

Level	Behavior
`autonomous`	Apply immediately after shakedown pass. No notification unless requested.
`gated`	Require shakedown pass. Present proposed change to human before applying.
`human`	Always require human review. Shakedown is a prerequisite, not a gate.
`disabled`	Never modify this artifact type automatically.

CLI: mallcop improve

# Run one improvement cycle (quality analysis → fix → validate → apply)
mallcop improve

# Preview what would be improved without applying
mallcop improve --dry-run

# Run shakedown harness only (no improvement logic)
mallcop improve --shakedown-only

# Check current quality metrics
mallcop improve --status

Configuration

self_improvement:
  enabled: true

  tiers:
    prompts: autonomous           # POST.md files
    parser_templates: autonomous  # apps/*/parser.yaml
    declarative_detectors: gated  # detector YAML rules
    event_classification: human   # Python code in connectors/detectors
    new_detectors: human          # new Python detector plugins

  # Max donuts per improvement cycle
  # Prevents runaway spend on hard-to-fix weaknesses
  max_donuts_per_cycle: 10

  # Minimum failure rate before triggering improvement for a scenario type
  # (0.3 = 30% of runs for that scenario type failing)
  weakness_threshold: 0.30

  # Shakedown retries before escalating to human
  max_retries: 3

Self-improvement runs as part of mallcop watch when enabled, or on-demand via mallcop improve. It consumes donuts from your tier allocation. mallcop status shows self-improvement's share of your appetite.

On this page

Related

What self-improvement means

Artifact safety tiers

The improvement loop

Quality monitoring

Weakness detection

Fix proposal

Shakedown validation

Apply or escalate

Shakedown validation

Approval levels

CLI: mallcop improve

Configuration