Twelve built-in detectors. All rule-based - no LLM, no probabilistic drift. Detectors operate as Airlock 2: they compare structured events against the learned baseline and emit findings. The AI sees findings, not events.
All detectors run after every mallcop scan when you call mallcop detect.
During the 14-day learning period, findings are informational only - no escalation.
After learning, full severity applies.
Flags actors not seen in the baseline period.
An "actor" is any principal that appears in events: user accounts, service principals, API keys, CI bots. If mallcop hasn't seen this actor before, it's flagged. Applies to all sources and event types.
Finding: new-actor [warn]
Actor: deploy-bot-staging (service principal)
Source: azure
First seen: 2026-03-11T01:52Z
Triage context:
- Created by: baron@3dl.dev
- Azure CLI, not portal
- Matches recent deployment repo setup pattern
Most new-actor findings are resolved at triage in one LLM call. The actor checks the baseline, sees the creating user's recent activity, and resolves if it fits a known pattern (onboarding, CI setup, etc.).
Tuning: baseline.actor_min_events controls how many events an actor needs before being "known".
Fires when an event indicates elevated permissions.
New admin role, new repo owner, new org admin, new subscription-level role assignment.
Severity is critical. Goes directly to investigation if triage can't
attribute it to a known human action in context.
Sources: azure, github, m365
Event types watched: role_assignment, collaborator_added, permission_change, admin_action
Fires on N+ authentication failures within a time window.
Catches brute force, credential stuffing, and distributed spray patterns. The threshold is configurable. By default: 5 failures in 10 minutes from the same IP, or 20 failures across any IPs targeting the same account.
The triage actor distinguishes between fat-finger failures (one user, interactive session, same time of day as normal) and spray attacks (many users, unusual IPs, automated timing).
# Tune in mallcop.yaml
detectors:
auth-failure-burst:
threshold_per_ip: 5
threshold_per_account: 20
window_minutes: 10
Flags events outside the baseline time-of-day pattern for a known actor.
Mallcop tracks per-actor activity windows from the baseline period. A developer who always works 9am-7pm Eastern triggers this detector if they appear at 3am UTC. Timing anomalies combined with privilege changes or new resource access are strong signals.
The shakedown suite tests competing signals: a developer who works both US East and European hours (wide baseline window) should not trigger this detector on evening activity.
Fires when event volume exceeds N× the baseline frequency.
Measured per (source, event_type) pair. A normal day generates ~50 Azure
Activity Log events; a day with 250 triggers this at the default 5× multiplier.
Catches data exfiltration (bulk reads), runaway processes (API hammering), and enumeration.
Deploy days legitimately generate high volume. The triage actor checks recent commit activity and resolved prior volume anomalies to distinguish normal spikes from attacks.
Flags events containing prompt injection patterns in string fields.
Scans sanitized string fields for known prompt injection signatures: SQL injection
user agents, embedded LLM instructions (IGNORE PREVIOUS INSTRUCTIONS variants),
and structured content in free-text fields. Runs before Airlock 2 validation to catch
malicious payloads that survived initial sanitization.
The shakedown suite tests benign SQL keywords in log messages (e.g., a developer
running SELECT queries normally). The detector must not fire on legitimate
SQL in application logs.
Fires when a container app's parser unmatched-line ratio exceeds threshold.
When a container app updates its logging format, the existing parser template stops matching. Unmatched lines accumulate. When the ratio exceeds the threshold (default: 30%), this detector fires. This catches:
The triage actor checks for recent deployments to the app to distinguish updates from anomalies.
Flags when external entities are granted access to internal resources.
An "external entity" is any principal not previously seen accessing your internal resources: a new cross-account role, a new OAuth app granted access, an external collaborator added to a private repo. The baseline tracks the set of known external relationships.
Flags when a known actor touches a resource outside their baseline relationship set.
Mallcop maintains per-actor resource relationships in the baseline: which Key Vaults, storage accounts, repos, databases each actor has historically accessed. A developer who has never touched the production key vault suddenly querying secrets is a strong signal for lateral movement or insider threat.
Scans git repositories for leaked credentials, hardcoded secrets, and security antipatterns.
Checks git history, not just the latest commit. Catches API keys, tokens, passwords, and other secrets that were committed at any point, even if subsequently removed. Works across all configured repos.
Detects known malicious skill patterns and IOCs in OpenClaw AI agent installations.
Monitors OpenClaw agent installations for skills that match known indicators of compromise. Checks skill manifests, behavior patterns, and known-bad signatures. Fires when a skill doesn't match expected behavior or contains known malicious patterns.
Detects insecure OpenClaw gateway configuration.
Monitors OpenClaw gateway settings for insecure configurations: overly permissive access controls, missing authentication, exposed endpoints, or configuration changes that weaken the security posture of AI agent installations.
Tier 3 artifacts (declarative detector rules) can be proposed by the self-improvement loop and applied autonomously after passing the shakedown harness. You can also write them yourself.
# plugins/detectors/unusual-geo/manifest.yaml name: unusual-geo description: Flags logins from countries not in the baseline geo set for a known actor version: 0.1.0 sources: - azure - m365 event_types: - login severity_default: warn rules: - condition: event.metadata.country NOT IN baseline.actor_countries[event.actor] finding_template: title: "Login from unusual country: {event.metadata.country}" description: "Actor {event.actor} logged in from {event.metadata.country}. Baseline countries: {baseline.actor_countries[event.actor]}"
For detectors that need more logic, scaffold a full plugin:
mallcop scaffold detector unusual-geo # Generates: plugins/detectors/unusual-geo/ manifest.yaml # capabilities declaration detector.py # implements DetectorBase tests.py # contract tests
# detector.py from mallcop.detectors._base import DetectorBase, Finding class UnusualGeoDetector(DetectorBase): def detect(self, events, baseline): findings = [] for event in events: if event.event_type != "login": continue country = event.metadata.get("country") known = baseline.actor_countries(event.actor) if country and known and country not in known: findings.append(Finding( detector="unusual-geo", severity="warn", title=f"Login from unusual country: {country}", event_id=event.id, actor=event.actor, )) return findings
Verify: mallcop verify plugins/detectors/unusual-geo/