Detection-as-Code | Zion Boggan

Repository layout

Rules live as Sigma YAML under rules/, split by platform and ATT&CK tactic. One compiler walks the tree and emits a query per backend into dist/{splunk,esql,kusto}/, mirroring the source layout. Tests, the CI workflow, and a Makefile of one-liners sit alongside.

rules/
 windows/ credential-access, execution, persistence, defense-evasion, initial-access
 linux/ credential-access, execution, persistence
tools/convert.py compile every rule to Splunk / Elastic / Sentinel
tests/test_rules.py schema, ATT&CK tagging, unique IDs, correlation references
.github/workflows/ lint -> test -> convert on every push
dist/ generated queries (CI artifact; gitignored)

The pinned toolchain is small and exact: sigma-cli==3.0.2 with the Splunk, Elasticsearch, and Kusto backends, plus the Sysmon and Windows processing pipelines. Both the linter and the converter run from that single requirements set.

sigma-cli==3.0.2
pysigma-backend-splunk==2.1.0
pysigma-backend-elasticsearch==2.0.3
pysigma-backend-kusto==1.0.1
pysigma-pipeline-sysmon==2.0.0
pysigma-pipeline-windows==2.0.0
pytest==8.3.3
PyYAML==6.0.3

A rule in full

Rules are not naive. The LSASS detection filters on the granted-access masks that Mimikatz, comsvcs MiniDump, and similar tooling actually request rather than the broad 0x1010 alone, and excludes known legitimate readers. Source, verbatim:

title: Suspicious LSASS Process Access
id: dcfda42d-c1a7-4106-aa96-7912201d9221
status: experimental
description: >
 Detects process access to lsass.exe with access rights commonly used to read
 process memory (credential dumping). Tuned to the granted-access masks seen with
 Mimikatz, comsvcs MiniDump, and similar tooling rather than the broad 0x1010 alone.
references:
 - https://attack.mitre.org/techniques/T1003/001/
 - https://github.com/SwiftOnSecurity/sysmon-config
author: Zion Boggan
date: 2026-05-12
tags:
 - attack.credential_access
 - attack.t1003.001
logsource:
 product: windows
 category: process_access
detection:
 selection:
 TargetImage|endswith: '\lsass.exe'
 GrantedAccess:
 - '0x1010'
 - '0x1410'
 - '0x143a'
 - '0x1438'
 - '0x1fffff'
 filter_known:
 SourceImage|endswith:
 - '\wininit.exe'
 - '\csrss.exe'
 - '\MsMpEng.exe'
 - '\wmiprvse.exe'
 condition: selection and not filter_known
falsepositives:
 - EDR and AV products legitimately reading LSASS; baseline and add to filter_known.
level: high

The compiler and pipeline selection

tools/convert.py walks every .yml under rules/ and shells out to sigma convert once per backend. The right processing pipeline is chosen from each rule's logsource: Sysmon for process, file, image-load, and network categories; Windows-audit for the Security and System channels (service installs, scheduled tasks). Rules with no matching pipeline are converted with --without-pipeline so generic logic still compiles.

CATEGORY_PIPELINE = {
 "process_creation": "sysmon",
 "process_access": "sysmon",
 "image_load": "sysmon",
 "file_event": "sysmon",
 "network_connection": "sysmon",
 "dns_query": "sysmon",
}
SERVICE_PIPELINE = {
 "security": "windows-audit",
 "system": "windows-audit",
}


def pipeline_for(rule: dict) -> str | None:
 ls = rule.get("logsource", {})
 if ls.get("product") == "windows":
 if ls.get("category") in CATEGORY_PIPELINE:
 return CATEGORY_PIPELINE[ls["category"]]
 if ls.get("service") in SERVICE_PIPELINE:
 return SERVICE_PIPELINE[ls["service"]]
 return None

Each rule then runs through sigma convert -t <backend> -s, with -p <pipeline> appended when one matched. A non-zero exit surfaces the last stderr line as the skip reason, so a broken rule is loud rather than silent.

All three backends

The LSASS source above compiles to each target without the logic being re-derived. The same selection plus the same exclusion list, expressed in three dialects. Splunk SPL:

EventID=10 TargetImage="*\\lsass.exe" GrantedAccess IN ("0x1010", "0x1410", "0x143a", "0x1438", "0x1fffff") NOT (SourceImage IN ("*\\wininit.exe", "*\\csrss.exe", "*\\MsMpEng.exe", "*\\wmiprvse.exe"))

Elastic ES|QL:

from * metadata _id, _index, _version | where EventID==10 and ends_with(TargetImage, "\\lsass.exe") and (GrantedAccess in ("0x1010", "0x1410", "0x143a", "0x1438", "0x1fffff")) and not (ends_with(SourceImage, "\\wininit.exe") or ends_with(SourceImage, "\\csrss.exe") or ends_with(SourceImage, "\\MsMpEng.exe") or ends_with(SourceImage, "\\wmiprvse.exe"))

Microsoft Sentinel / Defender KQL:

EventID == 10 and ((TargetImage endswith "\\lsass.exe" and (GrantedAccess in~ ("0x1010", "0x1410", "0x143a", "0x1438", "0x1fffff"))) and (not((SourceImage endswith "\\wininit.exe" or SourceImage endswith "\\csrss.exe" or SourceImage endswith "\\MsMpEng.exe" or SourceImage endswith "\\wmiprvse.exe"))))

Correlation rules

Single events are often informational; the alert is in the aggregate. SSH brute force is modelled as a Sigma correlation over a per-event base rule. The base rule is tagged informational and matches one failed authentication:

title: SSH Authentication Failure
name: ssh_auth_failure
id: cc6fd1c9-b264-4be8-bb53-b6f4e2af9776
status: experimental
description: Base detection for a single failed SSH authentication, used by the brute-force correlation.
logsource:
 product: linux
 service: sshd
detection:
 selection:
 - Message|contains: 'Failed password for'
 - Message|contains: 'Invalid user'
 - Message|startswith: 'Connection closed by authenticating user'
 condition: selection
level: informational

The correlation rule references that base by name and fires on eight or more failures from one source IP inside a two-minute window:

correlation:
 type: event_count
 rules:
 - ssh_auth_failure
 group-by:
 - src_ip
 timespan: 2m
 condition:
 gte: 8

Because a correlation needs its referenced rule present in the same collection, these are compiled together (make correlations converts the whole linux/credential-access/ directory at once).

The test suite

pytest gates rule quality before anything compiles. Schema, ATT&CK tagging, unique IDs, and correlation references are all enforced. The technique tag is matched against a regex and tactic tags against the full ATT&CK enterprise set:

TECHNIQUE_RE = re.compile(r"^attack\.t\d{4}(\.\d{3})?$")


@pytest.mark.parametrize("path", RULES, ids=[p.name for p in RULES])
def test_rule_schema(path):
 rule = load(path)
 for field in ("title", "id", "status", "description", "tags", "level"):
 assert rule.get(field), f"{path.name} missing {field}"
 assert "detection" in rule or "correlation" in rule, f"{path.name} has no detection/correlation"
 assert rule["level"] in {"informational", "low", "medium", "high", "critical"}
 assert rule["status"] in {"experimental", "test", "stable", "deprecated", "unsupported"}


@pytest.mark.parametrize("path", RULES, ids=[p.name for p in RULES])
def test_attack_tags(path):
 tags = load(path).get("tags", [])
 techniques = [t for t in tags if TECHNIQUE_RE.match(t)]
 assert techniques, f"{path.name} has no ATT&CK technique tag"

The correlation check loads every rule's name and asserts that each correlation's references resolve, so a renamed or deleted base rule fails the build instead of silently producing an empty alert:

def test_correlation_refs_resolve():
 names = {load(p).get("name") for p in RULES if load(p).get("name")}
 for path in RULES:
 corr = load(path).get("correlation")
 if corr:
 for ref in corr.get("rules", []):
 assert ref in names, f"{path.name} references unknown rule '{ref}'"

CI workflow

Every push and pull request to main runs a single validate job that lints, tests, and compiles in order, then uploads the generated queries as an artifact. The trimmed workflow:

jobs:
 validate:
 runs-on: ubuntu-latest
 steps:
 - uses: actions/checkout@v4
 - uses: actions/setup-python@v5
 with:
 python-version: "3.11"
 - run: pip install -r requirements.txt
 - name: Lint Sigma rules
 run: sigma check rules/
 - name: Schema + ATT&CK tests
 run: pytest -q
 - name: Convert to Splunk / Elastic / Sentinel
 run: python tools/convert.py
 - uses: actions/upload-artifact@v4
 with:
 name: converted-queries
 path: dist/

A conversion error fails the job, so a rule that lints but does not compile to one of the three backends never merges. The dist/ tree is gitignored and rebuilt from source on every run.

ATT&CK coverage and validation

A focused, high-signal set covering the techniques that show up most in real triage: credential dumping, phishing-to-execution, persistence, and brute force. Nine techniques across five tactics, every rule tagged and tuned past the naive version.

Tactic	Technique	Detection	Platform	Level
Initial Access	T1566.001	Office spawns scripting host / LOLBin	Windows	high
Execution	T1059.001	PowerShell EncodedCommand	Windows	medium
Execution	T1059.004	Reverse shell one-liner	Linux	high
Defense Evasion	T1218.011	Suspicious rundll32	Windows	high
Persistence	T1543.003	New service installed (7045)	Windows	high
Persistence	T1053.005	Scheduled task created (4698)	Windows	medium
Persistence	T1543.002	Systemd persistence	Linux	medium
Credential Access	T1003.001	Suspicious LSASS access	Windows	high
Credential Access	T1110	SSH brute force (correlation)	Linux	high

CI proves the rules lint, pass their tests, and compile. Behaviour is validated separately: in a companion purple-team lab, Atomic Red Team fires each technique and the matching detection is confirmed in a Wazuh SIEM before a rule is promoted here. New detections are only added after they survive that loop, which keeps the set small, high-signal, and grounded in observed telemetry rather than copied from public rule dumps.