Repository layout
Rules live as Sigma YAML under rules/, split by platform and ATT&CK tactic. One compiler walks the tree and emits a query per backend into dist/{splunk,esql,kusto}/, mirroring the source layout. Tests, the CI workflow, and a Makefile of one-liners sit alongside.
rules/
windows/ credential-access, execution, persistence, defense-evasion, initial-access
linux/ credential-access, execution, persistence
tools/convert.py compile every rule to Splunk / Elastic / Sentinel
tests/test_rules.py schema, ATT&CK tagging, unique IDs, correlation references
.github/workflows/ lint -> test -> convert on every push
dist/ generated queries (CI artifact; gitignored)The pinned toolchain is small and exact: sigma-cli==3.0.2 with the Splunk, Elasticsearch, and Kusto backends, plus the Sysmon and Windows processing pipelines. Both the linter and the converter run from that single requirements set.
sigma-cli==3.0.2
pysigma-backend-splunk==2.1.0
pysigma-backend-elasticsearch==2.0.3
pysigma-backend-kusto==1.0.1
pysigma-pipeline-sysmon==2.0.0
pysigma-pipeline-windows==2.0.0
pytest==8.3.3
PyYAML==6.0.3
A rule in full
Rules are not naive. The LSASS detection filters on the granted-access masks that Mimikatz, comsvcs MiniDump, and similar tooling actually request rather than the broad 0x1010 alone, and excludes known legitimate readers. Source, verbatim:
title: Suspicious LSASS Process Access
id: dcfda42d-c1a7-4106-aa96-7912201d9221
status: experimental
description: >
Detects process access to lsass.exe with access rights commonly used to read
process memory (credential dumping). Tuned to the granted-access masks seen with
Mimikatz, comsvcs MiniDump, and similar tooling rather than the broad 0x1010 alone.
references:
- https://attack.mitre.org/techniques/T1003/001/
- https://github.com/SwiftOnSecurity/sysmon-config
author: Zion Boggan
date: 2026-05-12
tags:
- attack.credential_access
- attack.t1003.001
logsource:
product: windows
category: process_access
detection:
selection:
TargetImage|endswith: '\lsass.exe'
GrantedAccess:
- '0x1010'
- '0x1410'
- '0x143a'
- '0x1438'
- '0x1fffff'
filter_known:
SourceImage|endswith:
- '\wininit.exe'
- '\csrss.exe'
- '\MsMpEng.exe'
- '\wmiprvse.exe'
condition: selection and not filter_known
falsepositives:
- EDR and AV products legitimately reading LSASS; baseline and add to filter_known.
level: high
The compiler and pipeline selection
tools/convert.py walks every .yml under rules/ and shells out to sigma convert once per backend. The right processing pipeline is chosen from each rule's logsource: Sysmon for process, file, image-load, and network categories; Windows-audit for the Security and System channels (service installs, scheduled tasks). Rules with no matching pipeline are converted with --without-pipeline so generic logic still compiles.
CATEGORY_PIPELINE = {
"process_creation": "sysmon",
"process_access": "sysmon",
"image_load": "sysmon",
"file_event": "sysmon",
"network_connection": "sysmon",
"dns_query": "sysmon",
}
SERVICE_PIPELINE = {
"security": "windows-audit",
"system": "windows-audit",
}
def pipeline_for(rule: dict) -> str | None:
ls = rule.get("logsource", {})
if ls.get("product") == "windows":
if ls.get("category") in CATEGORY_PIPELINE:
return CATEGORY_PIPELINE[ls["category"]]
if ls.get("service") in SERVICE_PIPELINE:
return SERVICE_PIPELINE[ls["service"]]
return NoneEach rule then runs through sigma convert -t <backend> -s, with -p <pipeline> appended when one matched. A non-zero exit surfaces the last stderr line as the skip reason, so a broken rule is loud rather than silent.
All three backends
The LSASS source above compiles to each target without the logic being re-derived. The same selection plus the same exclusion list, expressed in three dialects. Splunk SPL:
EventID=10 TargetImage="*\\lsass.exe" GrantedAccess IN ("0x1010", "0x1410", "0x143a", "0x1438", "0x1fffff") NOT (SourceImage IN ("*\\wininit.exe", "*\\csrss.exe", "*\\MsMpEng.exe", "*\\wmiprvse.exe"))Elastic ES|QL:
from * metadata _id, _index, _version | where EventID==10 and ends_with(TargetImage, "\\lsass.exe") and (GrantedAccess in ("0x1010", "0x1410", "0x143a", "0x1438", "0x1fffff")) and not (ends_with(SourceImage, "\\wininit.exe") or ends_with(SourceImage, "\\csrss.exe") or ends_with(SourceImage, "\\MsMpEng.exe") or ends_with(SourceImage, "\\wmiprvse.exe"))Microsoft Sentinel / Defender KQL:
EventID == 10 and ((TargetImage endswith "\\lsass.exe" and (GrantedAccess in~ ("0x1010", "0x1410", "0x143a", "0x1438", "0x1fffff"))) and (not((SourceImage endswith "\\wininit.exe" or SourceImage endswith "\\csrss.exe" or SourceImage endswith "\\MsMpEng.exe" or SourceImage endswith "\\wmiprvse.exe"))))
Correlation rules
Single events are often informational; the alert is in the aggregate. SSH brute force is modelled as a Sigma correlation over a per-event base rule. The base rule is tagged informational and matches one failed authentication:
title: SSH Authentication Failure
name: ssh_auth_failure
id: cc6fd1c9-b264-4be8-bb53-b6f4e2af9776
status: experimental
description: Base detection for a single failed SSH authentication, used by the brute-force correlation.
logsource:
product: linux
service: sshd
detection:
selection:
- Message|contains: 'Failed password for'
- Message|contains: 'Invalid user'
- Message|startswith: 'Connection closed by authenticating user'
condition: selection
level: informationalThe correlation rule references that base by name and fires on eight or more failures from one source IP inside a two-minute window:
correlation:
type: event_count
rules:
- ssh_auth_failure
group-by:
- src_ip
timespan: 2m
condition:
gte: 8Because a correlation needs its referenced rule present in the same collection, these are compiled together (make correlations converts the whole linux/credential-access/ directory at once).
The test suite
pytest gates rule quality before anything compiles. Schema, ATT&CK tagging, unique IDs, and correlation references are all enforced. The technique tag is matched against a regex and tactic tags against the full ATT&CK enterprise set:
TECHNIQUE_RE = re.compile(r"^attack\.t\d{4}(\.\d{3})?$")
@pytest.mark.parametrize("path", RULES, ids=[p.name for p in RULES])
def test_rule_schema(path):
rule = load(path)
for field in ("title", "id", "status", "description", "tags", "level"):
assert rule.get(field), f"{path.name} missing {field}"
assert "detection" in rule or "correlation" in rule, f"{path.name} has no detection/correlation"
assert rule["level"] in {"informational", "low", "medium", "high", "critical"}
assert rule["status"] in {"experimental", "test", "stable", "deprecated", "unsupported"}
@pytest.mark.parametrize("path", RULES, ids=[p.name for p in RULES])
def test_attack_tags(path):
tags = load(path).get("tags", [])
techniques = [t for t in tags if TECHNIQUE_RE.match(t)]
assert techniques, f"{path.name} has no ATT&CK technique tag"The correlation check loads every rule's name and asserts that each correlation's references resolve, so a renamed or deleted base rule fails the build instead of silently producing an empty alert:
def test_correlation_refs_resolve():
names = {load(p).get("name") for p in RULES if load(p).get("name")}
for path in RULES:
corr = load(path).get("correlation")
if corr:
for ref in corr.get("rules", []):
assert ref in names, f"{path.name} references unknown rule '{ref}'"
CI workflow
Every push and pull request to main runs a single validate job that lints, tests, and compiles in order, then uploads the generated queries as an artifact. The trimmed workflow:
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- run: pip install -r requirements.txt
- name: Lint Sigma rules
run: sigma check rules/
- name: Schema + ATT&CK tests
run: pytest -q
- name: Convert to Splunk / Elastic / Sentinel
run: python tools/convert.py
- uses: actions/upload-artifact@v4
with:
name: converted-queries
path: dist/A conversion error fails the job, so a rule that lints but does not compile to one of the three backends never merges. The dist/ tree is gitignored and rebuilt from source on every run.
ATT&CK coverage and validation
A focused, high-signal set covering the techniques that show up most in real triage: credential dumping, phishing-to-execution, persistence, and brute force. Nine techniques across five tactics, every rule tagged and tuned past the naive version.
| Tactic | Technique | Detection | Platform | Level |
|---|---|---|---|---|
| Initial Access | T1566.001 | Office spawns scripting host / LOLBin | Windows | high |
| Execution | T1059.001 | PowerShell EncodedCommand | Windows | medium |
| Execution | T1059.004 | Reverse shell one-liner | Linux | high |
| Defense Evasion | T1218.011 | Suspicious rundll32 | Windows | high |
| Persistence | T1543.003 | New service installed (7045) | Windows | high |
| Persistence | T1053.005 | Scheduled task created (4698) | Windows | medium |
| Persistence | T1543.002 | Systemd persistence | Linux | medium |
| Credential Access | T1003.001 | Suspicious LSASS access | Windows | high |
| Credential Access | T1110 | SSH brute force (correlation) | Linux | high |
CI proves the rules lint, pass their tests, and compile. Behaviour is validated separately: in a companion purple-team lab, Atomic Red Team fires each technique and the matching detection is confirmed in a Wazuh SIEM before a rule is promoted here. New detections are only added after they survive that loop, which keeps the set small, high-signal, and grounded in observed telemetry rather than copied from public rule dumps.
