Secure CI/CD Pipeline

The four gates

A cheap ruff lint runs first as fail-fast. The three security scans then fan out in parallel via needs: lint, the test job waits on all three, and the SOC notification runs last with if: always() so failures are reported too, not just green runs. The workflow holds only contents: read and security-events: write.

jobs:
 lint:
 runs-on: ubuntu-latest
 steps:
 - uses: actions/checkout@v4
 - uses: actions/setup-python@v5
 with:
 python-version: "3.11"
 - run: pip install ruff==0.6.9
 - run: ruff check .

 sast:
 runs-on: ubuntu-latest
 needs: lint
 steps:
 - uses: actions/checkout@v4
 - uses: returntocorp/semgrep-action@v1
 with:
 config: >-
 p/default
 p/python
 p/flask
 .semgrep/rules.yml
 generateSarif: "1"

Job	Tool	What it stops
`lint`	ruff	Style plus the `S` security rule set
`sast`	Semgrep	OWASP/Flask packs plus four custom rules
`secrets`	gitleaks	Committed credentials, full history on PRs
`dependencies`	pip-audit	Pinned packages with known advisories
`test`	pytest	Regressions, with coverage reported

Custom Semgrep rules

The upstream packs (p/default, p/python, p/flask) catch the common cases; .semgrep/rules.yml adds four rules for patterns that kept slipping through. Three are ERROR severity and block the merge; the 0.0.0.0 bind is a WARNING nudge to confirm intent.

rules:
 - id: flask-debug-true
 languages: [python]
 severity: ERROR
 message: Running Flask with debug=True exposes the Werkzeug debugger and allows remote code execution.
 patterns:
 - pattern: $APP.run(..., debug=True, ...)

 - id: subprocess-shell-true
 languages: [python]
 severity: ERROR
 message: subprocess call with shell=True and a non-literal argument is a command injection risk.
 patterns:
 - pattern: subprocess.$FN(..., shell=True, ...)
 - pattern-not: subprocess.$FN("...", shell=True, ...)

The subprocess rule uses a pattern-not to exempt fully-literal command strings, so it only fires when an attacker-controllable argument reaches the shell. The fourth rule, jwt-decode-without-verification, matches both verify=False and an options dict that disables verify_signature, the two ways a forged token gets accepted.

Secret + dependency scanning

The secrets job checks out with fetch-depth: 0 so gitleaks scans the full history on a pull request, not just the tip commit, and runs against a project config that extends the defaults with a generic API-key rule plus an allowlist for test fixtures and documented placeholders:

[[rules]]
id = "generic-api-key"
description = "Generic API key assignment"
regex = '''(?i)(api[_-]?key|secret|token)["'\s:=]{1,4}[a-z0-9]{24,}'''
keywords = ["api_key", "apikey", "secret", "token"]

The dependency gate runs pip-audit -r requirements.txt --strict --desc against the pinned manifest. --strict fails the build on any package carrying a known advisory and --desc prints the advisory text into the log, so the diff between a passing and failing run is a single pinned version.

SARIF and the Security tab

Semgrep is invoked with generateSarif: "1" and the SARIF file is uploaded through github/codeql-action/upload-sarif@v3 with if: always(), so findings surface under the repository's Security tab and as inline pull-request annotations rather than living only in the job log. Uploading on always() means a failing SAST run still publishes its findings instead of swallowing them when the step exits non-zero.

SOC notifier

The final notify-soc job posts the run outcome, repository, commit, actor, status, and a link back to the run, to a Shuffle webhook, passing PIPELINE_STATUS: ${{ needs.test.result }} so the payload reflects whether the gates actually passed. The notifier in scripts/notify_soc.py uses only the Python standard library, validates the webhook is an http(s) URL, and degrades gracefully: if SHUFFLE_WEBHOOK_URL is unset the job no-ops instead of failing the build.

def main():
 hook = os.environ.get("SHUFFLE_WEBHOOK_URL")
 if not hook:
 print("SHUFFLE_WEBHOOK_URL not set, skipping SOC notification")
 return 0
 if not hook.lower().startswith(("https://", "http://")):
 print("SHUFFLE_WEBHOOK_URL must be an http(s) URL", file=sys.stderr)
 return 1
 event = build_event()
 body = json.dumps(event).encode("utf-8")
 req = request.Request(
 hook, data=body, headers={"Content-Type": "application/json"}, method="POST"
 )

A network failure reaching the webhook returns 0 on purpose, the SOC being unreachable should not flip an otherwise-green build red. In the homelab this webhook feeds a SOC automation lab, so a failed security gate opens a TheHive case the same way a Wazuh alert does.

The sample app + tests

The target is a minimal Flask task API, health, list, create, fetch, and delete endpoints backed by a lock-guarded in-memory store. Five pytest tests cover the happy path plus the edge cases that matter for an API: missing required fields return 400, unknown task IDs return 404, and deletes return 204. The store is reset between tests so each runs against a clean fixture.

def test_delete(client):
 created = client.post("/tasks", json={"title": "temp"}).get_json()["task"]
 assert client.delete(f"/tasks/{created['id']}").status_code == 204
 assert client.get(f"/tasks/{created['id']}").status_code == 404


def test_missing_task(client):
 assert client.get("/tasks/999").status_code == 404

The app itself binds to 127.0.0.1 and never sets debug=True, so it passes its own custom Semgrep rules, the rules are written against the mistakes the app deliberately avoids.

What fails the build

Each gate fails the run for a concrete, reproducible reason, and because every gate is its own job the red check names the cause directly:

lint, ruff finds a style violation or an S security-rule hit
sast, any ERROR-severity Semgrep finding, custom or upstream (debug=True, shell=True on non-literal input, jwt.decode without verification)
secrets, gitleaks matches a credential anywhere in PR history that is not allowlisted
dependencies, pip-audit --strict hits a pinned package with a known advisory
test, any of the five pytest tests regresses

Running make all executes the same chain locally, lint sast secrets deps test, given semgrep and gitleaks on the PATH and the rest pip-installed by make install, so a developer sees the same failure before pushing that the pipeline would surface after.