Perseus - Zion Boggan

What it is

Perseus is a centralized command layer for a self-hosted homelab. Instead of SSH-ing into five machines and juggling terminals, I send a natural language message in Discord and the right agent handles it. A chat router classifies the intent (infrastructure query, content task, trend hunt, compliance check) and dispatches to one of 20 registered agents, each wired to the LLM backend that makes sense for its workload.

The division is deliberate: latency-sensitive or high-volume tasks (content curation, trend analysis, social drafting) go to a local Ollama model at zero marginal cost. Tasks that need strong reasoning (compliance risk assessment) hit GPT-4o-mini. Strategic analysis goes to Grok. The system tracks per-request token spend against a monthly budget ceiling and alerts before things get expensive.

The routing layer

The /chat endpoint is the core. Every incoming message, whether from Discord or direct REST, flows through a linear intent classifier that pattern-matches against keyword sets before falling through to a general-purpose brain. The classifier is intentionally simple: a fast keyword scan over the lowercased command string, not an LLM call, so routing itself costs nothing.

Once a message is classified, the dispatcher looks up the agent in a registry that records which backend (ollama, gpt4o-mini, grok-fast) and whether it runs locally. The agent's prompt is built from a template function (build_prompt) that injects prior context when available, then the call is dispatched to the right backend, cost is computed from the token usage response, and the result is logged to PostgreSQL.

main.py: intent classification and agent dispatch (secrets-free; all credentials loaded from env)
# Content agent triggers: keyword scan before falling through to the general brain
agent_name = None
if any(t in lc for t in ["post ideas", "post idea", "generate post", "social post"]):
    agent_name = "social_publisher"
elif any(t in lc for t in ["find trends", "trending niches", "trend hunt", "viral trends"]):
    agent_name = "trend_hunter"
elif any(t in lc for t in ["curate content", "filter content", "dedup", "deduplicate"]):
    agent_name = "content_curator"
elif any(t in lc for t in ["rewrite this", "rephrase this", "rewrite caption"]):
    agent_name = "content_rewriter"
elif any(t in lc for t in ["compliance scan", "policy scan", "content scan"]):
    agent_name = "compliance_scanner"

if agent_name:
    prompt = build_prompt(agent_name, state)
    local = AGENT_REGISTRY.get(agent_name, {}).get("local", True)
    cost = 0.0
    if local:
        result_data = call_ollama(prompt)
        result = result_data.get("result", "")
    else:
        # compliance_scanner routes to GPT-4o-mini; tracks token cost
        gr = requests.post(
            "https://api.openai.com/v1/chat/completions",
            headers={"Authorization": f"Bearer {OPENAI_API_KEY}"},
            json={"model": "gpt-4o-mini", "messages": [{"role": "user", "content": prompt}],
                  "temperature": 0.5, "max_tokens": 200},
            timeout=30,
        )
        rj = gr.json()
        result = rj["choices"][0]["message"]["content"].strip()
        u = rj.get("usage", {})
        cost = (u.get("prompt_tokens", 0) / 1e6 * 0.15) + (u.get("completion_tokens", 0) / 1e6 * 0.60)
    save_to_db(agent_name, command, result, cost)

Infrastructure control

Beyond content agents, Perseus manages the homelab directly. SSH commands are classified into three tiers before they run: a blocklist of destructive operations (rm -rf /, dd if=/dev/zero, shutdown) that are rejected outright; a read-only auto-approve list (ps, nvidia-smi, ollama list, df, systemctl status) that runs immediately; and everything else, which parks in a PENDING_EXEC queue until an explicit !exec approval arrives. The host is resolved by alias (gpu → the right LXC IP, vpn → WireGuard node) so commands can be issued by role rather than by address.

The same infrastructure map also feeds the health monitor, which hits all nodes in parallel, checking CPU, RAM, disk, GPU utilization and temperature, and Ollama model status, then sends a Discord alert when any threshold trips.

Why I built it

Most homelab "automation" is a pile of shell scripts you have to remember. Perseus is the alternative: one interface, one context window, a history of what ran and what it cost. The Discord layer means I can check a failing cron job or queue a content draft from my phone without opening a terminal. The cost tracking means I never hit a surprise API bill; the system tells me when I'm approaching the ceiling before I get there.