Cerberus Watchtower: a control tower for my dev machine
I built a local dashboard that auto-discovers every repo on my machine, renders architecture views per stack, reviews bundles of commits as an architectural delta, shows every listening socket, and watches my Claude Code agent sessions. Five lenses, about 8,500 lines, and a bet about what development is turning into.
By Igor Riera
In early June I spent five days building a tool I’d been missing for a year. Cerberus Watchtower is a local dashboard — a control tower — for my development machine: every repository, every architecture, every listening port, every AI agent session, and every commit, on one screen.
This post is what it does, how the interesting parts work, and the design decision I’d defend hardest.
The problem
My working set is wide: a .NET solution with 44 projects, several Vue applications, Python services, infrastructure and config repos — about fifteen active repositories. On top of that, I increasingly work by running multiple Claude Code agent sessions in parallel across those repos.
Two kinds of situational awareness kept failing. The first is structural: “what does this system actually look like right now” — which projects reference which, where a request enters and which handler serves it. That map lives in my head, and my head’s copy rots. The second is operational: an agent session finishes, or blocks on a permission prompt, and sits idle while I’m heads-down in another window. Work stalls silently.
Both are observability problems, so the tool is an observability tool.
The shape
A Python FastAPI backend on 127.0.0.1:8765 — loopback only, by design — serving a Vue 3 frontend. It launches in a chromeless Edge app window with its own taskbar identity, so it behaves like a native app without being one. An idempotent PowerShell launcher checks whether the server is already up and either starts it or just opens the window.
Five tabs: Review queue, Activity, Architecture, History, Ports.
Architecture views: an always-current map
Repo discovery is automatic. The config lists root directories; anything under them with a .git folder gets registered, and the stack is inferred from contents — a .sln means .NET, package.json with Vue means Vue, pyproject.toml means Python, Terraform files mean infrastructure. New repo on disk, refresh, it’s on the dashboard.
Each stack gets views that match how that stack is actually structured:
- .NET renders the project dependency graph (solution structure plus
ProjectReferences plus NuGet packages), a MediatR request-to-handler index, and full data flows: endpoint to dispatched request to handler to the service implementations it calls. Against the primary solution, the scanner traces 431 endpoints to their handlers — with about one miss — across ~263 handlers and ~3,800 parsed methods, in roughly six seconds. - Python shows the internal import graph, declared dependencies, FastAPI/Flask endpoints, and entry points.
- Vue shows the component and store import graph, the routes, and which components call which API endpoints.
- Everything else — config repos, Terraform, mixed — falls back to a structural inventory: language breakdown, directory map, docs index. No stack renders an error page.
Every scan is also diffed against the previous one. Drift detection compares stack-appropriate signatures — endpoints, dependency edges, packages, modules — and surfaces a “changes detected” badge when the structure moved. The dashboard doesn’t just show the map; it tells you when the territory changed.
For prose, a narrator generates per-project explanations of module responsibilities by shelling out to the Claude CLI, keyed by a hash of the architecture structure — so narratives cache until the structure actually changes, and an unchanged repo never costs a second model call.
The design decision I’d defend hardest
The scanners are regex and AST heuristics. There is no Roslyn, no compiler integration, no language server.
This was deliberate, and it’s the call I’d defend over any other in the codebase. Compiler-grade analysis is correct but heavy: it wants build context, package restores, and per-language toolchains, and it breaks when the code doesn’t compile — which, on an active dev machine, is often. Heuristic scanning is approximately correct and nearly free: the .NET scanner gets 431 of ~432 endpoint traces right with regular expressions over field declarations and type patterns.
For a dashboard you glance at thirty times a day, freshness beats precision. An always-current map that’s right to within one endpoint is more useful than a perfect map you stopped regenerating because it was slow. The scanners also never raise — a parse failure degrades to less detail, not to a broken tab.
History: looking backwards at committed work
The other four lenses answer “what is true right now.” Using the tool daily surfaced the gap: I also wanted to look backwards — at what a stretch of commits actually changed, structurally, not just line by line. So, about a week after the first build, I added a fifth lens at /history.
It does three things: it lists commits — per repo, or merged newest-first into a universal feed across the whole registry, or scoped to a single agent session; it opens any single commit as metadata plus a per-file diff; and, it assembles bundles: several commits reviewed together — either a contiguous range or an arbitrary cherry-picked set — as one combined diff.
The part I care about is what sits next to that combined diff: the architectural delta the bundle produced. To compute it, the scanner reconstructs the repo’s architecture as it stood at a past commit — extracted read-only with git archive, no worktree checkout, no branch switching — runs the same stack-aware scan the live Architecture view runs, and caches the result forever, keyed by repo-and-commit. Two of those snapshots get structurally diffed, and the result drills exactly like the live architecture map: the dependency graph highlights the nodes a bundle added, and you can walk endpoint-to-handler-to-service on the resulting structure. So a bundle isn’t just “+1,400 −300 across eighteen files” — it’s “these three endpoints and two handlers are new, and here’s the shape they added.”
This is also where Activity and History meet. Each agent session links to the commits it most likely produced — matched by time window and file overlap — so a “Commits →” jump takes you from “this agent did work in that repo” to “here’s exactly what landed, and what it did to the architecture.”
Activity: observability for agent sessions
Claude Code writes session transcripts as JSONL files. A parser extracts, per session: the working directory, the terminal tab title (which is effectively the agent’s own one-line summary of what it’s doing), tool call counts, files edited, and the last user prompt. Live events arrive separately, through a hook emitter registered with Claude Code itself — a stdlib-only script that appends an event and exits, adding zero latency to the agent — covering session starts, file edits, stops, and permission prompts.
The synthesis is an attention model. Every session is in one of three states: active (working), waiting (an unresolved permission prompt — the agent is blocked on me), or done. The dashboard sorts attention-needing sessions forward, and a Windows toast fires when a session enters a state that needs me — with a per-session cooldown, and a rule that a stop only toasts if the session had been running at least 45 seconds, so quick conversational turns don’t spam.
This tab changed my behavior the most. Running parallel agents without it, I was the scheduler and the alerting system. Now the machine is.
Ports: what is this box actually serving
The Ports tab polls the OS socket table every ten seconds: every TCP listener and UDP bind, with process name, PID, command line, uptime, and memory. Known services get labels. Established connections group by remote host.
The column that earns the tab its place: bind address. Loopback-only versus 0.0.0.0 — local-only versus exposed-to-the-LAN — is a security-relevant distinction that’s invisible until you put it on a screen. Dev machines accumulate listeners the way drawers accumulate cables; now mine are inventoried continuously.
What it deliberately is not
It’s local-only — the server binds loopback and nothing is exposed. It’s read-only toward the repos — all git access is read-only subprocess calls, and even the history lens reconstructs past architectures with git archive rather than ever checking a commit out; the tool will never mutate a repository. It doesn’t link across stacks (a .NET endpoint isn’t traced into the Vue component that calls it — per-stack maps only, for now). And the notification layer is Windows-specific; elsewhere it degrades to a silent no-op.
By the numbers: about 8,500 lines of Python, Vue, and JavaScript across 67 source files, 18 Vue components, 8 scanner modules, 70 tests that run against synthetic fixtures rather than real repos. Twelve commits, June 3rd to June 15th.
The bet underneath it
Five days for the first four lenses, and a sixth day a week later for the fifth — fast for this much tool, and the speed is itself part of the story: most of Watchtower was built by the same agent sessions it now monitors.
As more of the work is done by agents in parallel, the scarce skill shifts: less writing the change, more maintaining situational awareness over several systems changing at once. The tools for that mostly don’t exist yet — production observability got dashboards, states, and alerts a decade ago, but the development side of the desk never did.