The NSC Dashboard is a real working multi-agent system you can study. It does three unrelated things:
- Retro PC fleet control — Win98/XP/Linux machines, live video streaming, game benchmarks
- Market research pipeline — scans Reddit/HN/ProductHunt, scores opportunities, generates blueprints, builds apps, deploys to Azure
- Customer app build queues — handles fix submissions, compilation, UI test runs, E2E fix loops
Architecture at a glance
- Frontend: React + Tailwind, served by FastAPI
- Backend: FastAPI (Python), runs inside Docker
- Agent orchestration: openclaw cron jobs (a lightweight cron-style scheduler) trigger Python scripts on schedules
- LLM routing: three providers (Ollama on local 5090, Anthropic Claude API, GitHub Copilot API) selected per-task via
models.json - State: SQLite (
research.db) for pipeline state, Postgres for customer apps, encrypted secrets via Fernet
What makes it work
Decomposed responsibility. Each agent does one thing:
market-research.pyscans for opportunitiesblueprint-builder.pyconverts opportunities to buildable specsroadmap-generator.pyplans implementationfix-submission-agent.pypicks up queued customer fixesexternal-game-cataloger.pyscrapes game metadata weekly
Agents communicate via DB, not direct calls. Blueprint builder writes rows to a table; roadmap generator polls that table on its schedule. This is deliberately loose — any agent can crash and the next cycle picks up where it left off.
Multi-GPU routing. The 5090 runs heavy 30B+ models; the 4080 (on a second machine at 192.168.1.82) handles 8B-14B throughput work. market-research.py has a GPU_DEVICES dict that routes models appropriately.
Claude for hard problems, Ollama for volume. Quick classification / embeddings / summaries go to Ollama. Code generation and complex reasoning go to Claude via API. Cost curves naturally.
Lessons for your own agent system
- Database as message bus is under-rated. Your agents get durability, debuggability, and easy admin inspection for free.
- Separate the scheduler from the agents. Cron/openclaw fires agents; agents don't sleep-poll. Easier to scale.
- Per-task LLM routing saves money. Don't use Claude for everything.
- Ship a dashboard early. You'll need it to debug. The NSC Dashboard reads state from every agent's DB and shows it in one pane.
- Make everything idempotent. Agent crashed mid-run? Next cycle re-reads the DB and picks up. No manual recovery.
Code reading path
Start in the NSC repo at:
openclaw-agents/workspace/tools/market-research/AGENT.md— the pipeline walkthroughdashboard/backend/main.py— the FastAPI routesopenclaw-agents/workspace/tools/market-research/market-research.py— the orchestratoropenclaw-agents/config/cron-jobs.json— all scheduled agents
The whole system is ~15k LOC. Readable in a weekend.
Related
- Controlling retro PCs with AI agents →
- Building multi-agent orchestrators →
- How to self-host Claude API access →
Architecture diagram (logical)
┌──────────────────────────────────────────────────────────────────┐
│ NSC Dashboard │
│ │
│ ┌────────────┐ ┌─────────────┐ ┌────────────────────────┐ │
│ │ Frontend │ │ FastAPI │ │ openclaw cron │ │
│ │ (React + │──▶│ backend │◀──│ scheduler │ │
│ │ Tailwind)│ │ /api/* │ │ (runs Python jobs) │ │
│ └────────────┘ └─────────────┘ └────────────────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌───────────┐ ┌───────────┐ ┌─────────────────────┐ │
│ │ Postgres │ │ SQLite │ │ Per-agent Python │ │
│ │ (cust. │ │ research │ │ scripts (~8) │ │
│ │ apps) │ │ .db │ │ │ │
│ └───────────┘ └───────────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────┐ │
│ │ LLM providers (3) │ │
│ │ - Ollama (local 5090) │ │
│ │ - Claude API │ │
│ │ - GitHub Copilot │ │
│ └──────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
Agent roster (current, April 2026)
| Agent script | Schedule | Purpose |
|---|---|---|
market-research.py | every 30 min | scan Reddit/HN/ProductHunt for opportunities |
blueprint-builder.py | on-demand + nightly | opportunity → app blueprint |
roadmap-generator.py | on-demand + nightly | blueprint → implementation roadmap |
fix-submission-agent.py | every 2 min | dequeue user-submitted fixes, apply, re-test |
game-library-scanner.py | daily 6am | scan SMB share for new game installers |
external-game-cataloger.py | weekly Sun 5am | crawl Mobygames / ScreenScraper |
daily-briefing.py | daily 7:30am | calendar + news → morning email |
retro-multiplayer-refresh.py | daily 8:30am | scrape public Q3/UT/Q2 server lists, push favourites |
real-estate-agent.py | daily 7am | housing search aggregator |
All nine agents communicate via DB — never direct RPC. An agent that produces an artifact writes a row; another agent that consumes that artifact polls the same table on its own schedule.
How we tested and compared
The numbers and patterns in this article are from the running NSC Dashboard production deployment — ~50k LLM calls / month across all nine agents, ~3 years of operational history. The architecture has evolved through three major rewrites (monolith → separate scripts + shared DB → current "agents-over-DB" pattern), and the current shape is what survived.
Cross-references for the patterns: Claude Code best practices (Anthropic's guide to agentic workflows), Aider repo (reference for a simpler single-agent architecture), and LiteLLM docs (which the dashboard uses as its LLM proxy layer).
The rule-set that makes it work
1. One agent, one job. Every script does one thing. market-research.py doesn't build blueprints; blueprint-builder.py doesn't submit fixes. Clear boundaries mean you can run each in isolation, test each independently, and replace any one without affecting others.
2. DB is the bus. Agents never call each other directly. This makes failure modes cleaner — if blueprint-builder.py crashes mid-run, the producer (market-research.py) doesn't care. Just re-runs next schedule.
3. Cron for scheduling, not queues. openclaw-cron runs each agent on a schedule. Queues add complexity (another system to monitor) for no benefit at our scale. If we ever hit 10M calls / month we'd re-evaluate; at 50k we don't.
4. One config file (models.json) picks the model per agent. No hard-coded model in any script. Swap Ollama for Claude for Copilot with a one-line config change. See self-hosted Claude proxy and LiteLLM guide for the LLM layer.
5. Secrets encrypted at rest via Fernet. claude_api_key, github_token, azure_credentials in research.db are all Fernet-encrypted. Key file is colocated with the DB, bind-mounted into Docker, 0600 perms.
6. Dashboard UI reads-only. All LLM work runs in cron jobs; the dashboard displays state. This lets us iterate the UI without impacting the pipeline.
Trade-offs we've accepted
- Latency for cheaper orchestration. Opportunities scanned at the top of the hour wait 30 min before being picked up. For this workflow, fine.
- SQLite, not Postgres, for pipeline state. Faster for our size, trivial to operate, works fine with one-writer-many-reader. Would not scale past one machine.
- Docker Compose, not Kubernetes. One dashboard container, one Ollama container, one Caddy reverse proxy. Production-grade for our needs; would not suit a 10-person ops team.
- Python monorepo. One git repo holds all agent code. Simpler for a solo or tiny team; would rot with more contributors.
The UI layer
Frontend pulls dashboard state from FastAPI REST endpoints; websockets push live events (build progress, fix status). Every entity is cross-linked — opportunity → blueprint → build → roadmap → marketing → ready-to-deploy app — so an editor can click through the full pipeline for any customer app. See CLAUDE.md in NSC Dashboard for the "every linkable entity must be linked" rule.
Alternative architectures we rejected
- LangGraph. Overhead-heavy for our needs; the state-graph model fits tightly-scripted workflows better than our loose "agent-on-a-cron" pattern.
- n8n. Great for SaaS-glue; less good for Python-heavy LLM work. We'd end up writing custom nodes for everything.
- Celery + RabbitMQ. Classic Python worker stack. Too much ops surface for our scale; openclaw cron is simpler and does the job.
- Single monolith FastAPI app. Tried this first. Crashes in one agent took down the whole app. Splitting into scripts + DB fixed that.
Frequently asked questions
Can I copy this architecture for my own project?
Yes — it's a well-understood pattern (shared-DB workers). The NSC Dashboard source is on GitHub as a reference. Copy the cron-jobs pattern; you'll likely want Postgres instead of SQLite if you have concurrent writers.
How do you handle LLM cost?
LiteLLM proxy in front of everything (setup guide) with per-agent virtual keys and monthly spend caps. Any agent that hits its cap pauses until the admin bumps it or next month rolls over.
What if an agent needs state that doesn't fit in the DB?
We use /tmp/<agent-name>/ for work-in-progress file artifacts (generated app repos during build, scraped HTML during crawl). Ephemeral; don't commit to the DB what can be re-derived. This is the same pattern Claude Code uses for per-session scratch space.
How do you debug a multi-agent run?
Each agent writes to a per-agent log file; all logs tail into a central log aggregator (we use Tempo via Grafana). For a specific opportunity-to-deployed-app trace, follow the shared trace_id that every agent stamps on rows it writes.
Could this run on Mac Mini / smaller hardware?
Yes. The current deploy runs on a single box with an RTX 5090 for local Ollama. Remove Ollama and lean entirely on cloud APIs and this runs comfortably on an M2 Mac Mini. You'd pay more in cloud inference costs; you'd lose the privacy/latency benefits of local models.
Sources
- Anthropic — Claude Code best practices — reference for agentic patterns in engineering workflows.
- LiteLLM documentation — LLM-proxy layer used by the dashboard.
- Aider GitHub repository — reference for a simpler single-agent architecture.
- r/LocalLLaMA — community patterns for local-GPU + cloud-fallback hybrids.
Related guides
- Building multi-agent AI orchestrators
- Self-hosted Claude proxy
- Self-hosting an OpenAI-compatible LLM gateway
- Controlling retro PCs with AI agents
— SpecPicks Editorial · Last verified 2026-04-21
