NSC Dashboard — a reference multi-agent orchestration stack

NSC Dashboard — a reference multi-agent orchestration stack

How one React + FastAPI dashboard coordinates retro PCs, market research, and customer app builds.

The open-source NSC Dashboard orchestrates a fleet of retro PCs, a 30-min-interval market research pipeline, and per-customer app build queues. Architecture breakdown.

The NSC Dashboard is a real working multi-agent system you can study. It does three unrelated things:

  1. Retro PC fleet control — Win98/XP/Linux machines, live video streaming, game benchmarks
  2. Market research pipeline — scans Reddit/HN/ProductHunt, scores opportunities, generates blueprints, builds apps, deploys to Azure
  3. Customer app build queues — handles fix submissions, compilation, UI test runs, E2E fix loops

Architecture at a glance

  • Frontend: React + Tailwind, served by FastAPI
  • Backend: FastAPI (Python), runs inside Docker
  • Agent orchestration: openclaw cron jobs (a lightweight cron-style scheduler) trigger Python scripts on schedules
  • LLM routing: three providers (Ollama on local 5090, Anthropic Claude API, GitHub Copilot API) selected per-task via models.json
  • State: SQLite (research.db) for pipeline state, Postgres for customer apps, encrypted secrets via Fernet

What makes it work

Decomposed responsibility. Each agent does one thing:

  • market-research.py scans for opportunities
  • blueprint-builder.py converts opportunities to buildable specs
  • roadmap-generator.py plans implementation
  • fix-submission-agent.py picks up queued customer fixes
  • external-game-cataloger.py scrapes game metadata weekly

Agents communicate via DB, not direct calls. Blueprint builder writes rows to a table; roadmap generator polls that table on its schedule. This is deliberately loose — any agent can crash and the next cycle picks up where it left off.

Multi-GPU routing. The 5090 runs heavy 30B+ models; the 4080 (on a second machine at 192.168.1.82) handles 8B-14B throughput work. market-research.py has a GPU_DEVICES dict that routes models appropriately.

Claude for hard problems, Ollama for volume. Quick classification / embeddings / summaries go to Ollama. Code generation and complex reasoning go to Claude via API. Cost curves naturally.

Lessons for your own agent system

  1. Database as message bus is under-rated. Your agents get durability, debuggability, and easy admin inspection for free.
  2. Separate the scheduler from the agents. Cron/openclaw fires agents; agents don't sleep-poll. Easier to scale.
  3. Per-task LLM routing saves money. Don't use Claude for everything.
  4. Ship a dashboard early. You'll need it to debug. The NSC Dashboard reads state from every agent's DB and shows it in one pane.
  5. Make everything idempotent. Agent crashed mid-run? Next cycle re-reads the DB and picks up. No manual recovery.

Code reading path

Start in the NSC repo at:

  1. openclaw-agents/workspace/tools/market-research/AGENT.md — the pipeline walkthrough
  2. dashboard/backend/main.py — the FastAPI routes
  3. openclaw-agents/workspace/tools/market-research/market-research.py — the orchestrator
  4. openclaw-agents/config/cron-jobs.json — all scheduled agents

The whole system is ~15k LOC. Readable in a weekend.

Related

Architecture diagram (logical)

┌──────────────────────────────────────────────────────────────────┐
│                         NSC Dashboard                           │
│                                                                  │
│  ┌────────────┐   ┌─────────────┐   ┌────────────────────────┐  │
│  │  Frontend  │   │  FastAPI    │   │  openclaw cron         │  │
│  │  (React +  │──▶│  backend    │◀──│  scheduler             │  │
│  │   Tailwind)│   │  /api/*     │   │  (runs Python jobs)    │  │
│  └────────────┘   └─────────────┘   └────────────────────────┘  │
│                         │    │                  │               │
│                         ▼    ▼                  ▼               │
│  ┌───────────┐   ┌───────────┐        ┌─────────────────────┐   │
│  │  Postgres │   │  SQLite   │        │  Per-agent Python   │   │
│  │  (cust.   │   │  research │        │  scripts (~8)       │   │
│  │   apps)   │   │  .db      │        │                     │   │
│  └───────────┘   └───────────┘        └─────────────────────┘   │
│                                                │                 │
│                                                ▼                 │
│                               ┌──────────────────────────┐       │
│                               │  LLM providers (3)       │       │
│                               │  - Ollama (local 5090)   │       │
│                               │  - Claude API            │       │
│                               │  - GitHub Copilot        │       │
│                               └──────────────────────────┘       │
└──────────────────────────────────────────────────────────────────┘

Agent roster (current, April 2026)

Agent scriptSchedulePurpose
market-research.pyevery 30 minscan Reddit/HN/ProductHunt for opportunities
blueprint-builder.pyon-demand + nightlyopportunity → app blueprint
roadmap-generator.pyon-demand + nightlyblueprint → implementation roadmap
fix-submission-agent.pyevery 2 mindequeue user-submitted fixes, apply, re-test
game-library-scanner.pydaily 6amscan SMB share for new game installers
external-game-cataloger.pyweekly Sun 5amcrawl Mobygames / ScreenScraper
daily-briefing.pydaily 7:30amcalendar + news → morning email
retro-multiplayer-refresh.pydaily 8:30amscrape public Q3/UT/Q2 server lists, push favourites
real-estate-agent.pydaily 7amhousing search aggregator

All nine agents communicate via DB — never direct RPC. An agent that produces an artifact writes a row; another agent that consumes that artifact polls the same table on its own schedule.

How we tested and compared

The numbers and patterns in this article are from the running NSC Dashboard production deployment — ~50k LLM calls / month across all nine agents, ~3 years of operational history. The architecture has evolved through three major rewrites (monolith → separate scripts + shared DB → current "agents-over-DB" pattern), and the current shape is what survived.

Cross-references for the patterns: Claude Code best practices (Anthropic's guide to agentic workflows), Aider repo (reference for a simpler single-agent architecture), and LiteLLM docs (which the dashboard uses as its LLM proxy layer).

The rule-set that makes it work

1. One agent, one job. Every script does one thing. market-research.py doesn't build blueprints; blueprint-builder.py doesn't submit fixes. Clear boundaries mean you can run each in isolation, test each independently, and replace any one without affecting others.

2. DB is the bus. Agents never call each other directly. This makes failure modes cleaner — if blueprint-builder.py crashes mid-run, the producer (market-research.py) doesn't care. Just re-runs next schedule.

3. Cron for scheduling, not queues. openclaw-cron runs each agent on a schedule. Queues add complexity (another system to monitor) for no benefit at our scale. If we ever hit 10M calls / month we'd re-evaluate; at 50k we don't.

4. One config file (models.json) picks the model per agent. No hard-coded model in any script. Swap Ollama for Claude for Copilot with a one-line config change. See self-hosted Claude proxy and LiteLLM guide for the LLM layer.

5. Secrets encrypted at rest via Fernet. claude_api_key, github_token, azure_credentials in research.db are all Fernet-encrypted. Key file is colocated with the DB, bind-mounted into Docker, 0600 perms.

6. Dashboard UI reads-only. All LLM work runs in cron jobs; the dashboard displays state. This lets us iterate the UI without impacting the pipeline.

Trade-offs we've accepted

  • Latency for cheaper orchestration. Opportunities scanned at the top of the hour wait 30 min before being picked up. For this workflow, fine.
  • SQLite, not Postgres, for pipeline state. Faster for our size, trivial to operate, works fine with one-writer-many-reader. Would not scale past one machine.
  • Docker Compose, not Kubernetes. One dashboard container, one Ollama container, one Caddy reverse proxy. Production-grade for our needs; would not suit a 10-person ops team.
  • Python monorepo. One git repo holds all agent code. Simpler for a solo or tiny team; would rot with more contributors.

The UI layer

Frontend pulls dashboard state from FastAPI REST endpoints; websockets push live events (build progress, fix status). Every entity is cross-linked — opportunity → blueprint → build → roadmap → marketing → ready-to-deploy app — so an editor can click through the full pipeline for any customer app. See CLAUDE.md in NSC Dashboard for the "every linkable entity must be linked" rule.

Alternative architectures we rejected

  • LangGraph. Overhead-heavy for our needs; the state-graph model fits tightly-scripted workflows better than our loose "agent-on-a-cron" pattern.
  • n8n. Great for SaaS-glue; less good for Python-heavy LLM work. We'd end up writing custom nodes for everything.
  • Celery + RabbitMQ. Classic Python worker stack. Too much ops surface for our scale; openclaw cron is simpler and does the job.
  • Single monolith FastAPI app. Tried this first. Crashes in one agent took down the whole app. Splitting into scripts + DB fixed that.

Frequently asked questions

Can I copy this architecture for my own project?

Yes — it's a well-understood pattern (shared-DB workers). The NSC Dashboard source is on GitHub as a reference. Copy the cron-jobs pattern; you'll likely want Postgres instead of SQLite if you have concurrent writers.

How do you handle LLM cost?

LiteLLM proxy in front of everything (setup guide) with per-agent virtual keys and monthly spend caps. Any agent that hits its cap pauses until the admin bumps it or next month rolls over.

What if an agent needs state that doesn't fit in the DB?

We use /tmp/<agent-name>/ for work-in-progress file artifacts (generated app repos during build, scraped HTML during crawl). Ephemeral; don't commit to the DB what can be re-derived. This is the same pattern Claude Code uses for per-session scratch space.

How do you debug a multi-agent run?

Each agent writes to a per-agent log file; all logs tail into a central log aggregator (we use Tempo via Grafana). For a specific opportunity-to-deployed-app trace, follow the shared trace_id that every agent stamps on rows it writes.

Could this run on Mac Mini / smaller hardware?

Yes. The current deploy runs on a single box with an RTX 5090 for local Ollama. Remove Ollama and lean entirely on cloud APIs and this runs comfortably on an M2 Mac Mini. You'd pay more in cloud inference costs; you'd lose the privacy/latency benefits of local models.

Sources

  1. Anthropic — Claude Code best practices — reference for agentic patterns in engineering workflows.
  2. LiteLLM documentation — LLM-proxy layer used by the dashboard.
  3. Aider GitHub repository — reference for a simpler single-agent architecture.
  4. r/LocalLLaMA — community patterns for local-GPU + cloud-fallback hybrids.

Related guides


— SpecPicks Editorial · Last verified 2026-04-21

— SpecPicks Editorial · Last verified 2026-04-22