Docs — StackChan tech reference¶
Curated reference for the StackChan voice robot stack. The top-level README.md
covers how to deploy it; these docs cover what it is underneath — hardware,
protocols, upstream model facts, and capabilities we aren't yet using.
Every file here cites upstream sources so a future agent (or human) can re-verify claims against the canonical specs rather than trusting our paraphrase.
Start here if you want…¶
| If you're trying to… | Read |
|---|---|
| Understand the overall shape | architecture.md |
| Know what the physical robot can do | hardware.md |
| Understand the voice pipeline (ASR/TTS/VAD) | voice-pipeline.md |
| Understand the default voice LLM (Tier1Slim + escalation) | tier1slim.md |
| Understand the brain (model matrix + ZeroClaw) | brain.md |
| Run different models on voice vs. Discord | multi-daemon-split.md |
| Know what's on the wire between components | protocols.md |
| See every cross-layer signal at a glance | interaction-map.md |
| Know what mode the robot is in (and what the LEDs mean) | modes.md |
| Find features we aren't using yet | latent-capabilities.md |
| Pick an LLM backend | llm-backends.md |
| Jump to an upstream repo or spec | references.md |
File map¶
docs/
├── README.md ← you are here (index)
├── architecture.md ← high-level data flow, actor responsibilities
├── hardware.md ← M5Stack StackChan body + firmware lineage + MCP tool catalog
├── voice-pipeline.md ← xiaozhi-esp32-server, FunASR/Whisper, VAD, Piper/EdgeTTS
├── tier1slim.md ← two-tier voice LLM provider + escalation contract
├── brain.md ← model matrix (Tier1Slim + ZeroClaw), bridge, OpenRouter
├── multi-daemon-split.md ← split voice + Discord across two ZeroClaw daemons
├── protocols.md ← Xiaozhi WebSocket, MCP-over-WS, ACP JSON-RPC, emotion
├── interaction-map.md ← every cross-layer signal: source, dest, protocol, notes
├── modes.md ← behavioural mode taxonomy + LED contract + transitions
├── latent-capabilities.md ← upstream features we could wire up (cross-refs ROADMAP.md)
├── llm-backends.md ← side-by-side comparison of LLM backend options
└── references.md ← canonical URLs, licenses, model cards, spec docs
Conventions these docs follow¶
- TL;DR at the top of each file — 3-6 bullets, scannable in the first 40 lines.
- Tables over prose for dense facts — specs, tunables, method signatures.
- Grep-bait headers — e.g.
## MCP tool handshake,## session/prompt— so you can navigate by header search. - Relative links only —
[voice-pipeline.md](./voice-pipeline.md); never absolute paths. - Freshness footer — every non-index file ends with
Last verified: YYYY-MM-DD. - Placeholders for per-deployment values —
<XIAOZHI_HOST>,<ZEROCLAW_HOST>, etc. (mapping lives with the deployer, not in this repo). - Soft claims where unverified — if a fact came from a secondary source or we couldn't verify, the text says so rather than pretending to cite upstream.
Relationship to the rest of the repo¶
../README.md— deployment & ops (commands, layout, troubleshooting).../CLAUDE.md— agent orientation for this repo specifically.../bridge.py,../zeroclaw.py,../edge_stream.py,../fun_local.py,../piper_local.py— canonical source for the custom provider patches.- These
docs/— the why and the what else is possible behind the above.
When docs here are stale¶
Each sub-file has a Last verified: date. Freshness decays roughly as follows:
| Topic | Half-life | Why |
|---|---|---|
| Hardware spec | Years | M5Stack CoreS3 revisions are slow |
| Protocol spec | Months | xiaozhi is actively evolving |
| Model facts (Qwen3) | Weeks-months | OpenRouter pricing and model revisions churn |
| Latent capabilities | Months | Upstream adds features regularly |
If you're reading this a year from now, treat the protocol + model claims as starting points for re-verification, not ground truth.
Last verified: 2026-05-17.