Voice Catalog¶
A short, curated list of TTS voices that play well with Dotty's persona — warm, cheerful, easy on the ear at low volume on a tiny speaker. The full upstream catalogues are huge; this page is the opinionated subset we've actually listened to and like.
For instructions on switching, see Swap Voice. For an automated download of any Piper voice listed below, see the install helper section at the bottom.
Quick guide¶
- Piper runs locally, no cloud, no jitter. Prefer it for reliability.
- EdgeTTS has more variety and naturalness but needs internet.
- "Best for" is opinion only — try a couple, your room and speaker matter.
- Sample rate
22050Hz is the Piper default; the firmware resamples transparently. File sizes are approximate.
Piper voices¶
All voices live on the public mirror at
huggingface.co/rhasspy/piper-voices.
Each voice ships as a .onnx model plus a .onnx.json config — both
needed.
| Key | Lang | Quality | Character | Best for | Size |
|---|---|---|---|---|---|
en_US-amy-medium |
en_US | medium | Warm, friendly | Kid + Adult | ~63 MB |
en_US-amy-low |
en_US | low | Warm, friendly | Kid + Adult | ~28 MB |
en_US-kristin-medium |
en_US | medium | Cheerful, bright | Kid Mode | ~63 MB |
en_US-hfc_female-medium |
en_US | medium | Neutral, clear | Adult | ~63 MB |
en_US-lessac-medium |
en_US | medium | Neutral, articulate | Adult | ~63 MB |
en_US-lessac-low |
en_US | low | Neutral, articulate | Adult | ~28 MB |
en_US-libritts_r-medium |
en_US | medium | Multi-speaker | Both | ~75 MB |
en_GB-cori-medium |
en_GB | medium | Soft, warm UK | Kid + Adult | ~63 MB |
en_GB-jenny_dylan-medium |
en_GB | medium | Playful, lively UK | Kid Mode | ~63 MB |
en_GB-southern_english_female-low |
en_GB | low | Cheerful UK | Kid + Adult | ~28 MB |
en_GB-alba-medium |
en_GB | medium | Scottish, cosy | Both | ~63 MB |
en_GB-semaine-medium |
en_GB | medium | Neutral UK | Adult | ~63 MB |
The default voice that ships with make fetch-models is
en_GB-cori-medium — a safe, friendly starting point.
Notes on quality tiers¶
low(16 kHz, ~28 MB) is fine for casual chat on a small speaker. The Pi can synthesize it at well over realtime even on a Pi 4.medium(22050 Hz, ~63 MB) is the sweet spot for desk listening.highexists for some voices (~110 MB) but the difference is hard to hear through the StackChan's tiny driver — skip it.
EdgeTTS voices¶
EdgeTTS calls Microsoft's cloud, which means latency jitter and
occasional throttling, but you get a much wider voice pool. Use the slug
in the voice: field under TTS.EdgeTTS (or TTS.StreamingEdgeTTS).
| Slug | Lang | Character | Best for |
|---|---|---|---|
en-AU-NatashaNeural |
en-AU | Warm, friendly AU | Kid + Adult |
en-AU-WilliamNeural |
en-AU | Calm, neutral AU | Adult |
en-GB-SoniaNeural |
en-GB | Warm, professional UK | Both |
en-GB-MaisieNeural |
en-GB | Young, cheerful UK | Kid Mode |
en-US-AriaNeural |
en-US | Bright, expressive US | Both |
en-US-JennyNeural |
en-US | Friendly assistant US | Both |
To list every available voice yourself:
Install helper¶
To download any Piper voice from the table above into models/piper/:
make voice-list # show this catalog
make voice-install VOICE=en_US-kristin-medium # download only
make voice-install VOICE=en_US-kristin-medium APPLY=1 # download + edit .config.yaml
The same script is at scripts/voice-install.sh if you'd rather call
it directly. Run ./scripts/voice-install.sh --help for flags.
After installing a Piper voice, run make doctor to verify the file is
in place, then restart the server: docker compose restart xiaozhi-server.
How to switch voices¶
See Swap Voice for the full walkthrough on
editing .config.yaml for either backend. The short version:
selected_module:
TTS: LocalPiper
TTS:
LocalPiper:
voice: en_US-kristin-medium
model_path: /opt/xiaozhi-esp32-server/models/piper/en_US-kristin-medium.onnx
config_path: /opt/xiaozhi-esp32-server/models/piper/en_US-kristin-medium.onnx.json
Then docker compose restart xiaozhi-server.
Last verified: 2026-05-17.