Character counts include SSML tags. For text longer than the limit, split it into multiple requests.
Rate limits are tuned per product because the workloads differ: TTS audio is cost-per-call, voice agents is chatty interactive UI traffic.
Applies to /v1/audio/speech and /v1/audio/stream.
Applies to /v1/agents/*, /v1/tools/*, /v1/conversations/*, /v1/tests/*, /v1/knowledge-bases/*, and /v1/memories/*.
Burst is the peak bucket capacity. A fresh bucket absorbs the burst in a single second, then refills at the sustained rate. This lets a console page load or batch operation fire many parallel requests without hitting 429, while still capping long-running abuse at the sustained rate.
Concurrency limits cap the number of simultaneous in-flight requests per account.
Applies to /v1/audio/speech and /v1/audio/stream.
Applies to the authenticated voice-agent endpoints listed above. The primary target is POST /v1/agents/{id}/conversations, which allocates a live-call session.
All limits apply per account, not per API key.
When you exceed rate or concurrency limits, the API returns 429 Too Many Requests with a Retry-After header.
For texts exceeding 20,000 characters, split into chunks and process sequentially:
The request is rejected with an error response. Split your text into smaller chunks within the allowed limits.
Upgrade to a paid plan for 20 req/sec on TTS (with 15 concurrent requests) and 20 req/sec + 60 burst on voice-agent endpoints. Enterprise customers can request custom limits, contact sales.
Track usage through the Speechify Console dashboard.