API: Concurrency Limits for TTS Endpoints

We’ve introduced concurrency limits to prevent abuse and ensure fair usage across all users. These limits restrict the number of simultaneous in-flight text-to-speech requests per user.

Concurrency Limits

Plan TypeConcurrent Requests
Free1 concurrent request
Paid15 concurrent requests

What This Means

  • Free users are limited to 1 concurrent in-flight TTS request at a time
  • Paid users can have up to 15 concurrent in-flight TTS requests
  • Concurrency limits are enforced per user account, not per API key
  • When the limit is exceeded, you’ll receive a 429 Too Many Requests response with a Retry-After header

Affected Endpoints

  • POST /v1/audio/speech
  • POST /v1/audio/stream

Concurrency limits work alongside rate limits. Both must be satisfied for a request to proceed. If you exceed the concurrency limit, wait for your current request(s) to complete before sending new ones.