Rate Limiting
Understand how rate limits work, detect when you hit them, and implement retry logic that keeps your application running smoothly.
How Rate Limits Work
Conduit.im enforces rate limits to ensure fair usage and protect upstream providers. Limits are applied per API key and are measured in requests per minute. When you exceed a limit, the API returns a 429 Too Many Requests response.
Note: Rate limits are separate from spending limits. A spending limit caps how much money a key can spend; a rate limit caps how many requests it can make in a time window.
Detecting a Rate Limit
When you are rate-limited, the API returns HTTP 429 with a JSON error body and a Retry-After header indicating how many seconds to wait:
HTTP/1.1 429 Too Many Requests
Retry-After: 5
Content-Type: application/json
{
"error": {
"message": "Rate limit exceeded. Please try again later.",
"code": "RATE_LIMIT_EXCEEDED",
"timestamp": "2026-03-09T14:32:00.000Z",
"requestId": "req_abc123"
}
}Rate Limit Error Codes
The code field tells you exactly which limit was hit so you can respond appropriately:
| Code | Retryable | Description |
|---|---|---|
| RATE_LIMIT_EXCEEDED | Yes | Too many requests per minute — wait and retry |
| CHAT_RATE_LIMIT_EXCEEDED | Yes | Chat-specific rate limit hit — slow down chat requests |
| RATE_LIMIT_QUOTA_EXCEEDED | No | Daily quota exhausted — resets at midnight UTC |
| API_KEY_LIMIT_EXCEEDED | No | Per-key spending limit reached — increase it in the dashboard |
Exponential Back-off
The recommended retry strategy is exponential back-off with jitter. Each retry waits longer than the last, and a random jitter prevents all clients from retrying at the same moment:
| Attempt | Base delay | With jitter (typical) |
|---|---|---|
| 1 | 1 s | 0.5 – 1.5 s |
| 2 | 2 s | 1 – 3 s |
| 3 | 4 s | 2 – 6 s |
JavaScript / TypeScript
async function fetchWithRetry(url, options, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const res = await fetch(url, options);
if (res.ok) return await res.json();
// Only retry rate limits and server errors
if (res.status !== 429 && res.status < 500) {
const { error } = await res.json();
throw new Error(error.message);
}
// Prefer the server's Retry-After header if present
const retryAfter = res.headers.get("Retry-After");
const baseDelay = retryAfter
? Number(retryAfter) * 1000
: 1000 * 2 ** attempt;
// Add random jitter (±50 %)
const jitter = baseDelay * (0.5 + Math.random());
await new Promise((r) => setTimeout(r, jitter));
}
throw new Error("Max retries exceeded");
}Python
import time, random, requests
def fetch_with_retry(url, headers, json_body, max_retries=3):
for attempt in range(max_retries):
res = requests.post(url, headers=headers, json=json_body)
if res.ok:
return res.json()
# Only retry rate limits and server errors
if res.status_code != 429 and res.status_code < 500:
raise Exception(res.json()["error"]["message"])
# Prefer the server's Retry-After header if present
retry_after = res.headers.get("Retry-After")
base_delay = float(retry_after) if retry_after else 2 ** attempt
# Add random jitter (±50 %)
jitter = base_delay * (0.5 + random.random())
time.sleep(jitter)
raise Exception("Max retries exceeded")The Retry-After Header
When the API returns a 429, it includes a Retry-After header with the number of seconds to wait. Always respect this value — it is the fastest safe retry time:
const retryAfter = response.headers.get("Retry-After");
if (retryAfter) {
await new Promise((r) => setTimeout(r, Number(retryAfter) * 1000));
// Now safe to retry
}Important: Retrying before the Retry-After window elapses will result in another 429 and may extend the cool-down period.
Best Practices
Use a request queue
Instead of sending requests as fast as possible, enqueue them and process at a controlled rate (e.g., one request per 100 ms). This avoids hitting limits in the first place.
Set per-key spending limits
Configure spending limits on each API key to prevent runaway costs. A spending limit is a hard cap, not a rate limit, but it provides an extra safety net.
Don't retry non-retryable errors
Only retry RATE_LIMIT_EXCEEDED and server errors (5xx). Errors like RATE_LIMIT_QUOTA_EXCEEDED or API_KEY_LIMIT_EXCEEDED require user action, not retries.
Add jitter to back-off
Without jitter, multiple clients that hit a limit at the same time will all retry together, causing a "thundering herd." Random jitter spreads retries out and improves success rates.
Cap the number of retries
Set a maximum (e.g., 3–5 retries). After exhausting retries, surface a clear error to the user rather than blocking indefinitely.
Example: Simple Request Queue
A basic queue that spaces out requests to stay under the rate limit:
class RequestQueue {
constructor(minIntervalMs = 100) {
this.queue = [];
this.minInterval = minIntervalMs;
this.processing = false;
}
enqueue(fn) {
return new Promise((resolve, reject) => {
this.queue.push({ fn, resolve, reject });
if (!this.processing) this.#process();
});
}
async #process() {
this.processing = true;
while (this.queue.length > 0) {
const { fn, resolve, reject } = this.queue.shift();
try {
resolve(await fn());
} catch (err) {
reject(err);
}
await new Promise((r) => setTimeout(r, this.minInterval));
}
this.processing = false;
}
}
// Usage
const queue = new RequestQueue(200); // 5 requests per second max
const result = await queue.enqueue(() =>
fetchWithRetry(url, options)
);Next Steps
You now know how to detect, handle, and prevent rate limit errors. Continue learning: