Skip to main content
Conduit.im

Best Practices

Patterns and recommendations for building reliable, cost-effective, and secure applications with the Conduit.im API.

API Key Security

Never expose your API key in client-side code. Keys embedded in JavaScript bundles, mobile apps, or public repositories can be extracted and misused.

Use environment variables

Store keys in environment variables or a secrets manager. Never hard-code them in source files.

// Good — read from environment
const API_KEY = process.env.CONDUIT_API_KEY;

// Bad — hard-coded secret
const API_KEY = "cnd_live_abc123...";

Proxy through your backend

Browser and mobile clients should call your own server, which then forwards the request to Conduit.im with the API key attached server-side.

Use separate keys per environment

Create distinct keys for development, staging, and production. If a dev key leaks, your production traffic is unaffected.

Rotate keys regularly

Rotate keys periodically and revoke any that may have been compromised. You can manage keys from the API Keys dashboard.

Error Handling

Always check the HTTP status code and handle errors gracefully:

async function callConduit(body) {
  const res = await fetch("https://api.conduit.im/v1/chat/completions", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.CONDUIT_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify(body),
  });

  if (!res.ok) {
    const { error } = await res.json();

    // Non-retryable client errors — surface to the user
    if (res.status < 500 && res.status !== 429) {
      throw new Error(`[${error.code}] ${error.message}`);
    }

    // Retryable — implement back-off (see Rate Limiting guide)
    throw new RetryableError(error.message, res.status);
  }

  return await res.json();
}
  • Retry 429 and 5xx errors with exponential back-off
  • Never retry 401 or 402 — these require user action
  • Log the requestId for debugging and support

Cost Optimisation

Set spending limits

Configure per-key spending limits (daily, weekly, or monthly) to cap costs and prevent runaway usage.

Use max_tokens

Always set max_tokens to the maximum you actually need. This prevents unexpectedly long (and expensive) responses.

{
  "model": "gpt-4",
  "messages": [...],
  "max_tokens": 500    // cap the response length
}

Choose the right model

More capable models cost more per token. Use a smaller model for simple tasks (classification, extraction) and reserve larger models for complex reasoning. Browse the Models page to compare pricing.

Trim conversation history

Every token in the messages array counts towards input costs. For long conversations, keep a sliding window of recent messages or summarise older turns.

Monitor usage

Use the Usage API to track your balance and transaction history. Set up alerts when your balance drops below a threshold.

Performance

Use streaming for user-facing apps

Streaming delivers tokens as they are generated, dramatically reducing perceived latency. Users see the first word in milliseconds rather than waiting several seconds.

Set timeouts

Always set a request timeout so a slow upstream provider doesn't block your application indefinitely. Use an AbortController in JavaScript or the timeout parameter in Python's requests.

// JavaScript — 30-second timeout
const controller = new AbortController();
setTimeout(() => controller.abort(), 30_000);

const res = await fetch(url, {
  ...options,
  signal: controller.signal,
});

Keep prompts concise

Shorter prompts mean faster time-to-first-token and lower costs. Put essential context first and remove filler text.

Cache repeated requests

If multiple users ask the same question, cache the response on your server. This eliminates duplicate API calls and reduces both latency and cost.

Prompt Engineering

Use system messages effectively

Set the tone, persona, and constraints in the system message. This is more reliable than putting instructions in the user message.

{
  "messages": [
    {
      "role": "system",
      "content": "You are a customer support agent for Acme Corp. Be concise and helpful. Only answer questions about Acme products. If unsure, say so."
    },
    {
      "role": "user",
      "content": "How do I reset my password?"
    }
  ]
}

Be specific about output format

If you need JSON, bullet points, or a particular structure, say so explicitly in the prompt. This reduces post-processing and improves reliability.

Use temperature wisely

Lower temperature (0.0–0.3) for factual or deterministic tasks. Higher temperature (0.7–1.0) for creative writing or brainstorming.

Reliability

Implement retry with back-off

Transient failures happen. Use exponential back-off with jitter for 429 and 5xx responses.

Have a fallback model

If your primary model is temporarily unavailable, fall back to an alternative. Since Conduit.im provides all models through the same interface, switching is a one-line change.

const MODELS = ["gpt-4", "claude-3-sonnet", "gemini-pro"];

async function callWithFallback(messages) {
  for (const model of MODELS) {
    try {
      return await callConduit({ model, messages });
    } catch (err) {
      if (err.status === 404) continue; // model unavailable, try next
      throw err;                        // non-model error, don't mask it
    }
  }
  throw new Error("All models unavailable");
}

Log request IDs

Every error response includes a requestId. Log it so you can reference it when contacting support.

Quick Reference Checklist

  • API key stored in environment variable, never in client code
  • All API calls proxied through your own backend
  • Separate API keys for dev, staging, and production
  • Spending limits configured on every key
  • max_tokens set on every request
  • Retry logic with exponential back-off and jitter
  • Request timeout configured (e.g., 30 seconds)
  • Streaming enabled for user-facing interfaces
  • Conversation history trimmed to control costs
  • Error requestId values logged for debugging

Next Steps