Three endpoints, an auth header, a JSON body. A URL shortener API is one of the easiest integrations on any backlog, and the quickstart gets you a working short link in a few minutes. What the quickstart skips is everything that happens when the integration runs at volume: the rate limiter pushing back, a transient 503 mid-batch, a job queue that delivers the same message twice. Get those wrong and you get duplicate links, dropped work, and a 429 storm that makes things worse.
This post is the production-hardening companion to the API quickstart. It covers the three mechanics that separate a demo from a reliable integration: rate limits and how to pace against them, which errors to retry and how to back off, and idempotency keys that keep a retry from creating a second link. The examples use Elido's API, but the patterns are the same against any well-built link shortener API. If you treat short links as infrastructure you manage from code, the broader case for that is in short links as Terraform.
Rate Limits: a Token Bucket and Three Headers#
Elido meters the API with a token bucket, scoped per workspace. The published sustained rates are 10 requests per second on Free, 100 on Pro, 500 on Business, and a negotiated ceiling on Enterprise. Pro carries a burst capacity of 200, which means a full bucket lets you fire 200 requests at once before the rate settles back to the sustained 100 per second. Most link-creation jobs fit inside the burst and never feel the limit at all.
You do not have to guess where you stand. Every response carries three headers:
X-RateLimit-Limit- the current per-second ceiling.X-RateLimit-Remaining- tokens left in the current window.X-RateLimit-Reset- the Unix timestamp when the bucket refills.
A well-behaved client reads X-RateLimit-Remaining and slows down before it hits zero, rather than sprinting into a wall of 429s and reacting after the fact. Proactive pacing keeps throughput smooth; reactive retrying after every rejection wastes round trips and, if every client retries at the same instant, manufactures a thundering herd.
When you genuinely need to create thousands of links, do not loop the single-create endpoint. POST /v1/links/bulk accepts up to 1000 links in one request and counts as a single unit against the rate limit. One bulk call moves a thousand links for the cost of one token; a thousand single calls burn a thousand tokens and most of your burst. The bulk path is how the Google Sheets import moves a campaign's worth of links without tripping the limiter.
A 429 Too Many Requests - the status RFC 6585 reserves for exactly this - comes back with a retry_after value telling you how many seconds to wait. Respect it. That number is the limiter telling you precisely when a token will be available, which is better information than any guess your backoff would produce.
Retries: Which Codes, and How to Back Off#
Not every error is worth retrying, and retrying the wrong one is how a small failure becomes an outage. Sort the responses into two piles.
Retry these, because they are transient: 429 (you were too fast), and 500, 502, 503, 504 (a server-side or gateway fault that may clear on its own). Do not retry these, because the same request will fail identically: 400 (the payload is invalid), 401 (the token is missing or wrong), 403 (the token lacks the scope), 404 (the resource is not there or not yours), and 409 (a slug conflict or a stale-version edit). The first pile is "wait and try again." The second is "fix the code or the input." Retrying a 400 in a tight loop just turns a bug into a denial-of-service attack on yourself.
For the retryable codes, the algorithm that matters is exponential backoff with jitter. Plain exponential backoff - double the wait each attempt - still synchronizes clients, because every client that failed at the same moment also retries at the same moments. Adding randomness spreads them out. AWS's write-up on exponential backoff and jitter is the canonical reference and shows why the jittered version dramatically cuts contention. A compact version in TypeScript:
const RETRYABLE = new Set([429, 500, 502, 503, 504]);
async function withRetry<T>(
call: () => Promise<Response>,
max = 5,
): Promise<Response> {
let attempt = 0;
while (true) {
const res = await call();
if (res.ok || !RETRYABLE.has(res.status) || attempt >= max) return res;
// Honor server guidance first; otherwise back off exponentially with full jitter.
const retryAfter = Number(res.headers.get("retry-after"));
const base =
Number.isFinite(retryAfter) && retryAfter > 0
? retryAfter * 1000
: Math.min(1000 * 2 ** attempt, 20_000);
const wait = Math.random() * base; // full jitter
await new Promise((r) => setTimeout(r, wait));
attempt++;
}
}
Three things make this safe rather than dangerous. It caps attempts, so a persistent fault fails loudly instead of spinning forever. It honors Retry-After when the server sends it, falling back to computed backoff only when it does not. And it jitters, so a fleet of workers recovering from the same blip does not stampede in lockstep. The official SDKs implement this same policy out of the box - @elido/sdk, elido-python, and the Go client retry exactly the five transient codes with jittered backoff - which is the main reason to reach for an SDK over a hand-rolled HTTP client.
There is one rule that ties retries to the next section: a retry of a create is only safe if the create is idempotent. Otherwise every retry risks a second link.
Idempotency: How to Not Create Duplicate Links#
The classic failure looks like this. Your worker creates a short link, the link is created, but the 200 never makes it back - the connection drops on the return trip. The worker sees a timeout, assumes failure, and retries. Now you have two links for one campaign. At scale, the dashboard fills with /foo, /foo-1, /foo-2, and the duplicates skew every report downstream.
Idempotency keys close that gap. Send an Idempotency-Key header on a mutating request - any string up to 255 characters - and the server stores the response against it. Present the same key again and you get the original response back, status code and body, without the operation running twice. The pattern is the same one Stripe documents for idempotent requests, and it is the standard way to make an unreliable network safe for writes.
The detail that makes or breaks it is where the key comes from. Do not generate a random key per attempt - that defeats the point, because each retry then looks like a new operation. Derive it from a stable business identifier so the same logical action always produces the same key:
const link = await elido.links.create(
{ destinationUrl: order.landingUrl },
{ idempotencyKey: `order-${order.id}-link` },
);
Now a retry of the same job carries order-12345-link again, hits the stored response, and returns the link that already exists. Exactly one link per order, no matter how many times the queue redelivers. This is what lets you combine the backoff loop above with creates safely: the retry and the idempotency key are two halves of the same guarantee.
Two boundaries to keep in mind. The key is scoped per workspace: the same key in two workspaces creates two links, which is correct for a multi-tenant API but surprises teams that assume keys are global. And the cache is not forever - on Elido it holds for 24 hours keyed on (workspace, key). A retry within the window deduplicates; a retry three days later, from a stuck job that finally drained, will create a fresh link. For multi-day batches, do not lean on the key alone. Persist the link ID returned by the first success and look it up before re-issuing. The IETF has been standardizing this header in the Idempotency-Key draft, and the 24-hour-window caveat is called out there too.
If you are wiring an API integration today and want it to survive its own retries, start on a free workspace, generate a service-account token, and put an idempotency key on your very first create rather than retrofitting one after the duplicates show up.
Putting It Together#
A production-grade create call is the three mechanics stacked. Pace against the rate-limit headers so you rarely hit 429. Wrap the call in jittered backoff that retries only the transient codes and respects Retry-After. Carry an idempotency key derived from a business ID so the retry is safe. With the official SDK, the first two come for free and you supply only the key:
import { Elido, ElidoRateLimitError } from "@elido/sdk";
const elido = new Elido({ token: process.env.ELIDO_TOKEN! });
export async function shortenForOrder(order: Order) {
try {
return await elido.links.create(
{ destinationUrl: order.landingUrl, tags: [`order:${order.id}`] },
{ idempotencyKey: `order-${order.id}-link` },
);
} catch (err) {
if (err instanceof ElidoRateLimitError) {
// SDK already retried with backoff; we are still limited. Defer the job.
throw new RetryableJobError(err.retryAfter);
}
throw err; // non-retryable: surface it
}
}
None of this is exotic. It is the same discipline any write-heavy API deserves, applied to links. The reward is an integration that does the right thing under load instead of quietly corrupting your link inventory. For the read side of the same API - pulling click data back out without hammering the limiter - the tradeoffs are in webhooks versus polling for click tracking, and the full endpoint surface lives on the API and SDKs page and the developer solutions overview.
Related on the Blog#
Try Elido
Paste a URL, get a working short link
No signup. Link lives for 30 days. Sign up to keep it forever.
Free, no signup required · 2 per day