Shipping the Bitly migration: a worker, a token, a 30-minute budget

The first migration source for our Tier-3 integration rollout shipped today. Paste a Bitly Generic Access Token, pick a group, click Start. Five minutes later every link sits on s.elido.me/<slug> (or your custom domain) with the Bitly slug preserved.

This post is the engineering write-up — what's in the code, what's deliberately left out, and why the worker is in-process for now.

Why Bitly first#

Five vendors are queued in the rollout plan: Bitly, Rebrandly, Short.io, Dub.co, TinyURL. Bitly is first because the SEO and acquisition gravity is on that one specific search query — "Bitly alternative". Every other migration source benefits from sharing the worker scaffolding we put in place for Bitly. Order is engineering cost ascending; SEO is the tie-breaker.

The four other vendors will land in the next four weeks against the same import_jobs table.

Data model#

The whole feature is one table:

CREATE TABLE import_jobs (
    id                  BIGSERIAL    PRIMARY KEY,
    workspace_id        BIGINT       NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE,
    source_vendor       TEXT         NOT NULL,
    source_token_id     BIGINT       REFERENCES service_tokens(id) ON DELETE SET NULL,
    target_domain_id    BIGINT       NOT NULL REFERENCES domains(id) ON DELETE CASCADE,
    status              TEXT         NOT NULL DEFAULT 'queued',
    conflict_strategy   TEXT         NOT NULL DEFAULT 'suffix',
    source_filter       JSONB        NOT NULL DEFAULT '{}'::jsonb,
    total_items         INT          NOT NULL DEFAULT 0,
    imported_items      INT          NOT NULL DEFAULT 0,
    skipped_items       INT          NOT NULL DEFAULT 0,
    failed_items        INT          NOT NULL DEFAULT 0,
    error_log           JSONB        NOT NULL DEFAULT '[]'::jsonb,
    -- timestamps + check constraints elided
);

source_token_id is nullable on purpose. TinyURL has no public API for free accounts, so its path is a CSV upload — no token. CSV uploads still get a row in the same table so the dashboard surfaces a single "import progress" UI for all five sources.

source_filter is a JSONB bag for vendor-specific things: {group_guid: "..."} for Bitly, {project_slug: "..."} for Dub, {domain_id: 123} for Short.io. We could split it into typed columns once we know what's actually variant; until then JSONB keeps the schema flat.

error_log is a JSONB array of {source_id, source_slug, reason} so the dashboard can render "12 of 4,302 links could not be migrated" without a separate table or a join. The worker truncates at 1,000 entries — beyond that you have a structural problem and the count alone is the actionable signal.

The worker#

A single goroutine per kicked-off job. The worker lives in api-core (services/api-core/internal/imports/bitly.go) for v1 — fewer moving parts, no inter-service event bus, and the per-job context is bounded by a 30-minute timeout.

const (
    MaxLinksPerImport = 50_000
    ImportRunBudget   = 30 * time.Minute
    progressEvery     = 50
    errorLogCap       = 1_000
    bitlyPageSize     = 100
)

These four constants do most of the work. They aren't a config knob — they're the contract.

MaxLinksPerImport is a guardrail, not a product limit. Most users have under 5,000 bitlinks. Above 50k we want a chunked migration with explicit checkpointing, so the worker hard-fails with an instruction to email migrate@elido.app. Tomorrow it points at a paid concierge SKU; today it routes to the inbox.

ImportRunBudget is the deploy-friendliness budget. A 50k account at ~5 inserts/sec hits roughly three hours; we'd rather fail fast and re-run than deploy-over a long-running goroutine. Above 50k or above 30 minutes, see the resumability TODO at the bottom of the file.

Pagination#

Bitly's API is well-behaved. GET /v4/groups/{guid}/bitlinks?size=100 returns links plus a pagination.next URL. Empty next means done. The whole loop is:

page := fmt.Sprintf("%s/v4/groups/%s/bitlinks?size=%d",
    BitlyAPIBase, url.PathEscape(opts.GroupGUID), bitlyPageSize)

for page != "" {
    resp, err := w.fetchPage(ctx, opts.Token, page)
    if err != nil { /* mark failed */ return }

    for _, link := range resp.Links {
        // ... resolve slug, insert, update counters ...
    }
    page = strings.TrimSpace(resp.Pagination.Next)
}

We trust Bitly's pagination cursor. If they return the same next URL twice we'll loop, but that's never happened in testing — and the 30-minute budget caps the damage.

Conflict resolution#

When a Bitly slug collides with an Elido link that already exists on the target domain, the worker has to choose. The user picks the strategy when they kick the job off:

suffix (default): walk mylink-2, mylink-3, … up to 50. Past 50 we treat it as an error — that signals a pathological collision and they should clean up the source first.
skip: leave the existing Elido link alone, log the source row to error_log, count as skipped.
fail: abort the whole job on the first conflict. For users who want strict 1:1 semantics.

The lookup is a single indexed read on (domain_id, slug):

func (w *BitlyWorker) resolveSlug(ctx context.Context, domainID int64, desired, strategy string) (string, error) {
    if _, err := w.links.GetByDomainSlug(ctx, domainID, desired); err != nil {
        if errors.Is(err, pgx.ErrNoRows) {
            return desired, nil
        }
        return "", fmt.Errorf("slug lookup: %w", err)
    }
    switch strategy {
    case "skip": return "", nil
    case "fail": return "", fmt.Errorf("slug %q already exists", desired)
    case "suffix":
        for i := 2; i <= maxSuffix; i++ {
            candidate := fmt.Sprintf("%s-%d", desired, i)
            if _, err := w.links.GetByDomainSlug(ctx, domainID, candidate); err != nil {
                if errors.Is(err, pgx.ErrNoRows) { return candidate, nil }
                return "", err
            }
        }
        return "", fmt.Errorf("more than %d collisions, giving up", maxSuffix)
    }
    return "", fmt.Errorf("unknown conflict_strategy %q", strategy)
}

This is sequential lookup, not insert-with-conflict. We pay an extra read per row but get a deterministic suffix walk and a much friendlier error message — the alternative is fishing for a uniqueness violation in pgx and parsing the constraint name out of the error string.

What we don't migrate#

Click history. Bitly does not expose per-click data for export — only aggregate counters per link, and only on Pro plans. So we surface this on every single surface the user sees: the dashboard recipe page, the marketing landing, the import progress UI, and the FAQ section of /migrate-from/bitly. New clicks land in Elido analytics from the cutover moment forward.

We considered fetching /v4/bitlinks/{id}/clicks/summary per link to seed an "imported click count" metric. Rejected: it triples the API calls and gives a single fuzzy number that can't drive any actual analysis. If you need historical clicks, you need them in GA4 or your own warehouse anyway.

QR designs and Bitly campaigns are also dropped. They're vendor-specific structures that don't map cleanly. The Bitly-imported links carry an imported:bitly tag so you can filter them in bulk — most users use that to assign a default Elido CTA overlay or campaign post-hoc.

Token handling#

The token never lands on disk. The HTTP handler accepts it in the request body, drops it into a BitlyJobOptions struct, and hands it to the worker via the goroutine launch:

bgCtx := context.WithoutCancel(r.Context())
go h.worker.Run(bgCtx, job.ID, imports.BitlyJobOptions{
    Token:     req.Token,
    GroupGUID: req.GroupGUID,
})

source_token_id stays NULL. ADR-0036's service_tokens table exists and we'll wire migrations into it for the Tier-2 paste-token integrations (Mailchimp, Brevo, Klaviyo, …) where the value of persistence is recurring use. For one-shot migrations the operational benefit doesn't justify the storage surface — the user pastes the token once, the worker runs, the token is gone.

context.WithoutCancel is the new-to-me piece. The handler's request context is normally how Go programs propagate cancellation. We need the opposite — the worker should outlive the HTTP request that kicked it off. WithoutCancel (Go 1.21+) keeps the context's values (logger, trace IDs, deadline-less) but strips its cancellation signal.

Resumability and the deploy problem#

The worker is in-process. A deploy mid-import kills the goroutine. We accept that for v1 because:

Most jobs finish in under five minutes. Deploys are infrequent at the import-y times of day.
The import_jobs row records last_progress_at. A scheduler tick every 5 minutes flips any running row with no progress in the last 30 minutes to failed with a clear "worker stalled" reason, so users aren't left wondering what happened.
Re-running is idempotent under suffix and skip strategies — already-imported links are detected and resolved per the strategy. No data corruption.

That's the trade. For accounts north of 10,000 links, resumability earns its keep — we record the Bitly pagination cursor in import_jobs.source_filter and pick up where the last run left off. That's the next iteration.

What's measurable#

Ship a feature, instrument a feature. The handler emits structured zap logs for every job lifecycle event:

import: starting bitly run — workspace, target domain, conflict strategy, group GUID
import: bitly run complete — imported, skipped, failed, total
imports stuck-sweep flipped jobs to failed — count

We aren't graphing these in production yet — the first batch of real-user runs will tell us what to alert on. Initial guess: stuck-sweep count > 0 in any 1-hour window is a paging signal, because it means a worker died and the user's UI is stuck on running longer than they should tolerate.

What's next#

Same scaffolding, four more vendors:

Rebrandly — GET /v1/links?limit=25 paginated. Slashtag → slug 1:1 when the slug is free.
Short.io — GET /links?limit=150&domain_id=…. Per-domain pagination; we list domains first so the user can pick a source.
Dub.co — GET /api/links?projectSlug=…&limit=100. Folders + tags preserved; this is the easiest of the four.
TinyURL — CSV upload only. Public TinyURL has no API; Pro plans export CSV. We accept the CSV directly and skip the vendor-side roundtrip.

Each lands behind the same import_jobs row and the same dashboard polling UI. The vendor-specific worker stays in services/api-core/internal/imports/<vendor>.go.

If you've been holding off on a Bitly comparison because the migration story was hand-wavy, the migration story isn't hand-wavy anymore. Try it — token to last imported link in under ten minutes for typical accounts.