crate support
Overview
crate-api 2.0.0
Section titled “crate-api 2.0.0”crate API (v2 — cluster-first)
crate is the aggregation gateway for a fleet of music-data producers. You ask it about an artist, a label, or a festival; it joins every signal the fleet can see about that entity and hands you back one composed picture — collector behavior, editorial coverage, live-circuit demand, breakout momentum, web presence — each piece labelled with where it came from and how fresh it is.
Two things make crate different from a typical catalogue API, and the rest of this doc exists to teach them:
- Identity is a
cluster_id, not a Discogs/MusicBrainz/Bandcamp id. The same artist scattered across those platforms collapses to one canonical key. You resolve to it once, then key everything off it. - An empty answer is a first-class answer. crate returns HTTP
200and tells you plainly what it can’t see (present:false, a null field,state:"honest_gap") instead of404-ing or fabricating data. Only4xx/5xxare errors.
v2 is cluster-first: the agent (artist / label / festival) is the prime resource, and sources (Bandcamp, Discogs) and releases attach to it as dimensions rather than being top-level nouns. If you do nothing else, internalize the cold-start recipe: resolve a name or pasted link → cluster_id → dossier. GET /api/v2 returns this recipe live, and GET /api/v2/resolve?q=<name> is the front door.
Authentication
The API is keyed: every operation requires an X-API-Key header (ck_(live|test)_<…> format) except the two public front-door endpoints that opt out — GET /api/v2 (the root index) and GET /api/v2/openapi.json (this spec). One family of endpoints uses a different credential entirely — see beacons below.
Concepts
cluster_id — the canonical artist identity
cluster_id is crate’s prime key for an artist: a pe-norm-v1 hex string derived from the artist’s name, designed so that the same artist’s Discogs page, MusicBrainz entry, and Bandcamp profile all collapse to one cluster_id. Key all artist data off it.
- It is an opaque string — pass it through verbatim. Never numericize it, parse it, or assume structure.
cluster_id: nullis an honest gap, not an error: crate couldn’t resolve a canonical identity for that lookup. Roughly half of the long-tail booking artists have neither a Discogs id nor an MBID, sonullis normal.- You get one by calling
GET /api/v2/resolve(from a name, a pasted link, or a foreign id). You then address the artist dossier directly asGET /api/v2/artist/{key}, where{key}is the 64-hexcluster_idor a human slug. - A 64-hex key resolves identity directly from the cluster — it deliberately skips the Discogs lookup so a hex address never silently re-anchors onto a same-name Discogs row. Foreign locators (
discogs:<id>,mbid:<uuid>) are not canonical addresses: convert them via/resolvefirst, or/artist/{key}returns400(the response’snextfield is a ready-to-call/resolveURL).
dossier · grains — the composed per-entity picture
A dossier is the full picture crate composes for one entity by joining every fleet signal. v2 has three agent grains:
| grain | addressed by | what it is |
|---|---|---|
artist | GET /api/v2/artist/{key} or GET /api/v2/dossier/artist/{slug} | identity + collector behavior + editorial + emergence + live presence + web + compositions + discography + bandcamp dimensions… |
label | GET /api/v2/label/{key} or GET /api/v2/dossier/label/{slug} | label identity + sublabel→parent lineage + collector behavior |
festival | GET /api/v2/dossier/festival/{slug} | de-fragmented festival identity + consolidated editions ⋈ lineup |
Releases are not a top-level grain in v2. In the cluster-first model a release (a Discogs release-group, what a consumer calls a “release”) attaches to the artist as the dossier’s discography dimension, keyed by the artist’s cluster_id — there is no standalone release resource. Likewise Bandcamp is a dimension of the artist dossier (bandcamp_emergence / bandcamp_tastemaker), not a top-level surface.
Every dossier facet carries a classified state (e.g. present, empty/absent, honest_gap) and the dossier ships a provenance manifest — an array where each entry names the producer, sourceTable, refreshCadence, tier, and honestGapState for a field. Read GET /api/v2/dossier/manifest (the data dictionary) to discover the entire field surface across grains — including grains that are deliberately unavailable (e.g. song, because the fleet has no track key; and the demoted master grain, whose detail now lives in the artist’s discography) — without hitting every entity endpoint.
honest gap — empty is 200, not 404
This is crate’s defining principle: crate shows what it can see and is explicit about what it can’t, rather than 404-ing or faking data. An unresolved or empty lookup returns HTTP 200 with one of:
present: false(e.g. a dossier dimension with no match),- a
nullfield (cluster_id: null,identity: null), state: "honest_gap"on a dossier facet.
Branch on the body, not the status, for “did I get data?”. An unresolved artist slug returns identity:null at 200 — never 404. Reserve your error handling for genuine 4xx/5xx.
resolved_via · resolved_from — binding tier and match method
When crate resolves an identity it tells you two orthogonal things:
resolved_via= the binding tier, i.e. how trustworthy the identity is:'discogs'— canonical, Discogs-bound (verified).'cluster'— observed/unverified: the identity came from the booking graph with no Discogs bind. Surface it flagged as unverified, never as canonical.null— did not resolve.
resolved_from= how you addressed it on/resolve:'url'(a pasted link),'name', or'locator'(a foreign id).matched_onnames the surface that matched;noteexplains a recognized-but-unresolved link (e.g. a Twitter URL crate recognizes but does not yet cross-reference).
A 64-hex cluster_id address always yields resolved_via: 'cluster' (observed tier) by design.
the cube · cube_quadrant — the behavioral-signal model
The cube is crate’s behavioral model of a release: a 3-bit code (string, e.g. "101") placing it on three independent behavioral axes — who OWNS it (collector), who PLAYS it (DJ), who WRITES about it (critic). Each bit is 0/1, so the eight quadrants run "000" (“No signal”) through "111" (“Full intersection”): "100" = collector-only, "010" = DJ-only, "101" = collector + critic, and so on. The collector-vs-DJ split — who owns it vs who plays it — is the heart of the model. cube_quadrant: null means it isn’t yet classified (an honest gap). You meet cube_quadrant (with companion owner_count / dj_count / critic_count magnitudes + a link_to_cube explorer deep-link) on the master-grain result rows of GET /api/v2/search.
tastemakers · breakouts — discovery surfaces
Two read-only discovery surfaces, served from offline-published snapshots (no DB checkout) and fail-soft: each returns a state of present / empty / degraded, where degraded is a 200 honest-gap (a read failure never 500s), and stale:true flags a snapshot older than 7 days.
- tastemakers (
GET /api/v2/tastemakers,…/ones-to-watch) — influential curators and the richest artist-grain analytics crate has: rank, own-tier, brokerage score, corroborating axes, lead-times, Bandcamp demand.?limit=bounds each array (1..200). - breakouts (
GET /api/v2/breakouts) — emerging artists on the rise (“ones to watch”): booking-momentum signal cross-validated against press.?tier=breakout|risingand?corroboration=corroborated|booking_aheadfilter it;?limit=is clamped to200.
beacons — search-event telemetry (different credential)
Beacons are client-side telemetry about search behavior: POST /api/v2/search-events/observed (a result was served from cache) and …/refined (the user changed facets). They are not authenticated with your X-API-Key. Each search response issues a short-lived per-search JWT bound to one search_event_id; send it as Authorization: Bearer <token>, and the token must match the body’s search_event_id. Bodies are capped at 512 bytes and beacons are idempotent (a duplicate is a 204 no-op). Beacon 400s carry a Zod flattened details object (not the array shape of normal validation errors), so they use a distinct error schema.
sparse fieldsets — ?fields= (opt-out trim)
The artist dossier is default-rich: GET /api/v2/artist/{key} returns the whole dossier in one round-trip. If you want less, ?fields=identity,discography trims to the named top-level facets (the envelope is always present). An unknown field returns 400 invalid_fields with the exact valid set and a copy-pasteable example — so you never guess.
Bandcamp in v2 — analytical dimensions, and the link-only posture
In v2 Bandcamp is an analytical dimension of the artist dossier, not a release surface. Two facets carry it, both keyed by the artist’s cluster_id and both signals/metrics (not per-release listings):
bandcamp_emergence— purchase-backed demand signals (emergence class, demand lead/ratio, owner reach, wishlist demand, distinct releases, earliest-wished date).bandcamp_tastemaker— early-supporter quality scores (supporter-cohort size, aesthetic-quality scores, mean first-buyer earliness).
Per-release Bandcamp listings — and the bandcamp_item_id / track_url per-release fields — are not a v2 surface; they belong to the v1 Bandcamp endpoints. (The discography dimension is the Discogs release-group attachment, keyed by discogsMasterId — distinct from Bandcamp.)
crate is link-only everywhere it does carry links: every artwork item is a hotlink url with rehost:false — crate never fetches or re-hosts bytes (a Cover Art Archive URL is best-effort and may 404 if no cover exists), and crate never stores Bandcamp audio streams (tokenized, expiring, out of ToS bounds).
Opaque ids — the universal rule
cluster_id is always a string, always opaque. Round-trip it verbatim; do not parse, increment, numericize, or infer structure from it. (The same discipline applies to any id crate hands you.) This keeps your client correct across id-scheme changes.
Versioning
The API major version lives in the URL path (/api/v2). The spec’s info.version (currently 2.0.0) bumps on every spec change and is drift-guarded: the document is generated from code, so docs cannot drift from runtime behavior. Operations carry stable operationIds for codegen, and keyed 2xx responses declare X-RateLimit-Limit / X-RateLimit-Remaining / X-RateLimit-Reset headers (back off on 429 using retry_after_seconds / Retry-After). v2 is the cluster-first stable major; the frozen v1 predecessor remains available during a time-boxed, announced deprecation — the route-by-route migration map is at /docs/migration/v1-to-v2.
Authentication
Section titled “ Authentication ”ApiKeyAuth
Section titled “ApiKeyAuth ”Customer API key in ck_(live|test)_<32-base62> format
Security scheme type: apiKey
Header parameter name: X-API-Key
BeaconBearerAuth
Section titled “BeaconBearerAuth ”Short-lived per-search beacon JWT (issued with the search response), sent as Authorization: Bearer <token>. Distinct from the X-API-Key customer key; the token is bound to a single search_event_id.
Security scheme type: http
Bearer format: JWT