Sources & Methodology

How every number on this board is produced — and why you can check it.

The cardinal rule

No citation, no entry. Every prediction links to a real, retrievable source — a tweet, a Nostr note, a podcast timestamp, an article — with the verbatim quote. If we can’t cite it, it isn’t here.

Where the data comes from

We pull public Bitcoin price calls from four kinds of source:

X — timelines via the logged-in browser GraphQL API, keyword-filtered to price calls. First-party (the account is the person).
Nostr — the outbox model (a person’s kind:10002 write relays → their notes), via NDK. Cited with njump.me.
Podcasts — Fountain transcripts where available; audio-only shows (Coin Stories, Pomp, Stephan Livera…) transcribed locally with Whisper. Cited with the episode + timestamp.
Web — articles and interviews, researched and quoted.

A language model extracts candidate predictions from the raw text; a second, adversarial pass verifies them; non-Bitcoin calls (a stock, gold, an index) are rejected.

How outcomes are decided

Each call is resolved by code, not opinion, against a daily Bitcoin price oracle using the touch rule: a price target counts if BTC traded there at any point before the deadline; a floor/ceiling counts if it held. Direction/timing calls carry the resolver’s judgment with the price evidence shown, so you can overrule.

How it’s scored

Flat is the headline — the hit rate, every resolved call weighted equally. No confidence or boldness weighting, because people hide behind those.

Diff-adjusted is the same outcomes weighted by each call’s specificity (a precise, dated number scores high; “up only, eventually” scores ~0) — so a good score can’t be farmed with vague calls. Specificity is derived from the cited claim, never self-reported.

The board is tiered by track-record depth (Proven 10+ / Established 5–9 / Building 1–4 / On deck) so people are compared like-for-like — a lucky 100%-on-2 sits with its peers, not atop the proven names.

The scoring algorithm, exactly

Nothing hidden. Both scores run over a person’s resolved calls (PENDING excluded); a PARTIAL counts as half a hit.

outcome = 1 (correct) | 0.5 (partial) | 0 (wrong)

FLAT = 100 · (correct + 0.5·partial) / resolved          — every call weighted 1
DIFF = 100 · Σ(specificity · outcome) / Σ(specificity)   — weighted by specificity

Specificity (0–100) is derived from the cited claim — never self-reported — as the sum of three signals (capped at 100):

Horizon (made → deadline): tight <90d = 48 · near <1y = 32 · medium <2y = 18 · long = 8 · open-ended = 0
Date precision: specific-day = 28 · month-end = 18 · quarter-end = 12 · year-end = 6 · none = 0
Numeric threshold: +24 when a price number is given

So “$69k by Jul 5” (made Jan 1) = 32 + 28 + 24 = 84; “above $100k in the next 5 years” = 8 + 6 + 24 = 38; “number go up” = 0.

Tiers by resolved-call count: Proven 10+ · Established 5–9 · Building 1–4 · On deck 0. Within a tier, rank is by flat score (ties break toward more calls).

A living scoreboard

Scores move as new calls are logged and open ones resolve. New people get a deep backfill; everyone’s recent calls are swept incrementally; a periodic deeper pass fills gaps. Nothing is ever overwritten — the raw record is append-only.

Currently tracking 232 people. Download all the raw data →