Sources & Methodology
How every number on this board is produced — and why you can check it.
The cardinal rule
No citation, no entry. Every prediction links to a real, retrievable source — a tweet, a Nostr note, a podcast timestamp, an article — with the verbatim quote. If we can’t cite it, it isn’t here.
Where the data comes from
We pull public Bitcoin price calls from four kinds of source:
- X — timelines via the logged-in browser GraphQL API, keyword-filtered to price calls. First-party (the account is the person).
- Nostr — the outbox model (a person’s kind:10002 write relays → their notes), via NDK. Cited with njump.me.
- Podcasts — Fountain transcripts where available; audio-only shows (Coin Stories, Pomp, Stephan Livera…) transcribed locally with Whisper. Cited with the episode + timestamp.
- Web — articles and interviews, researched and quoted.
A language model extracts candidate predictions from the raw text; a second, adversarial pass verifies them; non-Bitcoin calls (a stock, gold, an index) are rejected.
How outcomes are decided
Each call is resolved by code, not opinion, against a daily Bitcoin price oracle using the touch rule: a price target counts if BTC traded there at any point before the deadline; a floor/ceiling counts if it held. Direction/timing calls carry the resolver’s judgment with the price evidence shown, so you can overrule.
How it’s scored
Flat is the headline — the hit rate, every resolved call weighted equally. No confidence or boldness weighting, because people hide behind those.
Diff-adjusted is the same outcomes weighted by each call’s specificity (a precise, dated number scores high; “up only, eventually” scores ~0) — so a good score can’t be farmed with vague calls. Specificity is derived from the cited claim, never self-reported.
The board is tiered by track-record depth (Proven 10+ / Established 5–9 / Building 1–4 / On deck) so people are compared like-for-like — a lucky 100%-on-2 sits with its peers, not atop the proven names.
The scoring algorithm, exactly
Nothing hidden. Both scores run over a person’s resolved calls (PENDING excluded); a PARTIAL counts as half a hit.
outcome = 1 (correct) | 0.5 (partial) | 0 (wrong) FLAT = 100 · (correct + 0.5·partial) / resolved — every call weighted 1 DIFF = 100 · Σ(specificity · outcome) / Σ(specificity) — weighted by specificity
Specificity (0–100) is derived from the cited claim — never self-reported — as the sum of three signals (capped at 100):
- Horizon (made → deadline): tight <90d = 48 · near <1y = 32 · medium <2y = 18 · long = 8 · open-ended = 0
- Date precision: specific-day = 28 · month-end = 18 · quarter-end = 12 · year-end = 6 · none = 0
- Numeric threshold: +24 when a price number is given
So “$69k by Jul 5” (made Jan 1) = 32 + 28 + 24 = 84; “above $100k in the next 5 years” = 8 + 6 + 24 = 38; “number go up” = 0.
Tiers by resolved-call count: Proven 10+ · Established 5–9 · Building 1–4 · On deck 0. Within a tier, rank is by flat score (ties break toward more calls).
A living scoreboard
Scores move as new calls are logged and open ones resolve. New people get a deep backfill; everyone’s recent calls are swept incrementally; a periodic deeper pass fills gaps. Nothing is ever overwritten — the raw record is append-only.
Currently tracking 232 people. Download all the raw data →