The composite score in one formula
Each component is normalized to [0, 100]. The weighted sum is bounded at 100. Three strategy modes change the weights:
Mode: Balanced (default)
55 / 20 / 15 / 10 — covers most use cases.
Mode: Momentum (chase the runners)
70 / 10 / 10 / 10 — weights recent moves heavily, less concerned with scarcity.
Mode: Long-Term Hold (grail premium)
30 / 40 / 15 / 15 — flips weights toward PSA pop scarcity, smooths over short-term noise.
Component math
Momentum (0-100)
Recent-weighted (60% 7-day, 40% 30-day). Centered at 50 (flat = neutral). A 25% 30D run produces momentum ≈ 70.
Live delta source: Until 5 days of price_history.jsonl accumulates, deltas are workbook-formula based and are suppressed in the public-facing ticker. Real 30D % activates after the 5-day mark.
Scarcity (0-100)
Logarithmic decay against PSA 10 population. Pop 100 → scarcity ≈ 68. Pop 10,000 → scarcity ≈ 36. Pop unknown → default 60.
Value (0-100)
Where target_buy = 0.80 × live_price. Cards trading below the 80% MV threshold score higher. Floored at 20 so chase cards never zero out.
Liquidity (0-100)
Lookup by tier:
- T1 Vintage Grails (Charizard 1st Ed, Pikachu Illustrator): 65
- T2 Modern Chase (Moonbreon, SIRs): 85
- T3 Mid-Modern: 55
- T4 Sealed/Specialty: 70
- T5 Modern Bulk: 45
- Unknown: 50
Signal thresholds
| Score | Signal | What it means |
|---|---|---|
| ≥ 75 | STRONG BUY | Algo says load up. Top-decile composite. |
| 60–74 | BUY | Positive — worth a position. |
| 40–59 | HOLD | Neutral. Existing positions OK, no new entries. |
| 25–39 | TRIM | Reduce exposure; rotate capital. |
| < 25 | SELL | Exit. Algo flags structural weakness. |
Data sources
1. PriceCharting (primary for graded slabs)
Per-grade aggregate prices (PSA 7/8/9/9.5/10). 3-6 month trailing window. This is the trusted benchmark — when PPT and PC disagree, PC wins.
2. PokemonPriceTracker (graded fallback + ungraded reference)
Per-grade eBay sold-comp medians (last ~30 days). More responsive to recent volatility but sensitive to thin-sample outliers. Used when PC has no data for the card, or paired with TCGPlayer market for raw NM.
3. TCGPlayer mp-search-api (live raw market)
No-auth public endpoint. Best signal for raw NM cards because it's the price Brandon's competitors actually see on TCGPlayer.
4. PokemonTCG.io (card universe + images)
20,000+ card catalog with TCGPlayer + Cardmarket price overlays. Powers the searchable Card Universe + card image enrichment.
5. eBay sold-listings scrape (when available)
Browse API integration pending. Direct scraping is blocked from data-center IPs (Akamai protection); GitHub Actions cron cannot reach eBay sold pages from its runners.
Sanity guards
Every price overlay passes through bidirectional sanity guards before reaching production:
- Lower: reject if
live < 20% × workbookwhen workbook ≥ $100 — catches URL mismatches to cheap unrelated cards - Upper: reject if
live > 10× workbookon sub-$1k cards — catches URL mismatches to grail-card auction records (e.g. matching a $5 Pikachu to a $300k trophy) - Grade mapping: reject if PC
grade_10 / grade_9 > 50×in the same payload (parser bug detector) - Thin sample: reject PPT data with fewer than 2 sales when live differs >3× from workbook
- Match score floor: PriceCharting URL discovery requires a match score ≥ 6 (name + set + grade-marker tokens) before accepting a discovered URL
Rejected entries are visible in the Price Audit tab with their original price, source, and rejection reason. Users can manually override via the same tab.
What we explicitly don't model
- Tournament-meta tier shifts (qualitative; no clean signal)
- Anime/social-media virality (would require sentiment API; not built)
- Vault/auction sentiment (proprietary; not public-data)
- Reddit/Twitter buzz scores (data files exist; not wired into Score yet)
Flip Bot — TCGPlayer + eBay BIN arbitrage scanner
A separate engine on top of the quant model. Where the score answers "should I want this card?", the flip bot answers "is there an underpriced live listing I should grab right now?".
Core formula (per live listing)
Fair-value selection (variant-aware)
- Raw NM:
pricecharting.ungraded→ workbook price → live_price (in that order) - Sealed: workbook hand-curated price (sealed products lack reliable third-party medians)
- Graded: not currently scanned by the bot — TCGP/eBay BIN graded markets are too thin to flip reliably
Confidence tiers
ROI magnitude is a tell — real arbs cluster 20–50%. Above 50% means either the listing is a steal or (more often) the fair-value reference is stale. The bot labels rather than auto-hides:
| ROI | Tier | What it means |
|---|---|---|
| 20–49% | OK | Real spread within historical norms. Trust the alert. |
| 50–99% | VERIFY | Big spread — eyeball before buying. Often legit but check the listing. |
| ≥ 100% | SUSPECT | Almost always bad fair-value data or a fake/scam listing. Don't blind-click. |
False-positive guards
- Sub-30% floor: if
buy_total < 0.30 × fair_value, reject (wrong-product fuzzy match) - Locale gate: JP watchlist rows reject English products and vice-versa (English Moonbreon ≠ JP Moonbreon)
- Bundle exclusion: watchlist rows with variant "Set", "Bundle", "Mixed" excluded (can't compare to single-card listings)
- Title accessory blocklist: eBay listings whose titles contain "case", "display", "sleeve", "proxy", "custom", etc. rejected before pricing
- Dedupe: source-aware key (
source:cardId:listingId) with 7-day TTL so reruns don't re-alert
Two sourcing channels
TCGPlayer path: public mp-search-api.tcgplayer.com endpoint, ~3 minutes for the
full 320-card sweep. TCGP sellers are mostly informed dealers so spreads tend to be tight but frequent.
eBay BIN path: scrapes ebay.com/sch/i.html?LH_BIN=1 sorted price-ascending. eBay has casual
sellers who under-price 30–50% below sold median on the same card — bigger spreads but rarer hits. A circuit
breaker aborts the scan if Akamai flags the IP (typically 8 consecutive blocked responses), so TCGP results
remain valid even when eBay is throttled.
Source code: flip_bot.py.
Run output (qualified + near-misses + thresholds) is published to flip_bot_results.json on every scan
and rendered live on the homepage and the 💸 Flip Bot tab.
Refresh cadence
- Every 4 hours: GitHub Actions cron runs the full scraper, regenerates
live_prices.json,homepage_feed.json,cards.html, and the Market Brief. - Every 60s: Marketing site auto-refetches
/homepage_feed.jsonfrom Cloudflare Pages (falls back to GitHub raw). - On click: "↻ Refresh Now" button on the homepage forces immediate refetch.
Open data, audit-ready
Every artifact mentioned here is in the public trading-desk repo. The scoring code is in streamlit_app.py (function compute_score). The scraper is live_prices.py. The sweep tools are sweep_pc_now.py and sweep_ppt_holdouts.py. Fork, validate, suggest changes.
Layer 2: Self-Learning + Forward Models
The composite score above is the static foundation. Layer 2 is the adaptive layer — models that re-tune themselves, surface forward-looking opportunities, and prove their own batting average.
🔮 Future Winners — 6-component predictive model
The composite score is backward-looking (it summarizes what the price IS). The Future Winners model is forward-looking (what the price IS LIKELY TO BECOME in 30-90 days).
Seed weights: 0.25 / 0.20 / 0.15 / 0.15 / 0.15 / 0.10 (re-tuned by the optimizer — see next).
- momentum_rising: magnitude + acceleration of 7-day vs 30-day price change
- scarcity: PSA 10 population inverse-log scaled, gem-rate penalty
- accessibility: liquidity proxy — under-$5K cards score higher (broader buyer pool)
- hidden_alpha: chase-variant premium (SIR / Full Art / Gold Star multipliers)
- tier_premium: T1 / T2 designation weight
- liquidity: raw NM availability on TCGP, eBay sold cadence
Output: future_winners.json. Picks visible on the homepage's 🔮 Future Winners section.
🧠 Self-Learning Engine — Pearson-correlation optimizer
Static weights are a guess. The Self-Learning Engine retunes them empirically using realized 7-day returns. The process:
- Every Future Winners emission logs its 6 component scores to
fw_emissions.jsonl - After 7 days, the emission's 7-day-forward realized return is computed from
price_history.jsonl - Pearson correlation is computed between each component's score and realized return
- New weights =
0.7 × old_weights + 0.3 × correlation-derived_weights(Bayesian-style smoothing) - Weights clamped to
[0.05, 0.40]per component, renormalized to sum to 1.0 - Updated
fw_weights.jsonis consumed by the next Future Winners build
Maturity threshold: minimum 30 mature samples (≥7 days old) before any retrain runs. Below threshold, system falls back to seed weights. Initial retrain ~ 7-10 days after launch as emissions accumulate.
Outputs: fw_weights.json (current), fw_weights_history.jsonl (every update with sample count + correlations for auditability). Visible on the homepage 🧠 Self-Learning Engine card.
💰 Arbitrage Scanner v2 — multi-path EV model
For every card, the scanner enumerates 4 profit paths and reports the one with highest expected value:
Path A · RAW → PSA 10 (grade-and-flip)
Probabilities by era: Modern (SwSh+) = 25/45/20/10 for PSA 10/9/8/<7. Mid (XY-SM) = 15/40/25/20. Vintage excluded from this path because PriceCharting "ungraded" prices reflect damaged copies — distorts the math.
Path B · RAW → PSA 9 Floor
Conservative variant: PSA 10 hits treated as PSA 9 (locks in lower upside, lower variance). Same probability table.
Path C · PSA 9 → PSA 10 Cross-Grade
Buy a PSA 9 slab, crack it, resubmit. Probabilities: 20% upgrade, 60% stays at 9, 20% downgrade. Only viable when p10/p9 ≥ 3.0× and PSA 9 entry < $5,000.
Path D · RAW Retail Flip
Instant arbitrage: TCG-listed vs eBay-sold spread ≥ 15%. No grading required, ~7-day hold.
Ceiling: 350% ROI cap — anything above is treated as data artifact, not opportunity. Fees modeled: 13.25% eBay FVF, $25 PSA grading, $5 inbound ship, $10 outbound. Output: arbitrage_v2.json, top 100 by EV-profit.
📊 Past Picks Scoreboard — model batting average
For every signal logged in signal_log.jsonl, compute realized return from emission-time price to current price:
Bucketed by signal type and age (<1d / 1-7d / 7-30d / 30d+). Win rate + avg return reported per bucket. Anything < 24h flagged as "pending" (too fresh to call). Visible in the 🧠 card as a strip beneath the weights.
📒 Trading Journal — Brandon's real trades
Reads two workbook sheets:
- Realized P&L: closed trades (Sale ID, CardID, buy date, sell date, qty, total cost, gross sale, fees%, net proceeds, P&L, holding days)
- Inventory: open positions (Item ID, CardID, Acquired, Qty, Cost/Unit)
Computes portfolio ROI, win rate, average hold days, LT vs ST tax split (LT = >365 days, qualifies for capital gains rate). Output: trading_journal.json. Visible in the homepage 📒 Track Record section.
🔥 Set Heat Tracker — macro-level signal
Per-set composite. When a set's heat_score > 100, the whole set is structurally heating up — individual card moves tend to correlate. Useful for "is this an Evolving Skies pump or just one card?" decisions.
📦 Sealed Market Scanner
Separate market from singles. Buckets products into Booster Box, ETB, Booster Bundle, Premium Collection, Collection Box, Theme Deck, Other. Surfaces:
- Top BUYs (composite-score > 60 sealed products)
- Undervalued (live > 15% below workbook)
- Cheapest / priciest per-pack (booster boxes apples-to-apples comparison)
🔍 Pipeline Health Self-Audit
Every cron cycle ends with pipeline_self_audit.py checking 19 output files for:
- Existence — file present on disk
- Min size — catches "wrote empty file" bugs
- JSON validity (where applicable)
- Freshness — modify-time and embedded
generated_atwithinmax_age_hours
Each file gets GREEN / YELLOW / RED. Overall pipeline health surfaces as a colored dot in the homepage sub-nav. Output: pipeline_health.json.
Full automation cron pipeline
Every 4 hours, GitHub Actions runs: