Closed-loop intelligence: settled bets feed back into grading weights. - Grade accuracy tracking per stat type (A/B/C/D hit rates) - Signal accuracy tracking (which deltas predict outcomes) - Kill condition effectiveness (hit_rate_with vs without) - Conservative weight adjustment (20% cap, 50-pick minimum) - 4 new DB tables: grade_accuracy, signal_accuracy, kill_condition_accuracy, weight_history - Desk-tier endpoints: /api/model/accuracy, /api/model/insights Spec complete, ready to build when Phase 3 deployment is stable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
11 KiB
Feature 4.1 — Model Learning Loop
Overview
Closed-loop intelligence system. Every settled bet feeds back into the grading engine. Track which signals, kill conditions, and composite weights actually predict outcomes. Over time, the model self-calibrates — grades get sharper, kill conditions get validated or deprecated, and weight distribution shifts toward what works.
Dependencies
- Feature 1.3 — Prop Analysis Engine (grader.js weights to tune)
- Feature 1.5 — Bet Submission (settled bets with outcomes)
- Feature 1.4 — Database Schema (outcomes, picks, performance tables)
The Loop
User scans parlay → grades assigned with current weights
↓
User places bet → logs in tracker
↓
Game plays out → user settles bet (won/lost/push)
↓
System records: grade predicted X, actual result was Y
↓
Accumulate enough data (50+ settled picks per signal)
↓
Recalculate signal accuracy → adjust grading weights
↓
Next scan uses improved weights
New Database Tables
grade_accuracy
Tracks accuracy of each grade level over time.
CREATE TABLE public.grade_accuracy (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
grade TEXT NOT NULL CHECK (grade IN ('A', 'B', 'C', 'D')),
stat_type TEXT NOT NULL,
total_picks INT NOT NULL DEFAULT 0,
hits INT NOT NULL DEFAULT 0,
misses INT NOT NULL DEFAULT 0,
pushes INT NOT NULL DEFAULT 0,
hit_rate NUMERIC(5,2),
expected_hit_rate NUMERIC(5,2),
calculated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX idx_grade_accuracy_unique ON public.grade_accuracy(grade, stat_type);
signal_accuracy
Tracks how predictive each individual signal is.
CREATE TABLE public.signal_accuracy (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
signal_name TEXT NOT NULL,
signal_value TEXT NOT NULL,
stat_type TEXT NOT NULL,
total_picks INT NOT NULL DEFAULT 0,
hits INT NOT NULL DEFAULT 0,
hit_rate NUMERIC(5,2),
avg_edge_when_hit NUMERIC(5,2),
avg_edge_when_miss NUMERIC(5,2),
predictive_score NUMERIC(5,2),
calculated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX idx_signal_accuracy_unique ON public.signal_accuracy(signal_name, signal_value, stat_type);
kill_condition_accuracy
Tracks whether kill conditions actually prevent bad bets.
CREATE TABLE public.kill_condition_accuracy (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
kill_condition TEXT NOT NULL,
total_triggered INT NOT NULL DEFAULT 0,
picks_with_condition INT NOT NULL DEFAULT 0,
hits_with_condition INT NOT NULL DEFAULT 0,
hit_rate_with NUMERIC(5,2),
hit_rate_without NUMERIC(5,2),
effectiveness NUMERIC(5,2),
calculated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX idx_kill_accuracy_unique ON public.kill_condition_accuracy(kill_condition);
weight_history
Audit trail of weight changes over time.
CREATE TABLE public.weight_history (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
weight_set JSONB NOT NULL,
reason TEXT NOT NULL,
sample_size INT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
Signal Tracking
On every pick created (via parlay scan), store the individual signal values alongside the pick. This is already partially captured in picks.reasoning, but we need structured data for aggregation.
New column on picks table (migration 003):
ALTER TABLE public.picks ADD COLUMN signal_snapshot JSONB;
signal_snapshot stores:
{
"season_delta": 1.8,
"recent_delta": 2.3,
"situational_delta": 1.5,
"line_edge": 0.5,
"home_away_signal": "bullish",
"rest_signal": "neutral",
"vs_opponent_signal": "strong_bullish",
"kill_conditions": ["blowout_risk"],
"composite": 2.1,
"weights_version": 1
}
Accuracy Calculation Pipeline
Triggered periodically (on every 10th bet settlement, or daily cron).
Step 1: Grade Accuracy
For each grade (A, B, C, D) × stat_type:
- Count settled picks with that grade + stat
- Count hits (outcome = 'hit')
- Calculate hit_rate = hits / total * 100
- Compare to expected:
- A should hit ~70-80%
- B should hit ~55-65%
- C should hit ~45-55%
- D should hit ~30-40%
- If actual diverges from expected by >10%: flag for weight adjustment
Step 2: Signal Accuracy
For each signal (season_delta, recent_delta, etc.):
- Group picks by signal_value bucket:
- "strong_bullish" (delta >= 4)
- "bullish" (2-4)
- "lean" (0.5-2)
- "neutral" (< 0.5)
- bearish equivalents
- For each bucket:
- hit_rate = hits / total
- avg_edge_when_hit vs avg_edge_when_miss
- predictive_score = (hit_rate - 0.5) * log(total)
(rewards accuracy AND sample size)
Step 3: Kill Condition Effectiveness
For each kill condition:
- Picks where this condition triggered: hit_rate_with
- Picks where this condition did NOT trigger: hit_rate_without
- effectiveness = hit_rate_without - hit_rate_with
(positive = the kill condition correctly identifies bad bets)
- If effectiveness < 5%: kill condition may not be useful
- If effectiveness > 20%: kill condition is highly predictive
Step 4: Weight Adjustment
Current weights: season=1.0, recent=1.5, situational=1.2, lineEdge=0.8
If a signal's predictive_score is higher than its current weight influence:
→ Increase that weight
If a signal's predictive_score is lower:
→ Decrease that weight
Adjustment formula:
new_weight = current_weight * (1 + (predictive_score - baseline) * learning_rate)
learning_rate = 0.1 (conservative — small steps)
Constraints:
- No weight can drop below 0.3 or exceed 3.0
- Total weight sum stays within 3.5-5.5 range
- Changes capped at 20% per adjustment cycle
- Minimum 50 picks per signal before adjusting
Store new weights in weight_history.
Apply new weights to grader.js (load from DB on startup, fallback to defaults).
Endpoints
GET /api/model/accuracy (auth required, Desk tier only)
Returns current model accuracy stats.
Response (200):
{
"grade_accuracy": [
{ "grade": "A", "stat_type": "points", "total": 120, "hit_rate": 72.5, "expected": 75.0 },
{ "grade": "B", "stat_type": "points", "total": 200, "hit_rate": 58.0, "expected": 60.0 }
],
"signal_accuracy": [
{ "signal": "recent_delta", "value": "bullish", "stat_type": "points", "hit_rate": 68.0, "predictive_score": 4.2 }
],
"kill_condition_effectiveness": [
{ "condition": "blowout_risk", "effectiveness": 22.5, "triggered": 45 },
{ "condition": "low_minutes", "effectiveness": 18.0, "triggered": 30 }
],
"current_weights": {
"season": 1.0, "recent": 1.5, "situational": 1.2, "lineEdge": 0.8,
"version": 3, "last_updated": "2026-04-15T00:00:00Z"
},
"total_settled_picks": 850,
"model_confidence": "high"
}
GET /api/model/insights (auth required, Desk tier only)
Returns human-readable insights from the learning loop.
Response (200):
{
"insights": [
{
"type": "signal_outperforming",
"message": "Recent form (last 10 games) is the strongest predictor for points props. It outperforms season average by 12%.",
"action": "Recent form weight increased from 1.5 to 1.65."
},
{
"type": "kill_condition_validated",
"message": "blowout_risk is your most effective kill condition. Props in blowout games hit 15% less often.",
"action": "No change needed — working as designed."
},
{
"type": "grade_calibration",
"message": "Grade A picks on rebounds are hitting at 68% instead of expected 75%. Sample is small (40 picks) — monitoring.",
"action": "No weight change yet. Need 50+ picks to adjust."
}
],
"next_recalculation_at": "2026-04-20T00:00:00Z"
}
Service Architecture
src/
├── services/
│ ├── modelLearningService.js # Orchestrator: triggers accuracy calc + weight adjustment
│ ├── accuracyCalculator.js # Grade, signal, kill condition accuracy from settled data
│ └── weightAdjuster.js # Computes new weights, stores history, applies to grader
├── routes/
│ └── model.js # GET /api/model/accuracy, GET /api/model/insights
Integration Points
On bet settlement (betService.js):
After settling a bet:
1. Check total settled picks for this user
2. Every 10th settlement: trigger modelLearningService.recalculate()
3. This is global (not per-user) — all users' data feeds the model
On pick creation (parlayScanService.js):
When creating a pick:
1. Attach signal_snapshot JSONB with all signal values + current weights version
2. This enables retrospective analysis of which weights were active when the pick was made
On grader startup (grader.js):
On first call:
1. Load latest weight_set from weight_history table
2. If no weights in DB: use hardcoded defaults
3. Cache weights in memory, refresh every hour
Acceptance Criteria
- Every settled pick updates grade_accuracy, signal_accuracy, and kill_condition_accuracy tables
- Grade accuracy tracks hit rate per grade per stat type
- Signal accuracy tracks predictive score per signal per stat type
- Kill condition effectiveness measures hit_rate_with vs hit_rate_without
- Weight adjustment runs after every 10th settlement (global)
- Weight changes are capped at 20% per cycle, bounded 0.3-3.0
- Weight history is stored with reason and sample size
GET /api/model/accuracyreturns current stats (Desk tier only)GET /api/model/insightsreturns human-readable insights- signal_snapshot JSONB attached to every new pick
- Grader loads weights from DB on startup, falls back to defaults
- Minimum 50 picks per signal before weight adjustment triggers
Test Plan
Unit Tests (accuracyCalculator.js)
- Correctly computes hit rate from settled picks
- Groups by grade + stat_type
- Groups by signal + value + stat_type
- Kill condition effectiveness: difference between with/without
- Handles zero settled picks gracefully
Unit Tests (weightAdjuster.js)
- Increases weight when predictive_score exceeds baseline
- Decreases weight when predictive_score below baseline
- Caps changes at 20% per cycle
- Enforces min/max bounds (0.3-3.0)
- Stores weight history with correct reason
- Does not adjust with < 50 picks per signal
Integration Tests
- Full loop: create pick with signal_snapshot → settle → accuracy updated → weights adjusted
- GET /api/model/accuracy returns correct stats
- GET /api/model/insights generates relevant insights
- Desk tier only: free/analyst get 403
- Weight changes reflected in next grading call
Open Questions
- Global vs per-user model: This spec uses a global model (all users' data combined). Per-user models would require significantly more data. Global is correct for MVP — the model learns from collective intelligence. Per-user customization can layer on top later.
- Cold start: With < 50 settled picks, no adjustments fire. The hardcoded defaults carry the system until enough data accumulates. This is intentional — bad adjustments on small samples would be worse than no adjustments.