feat: Feature 1.2 (NBA stats FastAPI service) + Feature 1.4 (database schema)

Feature 1.2: Python FastAPI microservice wrapping nba_api
- GET /stats/season-avg, /stats/last-n, /stats/splits, /players/search
- Redis caching (24hr/1hr/6hr/7day), 0.6s rate limiting, PRA derived stat
- 27 Python tests passing

Feature 1.4: Complete Supabase database schema
- 6 tables: users, picks, scan_sessions, bets, outcomes, performance
- RLS enabled on all tables with auth.uid() policies
- 3 triggers: auto-create user, updated_at, scan count reset
- 37 schema validation tests passing
- Migration SQL ready, pending manual apply (WSL2 DNS blocker)

Total: 92 tests (65 Node.js + 27 Python), all passing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Kev
2026-03-21 10:58:58 -04:00
parent 00409fd6cd
commit 3da1b4242c
27 changed files with 2360 additions and 16 deletions
+288
View File
@@ -0,0 +1,288 @@
# Feature 1.2 — NBA_API Stats Wrapper (FastAPI Microservice)
## Overview
A Python FastAPI microservice wrapping the `nba_api` library. Provides player stats (season averages, last N games, situational splits) via internal HTTP endpoints. Caches aggressively in Redis. Called by the Node.js backend — not exposed to the public internet.
## Dependencies
- None (builds parallel with Feature 1.1)
- Downstream consumers: Feature 1.3 (Prop Analysis Engine) needs season averages + recent game data to grade props
## Tech
- **Language:** Python 3.11+
- **Framework:** FastAPI + uvicorn
- **Data source:** `nba_api` (free, no key required)
- **Cache:** Redis (shared instance with Node backend)
- **Port:** 8000 (internal only)
## Directory Structure
```
nba-service/
├── app/
│ ├── main.py # FastAPI app, routes
│ ├── services/
│ │ └── stats.py # nba_api wrapper functions
│ ├── utils/
│ │ ├── cache.py # Redis caching helpers
│ │ └── player_map.py # Player name -> nba_api player_id resolution
│ └── config.py # Settings (Redis URL, cache TTLs)
├── tests/
│ ├── test_stats.py # Unit tests for stats service
│ └── test_routes.py # Integration tests for endpoints
├── requirements.txt
└── README.md
```
## Endpoints
### GET /stats/season-avg
Returns a player's season averages for the current season.
**Query params:**
| Param | Type | Required | Description |
|-----------|--------|----------|--------------------------------------------|
| player | string | yes | Player full name (e.g., "Nikola Jokic") |
| stat_type | string | no | Filter to specific stat (default: all) |
| season | string | no | NBA season (default: current, e.g., "2025-26") |
**Response (200):**
```json
{
"player": "Nikola Jokic",
"player_id": 203999,
"team": "DEN",
"season": "2025-26",
"source": "cache | live",
"stats": {
"points": 26.3,
"rebounds": 12.4,
"assists": 9.1,
"threes": 1.1,
"blocks": 0.7,
"steals": 1.4,
"pra": 47.8,
"turnovers": 3.2,
"games_played": 65,
"minutes": 34.2
}
}
```
### GET /stats/last-n
Returns a player's averages over their last N games.
**Query params:**
| Param | Type | Required | Default | Description |
|-----------|--------|----------|---------|------------------------------------|
| player | string | yes | | Player full name |
| n | int | no | 10 | Number of recent games (max: 30) |
| stat_type | string | no | all | Filter to specific stat |
**Response (200):**
```json
{
"player": "Nikola Jokic",
"player_id": 203999,
"team": "DEN",
"last_n": 10,
"source": "cache | live",
"stats": {
"points": 28.1,
"rebounds": 13.0,
"assists": 10.2,
"threes": 1.3,
"blocks": 0.8,
"steals": 1.5,
"pra": 51.3,
"turnovers": 2.9,
"games_played": 10,
"minutes": 35.1
}
}
```
### GET /stats/splits
Returns situational splits for a player.
**Query params:**
| Param | Type | Required | Default | Description |
|------------|--------|----------|---------|------------------------------------------|
| player | string | yes | | Player full name |
| stat_type | string | yes | | Stat to split (points, rebounds, etc.) |
| split_type | string | yes | | One of: home_away, rest_days, vs_team |
| opponent | string | no | | Required when split_type=vs_team (e.g., "LAL") |
**Response (200) — home_away split:**
```json
{
"player": "Nikola Jokic",
"player_id": 203999,
"team": "DEN",
"stat_type": "points",
"split_type": "home_away",
"source": "cache | live",
"splits": {
"home": { "avg": 27.8, "games": 33 },
"away": { "avg": 24.9, "games": 32 }
}
}
```
**Response (200) — rest_days split (back-to-back detection):**
```json
{
"player": "Nikola Jokic",
"stat_type": "points",
"split_type": "rest_days",
"source": "cache | live",
"splits": {
"b2b": { "avg": 23.1, "games": 8 },
"1_day_rest": { "avg": 26.5, "games": 40 },
"2_plus_days_rest": { "avg": 28.2, "games": 17 }
}
}
```
**Response (200) — vs_team split:**
```json
{
"player": "Nikola Jokic",
"stat_type": "points",
"split_type": "vs_team",
"opponent": "LAL",
"source": "cache | live",
"splits": {
"vs_opponent": { "avg": 30.5, "games": 3 },
"vs_all_others": { "avg": 25.8, "games": 62 }
}
}
```
### GET /players/search
Resolves player name to nba_api player ID. Used internally.
**Query params:**
| Param | Type | Required | Description |
|--------|--------|----------|----------------------|
| name | string | yes | Player name (partial match OK) |
**Response (200):**
```json
{
"results": [
{ "player_id": 203999, "full_name": "Nikola Jokic", "team": "DEN", "is_active": true }
]
}
```
### GET /health
Health check for the microservice.
**Response (200):**
```json
{ "status": "ok", "cache": "connected" }
```
### Error Responses
| Status | When | Body |
|--------|----------------------------------------|-----------------------------------------------|
| 400 | Missing required param or invalid value | `{ "error": "player is required" }` |
| 404 | Player not found in nba_api | `{ "error": "Player not found: Xyz" }` |
| 503 | nba_api unreachable or rate limited | `{ "error": "NBA stats service unavailable" }`|
## Data Shape (Internal)
```python
@dataclass
class PlayerStats:
player: str
player_id: int
team: str # 3-letter abbreviation
season: str # e.g., "2025-26"
stats: dict # { "points": 26.3, "rebounds": 12.4, ... }
games_played: int
minutes: float
```
## Stat Mapping
Map nba_api column names to our internal stat names:
| nba_api Column | Internal stat_type |
|----------------|--------------------|
| PTS | points |
| REB | rebounds |
| AST | assists |
| FG3M | threes |
| BLK | blocks |
| STL | steals |
| TOV | turnovers |
| (PTS+REB+AST) | pra (computed) |
| MIN | minutes |
| GP | games_played |
## Caching Strategy
- **Store:** Redis (same instance as Node backend)
- **Key patterns:**
- Season avg: `nba:season:{player_id}:{season}` — TTL: 24 hours
- Last N: `nba:last:{player_id}:{n}` — TTL: 1 hour
- Splits: `nba:splits:{player_id}:{stat_type}:{split_type}` — TTL: 6 hours
- Player search: `nba:player:{name_normalized}` — TTL: 7 days
- **On cache hit:** Return with `"source": "cache"`
- **On cache miss:** Fetch from nba_api, store, return with `"source": "live"`
- **On nba_api failure:** Return stale cache if available, 503 if not
## Player Name Resolution
- nba_api requires player_id for most endpoints
- Use `nba_api.stats.static.players.find_players_by_full_name()` for lookup
- Cache player_id mappings for 7 days (players don't change teams mid-game)
- Support partial matching: "Jokic" should resolve to "Nikola Jokic"
- If multiple matches, return all and let caller disambiguate
## nba_api Rate Limiting
- nba_api hits NBA.com endpoints which are rate-limited (no official docs)
- Add a 0.6s delay between nba_api calls to avoid getting blocked
- If a request fails with connection error, retry once after 2s
- If retry fails, serve from cache or return 503
## Acceptance Criteria
1. `GET /stats/season-avg?player=Nikola Jokic` returns correct season averages
2. `GET /stats/last-n?player=Nikola Jokic&n=10` returns last 10 game averages
3. `GET /stats/splits?player=Nikola Jokic&stat_type=points&split_type=home_away` returns home/away split
4. `GET /stats/splits?player=Nikola Jokic&stat_type=points&split_type=rest_days` returns B2B/rest splits
5. `GET /stats/splits?player=Nikola Jokic&stat_type=points&split_type=vs_team&opponent=LAL` returns vs-team split
6. `GET /players/search?name=Jokic` resolves to correct player with ID and team
7. Season averages are cached for 24 hours; subsequent calls return from cache
8. Last N averages are cached for 1 hour
9. Player search results are cached for 7 days
10. If nba_api is unreachable, stale cache is returned; if no cache, 503
11. All timestamps and dates use UTC
12. `GET /health` returns status and cache connectivity
## Test Plan
### Unit Tests (stats.py)
- Maps nba_api raw stats to internal stat names correctly
- Computes PRA from PTS + REB + AST
- Handles missing stats gracefully (player with 0 games returns empty)
- Player search returns correct matches for full and partial names
- Back-to-back detection: identifies games on consecutive days
### Unit Tests (cache.py)
- Cache hit returns stored data
- Cache miss returns None
- TTLs are set correctly per data type
### Integration Tests (routes)
- Full request cycle: GET /stats/season-avg with mocked nba_api -> verify response shape
- GET /stats/last-n with n=5, n=10, n=30 -> correct game counts
- GET /stats/splits for each split_type -> correct response shapes
- Player search with partial name -> returns matches
- Error: invalid player name -> 404
- Error: missing required param -> 400
- Error: nba_api down with warm cache -> returns stale data
- Error: nba_api down with cold cache -> returns 503
- Health check returns ok when Redis is connected
## Open Questions
- **nba_api game log structure:** Need to verify exact column names for game logs (used in last-n and splits). Will confirm during implementation with a test call.
- **Current season string:** nba_api uses format "2025-26" — need to compute this dynamically based on current date (season starts in October).
+330
View File
@@ -0,0 +1,330 @@
# Feature 1.4 — Database Schema (Supabase + RLS)
## Overview
Complete PostgreSQL schema in Supabase for all BetonBLK data. Uses Supabase Auth for user identity. Row Level Security (RLS) on all tables ensures users can only access their own data. Service role key used by backend for admin operations.
## Dependencies
- None (builds parallel with Features 1.1, 1.2)
- Downstream consumers: Feature 1.5 (Bet Submission), Feature 2.1 (Parlay Scan), Feature 3.4 (Stripe)
## Auth Model
- **Provider:** Supabase Auth
- **Identity:** `auth.users` table (managed by Supabase)
- **Extension:** Our `public.users` table references `auth.users.id` as FK
- **RLS:** Enabled on all tables. Policies use `auth.uid()` to scope access.
- **Backend access:** Service role key bypasses RLS for server-side operations
## Tables
### users
Extends Supabase Auth with app-specific profile data.
```sql
CREATE TABLE public.users (
id UUID PRIMARY KEY REFERENCES auth.users(id) ON DELETE CASCADE,
email TEXT NOT NULL,
tier TEXT NOT NULL DEFAULT 'free' CHECK (tier IN ('free', 'analyst', 'desk')),
scan_count INT NOT NULL DEFAULT 0,
scan_reset_date TIMESTAMPTZ NOT NULL DEFAULT (date_trunc('month', now()) + interval '1 month'),
stripe_customer_id TEXT,
founder_status BOOLEAN NOT NULL DEFAULT false,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
```
**RLS Policies:**
```sql
-- Users can read their own row
CREATE POLICY "users_select_own" ON public.users
FOR SELECT USING (auth.uid() = id);
-- Users can update their own row (except tier, scan_count — backend only)
CREATE POLICY "users_update_own" ON public.users
FOR UPDATE USING (auth.uid() = id)
WITH CHECK (auth.uid() = id);
-- Insert handled by trigger on auth.users creation (backend/service role)
CREATE POLICY "users_insert_service" ON public.users
FOR INSERT WITH CHECK (auth.uid() = id);
```
### picks
Individual prop analysis results from scans.
```sql
CREATE TABLE public.picks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES public.users(id) ON DELETE CASCADE,
player TEXT NOT NULL,
stat_type TEXT NOT NULL,
line NUMERIC(5,1) NOT NULL,
book TEXT NOT NULL,
direction TEXT NOT NULL CHECK (direction IN ('over', 'under')),
grade TEXT NOT NULL CHECK (grade IN ('A', 'B', 'C', 'D')),
edge_pct NUMERIC(5,2),
reasoning TEXT,
kill_conditions TEXT[],
confidence NUMERIC(4,2),
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_picks_user_id ON public.picks(user_id);
CREATE INDEX idx_picks_created_at ON public.picks(created_at);
```
**RLS Policies:**
```sql
CREATE POLICY "picks_select_own" ON public.picks
FOR SELECT USING (auth.uid() = user_id);
CREATE POLICY "picks_insert_own" ON public.picks
FOR INSERT WITH CHECK (auth.uid() = user_id);
```
### scan_sessions
Groups picks into a single scan/parlay analysis session.
```sql
CREATE TABLE public.scan_sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES public.users(id) ON DELETE CASCADE,
legs UUID[] NOT NULL DEFAULT '{}',
final_grade TEXT CHECK (final_grade IN ('A', 'B', 'C', 'D')),
kill_conditions TEXT[],
correlation_notes TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_scan_sessions_user_id ON public.scan_sessions(user_id);
```
**RLS Policies:**
```sql
CREATE POLICY "scan_sessions_select_own" ON public.scan_sessions
FOR SELECT USING (auth.uid() = user_id);
CREATE POLICY "scan_sessions_insert_own" ON public.scan_sessions
FOR INSERT WITH CHECK (auth.uid() = user_id);
```
### bets
User-submitted bets (via screenshot, quick slip, or sportsbook sync).
```sql
CREATE TABLE public.bets (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES public.users(id) ON DELETE CASCADE,
amount NUMERIC(10,2) NOT NULL,
potential_payout NUMERIC(10,2),
slip_data JSONB NOT NULL,
book TEXT NOT NULL,
bet_type TEXT NOT NULL CHECK (bet_type IN ('straight', 'parlay', 'teaser', 'round_robin')),
submission_method TEXT NOT NULL CHECK (submission_method IN ('screenshot', 'quickslip', 'sync')),
status TEXT NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'won', 'lost', 'push', 'void')),
placed_at TIMESTAMPTZ NOT NULL DEFAULT now(),
settled_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_bets_user_id ON public.bets(user_id);
CREATE INDEX idx_bets_status ON public.bets(status);
CREATE INDEX idx_bets_placed_at ON public.bets(placed_at);
```
**slip_data JSONB shape:**
```json
{
"legs": [
{
"player": "Nikola Jokic",
"stat_type": "points",
"line": 26.5,
"direction": "over",
"odds": -110
}
],
"total_odds": -110,
"raw_text": "Jokic PRA 50.5 over $20 DraftKings"
}
```
**RLS Policies:**
```sql
CREATE POLICY "bets_select_own" ON public.bets
FOR SELECT USING (auth.uid() = user_id);
CREATE POLICY "bets_insert_own" ON public.bets
FOR INSERT WITH CHECK (auth.uid() = user_id);
CREATE POLICY "bets_update_own" ON public.bets
FOR UPDATE USING (auth.uid() = user_id)
WITH CHECK (auth.uid() = user_id);
```
### outcomes
Tracks actual results for each pick after the game is played.
```sql
CREATE TABLE public.outcomes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
pick_id UUID NOT NULL REFERENCES public.picks(id) ON DELETE CASCADE,
result TEXT NOT NULL CHECK (result IN ('hit', 'miss', 'push')),
actual_value NUMERIC(5,1) NOT NULL,
logged_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX idx_outcomes_pick_id ON public.outcomes(pick_id);
```
**RLS Policies:**
```sql
-- Users can see outcomes for their own picks
CREATE POLICY "outcomes_select_own" ON public.outcomes
FOR SELECT USING (
EXISTS (
SELECT 1 FROM public.picks WHERE picks.id = outcomes.pick_id AND picks.user_id = auth.uid()
)
);
-- Insert by service role only (backend resolves outcomes)
-- No INSERT policy for anon/authenticated — backend uses service role
```
### performance
Aggregated performance metrics per user per time period.
```sql
CREATE TABLE public.performance (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES public.users(id) ON DELETE CASCADE,
period TEXT NOT NULL CHECK (period IN ('weekly', 'monthly', 'all_time')),
roi NUMERIC(6,2),
win_rate NUMERIC(5,2),
sample_size INT NOT NULL DEFAULT 0,
total_wagered NUMERIC(10,2) DEFAULT 0,
total_profit NUMERIC(10,2) DEFAULT 0,
calculated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_performance_user_id ON public.performance(user_id);
CREATE UNIQUE INDEX idx_performance_user_period ON public.performance(user_id, period);
```
**RLS Policies:**
```sql
CREATE POLICY "performance_select_own" ON public.performance
FOR SELECT USING (auth.uid() = user_id);
-- Insert/update by service role only (backend calculates performance)
```
## Triggers
### Auto-create user profile on signup
```sql
CREATE OR REPLACE FUNCTION public.handle_new_user()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO public.users (id, email)
VALUES (NEW.id, NEW.email);
RETURN NEW;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
CREATE TRIGGER on_auth_user_created
AFTER INSERT ON auth.users
FOR EACH ROW EXECUTE FUNCTION public.handle_new_user();
```
### Auto-update updated_at on users table
```sql
CREATE OR REPLACE FUNCTION public.update_updated_at()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = now();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER users_updated_at
BEFORE UPDATE ON public.users
FOR EACH ROW EXECUTE FUNCTION public.update_updated_at();
```
### Monthly scan count reset
```sql
CREATE OR REPLACE FUNCTION public.reset_scan_count()
RETURNS TRIGGER AS $$
BEGIN
IF NEW.scan_reset_date <= now() THEN
NEW.scan_count = 0;
NEW.scan_reset_date = date_trunc('month', now()) + interval '1 month';
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER users_scan_reset
BEFORE UPDATE ON public.users
FOR EACH ROW EXECUTE FUNCTION public.reset_scan_count();
```
## Migration File Structure
```
supabase/
└── migrations/
└── 001_initial_schema.sql # All tables, indexes, RLS, triggers
```
Single migration file for the initial schema. Future changes get their own numbered migration files.
## Implementation Notes
- All `TIMESTAMPTZ` columns store UTC. Application layer always sends UTC.
- `gen_random_uuid()` is available natively in Supabase (pgcrypto enabled).
- `slip_data` uses JSONB for flexibility — different bet types have different shapes.
- `legs` in scan_sessions is a UUID array referencing picks. Not a FK constraint (allows flexibility).
- Performance table uses a unique index on (user_id, period) — upsert on recalculation.
## Acceptance Criteria
1. All 6 tables created successfully in Supabase
2. RLS enabled on every table
3. RLS policies enforce user-scoped access (user can only read/write their own data)
4. Service role key can bypass RLS (for backend operations)
5. `handle_new_user` trigger fires on auth.users insert, creating a public.users row
6. `updated_at` auto-updates on users table modifications
7. `scan_reset_date` logic resets scan_count when month rolls over
8. All indexes created
9. Constraints enforced: tier values, grade values, bet_type values, status values
10. All timestamps stored as TIMESTAMPTZ in UTC
## Test Plan
### Schema Validation Tests
- Each table exists with correct columns and types
- All constraints reject invalid values (e.g., tier='gold' fails, grade='E' fails)
- Foreign keys enforce referential integrity (delete user cascades to picks, bets, etc.)
- Unique indexes prevent duplicates (one outcome per pick, one performance row per user+period)
### RLS Tests
- Authenticated user can SELECT their own rows from all tables
- Authenticated user CANNOT select another user's rows
- Authenticated user can INSERT into picks, bets, scan_sessions with their own user_id
- Authenticated user CANNOT insert with a different user_id
- Anon user cannot access any table
- Service role can read/write all rows (bypasses RLS)
### Trigger Tests
- Creating auth.users row auto-creates public.users row with correct email
- Updating users row auto-updates updated_at timestamp
- Scan count resets to 0 when scan_reset_date has passed
### Integration Tests (from Node.js)
- Supabase client with anon key + JWT can read user's own data
- Supabase client with service role can insert/read any data
- Full flow: create user -> insert pick -> insert outcome -> read performance
## Open Questions
- **scan_sessions.legs as UUID[]:** Using a Postgres array instead of a junction table. Simpler for now, but limits query flexibility. Acceptable for MVP; can migrate to a junction table if needed.
- **Performance recalculation:** Currently a stored row. Could instead be a Postgres view computed on the fly. Stored row chosen for read performance at scale. Backend job recalculates periodically.