feat: Feature 1.2 (NBA stats FastAPI service) + Feature 1.4 (database schema)

Feature 1.2: Python FastAPI microservice wrapping nba_api
- GET /stats/season-avg, /stats/last-n, /stats/splits, /players/search
- Redis caching (24hr/1hr/6hr/7day), 0.6s rate limiting, PRA derived stat
- 27 Python tests passing

Feature 1.4: Complete Supabase database schema
- 6 tables: users, picks, scan_sessions, bets, outcomes, performance
- RLS enabled on all tables with auth.uid() policies
- 3 triggers: auto-create user, updated_at, scan count reset
- 37 schema validation tests passing
- Migration SQL ready, pending manual apply (WSL2 DNS blocker)

Total: 92 tests (65 Node.js + 27 Python), all passing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Kev
2026-03-21 10:58:58 -04:00
parent 00409fd6cd
commit 3da1b4242c
27 changed files with 2360 additions and 16 deletions
+288
View File
@@ -0,0 +1,288 @@
# Feature 1.2 — NBA_API Stats Wrapper (FastAPI Microservice)
## Overview
A Python FastAPI microservice wrapping the `nba_api` library. Provides player stats (season averages, last N games, situational splits) via internal HTTP endpoints. Caches aggressively in Redis. Called by the Node.js backend — not exposed to the public internet.
## Dependencies
- None (builds parallel with Feature 1.1)
- Downstream consumers: Feature 1.3 (Prop Analysis Engine) needs season averages + recent game data to grade props
## Tech
- **Language:** Python 3.11+
- **Framework:** FastAPI + uvicorn
- **Data source:** `nba_api` (free, no key required)
- **Cache:** Redis (shared instance with Node backend)
- **Port:** 8000 (internal only)
## Directory Structure
```
nba-service/
├── app/
│ ├── main.py # FastAPI app, routes
│ ├── services/
│ │ └── stats.py # nba_api wrapper functions
│ ├── utils/
│ │ ├── cache.py # Redis caching helpers
│ │ └── player_map.py # Player name -> nba_api player_id resolution
│ └── config.py # Settings (Redis URL, cache TTLs)
├── tests/
│ ├── test_stats.py # Unit tests for stats service
│ └── test_routes.py # Integration tests for endpoints
├── requirements.txt
└── README.md
```
## Endpoints
### GET /stats/season-avg
Returns a player's season averages for the current season.
**Query params:**
| Param | Type | Required | Description |
|-----------|--------|----------|--------------------------------------------|
| player | string | yes | Player full name (e.g., "Nikola Jokic") |
| stat_type | string | no | Filter to specific stat (default: all) |
| season | string | no | NBA season (default: current, e.g., "2025-26") |
**Response (200):**
```json
{
"player": "Nikola Jokic",
"player_id": 203999,
"team": "DEN",
"season": "2025-26",
"source": "cache | live",
"stats": {
"points": 26.3,
"rebounds": 12.4,
"assists": 9.1,
"threes": 1.1,
"blocks": 0.7,
"steals": 1.4,
"pra": 47.8,
"turnovers": 3.2,
"games_played": 65,
"minutes": 34.2
}
}
```
### GET /stats/last-n
Returns a player's averages over their last N games.
**Query params:**
| Param | Type | Required | Default | Description |
|-----------|--------|----------|---------|------------------------------------|
| player | string | yes | | Player full name |
| n | int | no | 10 | Number of recent games (max: 30) |
| stat_type | string | no | all | Filter to specific stat |
**Response (200):**
```json
{
"player": "Nikola Jokic",
"player_id": 203999,
"team": "DEN",
"last_n": 10,
"source": "cache | live",
"stats": {
"points": 28.1,
"rebounds": 13.0,
"assists": 10.2,
"threes": 1.3,
"blocks": 0.8,
"steals": 1.5,
"pra": 51.3,
"turnovers": 2.9,
"games_played": 10,
"minutes": 35.1
}
}
```
### GET /stats/splits
Returns situational splits for a player.
**Query params:**
| Param | Type | Required | Default | Description |
|------------|--------|----------|---------|------------------------------------------|
| player | string | yes | | Player full name |
| stat_type | string | yes | | Stat to split (points, rebounds, etc.) |
| split_type | string | yes | | One of: home_away, rest_days, vs_team |
| opponent | string | no | | Required when split_type=vs_team (e.g., "LAL") |
**Response (200) — home_away split:**
```json
{
"player": "Nikola Jokic",
"player_id": 203999,
"team": "DEN",
"stat_type": "points",
"split_type": "home_away",
"source": "cache | live",
"splits": {
"home": { "avg": 27.8, "games": 33 },
"away": { "avg": 24.9, "games": 32 }
}
}
```
**Response (200) — rest_days split (back-to-back detection):**
```json
{
"player": "Nikola Jokic",
"stat_type": "points",
"split_type": "rest_days",
"source": "cache | live",
"splits": {
"b2b": { "avg": 23.1, "games": 8 },
"1_day_rest": { "avg": 26.5, "games": 40 },
"2_plus_days_rest": { "avg": 28.2, "games": 17 }
}
}
```
**Response (200) — vs_team split:**
```json
{
"player": "Nikola Jokic",
"stat_type": "points",
"split_type": "vs_team",
"opponent": "LAL",
"source": "cache | live",
"splits": {
"vs_opponent": { "avg": 30.5, "games": 3 },
"vs_all_others": { "avg": 25.8, "games": 62 }
}
}
```
### GET /players/search
Resolves player name to nba_api player ID. Used internally.
**Query params:**
| Param | Type | Required | Description |
|--------|--------|----------|----------------------|
| name | string | yes | Player name (partial match OK) |
**Response (200):**
```json
{
"results": [
{ "player_id": 203999, "full_name": "Nikola Jokic", "team": "DEN", "is_active": true }
]
}
```
### GET /health
Health check for the microservice.
**Response (200):**
```json
{ "status": "ok", "cache": "connected" }
```
### Error Responses
| Status | When | Body |
|--------|----------------------------------------|-----------------------------------------------|
| 400 | Missing required param or invalid value | `{ "error": "player is required" }` |
| 404 | Player not found in nba_api | `{ "error": "Player not found: Xyz" }` |
| 503 | nba_api unreachable or rate limited | `{ "error": "NBA stats service unavailable" }`|
## Data Shape (Internal)
```python
@dataclass
class PlayerStats:
player: str
player_id: int
team: str # 3-letter abbreviation
season: str # e.g., "2025-26"
stats: dict # { "points": 26.3, "rebounds": 12.4, ... }
games_played: int
minutes: float
```
## Stat Mapping
Map nba_api column names to our internal stat names:
| nba_api Column | Internal stat_type |
|----------------|--------------------|
| PTS | points |
| REB | rebounds |
| AST | assists |
| FG3M | threes |
| BLK | blocks |
| STL | steals |
| TOV | turnovers |
| (PTS+REB+AST) | pra (computed) |
| MIN | minutes |
| GP | games_played |
## Caching Strategy
- **Store:** Redis (same instance as Node backend)
- **Key patterns:**
- Season avg: `nba:season:{player_id}:{season}` — TTL: 24 hours
- Last N: `nba:last:{player_id}:{n}` — TTL: 1 hour
- Splits: `nba:splits:{player_id}:{stat_type}:{split_type}` — TTL: 6 hours
- Player search: `nba:player:{name_normalized}` — TTL: 7 days
- **On cache hit:** Return with `"source": "cache"`
- **On cache miss:** Fetch from nba_api, store, return with `"source": "live"`
- **On nba_api failure:** Return stale cache if available, 503 if not
## Player Name Resolution
- nba_api requires player_id for most endpoints
- Use `nba_api.stats.static.players.find_players_by_full_name()` for lookup
- Cache player_id mappings for 7 days (players don't change teams mid-game)
- Support partial matching: "Jokic" should resolve to "Nikola Jokic"
- If multiple matches, return all and let caller disambiguate
## nba_api Rate Limiting
- nba_api hits NBA.com endpoints which are rate-limited (no official docs)
- Add a 0.6s delay between nba_api calls to avoid getting blocked
- If a request fails with connection error, retry once after 2s
- If retry fails, serve from cache or return 503
## Acceptance Criteria
1. `GET /stats/season-avg?player=Nikola Jokic` returns correct season averages
2. `GET /stats/last-n?player=Nikola Jokic&n=10` returns last 10 game averages
3. `GET /stats/splits?player=Nikola Jokic&stat_type=points&split_type=home_away` returns home/away split
4. `GET /stats/splits?player=Nikola Jokic&stat_type=points&split_type=rest_days` returns B2B/rest splits
5. `GET /stats/splits?player=Nikola Jokic&stat_type=points&split_type=vs_team&opponent=LAL` returns vs-team split
6. `GET /players/search?name=Jokic` resolves to correct player with ID and team
7. Season averages are cached for 24 hours; subsequent calls return from cache
8. Last N averages are cached for 1 hour
9. Player search results are cached for 7 days
10. If nba_api is unreachable, stale cache is returned; if no cache, 503
11. All timestamps and dates use UTC
12. `GET /health` returns status and cache connectivity
## Test Plan
### Unit Tests (stats.py)
- Maps nba_api raw stats to internal stat names correctly
- Computes PRA from PTS + REB + AST
- Handles missing stats gracefully (player with 0 games returns empty)
- Player search returns correct matches for full and partial names
- Back-to-back detection: identifies games on consecutive days
### Unit Tests (cache.py)
- Cache hit returns stored data
- Cache miss returns None
- TTLs are set correctly per data type
### Integration Tests (routes)
- Full request cycle: GET /stats/season-avg with mocked nba_api -> verify response shape
- GET /stats/last-n with n=5, n=10, n=30 -> correct game counts
- GET /stats/splits for each split_type -> correct response shapes
- Player search with partial name -> returns matches
- Error: invalid player name -> 404
- Error: missing required param -> 400
- Error: nba_api down with warm cache -> returns stale data
- Error: nba_api down with cold cache -> returns 503
- Health check returns ok when Redis is connected
## Open Questions
- **nba_api game log structure:** Need to verify exact column names for game logs (used in last-n and splits). Will confirm during implementation with a test call.
- **Current season string:** nba_api uses format "2025-26" — need to compute this dynamically based on current date (season starts in October).