# Feature 1.2 — NBA_API Stats Wrapper (FastAPI Microservice) ## Overview A Python FastAPI microservice wrapping the `nba_api` library. Provides player stats (season averages, last N games, situational splits) via internal HTTP endpoints. Caches aggressively in Redis. Called by the Node.js backend — not exposed to the public internet. ## Dependencies - None (builds parallel with Feature 1.1) - Downstream consumers: Feature 1.3 (Prop Analysis Engine) needs season averages + recent game data to grade props ## Tech - **Language:** Python 3.11+ - **Framework:** FastAPI + uvicorn - **Data source:** `nba_api` (free, no key required) - **Cache:** Redis (shared instance with Node backend) - **Port:** 8000 (internal only) ## Directory Structure ``` nba-service/ ├── app/ │ ├── main.py # FastAPI app, routes │ ├── services/ │ │ └── stats.py # nba_api wrapper functions │ ├── utils/ │ │ ├── cache.py # Redis caching helpers │ │ └── player_map.py # Player name -> nba_api player_id resolution │ └── config.py # Settings (Redis URL, cache TTLs) ├── tests/ │ ├── test_stats.py # Unit tests for stats service │ └── test_routes.py # Integration tests for endpoints ├── requirements.txt └── README.md ``` ## Endpoints ### GET /stats/season-avg Returns a player's season averages for the current season. **Query params:** | Param | Type | Required | Description | |-----------|--------|----------|--------------------------------------------| | player | string | yes | Player full name (e.g., "Nikola Jokic") | | stat_type | string | no | Filter to specific stat (default: all) | | season | string | no | NBA season (default: current, e.g., "2025-26") | **Response (200):** ```json { "player": "Nikola Jokic", "player_id": 203999, "team": "DEN", "season": "2025-26", "source": "cache | live", "stats": { "points": 26.3, "rebounds": 12.4, "assists": 9.1, "threes": 1.1, "blocks": 0.7, "steals": 1.4, "pra": 47.8, "turnovers": 3.2, "games_played": 65, "minutes": 34.2 } } ``` ### GET /stats/last-n Returns a player's averages over their last N games. **Query params:** | Param | Type | Required | Default | Description | |-----------|--------|----------|---------|------------------------------------| | player | string | yes | | Player full name | | n | int | no | 10 | Number of recent games (max: 30) | | stat_type | string | no | all | Filter to specific stat | **Response (200):** ```json { "player": "Nikola Jokic", "player_id": 203999, "team": "DEN", "last_n": 10, "source": "cache | live", "stats": { "points": 28.1, "rebounds": 13.0, "assists": 10.2, "threes": 1.3, "blocks": 0.8, "steals": 1.5, "pra": 51.3, "turnovers": 2.9, "games_played": 10, "minutes": 35.1 } } ``` ### GET /stats/splits Returns situational splits for a player. **Query params:** | Param | Type | Required | Default | Description | |------------|--------|----------|---------|------------------------------------------| | player | string | yes | | Player full name | | stat_type | string | yes | | Stat to split (points, rebounds, etc.) | | split_type | string | yes | | One of: home_away, rest_days, vs_team | | opponent | string | no | | Required when split_type=vs_team (e.g., "LAL") | **Response (200) — home_away split:** ```json { "player": "Nikola Jokic", "player_id": 203999, "team": "DEN", "stat_type": "points", "split_type": "home_away", "source": "cache | live", "splits": { "home": { "avg": 27.8, "games": 33 }, "away": { "avg": 24.9, "games": 32 } } } ``` **Response (200) — rest_days split (back-to-back detection):** ```json { "player": "Nikola Jokic", "stat_type": "points", "split_type": "rest_days", "source": "cache | live", "splits": { "b2b": { "avg": 23.1, "games": 8 }, "1_day_rest": { "avg": 26.5, "games": 40 }, "2_plus_days_rest": { "avg": 28.2, "games": 17 } } } ``` **Response (200) — vs_team split:** ```json { "player": "Nikola Jokic", "stat_type": "points", "split_type": "vs_team", "opponent": "LAL", "source": "cache | live", "splits": { "vs_opponent": { "avg": 30.5, "games": 3 }, "vs_all_others": { "avg": 25.8, "games": 62 } } } ``` ### GET /players/search Resolves player name to nba_api player ID. Used internally. **Query params:** | Param | Type | Required | Description | |--------|--------|----------|----------------------| | name | string | yes | Player name (partial match OK) | **Response (200):** ```json { "results": [ { "player_id": 203999, "full_name": "Nikola Jokic", "team": "DEN", "is_active": true } ] } ``` ### GET /health Health check for the microservice. **Response (200):** ```json { "status": "ok", "cache": "connected" } ``` ### Error Responses | Status | When | Body | |--------|----------------------------------------|-----------------------------------------------| | 400 | Missing required param or invalid value | `{ "error": "player is required" }` | | 404 | Player not found in nba_api | `{ "error": "Player not found: Xyz" }` | | 503 | nba_api unreachable or rate limited | `{ "error": "NBA stats service unavailable" }`| ## Data Shape (Internal) ```python @dataclass class PlayerStats: player: str player_id: int team: str # 3-letter abbreviation season: str # e.g., "2025-26" stats: dict # { "points": 26.3, "rebounds": 12.4, ... } games_played: int minutes: float ``` ## Stat Mapping Map nba_api column names to our internal stat names: | nba_api Column | Internal stat_type | |----------------|--------------------| | PTS | points | | REB | rebounds | | AST | assists | | FG3M | threes | | BLK | blocks | | STL | steals | | TOV | turnovers | | (PTS+REB+AST) | pra (computed) | | MIN | minutes | | GP | games_played | ## Caching Strategy - **Store:** Redis (same instance as Node backend) - **Key patterns:** - Season avg: `nba:season:{player_id}:{season}` — TTL: 24 hours - Last N: `nba:last:{player_id}:{n}` — TTL: 1 hour - Splits: `nba:splits:{player_id}:{stat_type}:{split_type}` — TTL: 6 hours - Player search: `nba:player:{name_normalized}` — TTL: 7 days - **On cache hit:** Return with `"source": "cache"` - **On cache miss:** Fetch from nba_api, store, return with `"source": "live"` - **On nba_api failure:** Return stale cache if available, 503 if not ## Player Name Resolution - nba_api requires player_id for most endpoints - Use `nba_api.stats.static.players.find_players_by_full_name()` for lookup - Cache player_id mappings for 7 days (players don't change teams mid-game) - Support partial matching: "Jokic" should resolve to "Nikola Jokic" - If multiple matches, return all and let caller disambiguate ## nba_api Rate Limiting - nba_api hits NBA.com endpoints which are rate-limited (no official docs) - Add a 0.6s delay between nba_api calls to avoid getting blocked - If a request fails with connection error, retry once after 2s - If retry fails, serve from cache or return 503 ## Acceptance Criteria 1. `GET /stats/season-avg?player=Nikola Jokic` returns correct season averages 2. `GET /stats/last-n?player=Nikola Jokic&n=10` returns last 10 game averages 3. `GET /stats/splits?player=Nikola Jokic&stat_type=points&split_type=home_away` returns home/away split 4. `GET /stats/splits?player=Nikola Jokic&stat_type=points&split_type=rest_days` returns B2B/rest splits 5. `GET /stats/splits?player=Nikola Jokic&stat_type=points&split_type=vs_team&opponent=LAL` returns vs-team split 6. `GET /players/search?name=Jokic` resolves to correct player with ID and team 7. Season averages are cached for 24 hours; subsequent calls return from cache 8. Last N averages are cached for 1 hour 9. Player search results are cached for 7 days 10. If nba_api is unreachable, stale cache is returned; if no cache, 503 11. All timestamps and dates use UTC 12. `GET /health` returns status and cache connectivity ## Test Plan ### Unit Tests (stats.py) - Maps nba_api raw stats to internal stat names correctly - Computes PRA from PTS + REB + AST - Handles missing stats gracefully (player with 0 games returns empty) - Player search returns correct matches for full and partial names - Back-to-back detection: identifies games on consecutive days ### Unit Tests (cache.py) - Cache hit returns stored data - Cache miss returns None - TTLs are set correctly per data type ### Integration Tests (routes) - Full request cycle: GET /stats/season-avg with mocked nba_api -> verify response shape - GET /stats/last-n with n=5, n=10, n=30 -> correct game counts - GET /stats/splits for each split_type -> correct response shapes - Player search with partial name -> returns matches - Error: invalid player name -> 404 - Error: missing required param -> 400 - Error: nba_api down with warm cache -> returns stale data - Error: nba_api down with cold cache -> returns 503 - Health check returns ok when Redis is connected ## Open Questions - **nba_api game log structure:** Need to verify exact column names for game logs (used in last-n and splits). Will confirm during implementation with a test call. - **Current season string:** nba_api uses format "2025-26" — need to compute this dynamically based on current date (season starts in October).