Practice extracting a year-month bucket from a date column and grouping on the derived value. SUBSTR (or strftime) on service_date, COUNT(*), and ordered output — the pattern behind every utilization trend chart in payer analytics.
Finance is rebuilding the utilization trend deck for the next operating review. From fact_claims, give me a count of claims per service month — service_month formatted as YYYY-MM, and claim_count — sorted chronologically. Use service_date as the source. Our claims data spans 2024-01 through 2025-12, so 24 monthly buckets is what I expect. This is the trend line every cost-trend conversation starts with; if a month spikes 20%+ above its neighbors, that's the conversation.
| Column | Type | Key |
|---|---|---|
| claim_id | INT | PK |
| member_id | INT | FK → dim_members |
| provider_id | INT | FK → dim_providers_payer |
| plan_id | INT | FK → dim_plans |
| service_date | TEXT | |
| cpt_code | TEXT | |
| place_of_service | TEXT | |
| billed_amt | REAL | |
| allowed_amt | REAL | |
| paid_amt | REAL | |
| status | TEXT |
Each hint you reveal reduces the XP you can earn. Try the query first.
Single-table aggregate over fact_claims. The bucketing happens by deriving a year-month string from the service_date column and grouping on that derived value.
service_date is stored as YYYY-MM-DD. The first seven characters are the year-month bucket — SUBSTR is the most direct extraction in SQLite. (strftime works too, but SUBSTR is shorter and the cost is identical here.)
Two output columns: the service_month string and the count, named for the metric. Sort ascending on the month so the trend reads left-to-right; lexicographic sort on YYYY-MM is the same as chronological — that's why the format matters.