Cohort retention from snapshots. Practice multi-CTE structure (cohort sizing CTE + retention check CTE), date arithmetic with INTERVAL, LEFT JOIN to preserve cohorts where the m3_active count is zero, and computing a percentage with explicit rounding.
Investor update tomorrow — they want our 90-day cohort retention curve. For each signup-month cohort whose +90-day mark falls inside our snapshot window (2025-09-13 to 2026-03-31), give me four columns: cohort_month, cohort_size, m3_active (count still active 90 days after signup), and retention_pct (rounded to 1 decimal). Two CTEs feels right — one for cohort sizing, one for the +90-day check. Sort cohort_month ascending.
| Column | Type | Key |
|---|---|---|
| snapshot_date | TEXT | PK |
| subscription_id | INT | PKFK → fact_subscriptions |
| customer_id | INT | FK → dim_customers |
| state | TEXT | |
| plan_id | INT | FK → dim_plans |
| mrr | REAL | |
| paused_until | TEXT |
| Column | Type | Key |
|---|---|---|
| customer_id | INT | PK |
| TEXT | ||
| first_name | TEXT | |
| last_name | TEXT | |
| country | TEXT | |
| signup_date | TEXT | |
| acquisition_channel | TEXT | |
| is_subscriber | INT |
Each hint you reveal reduces the XP you can earn. Try the query first.
First CTE (cohort_size): from dim_customers WHERE is_subscriber = 1, group by DATE_TRUNC('month', signup_date::date), COUNT(*).
Second CTE (m3_active): join dim_customers to fact_subscription_snapshots ON snapshot_date = signup_date::date + INTERVAL '90 days' AND state = 'active'. This is the date-arithmetic crux.
Restrict cohorts to those whose +90-day mark is inside the snapshot window: signup_date BETWEEN '2025-06-15' AND '2025-12-31' approximately. Cleaner: filter inside the CTE on signup_date::date + 90 BETWEEN '2025-09-13' AND '2026-03-31'.
Final SELECT: LEFT JOIN cohort_size to m3_active to preserve cohorts with zero retention; ROUND(100.0 * m3 / size, 1).