Every analyst eventually owns the retention query. At a SaaS company it’s "what percent of customers who signed up in January were still active in February, March, April." At a marketplace it’s "do buyers come back in week 2." At a consumer app it’s the D1 / D7 / D30 retention chart that every investor asks about on day one. The math is always the same; the SQL is always where people get stuck.
This post walks through the full build: what a cohort actually is (and isn’t), how to pick the anchor date, how to write the grid query cleanly, the one NULL trap that catches everyone, and how to read the resulting triangle the way stakeholders read it.
Cohorts vs segments (the easy confusion)
A segment is a group defined by what users *are* — their plan tier, their country, their signup source. A cohort is a group defined by *when* something happened to them — the month they signed up, the week they first purchased, the day they hit 10 sessions. Retention analysis lives in cohort-land because we care about time-since-event, not about static attributes.
The easiest way to see the difference: segments stay fixed when you re-run the query next week, but cohorts grow — July cohort is July cohort forever, but every week there’s a new "this week’s cohort" that didn’t exist before.
Building the cohort anchor date
The first thing a retention query has to settle is the anchor: what event defines membership in a cohort? Common choices: signed_up_at (the default for most SaaS), first_purchase_at (for ecommerce), onboarded_at (for products with a multi-step setup). Pick the event that best represents "user is now a real user."
-- Anchor = signup month. One row per customer, with their cohort label.
WITH cohorts AS (
SELECT
customer_id,
DATE_TRUNC('month', signed_up_at)::date AS cohort_month
FROM dim_customers
WHERE signed_up_at IS NOT NULL
)
SELECT cohort_month, COUNT(*) AS cohort_size
FROM cohorts
GROUP BY cohort_month
ORDER BY cohort_month;Two design choices worth noting. First, DATE_TRUNC buckets at month-granularity — weekly is also common (swap to WEEK), daily is too granular to read. Second, the WHERE signed_up_at IS NOT NULL drops customers who somehow made it into the dim table without a signup date; that’s almost always data-quality exhaust and you don’t want them skewing the cohort size.
The retention grid query
Here’s the canonical shape. Join the cohort anchor to an activity fact, compute "months since signup" per activity row, then pivot.
WITH cohorts AS (
SELECT
customer_id,
DATE_TRUNC('month', signed_up_at)::date AS cohort_month
FROM dim_customers
),
activity AS (
SELECT
c.cohort_month,
c.customer_id,
(EXTRACT(YEAR FROM s.session_at) - EXTRACT(YEAR FROM c.cohort_month)) * 12
+ (EXTRACT(MONTH FROM s.session_at) - EXTRACT(MONTH FROM c.cohort_month)) AS months_since_signup
FROM cohorts c
JOIN fact_sessions s USING (customer_id)
)
SELECT
cohort_month,
months_since_signup,
COUNT(DISTINCT customer_id) AS active_users
FROM activity
WHERE months_since_signup BETWEEN 0 AND 11
GROUP BY cohort_month, months_since_signup
ORDER BY cohort_month, months_since_signup;Note COUNT(DISTINCT customer_id) rather than COUNT(*) — a customer with five sessions in the same month should still count once. This is the line that separates "retention query that looks right" from "retention query that overcounts by 3x."
To turn this into the percentage triangle that stakeholders want, divide each cell by the cohort size (month 0 count). That’s easiest as a second CTE:
WITH grid AS (
-- ... the query above ...
),
sizes AS (
SELECT cohort_month, active_users AS cohort_size
FROM grid
WHERE months_since_signup = 0
)
SELECT
g.cohort_month,
g.months_since_signup,
g.active_users,
s.cohort_size,
ROUND(100.0 * g.active_users / NULLIF(s.cohort_size, 0), 1) AS retention_pct
FROM grid g
JOIN sizes s USING (cohort_month)
ORDER BY g.cohort_month, g.months_since_signup;The "didn’t return" NULL trap
The fix looks like this:
WITH cohort_months AS (
SELECT DISTINCT DATE_TRUNC('month', signed_up_at)::date AS cohort_month
FROM dim_customers
),
full_grid AS (
SELECT
cm.cohort_month,
m AS months_since_signup
FROM cohort_months cm
CROSS JOIN generate_series(0, 11) AS m
)
SELECT
fg.cohort_month,
fg.months_since_signup,
COUNT(DISTINCT a.customer_id) AS active_users
FROM full_grid fg
LEFT JOIN activity a
ON a.cohort_month = fg.cohort_month
AND a.months_since_signup = fg.months_since_signup
GROUP BY fg.cohort_month, fg.months_since_signup
ORDER BY fg.cohort_month, fg.months_since_signup;Build a weekly retention grid for customers who signed up in the last 12 weeks. Each cell should be the percent of the signup cohort that had a session in that week (0 for no sessions, not missing).
dim_customers(customer_id, signed_up_at), fact_sessions(customer_id, session_at)
One row per (cohort_week, weeks_since_signup) pair. Columns: cohort_week, weeks_since_signup, cohort_size, active_users, retention_pct. Expect a triangle-ish shape where retention_pct starts at 100% at week 0 and decreases.
Show solution
-- Full retention grid with LEFT JOIN so empty cells appear as 0.
WITH cohorts AS (
SELECT
customer_id,
DATE_TRUNC('week', signed_up_at)::date AS cohort_week
FROM dim_customers
WHERE signed_up_at >= DATE('now', '-84 days')
),
cohort_sizes AS (
SELECT cohort_week, COUNT(*) AS cohort_size FROM cohorts GROUP BY cohort_week
),
full_grid AS (
SELECT cs.cohort_week, cs.cohort_size, w AS weeks_since_signup
FROM cohort_sizes cs
CROSS JOIN generate_series(0, 11) AS w
),
activity AS (
SELECT
c.cohort_week,
FLOOR((s.session_at - c.cohort_week) / 7)::int AS weeks_since_signup,
COUNT(DISTINCT c.customer_id) AS active_users
FROM cohorts c
JOIN fact_sessions s USING (customer_id)
GROUP BY c.cohort_week, weeks_since_signup
)
SELECT
fg.cohort_week,
fg.weeks_since_signup,
fg.cohort_size,
COALESCE(a.active_users, 0) AS active_users,
ROUND(100.0 * COALESCE(a.active_users, 0) / NULLIF(fg.cohort_size, 0), 1) AS retention_pct
FROM full_grid fg
LEFT JOIN activity a
ON a.cohort_week = fg.cohort_week
AND a.weeks_since_signup = fg.weeks_since_signup
ORDER BY fg.cohort_week, fg.weeks_since_signup;Reading the triangle (what stakeholders ask)
Once the grid is built, stakeholders ask three questions in order. First: is retention improving? Compare cohorts row-by-row at the same months_since_signup — if May’s month-3 retention is higher than February’s month-3 retention, something’s working.
Second: where do we lose people? Find the biggest percentage-point drop across consecutive MOBs within a single cohort. If retention goes 100% → 42% from MOB 0 to MOB 1, month 1 is where onboarding is failing. If it’s flat at 100% → 94% → 92% and then plunges at month 3, investigate what happens at month 3 (free trial ending, invoice landing, seasonal effect).
Third: what’s the long-run floor? Retention decays asymptotically to some number — the "loyal core." A healthy SaaS business has a floor in the 60–75% range by month 12; a consumer app might floor at 10–20% by month 3. If yours floors lower, you either have a product-market-fit problem or a cohort definition problem — often the latter.
Retention queries look intimidating the first time because of the date arithmetic and the pivot, but the underlying pattern — anchor, join, bucket, pivot — is the same one you already use for campaign performance and funnel reporting. Master the triangle once, and you can build it on any product schema inside a day of learning the tables.