Technique

Cohort Retention in SQL: The Query Every Analyst Eventually Owns

The retention triangle is the single most-requested chart in product analytics. Here’s how it’s built — cohort anchor, grid query, NULL traps, and how to read what stakeholders actually ask for.

11 min read

Every analyst eventually owns the retention query. At a SaaS company it’s "what percent of customers who signed up in January were still active in February, March, April." At a marketplace it’s "do buyers come back in week 2." At a consumer app it’s the D1 / D7 / D30 retention chart that every investor asks about on day one. The math is always the same; the SQL is always where people get stuck.

This post walks through the full build: what a cohort actually is (and isn’t), how to pick the anchor date, how to write the grid query cleanly, the one NULL trap that catches everyone, and how to read the resulting triangle the way stakeholders read it.

Cohorts vs segments (the easy confusion)

A segment is a group defined by what users *are* — their plan tier, their country, their signup source. A cohort is a group defined by *when* something happened to them — the month they signed up, the week they first purchased, the day they hit 10 sessions. Retention analysis lives in cohort-land because we care about time-since-event, not about static attributes.

The easiest way to see the difference: segments stay fixed when you re-run the query next week, but cohorts grow — July cohort is July cohort forever, but every week there’s a new "this week’s cohort" that didn’t exist before.

Building the cohort anchor date

The first thing a retention query has to settle is the anchor: what event defines membership in a cohort? Common choices: signed_up_at (the default for most SaaS), first_purchase_at (for ecommerce), onboarded_at (for products with a multi-step setup). Pick the event that best represents "user is now a real user."

-- Anchor = signup month. One row per customer, with their cohort label.
WITH cohorts AS (
  SELECT
    customer_id,
    DATE_TRUNC('month', signed_up_at)::date AS cohort_month
  FROM dim_customers
  WHERE signed_up_at IS NOT NULL
)
SELECT cohort_month, COUNT(*) AS cohort_size
FROM cohorts
GROUP BY cohort_month
ORDER BY cohort_month;

Two design choices worth noting. First, DATE_TRUNC buckets at month-granularity — weekly is also common (swap to WEEK), daily is too granular to read. Second, the WHERE signed_up_at IS NOT NULL drops customers who somehow made it into the dim table without a signup date; that’s almost always data-quality exhaust and you don’t want them skewing the cohort size.

The retention grid query

Here’s the canonical shape. Join the cohort anchor to an activity fact, compute "months since signup" per activity row, then pivot.

WITH cohorts AS (
  SELECT
    customer_id,
    DATE_TRUNC('month', signed_up_at)::date AS cohort_month
  FROM dim_customers
),
activity AS (
  SELECT
    c.cohort_month,
    c.customer_id,
    (EXTRACT(YEAR  FROM s.session_at) - EXTRACT(YEAR  FROM c.cohort_month)) * 12
      + (EXTRACT(MONTH FROM s.session_at) - EXTRACT(MONTH FROM c.cohort_month)) AS months_since_signup
  FROM cohorts c
  JOIN fact_sessions s USING (customer_id)
)
SELECT
  cohort_month,
  months_since_signup,
  COUNT(DISTINCT customer_id) AS active_users
FROM activity
WHERE months_since_signup BETWEEN 0 AND 11
GROUP BY cohort_month, months_since_signup
ORDER BY cohort_month, months_since_signup;

Note COUNT(DISTINCT customer_id) rather than COUNT(*) — a customer with five sessions in the same month should still count once. This is the line that separates "retention query that looks right" from "retention query that overcounts by 3x."

To turn this into the percentage triangle that stakeholders want, divide each cell by the cohort size (month 0 count). That’s easiest as a second CTE:

WITH grid AS (
  -- ... the query above ...
),
sizes AS (
  SELECT cohort_month, active_users AS cohort_size
  FROM grid
  WHERE months_since_signup = 0
)
SELECT
  g.cohort_month,
  g.months_since_signup,
  g.active_users,
  s.cohort_size,
  ROUND(100.0 * g.active_users / NULLIF(s.cohort_size, 0), 1) AS retention_pct
FROM grid g
JOIN sizes s USING (cohort_month)
ORDER BY g.cohort_month, g.months_since_signup;

The "didn’t return" NULL trap

The fix looks like this:

WITH cohort_months AS (
  SELECT DISTINCT DATE_TRUNC('month', signed_up_at)::date AS cohort_month
  FROM dim_customers
),
full_grid AS (
  SELECT
    cm.cohort_month,
    m AS months_since_signup
  FROM cohort_months cm
  CROSS JOIN generate_series(0, 11) AS m
)
SELECT
  fg.cohort_month,
  fg.months_since_signup,
  COUNT(DISTINCT a.customer_id) AS active_users
FROM full_grid fg
LEFT JOIN activity a
  ON a.cohort_month = fg.cohort_month
 AND a.months_since_signup = fg.months_since_signup
GROUP BY fg.cohort_month, fg.months_since_signup
ORDER BY fg.cohort_month, fg.months_since_signup;
Exercise

Build a weekly retention grid for customers who signed up in the last 12 weeks. Each cell should be the percent of the signup cohort that had a session in that week (0 for no sessions, not missing).

Schema hint

dim_customers(customer_id, signed_up_at), fact_sessions(customer_id, session_at)

Expected

One row per (cohort_week, weeks_since_signup) pair. Columns: cohort_week, weeks_since_signup, cohort_size, active_users, retention_pct. Expect a triangle-ish shape where retention_pct starts at 100% at week 0 and decreases.

Show solution
-- Full retention grid with LEFT JOIN so empty cells appear as 0.
WITH cohorts AS (
  SELECT
    customer_id,
    DATE_TRUNC('week', signed_up_at)::date AS cohort_week
  FROM dim_customers
  WHERE signed_up_at >= DATE('now', '-84 days')
),
cohort_sizes AS (
  SELECT cohort_week, COUNT(*) AS cohort_size FROM cohorts GROUP BY cohort_week
),
full_grid AS (
  SELECT cs.cohort_week, cs.cohort_size, w AS weeks_since_signup
  FROM cohort_sizes cs
  CROSS JOIN generate_series(0, 11) AS w
),
activity AS (
  SELECT
    c.cohort_week,
    FLOOR((s.session_at - c.cohort_week) / 7)::int AS weeks_since_signup,
    COUNT(DISTINCT c.customer_id) AS active_users
  FROM cohorts c
  JOIN fact_sessions s USING (customer_id)
  GROUP BY c.cohort_week, weeks_since_signup
)
SELECT
  fg.cohort_week,
  fg.weeks_since_signup,
  fg.cohort_size,
  COALESCE(a.active_users, 0) AS active_users,
  ROUND(100.0 * COALESCE(a.active_users, 0) / NULLIF(fg.cohort_size, 0), 1) AS retention_pct
FROM full_grid fg
LEFT JOIN activity a
  ON a.cohort_week = fg.cohort_week
 AND a.weeks_since_signup = fg.weeks_since_signup
ORDER BY fg.cohort_week, fg.weeks_since_signup;

Reading the triangle (what stakeholders ask)

Once the grid is built, stakeholders ask three questions in order. First: is retention improving? Compare cohorts row-by-row at the same months_since_signup — if May’s month-3 retention is higher than February’s month-3 retention, something’s working.

Second: where do we lose people? Find the biggest percentage-point drop across consecutive MOBs within a single cohort. If retention goes 100% → 42% from MOB 0 to MOB 1, month 1 is where onboarding is failing. If it’s flat at 100% → 94% → 92% and then plunges at month 3, investigate what happens at month 3 (free trial ending, invoice landing, seasonal effect).

Third: what’s the long-run floor? Retention decays asymptotically to some number — the "loyal core." A healthy SaaS business has a floor in the 60–75% range by month 12; a consumer app might floor at 10–20% by month 3. If yours floors lower, you either have a product-market-fit problem or a cohort definition problem — often the latter.

Retention queries look intimidating the first time because of the date arithmetic and the pivot, but the underlying pattern — anchor, join, bucket, pivot — is the same one you already use for campaign performance and funnel reporting. Master the triangle once, and you can build it on any product schema inside a day of learning the tables.

Build these reflexes against real data.

caseSQL runs 100+ missions against a realistic star schema with planted data-quality issues. Free tier runs in your browser; upgrade to Pro for interview prompts.

Keep reading