Technique

CTEs and Recursive Queries for Analysts

Why `WITH` changed how analysts write SQL, when to chain vs nest, and the one recursive pattern that every analyst will eventually need (org charts, category trees, session chains).

9 min read

The WITH statement — the common table expression, or CTE — is the single biggest readability improvement SQL has seen in the last twenty years. And yet many analysts still write deeply nested subqueries because that’s what their courses taught. This post is the case for CTEs becoming your default, plus a short tour of the recursive variant that unlocks hierarchical data problems every analyst eventually faces.

Why CTEs replaced most subqueries

Both forms below return the same result. One of them is pleasant to read six weeks later.

-- Subquery form: reads inside-out, nested, hard to comment
SELECT segment, AVG(ltv) AS avg_ltv
FROM (
  SELECT c.segment, p.ltv
  FROM dim_customers c
  JOIN (
    SELECT customer_id, SUM(amount) AS ltv
    FROM fact_purchases
    WHERE purchased_at >= '2025-01-01'
    GROUP BY customer_id
  ) p USING (customer_id)
) x
GROUP BY segment;

-- CTE form: reads top-down, each block named, each block comment-able
WITH recent_ltv AS (
  -- One row per customer with their 2025 lifetime value
  SELECT customer_id, SUM(amount) AS ltv
  FROM fact_purchases
  WHERE purchased_at >= '2025-01-01'
  GROUP BY customer_id
),
segment_ltv AS (
  -- Join LTV back to customer segment
  SELECT c.segment, r.ltv
  FROM dim_customers c
  JOIN recent_ltv r USING (customer_id)
)
SELECT segment, AVG(ltv) AS avg_ltv
FROM segment_ltv
GROUP BY segment;

The CTE form is longer by a few lines. It’s also the form that survives code review, that a colleague can land on and modify confidently, and that comments usefully when the logic gets complex. For any query longer than three clauses, the CTE form is the correct default.

Chain vs nest: the readability decision

Once you’re using CTEs, the next decision is whether to chain (multiple CTEs in sequence) or nest (one CTE referenced from multiple places). The rule is simpler than it sounds:

  • Chain when each step transforms the data further and the next step only needs the previous step’s output. Most analyst work looks like this: filter → aggregate → join → filter again → final select.
  • Nest (well: reference a CTE from multiple places) when the same intermediate result is needed in two places. Don’t copy-paste the subquery; define it once as a CTE and reference it twice.

Recursive CTEs: the one pattern analysts need

A recursive CTE references itself. That sounds esoteric; it turns out to be the one SQL feature that makes hierarchical data tractable without writing a loop in Python. The skeleton is always the same shape:

WITH RECURSIVE walk AS (
  -- BASE CASE: the starting set of rows
  SELECT node_id, parent_id, 0 AS depth
  FROM nodes
  WHERE parent_id IS NULL          -- roots

  UNION ALL

  -- RECURSIVE CASE: one level deeper each iteration
  SELECT n.node_id, n.parent_id, w.depth + 1
  FROM nodes n
  JOIN walk w ON n.parent_id = w.node_id
)
SELECT * FROM walk;

Every recursive CTE has two parts: a base case that seeds the recursion, and a recursive case that generates the next layer from the previous one, joined via UNION ALL. The database keeps iterating until the recursive case returns zero new rows, then stops.

Org charts, category trees, and session chains

Three places recursive CTEs show up in real analyst work. Each one is a hiring-interview-level problem and a real ticket you’ll see.

Org chart: count all reports under a manager

-- How many people (direct + indirect) report to each manager?
WITH RECURSIVE reports AS (
  SELECT employee_id, manager_id, employee_id AS top_manager, 0 AS depth
  FROM dim_employees

  UNION ALL

  SELECT r.employee_id, e.manager_id, r.top_manager, r.depth + 1
  FROM reports r
  JOIN dim_employees e ON e.employee_id = r.manager_id
)
SELECT
  top_manager,
  COUNT(*) - 1 AS total_reports  -- minus 1 to exclude the manager themselves
FROM reports
GROUP BY top_manager;

Category tree: all descendants of a parent category

E-commerce category trees are classic recursive territory. "Electronics → Computers → Laptops → Gaming Laptops" is four levels deep; a product tagged to Gaming Laptops should roll up into all three parent categories for reporting.

Session chains: next-page journeys

Less obviously recursive: "starting from landing page X, what’s the most common 5-step user journey?" can be written as a recursive CTE that walks next-page edges from each session. In practice most analyst teams solve this with window functions, but the recursive form handles variable-length journeys that LAG/LEAD can’t.

Exercise

Given dim_employees(employee_id, name, manager_id), return every employee along with their full chain of management above them — as a comma-separated string like "Alice > Bob > Charlie > CEO".

Schema hint

CEO’s manager_id is NULL. Chain depth is at most ~10 levels in this org.

Expected

One row per employee, with a chain column containing the name path from the employee up to the CEO.

Show solution
WITH RECURSIVE chain AS (
  SELECT
    employee_id,
    name,
    manager_id,
    name AS path,
    1 AS depth
  FROM dim_employees

  UNION ALL

  SELECT
    c.employee_id,
    c.name,
    e.manager_id,
    c.path || ' > ' || e.name,
    c.depth + 1
  FROM chain c
  JOIN dim_employees e ON e.employee_id = c.manager_id
  WHERE c.depth < 15  -- cycle guard
)
SELECT employee_id, name, path AS chain
FROM chain c1
WHERE depth = (
  SELECT MAX(depth) FROM chain c2
  WHERE c2.employee_id = c1.employee_id
);

CTEs are the single biggest lever for writing SQL that other humans can read. Chain them by default, reach for the recursive variant the next time a manager asks you something about hierarchy, and you’ll stop writing the kind of inside-out nested subqueries that separate juniors from seniors in code review.

Build these reflexes against real data.

caseSQL runs 100+ missions against a realistic star schema with planted data-quality issues. Free tier runs in your browser; upgrade to Pro for interview prompts.

Keep reading