If you join a credit-risk team — a bank, a fintech lender, a BNPL, an auto loan portfolio — by year two you will own two queries: the vintage curve and the roll-rate matrix. They sit behind every portfolio review, every board slide on credit quality, every loss forecast. Getting them right is table stakes. Getting them right *quickly* is the difference between a seat at the weekly risk review and a seat at the junior desk.
This post is the practical how-to: what each one is, why stakeholders care, how to write it from a realistic banking schema, and how it gets asked about in an interview. Everything here runs against the caseSQL banking schema — dim_loans, fact_originations, fact_loan_status — so you can try each query end-to-end.
What a vintage curve actually is
A vintage curve tracks a cohort of loans originated in the same month, measuring how they perform over their lifetime. The x-axis is "months on book" (MOB): 1, 2, 3, ... 24. The y-axis is a risk metric — most commonly the cumulative default rate or the 30+ day delinquency rate. Each line on the chart is one origination month.
The curve’s shape tells you whether newer loans are performing worse than older loans at the same age, which is a leading indicator of credit quality. If the April vintage is at 2% default by MOB 6 and the October vintage is at 4% default by MOB 6, something changed in underwriting between those months — and the credit officer wants to know before month 12.
Writing a monthly vintage in SQL
The core move is joining originations to the status history on months_on_book rather than on calendar date. Every warehouse has slightly different date arithmetic — in PostgreSQL:
WITH originations AS (
SELECT
loan_id,
DATE_TRUNC('month', originated_at)::date AS vintage_month
FROM fact_originations
),
loan_ages AS (
SELECT
o.vintage_month,
o.loan_id,
s.status,
(EXTRACT(YEAR FROM s.status_date) - EXTRACT(YEAR FROM o.vintage_month)) * 12
+ (EXTRACT(MONTH FROM s.status_date) - EXTRACT(MONTH FROM o.vintage_month)) AS months_on_book
FROM originations o
JOIN fact_loan_status s USING (loan_id)
)
SELECT
vintage_month,
months_on_book,
COUNT(DISTINCT loan_id) AS loans,
SUM(CASE WHEN status = '30_day_delinquent' THEN 1 ELSE 0 END) AS dq_30,
ROUND(100.0 * SUM(CASE WHEN status = '30_day_delinquent' THEN 1 ELSE 0 END)
/ NULLIF(COUNT(DISTINCT loan_id), 0), 2) AS dq_rate_pct
FROM loan_ages
WHERE months_on_book BETWEEN 1 AND 24
GROUP BY vintage_month, months_on_book
ORDER BY vintage_month, months_on_book;Three details to notice. First, the DATE_TRUNC rolls originations up to the month level — daily vintages are too granular to read in a chart. Second, the months_on_book math uses year+month arithmetic rather than AGE() or EXTRACT(DAYS) because stakeholders expect MOB 12 to mean "exactly one year later," not "365 or 366 days later depending on the leap year." Third, NULLIF(..., 0) on the denominator so a vintage with zero loans doesn’t blow up the query.
Build a monthly vintage for the first 12 months on book, showing the cumulative 30+ day delinquency rate (not the per-month rate). A loan that went 30-DQ in MOB 3 should count as delinquent in MOB 3, 4, 5, ... 12.
fact_originations(loan_id, originated_at), fact_loan_status(loan_id, status_date, status)
One row per (vintage_month, months_on_book) pair, with columns loans, cum_dq, cum_dq_rate_pct. Cumulative means the percent never decreases as MOB increases for a given vintage.
Show solution
WITH first_dq AS (
SELECT
loan_id,
MIN(status_date) FILTER (WHERE status = '30_day_delinquent') AS first_dq_date
FROM fact_loan_status
GROUP BY loan_id
),
vintages AS (
SELECT
o.loan_id,
DATE_TRUNC('month', o.originated_at)::date AS vintage_month,
f.first_dq_date
FROM fact_originations o
LEFT JOIN first_dq f USING (loan_id)
),
panel AS (
SELECT
v.vintage_month,
m AS months_on_book,
COUNT(*) AS loans,
COUNT(*) FILTER (
WHERE v.first_dq_date IS NOT NULL
AND (EXTRACT(YEAR FROM v.first_dq_date) - EXTRACT(YEAR FROM v.vintage_month)) * 12
+ (EXTRACT(MONTH FROM v.first_dq_date) - EXTRACT(MONTH FROM v.vintage_month)) <= m
) AS cum_dq
FROM vintages v
CROSS JOIN generate_series(1, 12) AS m
GROUP BY v.vintage_month, m
)
SELECT
vintage_month,
months_on_book,
loans,
cum_dq,
ROUND(100.0 * cum_dq / NULLIF(loans, 0), 2) AS cum_dq_rate_pct
FROM panel
ORDER BY vintage_month, months_on_book;Roll rates: the other side of the coin
Where vintage curves answer "how are loans performing as they age," roll rates answer "what percent of loans currently in delinquency bucket X will move to bucket X+1 next month." The matrix has delinquency buckets on both axes — Current, 30-DQ, 60-DQ, 90-DQ, 120+-DQ — and each cell is the probability of transitioning from the row state this month to the column state next month.
Credit officers use roll rates to forecast losses. If 45% of 60-DQ loans historically roll to 90-DQ, and 80% of 90-DQ loans eventually charge off, then the 60-DQ population tells you roughly how many dollars will charge off in two months. That forecast lands in the loss reserve (allowance for credit losses) on the balance sheet.
WITH status_pairs AS (
SELECT
curr.loan_id,
curr.status AS from_status,
next.status AS to_status
FROM fact_loan_status curr
JOIN fact_loan_status next
ON next.loan_id = curr.loan_id
AND next.status_date = (curr.status_date + INTERVAL '1 month')::date
)
SELECT
from_status,
to_status,
COUNT(*) AS transitions,
ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (PARTITION BY from_status), 2) AS pct_of_from
FROM status_pairs
GROUP BY from_status, to_status
ORDER BY from_status, to_status;The SUM(COUNT(*)) OVER (PARTITION BY from_status) is the trick — an analytic function returning the total transitions out of each from_status bucket, which becomes the denominator for the percentage. This is one of those queries where window functions earn their complexity.
Net charge-off rate in one CTE
The classic follow-up once roll rates are in hand: net charge-off (NCO) rate. Charge-offs minus recoveries, divided by average outstanding balance. Annualized. Stakeholders want it monthly, by product, sometimes by vintage. The query is short but every piece has a reason:
SELECT
DATE_TRUNC('month', charge_off_date)::date AS month,
product,
SUM(charge_off_amount) AS gross_co,
SUM(recovery_amount) AS recoveries,
SUM(charge_off_amount) - SUM(recovery_amount) AS net_co,
AVG(outstanding_balance) AS avg_balance,
ROUND(
12 * 100.0 * (SUM(charge_off_amount) - SUM(recovery_amount))
/ NULLIF(AVG(outstanding_balance), 0),
2
) AS nco_rate_annualized_pct
FROM fact_loan_status
WHERE charge_off_date IS NOT NULL
GROUP BY DATE_TRUNC('month', charge_off_date), product
ORDER BY month, product;How this work shows up in an interview
Risk-analyst interviews at banks and fintechs almost always include a vintage-curve question in some form. Common phrasings: "how would you detect that credit quality is deteriorating?" (answer: vintage curves), "how would you forecast losses for next quarter?" (answer: roll rates), "walk me through building an allowance model" (answer: roll-rate-to-NCO chained together). Interviewers rarely want perfect SQL; they want to see that you understand why the shape of the data matters.
A few signal-sending moves during the interview: mention that you’d exclude the first and last MOB of each vintage from cross-vintage comparisons (edge effects); mention that roll rates drift with the macro cycle and point estimates from one quarter can be misleading in a recession; mention that NCO isolates the realized loss but provision-to-NCO gaps can signal management’s view on the cycle. Each of those is one sentence that makes you sound like a year-two analyst, not a year-zero one.
None of this is exotic SQL — it’s CASE WHEN, window functions, and careful NULL handling. What makes it senior work is the business fluency on top: knowing which buckets matter, why the macro overlay exists, how loss reserves read on the balance sheet. That’s the bar caseSQL’s banking path is written to.