First LEFT JOIN with IS NULL anti-join. Find ICD codes in the codebook (dim_icd) that never appear in fact_diagnoses. Sets up the Hard-tier anti-join intuition behind the M24 data-quality audit.
Codebook utilization audit, follow-up to last quarter's coder training. We added a bunch of codes to dim_icd to support the ED expansion (sepsis, AKI, respiratory failure variants), but I'm hearing the coding team isn't reaching for them. Show me every code in dim_icd that has *zero* rows in fact_diagnoses — those are the codes we trained on but nobody's documenting. Just code, description, and body system. Sort by code. If the list is short we can run targeted refreshers; if it's long we have a documentation problem, not a knowledge problem.
| Column | Type | Key |
|---|---|---|
| icd_code | TEXT | PK |
| icd_description | TEXT | |
| ccsr_category | TEXT | |
| ccsr_description | TEXT | |
| body_system | TEXT | |
| is_chronic | INT | |
| hcc_flag | INT |
| Column | Type | Key |
|---|---|---|
| diagnosis_id | INT | PK |
| encounter_id | INT | FK → fact_encounters |
| icd_code | TEXT | FK → dim_icd |
| dx_type | TEXT | |
| poa_indicator | TEXT | |
| dx_rank | INT |
Each hint you reveal reduces the XP you can earn. Try the query first.
LEFT JOIN dim_icd to fact_diagnoses ON icd_code (NOT INNER JOIN — INNER would only keep codes WITH activity, which is the opposite of what Maya wants).
Filter WHERE fd.icd_code IS NULL — the codes that didn't match any fact_diagnoses row. That's the anti-join idiom.
Three columns: icd_code, icd_description, body_system. ORDER BY icd_code.