Window function for total-of-group share + NULL filtering on a HMDA-coded column
Pre-LAR denial review. Pull the distribution of `denial_reason_1` codes across denied applications (`action_taken = 3`). Skip rows where `denial_reason_1 IS NULL` — those would fail FFIEC validation anyway and don't belong on this slide. Three columns: `denial_reason_1` (integer code), `denied_count`, `pct_of_denials` (4 decimals, share of the denial population not the full app population). Sort by count descending, then code ascending.
| Column | Type | Key |
|---|---|---|
| app_id | INT | PK |
| customer_id | INT | FK → retail_customers |
| product_id | INT | FK → retail_products |
| application_date | TEXT | |
| requested_amount | REAL | |
| requested_term | INT | |
| loan_purpose | TEXT | |
| action_taken | INT | |
| denial_reason_1 | INT | |
| aus_recommendation | TEXT | |
| rate_spread | REAL | |
| lien_status | INT | |
| occupancy | INT | |
| ethnicity | INT | |
| race | INT | |
| sex | INT | |
| age_bucket | TEXT | |
| property_tract | TEXT | FK → lending_geography |
Each hint you reveal reduces the XP you can earn. Try the query first.
Two filters compose: action_taken restricted to denied applications, AND `denial_reason_1 IS NOT NULL`. The latter catches the handful of incomplete records that would fail FFIEC validation anyway and don't belong on the slide.
For "share of group total in the same SELECT," reach for an aggregate window function — the family that lets you compute a grand total alongside each grouped row without a self-join or a separate CTE. Look up `SUM(...) OVER ()` and how it composes with COUNT.
Three projected columns; two-key sort (count desc, code asc). Round the share to 4 decimals so the deck formats cleanly.