JOIN application fact to customer dim, GROUP BY state with COUNT
I'm building the geographic exposure slide for the CRA committee. Count applications by the applicant's home state. Two columns — `state`, `application_count` — and please sort by application_count descending, then state ascending for the tie-break. Skip walk-in apps where customer_id is NULL — I want only the apps tied to an existing customer record.
| Column | Type | Key |
|---|---|---|
| app_id | INT | PK |
| customer_id | INT | FK → retail_customers |
| product_id | INT | FK → retail_products |
| application_date | TEXT | |
| requested_amount | REAL | |
| requested_term | INT | |
| loan_purpose | TEXT | |
| action_taken | INT | |
| denial_reason_1 | INT | |
| aus_recommendation | TEXT | |
| rate_spread | REAL | |
| lien_status | INT | |
| occupancy | INT | |
| ethnicity | INT | |
| race | INT | |
| sex | INT | |
| age_bucket | TEXT | |
| property_tract | TEXT | FK → lending_geography |
| Column | Type | Key |
|---|---|---|
| customer_id | INT | PK |
| household_id | TEXT | |
| first_name | TEXT | |
| last_name | TEXT | |
| date_of_birth | TEXT | |
| state | TEXT | |
| segment | TEXT | |
| fico_current | INT | |
| estimated_income | REAL | |
| kyc_risk | TEXT | |
| acquisition_channel | TEXT | |
| onboarded_date | TEXT | |
| status | TEXT |
Each hint you reveal reduces the XP you can earn. Try the query first.
The state column lives on the customer dim, not on applications — JOIN through `customer_id` to reach it.
Walk-in apps have a NULL customer_id. An INNER JOIN already drops them implicitly; an explicit `IS NOT NULL` filter just makes the population assumption visible. Fair-lending memos always document who's in scope.
Two-key sort: count descending so the headline state leads, then state ascending so equal-count states tie-break alphabetically and the ranking stays deterministic across runs.