Use GROUP BY with TRIM to clean trailing whitespace, ORDER BY DESC, and LIMIT
What are our top 10 most common diagnoses by encounter volume? Use the diagnosis_code field. I suspect there may be duplicates from whitespace issues in the EHR extract — our IT team has flagged this before. Clean it up before you count.
| Column | Type | Key |
|---|---|---|
| encounter_id | INT | PK |
| patient_id | INT | FK → dim_patients |
| provider_id | INT | FK → dim_providers |
| department_id | INT | FK → dim_departments |
| admit_date | TEXT | |
| discharge_date | TEXT | |
| diagnosis_code | TEXT | |
| diagnosis_description | TEXT | |
| encounter_type | TEXT | |
| discharge_disposition | TEXT |
Each hint you reveal reduces the XP you can earn. Try the query first.
Single-table top-N by aggregate count, with a string-cleaning step before the grouping key.
TRIM the diagnosis code in BOTH the SELECT and the GROUP BY, then ORDER BY count descending and LIMIT to the top slice.
Without TRIM, codes that differ only by trailing whitespace split into separate groups and the count splits with them.