First aggregation on fact_diagnoses. Reveals the Principal / Admit / Secondary distribution that every downstream mission filters on. GROUP BY a categorical column with COUNT(*).
Heads up — first day on quality and I'm trying to get my bearings on the diagnosis table. I see fact_diagnoses has a dx_type column with values like 'Principal' and 'Secondary' but I don't actually know how many of each we have in the data. Can you pull a count by dx_type so I have a baseline before I start filtering on it for the readmission stuff? Just dx_type and count, sorted with the biggest bucket on top.
| Column | Type | Key |
|---|---|---|
| diagnosis_id | INT | PK |
| encounter_id | INT | FK → fact_encounters |
| icd_code | TEXT | FK → dim_icd |
| dx_type | TEXT | |
| poa_indicator | TEXT | |
| dx_rank | INT |
Each hint you reveal reduces the XP you can earn. Try the query first.
Single table — fact_diagnoses. GROUP BY dx_type, COUNT(*) AS dx_count.
Three values to expect: Principal (one per encounter that has any dx), Admit (admit-time documentation, on inpatient/observation only), Secondary (comorbidities, multiple per encounter possible).
ORDER BY dx_count DESC so the biggest bucket leads — Riley's asking for the baseline view, biggest first.