First two-table join: fact_enrollments to dim_sites on site_id. Practice INNER JOIN, GROUP BY on a dimension attribute, and ranking sites by enrollment volume — the most common operational report at any sponsor or CRO.
The site management call is at 2pm. I need a roster of every site name with how many total enrollments they’ve recorded across all trials — site_name and enrollment_count, sorted with the highest-enrolling site at the top. Tiebreaker: site_name ascending so the order is stable when two sites tie. Pull from fact_enrollments joined to dim_sites on site_id. This is the standing scorecard our site managers see weekly; if a site is at the bottom three weeks running, we have a conversation.
| Column | Type | Key |
|---|---|---|
| enrollment_id | INT | PK |
| trial_id | INT | FK → dim_trials |
| protocol_id | INT | FK → dim_protocols |
| patient_id | INT | FK → dim_trial_patients |
| site_id | INT | FK → dim_sites |
| enrolled_date | TEXT | |
| status | TEXT | |
| withdrawal_reason | TEXT |
| Column | Type | Key |
|---|---|---|
| site_id | INT | PK |
| site_name | TEXT | |
| city | TEXT | |
| state | TEXT | |
| region | TEXT | |
| site_type | TEXT | |
| enrollment_capacity | INT | |
| pi_name | TEXT | |
| activated_date | TEXT |
Each hint you reveal reduces the XP you can earn. Try the query first.
Two-table join: enrollments to sites on the site key. Group on the site's human-readable name so the report header is meaningful.
The aggregate is a row count per site. Name the output column for the metric.
Stable ordering: rank descending by the metric, then alphabetic on site name as the tiebreaker. Result has one row per site that appears in fact_enrollments (INNER JOIN); every site in dim_sites has at least one enrollment in this dataset, so expect 15 rows.