Clinical Trials & Research Analytics Path · Mission 6 of 25Medium

Arm balance — flag the imbalanced trials

Conditional aggregation across two tables: per-trial enrollment by arm, then a window-style total to compute each arm's share. Practice the core randomization-balance audit pattern that biostatisticians run on every trial before unblinding.

Back to Clinical Trials & Research Analytics

The Brief

Elena VasquezBiostatisticianslack-dm

DSMB pre-read needs a randomization audit. Find every trial where any single arm is carrying more than 70% of the post-randomization enrollments — that's the threshold the protocols flag as needing investigator review. Join fact_enrollments to dim_protocols to dim_trials. Skip status = 'screen_fail' (those are pre-randomization). Per-trial total = sum of arm enrollments. Arm share = arm count / trial total. I want trial_name, arm_name, arm_n, arm_share_pct (rounded to 1 decimal), sorted by arm_share_pct DESC. Two trials should pop — one we already know about and one that's a surprise.

You'll practice

Conditional aggregationWindow over GROUP BYFloat promotion

Tables & columns available

fact_enrollmentsfact8 columns

Column	Type	Key
enrollment_id	INT	PK
trial_id	INT	FK → dim_trials
protocol_id	INT	FK → dim_protocols
patient_id	INT	FK → dim_trial_patients
site_id	INT	FK → dim_sites
enrolled_date	TEXT
status	TEXT
withdrawal_reason	TEXT

dim_protocolsdim6 columns

Column	Type	Key
protocol_id	INT	PK
trial_id	INT	FK → dim_trials
arm_name	TEXT
intervention_type	TEXT
atc_class	TEXT
version	TEXT

dim_trialsdim10 columns

Column	Type	Key
trial_id	INT	PK
nct_id	TEXT
trial_name	TEXT
phase	INT
condition	TEXT
therapeutic_area	TEXT
sponsor	TEXT
status	TEXT
start_date	TEXT
end_date	TEXT

Hints (4)

Each hint you reveal reduces the XP you can earn. Try the query first.

Hint 1

Three-table join: enrollments to protocols (for arm name) to trials (for trial name). Exclude screen_fail enrollments — they're pre-randomization and don't affect arm balance.

Hint 2

Two aggregations are needed: per-arm count and per-trial total. The per-trial total can be computed as a window aggregate over the per-arm GROUP BY result (a window function on top of an aggregate is the canonical pattern here).

Hint 3

Cast the arm/total ratio to a float before comparing against 0.70 — integer division silently rounds the share to 0 and filters out every match.

Hint 4

Round the share to one decimal for display. Sort by share descending. Row count is small — only a couple of trials hit the threshold.

← Previous

Mission 5

Mission 7