Clinical Trials & Research Analytics Path · Mission 6 of 25Medium

Arm balance — flag the imbalanced trials

Conditional aggregation across two tables: per-trial enrollment by arm, then a window-style total to compute each arm's share. Practice the core randomization-balance audit pattern that biostatisticians run on every trial before unblinding.

The Brief

Elena VasquezBiostatisticianslack-dm

DSMB pre-read needs a randomization audit. Find every trial where any single arm is carrying more than 70% of the post-randomization enrollments — that's the threshold the protocols flag as needing investigator review. Join fact_enrollments to dim_protocols to dim_trials. Skip status = 'screen_fail' (those are pre-randomization). Per-trial total = sum of arm enrollments. Arm share = arm count / trial total. I want trial_name, arm_name, arm_n, arm_share_pct (rounded to 1 decimal), sorted by arm_share_pct DESC. Two trials should pop — one we already know about and one that's a surprise.

You'll practice

Conditional aggregationWindow over GROUP BYFloat promotion

Tables & columns available

fact_enrollmentsfact8 columns
ColumnTypeKey
enrollment_idINTPK
trial_idINTFK → dim_trials
protocol_idINTFK → dim_protocols
patient_idINTFK → dim_trial_patients
site_idINTFK → dim_sites
enrolled_dateTEXT
statusTEXT
withdrawal_reasonTEXT
dim_protocolsdim6 columns
ColumnTypeKey
protocol_idINTPK
trial_idINTFK → dim_trials
arm_nameTEXT
intervention_typeTEXT
atc_classTEXT
versionTEXT
dim_trialsdim10 columns
ColumnTypeKey
trial_idINTPK
nct_idTEXT
trial_nameTEXT
phaseINT
conditionTEXT
therapeutic_areaTEXT
sponsorTEXT
statusTEXT
start_dateTEXT
end_dateTEXT

Hints (4)

Each hint you reveal reduces the XP you can earn. Try the query first.

Hint 1

Three-table join: enrollments to protocols (for arm name) to trials (for trial name). Exclude screen_fail enrollments — they're pre-randomization and don't affect arm balance.

Hint 2

Two aggregations are needed: per-arm count and per-trial total. The per-trial total can be computed as a window aggregate over the per-arm GROUP BY result (a window function on top of an aggregate is the canonical pattern here).

Hint 3

Cast the arm/total ratio to a float before comparing against 0.70 — integer division silently rounds the share to 0 and filters out every match.

Hint 4

Round the share to one decimal for display. Sort by share descending. Row count is small — only a couple of trials hit the threshold.