Conditional aggregation grouped by a denormalized dim attribute. Practice the same SUM/COUNT refund-rate pattern from M4 but joined through dim_listings to surface category-level signal — different cut, same skill.
Category review with the merchant team. Refund rate by category — same shape as M4 but cut by dim_listings.category instead of marketplace. Four columns: category, total_orders, refunded_orders, refund_rate_pct (rounded 1 decimal). Sort refund_rate_pct descending, category ascending. Apparel typically tops the refund rate in marketplace data — sizing returns are inherent to the category.
| Column | Type | Key |
|---|---|---|
| order_id | INT | PK |
| listing_id | INT | FK → dim_listings |
| marketplace_id | INT | FK → dim_marketplaces |
| order_ts | TEXT | |
| units | INT | |
| gross_amount | REAL | |
| refunded | INT |
| Column | Type | Key |
|---|---|---|
| listing_id | INT | PK |
| marketplace_id | INT | FK → dim_marketplaces |
| asin_or_listing_id | TEXT | |
| product_title | TEXT | |
| category | TEXT | |
| list_price | REAL | |
| current_status | TEXT |
Each hint you reveal reduces the XP you can earn. Try the query first.
JOIN dim_listings on listing_id to pull category.
GROUP BY l.category so the refund rate aggregates at the category grain.
SUM(refunded) for refunded count, COUNT(*) for total.
ROUND(100.0 * SUM(refunded) / COUNT(*), 1) AS refund_rate_pct.