Normalize inconsistent category casing and calculate each category's share of total revenue
Can you build me a product category revenue breakdown? I want to see each category, its total revenue, and what percentage of overall revenue it represents. I've heard there might be a casing issue in the category data that's splitting one category into two rows — can you normalize it so we're not double-counting? Sort by total revenue descending.
| Column | Type | Key |
|---|---|---|
| sale_id | INT | PK |
| employee_id | INT | FK → dim_employees |
| product_id | INT | FK → dim_products |
| region_id | INT | FK → dim_regions |
| sale_date | TEXT | |
| quantity | INT | |
| revenue | REAL | |
| discount | REAL |
| Column | Type | Key |
|---|---|---|
| product_id | INT | PK |
| product_name | TEXT | |
| category | TEXT | |
| unit_price | REAL | |
| launch_date | TEXT |
Each hint you reveal reduces the XP you can earn. Try the query first.
Revenue lives in the fact table but categories are in the product dimension — you'll need to connect them. Before aggregating, check whether the category column has a data quality issue that could split what should be one group into two.
Use LOWER(category) to normalize 'Electronics' and 'electronics' into one bucket
For the percentage, divide each category's revenue by the total: SUM(revenue) / (SELECT SUM(revenue) FROM fact_sales)