Investigate all 10 planted data quality issues across the BI schema
Something is very wrong with the numbers the board deck pulled last week. Revenue totals don't match between reports. Product counts look off. One region is showing up that shouldn't exist. I need a full data quality audit before we present anything externally. Find every issue you can — I've been told there are at least 10 problems buried in this database. Document what you find.
| Column | Type | Key |
|---|---|---|
| employee_id | INT | PK |
| first_name | TEXT | |
| last_name | TEXT | |
| TEXT | ||
| department | TEXT | |
| title | TEXT | |
| hire_date | TEXT | |
| salary | INT | |
| manager_id | INT | |
| is_active | INT |
| Column | Type | Key |
|---|---|---|
| product_id | INT | PK |
| product_name | TEXT | |
| category | TEXT | |
| unit_price | REAL | |
| launch_date | TEXT |
| Column | Type | Key |
|---|---|---|
| region_id | INT | PK |
| region_name | TEXT | |
| country | TEXT | |
| timezone | TEXT |
| Column | Type | Key |
|---|---|---|
| sale_id | INT | PK |
| employee_id | INT | FK → dim_employees |
| product_id | INT | FK → dim_products |
| region_id | INT | FK → dim_regions |
| sale_date | TEXT | |
| quantity | INT | |
| revenue | REAL | |
| discount | REAL |
| Column | Type | Key |
|---|---|---|
| target_id | INT | PK |
| employee_id | INT | FK → dim_employees |
| quarter | INT | |
| year | INT | |
| target_amount | REAL | |
| actual_amount | REAL |
Each hint you reveal reduces the XP you can earn. Try the query first.
Before aggregating fact_sales, audit dim_products — if the same product name appears on two rows, every downstream JOIN double-counts it. Think about how you'd spot a name that repeats.
Look for the ghost region: SELECT r.region_name, COUNT(s.sale_id) FROM dim_regions r LEFT JOIN fact_sales s ON r.region_id = s.region_id GROUP BY r.region_name
Investigate hire dates: SELECT employee_id, hire_date FROM dim_employees WHERE hire_date > '2026-01-01' OR hire_date LIKE '%/%'