Filter noise rows out of a SUM using WHERE — the data-quality-before-reporting pattern
Ops team is reporting a total spend number that doesn't reconcile to the GL — by a lot. I suspect they're summing raw `banking_transactions.amount` without excluding reversals and chargebacks. Please give me the clean total: SUM of `amount` where `is_reversal = 0` AND `is_chargeback = 0`. Call the column `total_clean_spend`. Single number.
| Column | Type | Key |
|---|---|---|
| txn_id | INT | PK |
| account_id | INT | FK → banking_accounts |
| posted_date | TEXT | |
| amount | REAL | |
| merchant_category | TEXT | |
| channel | TEXT | |
| is_reversal | INT | |
| is_chargeback | INT |
Each hint you reveal reduces the XP you can earn. Try the query first.
Reversal rows carry the SAME positive amount as the original transaction (absolute-value convention in this source system). If you sum them in, you double-count.
Chargebacks are flagged on the original transaction itself — not as a separate line. Exclude them the same way.
SUM the amount with both flags filtered to 0 in the WHERE clause, alias the column total_clean_spend. Worth running the unfiltered SUM separately to see how big the gap is.