Data CleaningMedium

When do you need fuzzy string matching, and what's the approach?

Seen at: Salesforce · Stripe · Experian

You need to deduplicate a customer table with names like 'John Doe', 'Jon Doe', 'J. Doe', 'John Doe Jr.'. What's your approach?

Draft your answer

Saved to this browser only. Try it before you peek at the model answer.

Stuck? Peek at a hint

Reveals the model answer and the self-score rubric.

More data cleaning questions