|
|
Are Duplicate Records Eroding Your Bottom Line?
Fuzzy matching and deduplication help achieve a single view of customers for
competitive advantage.
Bad data is a challenge for all organizations. A recent TDWI study reported that
organizations lose more than $611 billion each year due to bad data.
And one major root cause of bad data is duplicate records. Conflicting
data is corruptive to the integrity of databases and prevents organizations from
gaining a single, accurate and organized view of enterprise data. Poor data
quality that includes duplicate records restricts the implementation of
mission-critical initiatives and creates the risk of poor business decisions by
clouding critical customer, employee, and financial information.
Duplicate records come from many sources including acquisitions or mergers,
legacy systems, data migrations and data entry errors. Regardless of the source,
these issues quickly became a costly expense for your business in program
inefficiencies,
missed opportunities, and erroneous views of information.
There is, however, a way to turn this chaotic data into actionable data – by
identifying and eliminating duplicate records. Deduplication of data through the
use of merge/purge solutions is an important component of the data cleansing
process. But identifying duplicate records has its own set of challenges.
-Phonetic Matching
-N-gram or Q-gram-based Algorithms
-Jaro-Winkler Algorithm
-More Fuzzy Algorithms
1. Standardize your data
2. Determine your business needs
3. Monitor your data
|
|
|
|