Data Quality Tools, Mailing Software, Lists, NCOA, Data Enhancements
Shopping Cart Cart | Newsletters | Search
Call 1-800-Melissa Products      Solutions Professional Services Downloads Support Resources Lookups Company

 
 News


Are Duplicate Records A Blow to Your Bottom Line?
By Abby Garcia Telleria, copywriter & editor of Melissa Data’s DQ Insider newsletter

Is the problem of poor data quality an ongoing challenge for your organization?

A recent TDWI study reported that organizations lose more than $611 billion each year due to bad data. Even more shocking – 42 percent of organizations have made no effort to monitor the quality of their data, according to a study by The Information Difference.

One major root cause of poor data quality is duplicate records. Duplicate records – at any level or amount – can adversely affect the performance of your database, as it can tarnish vital customer contact information. Conflicting data prevents your organization from gaining a single, accurate view of your customer – creating the risk of making poor business decisions.

Duplicate records also lead to increased manual labor, lackluster customer response-driven initiatives, and storage and retrieval optimization concerns.

But there is a way you can turn chaotic data into actionable information – by identifying and eliminating duplicate records.

The Problem of Finding Non-Exact Matches
Deduplication of data through the use of merge/purge solutions is one of the critical components in the data cleansing process. But the biggest roadblock to identifying duplicates lies in detecting non-exact matching duplicate records –data that appears to be multiple sets of unique information, but are actually duplicate records. These “non-exact matching” duplicates are very difficult to identify.

For example, a ‘Beth Smith’ at ‘United Data Machines’ can be recorded in the same or different database as ‘Smithe, Elizabeth’ at ‘UDM’. In reality, Beth Smith and Elizabeth Smithe are the same person, but your organization might identify the data as two different contacts, with two different order histories, two different buying patterns, etc.

The most effective merge/purge applications use fuzzy matching algorithms that identify these hard-to-spot, non-exact matching duplicates.

Fuzzy matching is a mathematical process that determines the similarities between data sets, information and facts – where the outcome is neither true nor false, nor 100 percent certain, hence the word, “fuzzy.” The process compares any data type of any length and from any place in a field to find non-exact matches.

Types of Fuzzy Matching Algorithms
There are several different algorithms that can be implemented as part of the deduplication process.

Phonetic matching - Utilizes the phonetic algorithm to detect “alike-sounding” relationships between words. Phonetic matching allows you to perform approximate searches, instead of just ‘exact’ matches – thus enabling your organization to find variations of a name.

N-gram or Q-gram-based - The linear n-gram or q-gram-based algorithm models are primarily used in statistical, natural language processing. An n-gram is a subsequence of n items from a given sequence – which can be phonemes, syllables, letters, words or base pairs, as defined by Wikipedia.

Jaro-Winkler - The Jaro-Winkler distance is a measure of similarity between two strings. It is mainly used in the area of record linkage for duplicate detection.

Power and Performance in One Package
Most mailing software packages integrate a deduplication function as part of the address hygiene process to not only eliminate bad addresses from the mailing, but find and eliminate duplicate records. The combined process ultimately saves the mailer maximum dollars on both printing and postage.

For mailers needing a more potent merge/purge deduping process, they can choose to incorporate a more sophisticated program that identifies the most difficult-to-detect duplicate records and also enables USPS® CASSTM address verification for superior list hygiene.

Melissa Data’s MatchUp software facilitates the most sophisticated record matching process available with its advance fuzzy matching algorithms, along with CASS address verification routines – for organizations that are serious about cleaning up and clearing out the waste.

Protecting the Value of Your Database
The true value of your database is determined by one fundamental component – the quality of your data. Without data that is reliable, accurate and updated, your organization can’t deliver trusted customer information throughout the enterprise. Detecting the most difficult-to-detect duplicate records through the use of deduplication solutions like MatchUp helps streamline your database, improve marketing efficiency, and achieve a unified view of your customer.

---Source: To learn more about how to implement deduplication technology into your operations, download the white paper, “Are Duplicate Records Eroding Your Bottom Line?” Or call 1-800-MELISSA to get a free trial of MatchUp.
 

Melissa Data


 
Enhance your website, software or database with easy-to-integrate data quality programming tools and web services.


 
Save money on postage using leading mail preparation software and other direct marketing products.


 
Update & standardize addresses and find out more about contacts in your database.

 


 
Find new customers perfect for your business with our online and specialty mailing lists.
 


 
Locate the business information you need such as ZIP Codes, address verification, maps.
 

Melissa Data Catalog - Your partner in data quality









Download
your free copy of the Melissa Data product catalog.


 


Follow us on:

Facebook           Twitter

           


Article Library | Direct Mail | Copywriting | Data Quality | eMail | Case Studies | Technical | Postal
Marketing Strategies | Internet & Web | Industry News | Subscript to Newsletters