News
|
Are You Afraid to Say Goodbye to Your Data?
By Dylan Jones, founder, Data Quality Pro
When identifying data in scope for a migration, I
typically start from the premise that ALL data is
out of scope, unless someone can justify its
existence. (This forces the emphasis back on the
business to justify their use of the data).
In most cases, at least half of the information was
found to have limited value and could be cut from
the target system, typically to significant cost
savings, as every item of data incurs modeling, data
quality, data mapping, transfer coding, and
extensive validation.
The Causes of Growth
Why have data volumes grown so excessively? There
are plenty of reasons:
1. Storage is affordable and accessible;
2. Data warehouses need feeding with data (but how
much of that data is transformed to actions?);
3. Applications are designed that have
attributes/data structures that are bloated and in
many cases redundant;
4. System silos lead to replication of corporate
data;
5. Mergers and acquisitions are commonplace, data
often comes with the deal; and
6. There is no archive strategy.
I believe the main reason data volumes are growing,
though, is simply because of the last point:
Organizations are not very good at developing an
archival strategy to remove stale data. The impacts
of this growth are numerous:
• Increased staffing costs to tune and manage the
data;
• Additional cooling/infrastructure costs;
• Reduced query performance;
• Backup windows become compromised;
• Data integration and data migration become far
more complex and costly;
• New IT projects take longer and are more prone to
failure; and
• Slower performance lowers knowledge worker
productivity and increases costs.
The Impact of Stale Data on Data Quality Management
There is a danger in assessing data quality across
stale data, as it can dramatically skew your
findings.
If the data quality was found to be poor
historically (perhaps there was a lack of
completeness in the past, but now there are far less
data “gaps”) we may incorrectly assume that our
improvement process is working correctly.
I recall an organization that, upon receipt of their
new data profiling tool, pointed it at their billing
system.
They were horrified to find thousands of historical
errors in tariff coding, product code allocation,
and many other issues. The problem was that the
company had shifted their business model from
offering products, to focusing far more on services.
In addition, many of the customers incorrectly
billed in the past had terminated their accounts. By
taking a data quality assessment of this historical
data, the company was, in fact, providing no real
insight into data quality across their current
business model.
Yes, they discovered they had badly designed
processes, but a workshop with the knowledge workers
confirmed the same insight within a few minutes.
What they should have been focusing on originally is
how data quality impacts their business TODAY.
Designing an Archive Strategy—Getting re-use from
the Data Quality Team
There are a number of techniques common to the data
quality practitioner that can play a useful role in
the decommissioning of your corporate data:
• Information chain mapping: Help identify the flow
of information across the enterprise so that any
downstream data consumers can be assessed for
potential impact from decommissioned data.
• Data profiling: Analyzing the statistics of data
elements (records/attributes) can help identify
redundant data that can be eliminated.
• Data matching/relationship discovery: Can help
identify dependent data in disparate systems so that
a synchronized process of data removal can take
place.
• CRUD analysis: Identifying which applications
Create, Read, Update, or Delete data is of great
importance when determining which datasets can be
archived.
So, What Next?
Archiving data is nearly always initiated by IT. If
you’re on the business side, start the discussions
now and play your part, because there are
significant benefits to the business community in
archiving off data. By waiting for something magical
to happen without your involvement, means it will
simply never get done.
(Note that we’re talking about archiving, not
deleting, typically on a readily accessible medium.)
The data can still be maintained in an offline store
for compliance or reporting requirements, but
particularly if you want to reduce the costs of your
data quality management efforts and create a more
effective workforce, it may just be the time to
collaborate with your IT colleagues and begin the
essential activity of creating an archive process.
If you find after several months that on no occasion
did you need to dip into the archive to retrieve
some past information, it may be time to archive to
tape, store offsite, and cut loose for good.
---Source: Data Quality Pro July 22,
2010 (www.dataqualitypro.com). Dylan Jones is the
founder of Data Quality Pro. Reach him at http://www.dataqualitypro.com/data-quality-dylan-jones.
|
|
|
Melissa Data
|
 |

| Enhance your
website, software or database with
easy-to-integrate data quality programming tools
and web services. |
|
|
|
|
 |

|
Save money on postage using leading
mail preparation software and other
direct marketing products. |
|
|
|
|
 |

Update & standardize addresses and
find out more about contacts in your
database.
|
|
|
|
|
 |

Find new customers perfect for your
business with our online and
specialty mailing lists.
|
|
|
|
|
 |

Locate the business information you
need such as ZIP Codes, address
verification, maps.
|
|
|
|
|

Download
your free copy of the Melissa Data product catalog.
|
|