Monitor the Quality of your Master Data

bloatdecorumSoftware and s/w Development

Oct 30, 2013 (3 years and 10 months ago)

89 views


WWW.PLATON.NET

Monitor the Quality of your
Master Data

THOMAS RAVN

TRA@PLATON.NET

March 16th
th

2010, San Francisco

© Platon

Platon


A leading Information Management consulting firm


Independent of software vendors


Headquarter in Copenhagen, Denmark


220+ employees in 9 offices


300+ customers and 800+ projects


Founded in 1999


Employee owned company

“Platon received good feedback in our satisfaction survey. Clients cited the following strengths:
experience and skill of consultants, business focus and the ability to remain focused on the needs of the
client, and a strong methodological approach”

Gartner
July 2008

2

© Platon

Key Concepts and Definitions

MDM

“Information Management is the
discipline of managing and l
everaging

information in a company as a strategic
asset”

“Master Data Management (MDM) is
the structured management of Master
Data in terms of definitions,
governance, architecture, technology
and processes”

Data

Governance

“Data Governance is the cross
-
functional discipline of managing,
improving, monitoring, maintaining, and
protecting data”

Information
Management

3

“Data Quality Management is the
discipline of ensuring high quality data
in enterprise systems”

DQM

© Platon

Components of an effective
MDM approach

4

MDM

Business
ownership,
responsibility,
accountability

Common
definitions

Effective
Master Data
processes

Data Quality
Management

Protect,
validate and
integrate data
in IT
applications

IT Change
control

Formalize business ownership and stewardship
around data.

Ensure that Master Data is taken
into account each and every time a
business process or an IT system is
changed.

Control in which systems Master
Data is entered and how it is
synchronized across systems.

Manage Master Data Repository.


To be able to share data you need to
share definitions and business rules.
Definitions require management,
rigor and documentation.

Capturing Master data efficiently
needs to be built into the business
processes.

Equally consistent
usage of Master Data needs to be
ensured across business processes
and business functions.

Measure and monitor the
quality of data

© Platon


Typical Data Problems
-

1

5

No

Name

Address

Purchase

90328574

IBM

187 N.Pk. Str. Salem NH 01456

8,494.00

90328575


I.B.M. Inc.

187 N.Pk. St. Sarem NH 01456

3,432.00

90328575


International Bus. M.

187 No. Park St Salem NH 04156

2,243.00

09243242

Int. Bus. Machines


187 Park Ave Salem NH 04156

5,900.00

12398732

Inter
-
Nation Consults


15 Main St. Andover MA 02341


6,800.00

99643413

Int. Bus. Consultants

PO Box 9 Boston MA 02210

10,243.00

43098436

I.B. Manufacturing

Park Blvd. Boston MA 04106

15,999.00

How much did we spend
with IBM last year?

© Platon


Typical Data Problems
-

2

6

Name


Street


Zip

Code

City

CAFÉ SPORTSCLUB

15 3
rd

Street

10001

New

York

CAFÉ SPORT KLUB

15

Third St.

.

NYC

Is this the same
customer?

Are these the same
products?

Description, System 2

1 L Cappucino
-

Mathilde Cafe

FETA W/OLIVES & GARLIC 60G, 45+

1000 ML YOG. PEACH/BANANA

Description, System 1

1/1L Mathilde Cafe Ice Cappucino

45+ FETA M/OLI+HVIDL 60G, 45+

YOGHURT PÆRE/BANAN, 1000ML

© Platon

Typical Problems
-

3


A common problem is overloading of fields, which is the misuse
of a field compared to the intended use. Often because the
field the user wanted to use wasn’t available in the application


Sometimes a field might even have been used for different
purposes by different parts of the organization


7

Customer No


Name


Email

Fax

1234

John

john@
mail.com

Vip

Customer

3368

Pete

pete@mail.com

Tel: 11223344

2345

Bob

bob@mail.com

© Platon

Where Does the Bad Data Come
From?


8

State is a required field


regardless of country

© Platon

Where Does the Bad Data Come
From?


9

© Platon

Top 5 Sources of Bad Data

1.
Lack of ownership and clearly defined responsinility

2.
Lack of common definitions for data

3.
Lack of control of field usage

4.
Lack of process control

5.
Lack of synchronization between systems



10

© Platon

What is Good Data Quality?


11

Larry English:


Quality exists solely in the eye of a customer of a product or service based on the value they
perceive


Information quality is consistently meeting ‘end customers’ expectations through information
and information services, enabling them to perform their jobs effectively


To define information quality, one must identify the "customer" of the data
-

the knowledge
worker who requires data to perform his or her job

Platon definition:

Data Quality is the degree to which data meets the defined standards

© Platon

“Information producers will create information only to
the quality level for which they are
trained
,
measured
and held
accountable
.”


Larry English

“The Law of Information Creation”

12

© Platon

Data Standards & Data Quality

It’s all about the Meta Data…

13


Good Meta Data is prequisite to achieve great
data quality (inferred

from

the
trained

part of
the ”Law of Information

Creation”
)






You can only achieve high quality data if you have
standards to measure against!

© Platon

Defining Good Data standards

14


Business description


Data entry format and conventions


Definition owner


Stakeholders


Definition and keys


Life cycle


Classification(s)


Hierarchies

For every entity define:

For every field define:

Consider what a user needs to know to produce high quality data


Business Owner(s)

© Platon

15

Data Standards


An Example





Challenges


Relating the data definitions to the process documentation


Keeping

the definitions up to date


The same piece of information may be entered in multiple different systems


© Platon

Defining Good Data standards


There are two basic approaches to defining your data standards

1.
Define a system independent Enterprise Information Model and then
map attributes to system fields, or

2.
Define data definitions for a system (screen/table) specific view of
data



If you have one primary system where a data entity is used, option
2 is preferable


If you have many different systems where the same data entity is
used, option 1 is preferable


16

© Platon

Generating Garbage


Garbage In = Garbage Out

Quality
Standard1

In + Quality
Standard2

In

= Garbage Out




17

© Platon

18

Data Quality Monitoring


Like most other things, data quality can only be managed properly
if it is measured and monitored


A data quality monitoring concept is necessary to ensure that you
identify


Trends in data quality


Data quality issues before they impact critical business processes


Areas where process improvements are needed


© Platon

Data Quality Monitoring


For this to work, clearly
-
defined standards, targets for data quality
and follow
-
up mechanisms are required


There is little point in monitoring the quality of your data if no one in
the business feels responsible and if clear business rules data
have not yet been defined


Thus a data quality monitoring concept should go hand in hand
with a data governance model


19

© Platon

The Dimensions of Data Quality

Accuracy

Data

Quality

Timeliness

Does data reflect the real world
objects or a
trusted source?

Are business rules on field and
table relationships met?

Are
shared data
elements
synchronized correct across
the system landscape?

Do we have all
required data?

Are all data values within
the valid
domain for the
field?

Are data available at the
time needed?

20

© Platon

KPI Examples in the different
dimensions

Dimension

KPI Example

Completeness

Pct

of active customer records with an email address

Validity

Pct of active US customers with a

phone number of 10 digits

Accuracy

Pct of active

customers with an mailing address that is
verified as correct against Dun & Bradstreet

Consistency

Pct. of customer records shared

between our CRM system
and our ERP system that has identical values for name,
address and telephone number.

Integrity

Pct. of active product

records

with [type] =

“Service” where
[weight] = 0, or Pct. of open sales orders that refer to an
active customer.

Timeliness

Pct. of supplier records where

the time from request of a
new record to completion and release of the record is less
then 24 hours

21

© Platon

22

The Dimensions of Data Quality

Business Impact

Difficulty of
Measurement

Completeness

Validity

Integrity

Timeliness

Consistency

Accuracy

© Platon

23

The steps in building a
monitoring concept


Building a data quality monitoring concept involves the
following
five
basic steps:

1.
Identify stakeholders

2.
Conduct interviews with stakeholders and selected business users

3.
Identify data quality candidate KPI’s

4.
Select KPI’s for data quality monitoring

5.
For each KPI, define details





© Platon

Finding Good Data Quality KPI’s

Perform a thorough data assessment
(profiling) exercise searching for
common data quality problems and
look for abnormalities

Collect
business input


Business process requirements


Data
quality pain points


Business Intelligence


Business
KPIs

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

XXX

DEFINED KPIs


KPI

Frq

Target

UoM

A

B

C

KPI Candidates


To
find good data quality KPIs collect business input through
interviews with stakeholders (use Interviewing Technique) and a
data assessment. The technique Data Profiling contains more
details on how to analyze data




24

© Platon

Tying Data Quality KPIs to
Business Processes


It is essential that KPIs are not just made up, so your organization
has something to measure


Don’t measure data quality because it’s great to have high quality
data. Measure it because your business processes depend on it


Derive data quality KPIs from business process requirements


Start with a high level business process like procurement (also
known as a macro process) and then break it down.








25

© Platon

Tying Data Quality KPIs to
Business Processes

Procurement

No duplicate
vendors

Correct industry
code for
vendors

Correct placement
in hierarchy
(parent vendor)

Correct email
address for
vendors

Business Meta Data

DEFINED KPIs


KPI

Frq

Target

UoM

A

B

C

Data quality
requirements

Business Meta Data is required to
define the actual KPIs.

Ex: A vendor record is uniquely defined as an address of a
vendor where we place orders, receive shipments from or…..

Define the data
entities used within
the process

Material Master

Data

Data Entity Scope

Macro process

Process

Is the
required data
quality aspect
meaningful to
monitor?

It may be better to improve
data validation or perhaps
problems are not
experienced

Spend analysis

Vendor
Selection

26

Vendor Master

Data

© Platon

Tying Data Quality KPIs to
Business Processes


Using a simple model like the one illustrated on the previous slide
allows you to tie data quality KPIs to business processes and to
business stakeholders


This relationship is critical for the success of the data quality
monitoring initiative. Clearly illustrating how poor data quality
impacts specific business processes is instrumental in getting the
executive support and the business buy in


When conducting data quality KPI interviews you may encounter
KPI suggestions like “measure if there is a valid relationship
between gross weight and product type”. Ask why this is important
and which process this is important for


A particular data quality KPI may be important for multiple different
processes. Document the relationship to all relevant processes







27

© Platon

Defining Data Quality KPI’s


Data quality KPIs should express the important characteristics of quality
of a particular data element


Typically units of measures are percentages, ratios, or number of
occurrences


For consistency reasons, try to harmonize the measures. If for instance one
measure is “number of customers without a postal code” while another is
“percentage of customers with a valid VAT
-
no” a list of measures will look
strange, since one measure should be as high as possible, and the other as
low as possible


A good simple approach is to define all data quality KPI’s as percentages,
with a 100% meaning all records meet the criteria behind this KPI


Be careful not to define too many measures, as this will just make the
organizational implementation more difficult


Pay attention to controlling fields (like material type) that may determine
rules like whether a specific attribute is required

28

© Platon

Defining Hierarchies


Use hierarchical measures where possible, so that measures can be
rolled up in regions and countries for instance


In the below example a KPI related to customer data is broken down in
individual countries to allow detailed follow up


A concern here is that fields may be used differently in different
countries. Given the below data insight, it might make sense to define a
separate KPI’s for CA and perhaps ignore MX and US


KPI:

Customer
Fax number
correctly
formatted

US Customers

CA Customers

MX Customers

5%

43%

77%

Value

Avg. Value

25%

Recs

85,000

38,000

19,000

Data Insight


Fax numbers are not required for US
customers since all communication is
done via email.


Fax is the primary communication
channel with Canadian customers.


Only some customers in Mexico have a
fax machine.


29

© Platon

Defining KPI Thresholds


Along with each KPI two thresholds should be defined:


Lowest acceptable value


Without specifying the lowest acceptable value (or worst value), it’s difficult
to know when to react


If the measure falls below this threshold action is required


Target value


Without target values, you don’t know when the quality is ok. Remember fit
-
for
-
purpose


Specifying a
low

and
target

threshold allows for traffic light
reporting that provides an easy overview


Defining appropriate thresholds can be difficult as even a single
product record with wrong dimensions may cause serious process
impact. But without any indication of when to be alerted any form
of automated monitoring is difficult




Target Value: 95 %

Lowest acceptable value: 80 %

30

© Platon


31

Indirect Measures


Consider critical fields (e.g. weight of a product or customer type)
where the correct value is of utmost importance, but it’s close to
impossible to define the rules to check if a new value entered is
correct….



One approach is to measure indirectly by for instance reporting
what users have changed these values for which products over the
last 24 hours, week or whatever is appropriate in your organization



© Platon

Cross field KPIs and Process
KPIs


Common KPIs that are not related to a single field


Number of new customer records created this week


Average time from request to completion of a new material record


Number of materials with a non
-
unique description (or pct. of materials
with a unique description)


Number of vendors, where a different payment is defined in different
purchasing organizations


Number of open sales orders referring to an inactive customer




32

© Platon

Think Prevention!


Every possible business rule related to completeness, integrity,
consistency and validity should be enforced by the system at the
time of data entry.


If it isn’t, consider implementing a data input validation rule rather
than allowing bad data to be entered and then measure it!


However, there are cases, where the business logic of a field is too
ambiguous to be enforced by a simple input validation rule.


Process (workflow) adjustments may also be the answer.



33

© Platon

34

Documentation of KPIs

KPI Name:

A meaningful name of the KPI that
expresses
what is being
measured

Objective:

Why do you measure this? What business processes are impacted if there data is not ok?

Dimensions:

What data quality dimensions (integrity, validity, etc.) are this KPI related to?

Frequency of measure:

How often do you wish to report on this KPI? Daily,
daily,

weekly
or monthly?

Unit of measure:

What is the unit of the KPI? Number of records, pct of records, number of bad values, etc.?

Lowest acceptable
measure:

Threshold that indicates if the data quality aspect the KPI represents is at a minimal
acceptable level. The value here must be in the unit of measure of the KPI.

Target value:

At what value is the KPI considered to represent data quality at a high level?

Responsible:

The person responsible for the particular KPI.

Formula:

The tables and fields that are used to analyze and calculate the KPI. This is the functional
design formula that forms the basis for the technical implementation.

Hierarchies:

When reporting on a KPI it is very useful to be able to slice and dice the measure according to
different dimensions or hierarchies. For a customer data KPI for instance, good hierarchies
would be regions, country, company code and account group.

Being able to view the KPI through a hierarchy also makes it easier to follow up with specific
groups of business users.

Notes and assumptions:

If certain assumptions are made about the KPI make sure to document
it

here

© Platon

35

Remember!


Quality is in the Eye of the beholder!


Data quality is defined by our Information Customers


Data is not always clean or dirty in itself


it may depend on the
viewpoint and a defined standard


Focus on what’s important to those that use the data


© Platon

Monitoring Process


A simple example



36

Publish KPI

Analyze
KPIs

Evaluate
root cause

Implement
Improvements

Plan
corrective
actions

Low
value in
KPI?

Y

N

© Platon

37

Monitor the Quality of your Master
Data

Thomas Ravn

Practice Director, MDM


E: tra@platon.net

M: +1 646
-
400
-
2862


PLATON US INC.

5 PENN PLAZA, 23
rd

Floor

NEW YORK NY 10001 www.platon.net