Download this webcast's PPT Presentation. - Neudesic

lapclassManagement

Nov 6, 2013 (3 years and 5 months ago)

50 views

SQL S
ERVER

D
ATA

Q
UALITY

S
ERVICES

Marc Jellinek

Principal Consultant


Neudesic

marc.jellinek@neudesic.com

A
BOUT

M
E

Experience


Principal Consultant
-

Neudesic


Assistant Director (SQL Team)


Application Engineering at Ernst &
Young


IT Manager at MLB Network


Sr. Technology Specialist at Microsoft

Technologies


Microsoft SQL Server 6.0, 6.5, 7.0, 2000, 2005, 2008, 2008 R2 and
2012


Relational Engine, Analysis Services, Integration Services and
Reporting Services

Marc Jellinek


marc.jellinek@neudesic.com

S
ESSION

O
BJECTIVES


Introduction to SQL Server Data Quality Services (DQS)


Understanding the problem


Demo


Where do we go from here?

S
ETTING

THE

STAGE


Building from the SQL Server Series: Master Data Services in SQL
Server 2012, presented by Patrick Gallucci


http://www.neudesic.com/media/webcasts/20120501/20120501.wmv
(start at 6:26
)


PASS MDM/DQS Virtual Chapter
http://
masterdata.sqlpass.org


Based on demos from “SQL Server 2012 Developers Update”

Data Quality

Issue

Sample Data

Problem

Standard

Are data elements consistently
defined and understood?

Gender code = M, F, U in one system
and Gender code = 0, 1, 2 in another
system

Complete

Is all necessary data present?

20% of customers’

l慳t 湡浥⁩猠 l慮kⰠ

50┠潦ozip
-
c潤敳o慲攠99999

Accurate

Does the data accurately
represent reality or a verifiable
source?

A Supplier is listed as ‘Active’ but
w敮t 潵o映扵獩湥獳⁳楸 y敡e猠慧a

Valid

Do data values fall within
acceptable ranges?

Salary

values should be between

60,000
-
120,000

Unique

Data

appears several times

Both

John Ryan and Jack Ryan appear
in the system



are they the same
person?

T
HE

D
ATA

Q
UALITY

P
ROBLEM

S
PACE

W
HAT

S

T
HE

P
ROBLEM


My name is
Marc

Jellinek


Marc

<> “Mark”, “
Marck
” or “March”


Jellinek

<> “
Jelinek
”, “
Jellineck
”, “
Jelineck
”, “
Jelliner
”, “
Jeliner
” or

Jellyneck


R

r

W
HAT

S

THE

P
ROBLEM

Jellinek

Jelinek

Jellineck

Jelineck

Jelliner

Jeliner

Jellyneck

Marc

Mark

Marck

March

T
HE

NIGHTMARE

S
CENARIO

The Customer Dimension


Jellinek
, Marc


Jellinek, Mark


Jellinek,
Marck


Jellinek, March


Jelinek
, Marc


Jelinek
, Mark


Jelinek
,
Marck


Jelinek
, March


Jellineck
, Marc


Jellineck
, Mark


Jellineck
,
Marck


Jellineck
,
March


Jelineck
, Marc


Jelineck
, Mark



Jelineck
,
Marck


Jelineck
, March


Jelliner
, Marc


Jelliner
, Mark


Jelliner
,
Marck


Jelliner
, March


Jelliner
, Marc


Jelliner
, Mark


Jelliner
,
Marck


Jelliner
, March


Jellyneck
, Marc


Jellyneck
, Mark


Jellyneck
,
Marck


Jellyneck
, March



A
NALYTIC

I
MPACT


Average Revenue per customer


Average Profit per customer


Number of customers


Customers per Geography


Customers by Income


Customers by Gender


Customers by Educational Level


Customers by Product Bought


O
BLIGATORY

T
RUISMS


The accuracy of your reporting is determined by the accuracy of your
data (Garbage In, Garbage Out)


Decisions made based on data will only be as good as the data on
which you are basing your decisions.


You can’t manage what you can’t measure. Inaccurate
measurements lead to interesting management challenges.


T
HE

D
ATA

Q
UALITY

S
OLUTION

S
PACE

Amend
, remove or enrich
data that is incorrect or
incomplete. This includes
correction,
enrichment
and standardization.

Identifying
, linking or
merging related entries
within or across sets of
data.

Cleansing

Matching

Profiling

Monitoring

Analysis
of the data
source to provide insight
into the quality of the
data and help to identify
data quality issues.

Tracking
and
monitoring

the state of Quality

activities and Quality

of
Data.

SQL S
ERVER

2012 D
ATA

Q
UALITY

S
ERVICES

High quality data is
critical to effective
business intelligence
and to business
activities

DQS
is an on
-
premise
Data Quality product in
SQL
Server 2012,
extendible
with
knowledge from
multiple parties thru
Azure
DataMarket

Richer DQ knowledge
and capabilities in the
cloud will make it
even easier to provide
high quality data

Data Quality Services (DQS) is a
Knowledge
-
Driven
data
quality
solution
enabling IT Pros and data stewards to
easily improve the quality of their
data

Included with SQL Server 2012 Enterprise and BI Editions

K
EY

D
ATA

Q
UALITY

S
ERVICES

C
ONCEPTS

Knowledge
-
Driven

Semantics


Knowledge Discovery


Based on a
Data Quality
Knowledge Base

(DQKB)

Data Domains
capture the
semantics
of your data

Acquires additional knowledge
the more you use
it

Open and Extendible

Easy to use


Add
user
-
generated
knowledge

&

3
rd

party
reference data providers

User
experience designed for
increased productivity

DQS A
RCHITECTURE

Matching

Reference
Data

DQ Clients

DQS UI

DQ Server

DQ Projects Store

Common Knowledge Store

Knowledge Base Store

DQ Engine

3
rd

Party /
Internal

MS DQ

Domains Store

Reference
Data Services

Reference
Data Sets

SSIS DQ
Component

DQ Active Projects

MS Data
Domains

Local
Data
Domains

Published
KBs

Knowledge
Discovery

Data Profiling
& Exploration

Cleansing

Knowledge
Discovery and
Management

Interactive
DQ Projects

Data
Exploration

Azure Market Place

Categorized
Reference Data

Categorized Reference
Data Services

Reference Data API

(Browse, Get, Update…)

RD Services API

(Browse, Set,
Validate…)

MDS Excel

Add in

D
ATA

Q
UALITY

S
ERVICES

P
ROCESSES

Build

Use

DQ Projects

Knowledge

Management

Integrated

Profiling

K
nowledge

Base

B
ASIC

D
EFINITIONS


Knowledge Base


Stores all the knowledge related to a specific type of data source


Container for domains


Domain


Semantic representation of a type of data in a data field or column


Trusted values, invalid values and erroneous data


Synonym associations, term relationships, validation and business rules,
matching policies


Matching Rule


Set of rules and conditions that determine a match or duplicate

D
ATA

Q
UALITY

S
ERVICES

C
OMPONENTS


Data Quality Server


Data Quality Client


DQS Cleansing Component for SQL Server Integration


Data Quality Processes in Master Data Management

D
ATA

Q
UALITY

S
ERVICES

C
OMPONENTS


Data Quality Server


SQL Server Databases


DQS_MAIN


DQS Stored Procedures, the DQS Engine and published Knowledge Bases


DQS_PROJECTS


Data required for knowledge base management and DQS project activities


DQS_STAGING_AREA


Intermediate staging area where source data is copied and processed

D
ATA

Q
UALITY

S
ERVICES

C
OMPONENTS


Data Quality Client


Standalone application


Designed for both data stewards and DQS Administrators


Perform knowledge management, data quality projects and administration in
one user interface


Allows for domain management, matching policy creation, data cleansing,
matching, profiling, monitoring and server administration.


Can be installed on a remote computer

D
ATA

Q
UALITY

S
ERVICES

C
OMPONENTS


DQS Cleansing Component in SQL Server Integration Services


Performs data cleansing as a part of an SSIS package


Alternative to running a cleansing project within the Data Quality
Services Client application


Data Quality Processes within Master Data Services


Perform de
-
duplication on source data and master data within the
Microsoft SQL Server Data Services Add
-
in for Microsoft Excel.

DEMO

R
ESOURCES


DBI207: Using Knowledge to Cleanse Data with Data Quality Services

http://channel9.msdn.com/Events/TechEd/NorthAmerica/2011/DBI207


Data Quality Services Blog

http://blogs.msdn.com/b/dqs


Books Online for SQL Server
-

Data Quality Services

http://technet.microsoft.com/en
-
us/library/ff877925(SQL.110).aspx



Install Data Quality Services

http://technet.microsoft.com/en
-
us/library/gg492277.aspx


Used Master Data Services Configuration Manager, set up IIS

http://msdn.microsoft.com/library/ee633744%28SQL.110%29.aspx


Troubleshoot Installation and Configuration Issues (Master Data Services in SQL Server 2012

http://go.microsoft.com/fwlink/?LinkId=226284



SQL Server 2012 Developer Training Kit Web Installer

http://www.microsoft.com/en
-
us/download/details.aspx?id=27721



SQL Server 2012 Update for Developers Training Workshop

http://social.technet.microsoft.com/wiki/contents/articles/6981.sql
-
server
-
2012
-
update
-
for
-
developers
-
training
-
workshop.aspx



SQL Server 2012 Update for Developers Training Kit

http://social.technet.microsoft.com/wiki/contents/articles/6982.sql
-
server
-
2012
-
developer
-
training
-
kit
-
bom
-
en
-
us.aspx



SQL Server 2012 Update for Developers Training Kit Content

http://social.technet.microsoft.com/wiki/contents/articles/6976.sql
-
server
-
2012
-
developer
-
training
-
kit
-
content
-
en
-
us.aspx



MSDN


Data Quality Services

http://msdn.microsoft.com/en
-
us/library/ff877925(v=sql.110).
aspx


MSDN Discussions


Data Quality Services

http://
social.msdn.microsoft.com/Forums/en
-
US/sqldataqualityservices/threads



Technet



Data Quality Services

http://
technet.microsoft.com/en
-
us/library/ff877925.aspx



PASS MDM/DQS Virtual Chapter

http://masterdata.sqlpass.org



Driving Innovation with SAP
On
-
Premise and Beyond

Monish
Nagisetty

Daniel
Sepp

7 June 2012

10:00
-
11:00

PDT

http://www.clicktoattend.co
m/?id=159718


Delivering a Semantic Model
for Ad
-
Hoc Reporting

Steve Muise

21 June 2012

10:00
-
11:00

PDT

http://www.clicktoattend.co
m/?id=159917

Exploring

the New
Hadoop

Implementation in Azure

Jacob Saunders

10 July 2012

10:00
-
11:00

PDT

http://www.clicktoattend.co
m/?id=160452

Amerishore



an Innovative,
Socially Conscious Approach
to Offshoring

Tracy
Derr

17 July

2012

10:00
-
11:00 PDT

http://www.clicktoattend.co
m/?id=160650

Data Integration
Improvements within

SQL
Server 2012

Kola Bolarin

24 July 2012

10:00
-
11:00 PDT

http://www.clicktoattend.co
m/?id=160322


EDI:

The Reports of My Death
Have Been Greatly
Exaggerated

Abhilash

Shanmug

9 August 2012

10:00
-
11:00 PDT

http://www.clicktoattend.co
m/?id=159720


http://www.neudesic.com/About/Events/Pages/RecentWebcasts.aspx






THANK YOU