Graph Data Analytics
www.globalids.com
Arka Mukherjee, Ph.D.
Global IDs
Arka.Mukherjee@globalids.com
Resolving Complexity at an Enterprise Scale
© 2013 Global IDs
2
Proprietary
1
The
“Complex Data
” Context
Current Challenges
2
Governance Methodology
3
Topics
The “Complex Data” Context
© 2013 Global IDs
4
Proprietary
The Big Shift
© 2013 Global IDs
5
Proprietary
The cost structure
is unsustainable
The cost of managing information is going up
exponentially
.
© 2013 Global IDs
6
Proprietary
The Complexity growth is unmanageable
1.
Complex data ecosystems
2.
Highly dynamic
3.
Limited traceability
4.
Systemic Risk : Hard to measure
Financial
Services
Institutions
© 2013 Global IDs
7
Proprietary
Question
How can Enterprises handle the cost and complexity
of managing complex data
landscapes ?
© 2013 Global IDs
8
Proprietary
Global IDs
Focus
To organize enterprise data
landscapes
© 2013 Global IDs
9
Proprietary
Global IDs: Product Suite
© Global IDs Inc
. (
2001
-
2013
)
14
Global IDs Software Products
Metadata
Governance Suite
Master Data
Governance Suite
Enterprise Data
Governance Suite
13
12
11
10
9
8
7
6
5
4
3
2
1
Dashboards
Stewardship
Validation
Rules
Monitor
Model
Search
Map
Classify
Profile
Ingest
Discover
Big Data
Governance Suite
Move
Standardize
Create
Transparency
Improve
Quality
Accelerate
Integration
Integrate
Distribute
15
16
18
17
Analyze
Measure
Embed
Analytics
Link
Visualize
19
20
Dashboards and Infographics
Graph Databases with Linked Data
KPIs and Trend Metrics
Reporting and Ad
-
Hoc Analysis
Data Services for Master Data
Integrated Master Data
Enriched Master Data
Data Repositories in Relational Databases or Hadoop
Master Data Governance Portals
RACI Matrix of Data Stewards
Data Quality Metrics
Rules Repository
Change Monitors
,
Impact Analysis
Master Data Models
Enterprise Search
Business Ontologies
Business Taxonomies
Semantic Metadata Repository
Inventory of External Data Assets
Comprehensive Data Asset Inventory
4
3
2
1
Deliverables
Under Development Using Hadoop Stack
Objective
Function
Challenges
© 2013 Global IDs
11
Proprietary
The
typical Financial Institution’s
#
Databases
> 1000
#
Tables
> 200,000
# Columns
> 2,000,000
© 2013 Global IDs
12
Proprietary
Question
How can we understand the relationships across
2,000,000
attributes?
© 2013 Global IDs
13
Proprietary
Converging
Data Variety
Structured
Unstructured
Multi
Structured
Data Content
© 2013 Global IDs
14
Proprietary
Converging
Data Ecosystems
Social
Data
Enterprise
Data
Machine
Data
Data Ecosystems
© 2013 Global IDs
15
Proprietary
Current Approaches do not Scale
#
Databases
> 1,000
>
10,000
>
100,000
Small Average Large
© 2013 Global IDs
16
Proprietary
A New Approach is Required
© 2013 Global IDs
17
Proprietary
5 Utilize Graph Structures for Governance
Graph Analytics : Use Cases
© 2013 Global IDs
19
Proprietary
Key Challenges
•
Vast diversity and volume of metadata and data
•
Storage and indexing of metadata to facilitate
search and
navigation
•
Understanding the connection between
different pieces of metadata (Crosswalk)
© 2013 Global IDs
20
Proprietary
Utilize Graphs Structures
for Storing Complex Data
© 2013 Global IDs
21
Proprietary
Use Case 1:
Enterprise
Metadata Search with
Hadoop
© 2013 Global IDs
22
Proprietary
Use Case
2:
Unstructured Data Integration
© 2013 Global IDs
23
Proprietary
Use Case 3: Cross
Database Similarity Mapping
© 2013 Global IDs
24
Proprietary
Use Case 4 : Graph Analytics
Demo
Methodology
© 2013 Global IDs
27
Proprietary
What we do
1.
Scan
2.
Analyze
3.
Map / Organize
4.
Govern
© 2013 Global IDs
28
Proprietary
Automation
© 2013 Global IDs
29
Proprietary
1 : Scan
© 2013 Global IDs
30
Proprietary
2 : Semantic Analysis
© 2013 Global IDs
31
Proprietary
3 Automate Semantic
Mapping
© 2013 Global IDs
32
Proprietary
4 Link the Data Landscape
Thank You!
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Comments 0
Log in to post a comment