NIST BIG DATA WG
Reference Architecture Subgroup
Intermediate Report
Co
-
chairs:
Orit
Levin (
Microsoft)
James
Ketner
(
AT&T)
Don
Krapohl (Augmented Intelligence
)
July 24th, 2013
Reference Architecture Objectives
•
Addresses a broad range of stakeholders (e.g., data owners,
industries, academia,
p
olicy makers)
•
Wide scope:
•
Encompasses the whole data life cycle or in the ecosystem
•
Can be applied to different use cases (including various verticals)
•
Represents different system architectures (e.g., an enterprise data
warehouse, distributed cloud
-
based system using multiple service providers)
•
Focus
•
Potentially with initial focus on the Big Data analytics and tools
•
Assists in identifying security and privacy issues
•
Agnostic to any specific technologies
2
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
RA Diagram Independent Submissions
•
Different styles and perspectives, but easy to map between them
•
Data centric (Wo Chang)
•
Data
F
low centric (Orit Levin, Bob
Marcus)
•
Technology
Layers / Stack
diagram (Gary
Mazzaferro
)
•
The vocabulary used in these submissions and on the mailing list has
been compiled and submitted as M
-
0057
3
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
Abstract Reference Architecture
by Wo Chang / NIST
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
4
Independent RA Proposals: Big Data
Sources, Usage, Transformation, and Infrastructure
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
5
Data Flow
Diagram by Bob Marcus
Technology Stack / Layers
Diagram
by G.
Mazzaferro
Data Flow Ecosystem
Diagram by Orit Levin
Data Sources and Usage
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
6
Data Flow
Diagram by Bob Marcus
Technology Stack / Layers
Diagram
by G.
Mazzaferro
Data Flow Ecosystem
Diagram by Orit Levin
Infrastructure
:
Storage, Security, and Management
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
7
Data Flow
Diagram by Bob Marcus
Technology Stack / Layers
Diagram
by G.
Mazzaferro
Data Flow Ecosystem
Diagram by Orit Levin
Data Transformation
:
Processing, Analytics, and Visualization
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
8
Data Flow
Diagram by Bob Marcus
Technology Stack / Layers
Diagram
by G.
Mazzaferro
Data Flow Ecosystem
Diagram by Orit Levin
Draft Agreement / Rough Consensus
•
Transformation
includes
•
Processing functions
•
Analytic functions
•
Visualization functions
•
Data Infrastructure
includes
•
Data stores
•
In
-
memory DBs
•
Analytic DBs
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
9
Sources
Transformation
Usage
Data Infrastructure
Security
Management
Cloud Computing
Network
Next Steps and AIs
•
Deliverable I
: Write the White Paper draft showing one or more (e.g.,
Data
Flow and
Stack approaches
)
using the same or similar terminology
•
AI: Chairs will start the draft of the document incorporating the
submissions to the
Ref Arch subgroup
•
AI:
C
lose cooperation between “Ref Arch” and “
Def&Tax
” sub
-
groups to produce the
Output: taxonomy for the RA diagrams with definitions for major entities/blocks;
Input: M
-
0057.
•
Deliverable II
: A draft of a single RA requires more discussion and inputs
based on the work of all sub
-
groups
•
AI: Chairs will start the draft of the document incorporating the findings of the Ref
Arch subgroup
•
AI: Review the latest contributions to the Ref Arch and incorporate their findings (See
email from Yuri
Demchenko
/ University
of
Amsterdam)
•
AI:
C
lose cooperation with the “Use Cases” and “Security” sub
-
groups to identify the
areas of focus for “zooming” into their architecture
10
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
Backup Slides
11
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
Submitted RAs
12
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
Data Centric by Wo Chang / NIST
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
13
Data Flow Diagram by Bob Marcus
14
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
Individual Data Transfer
Big Data Transfer
Selected Data Storage and Retrieval
Big Data Storage and Retrieval
Aggregation
Dat a Obj ect s
Data Sources
Data Usage
Government (incl. health & financial institutions)
Industries / Businesses
Network Operators / Telecom
Academia
Data Mining
Matching
Collection
Data Transformation
Data Infrastructure
Storage &
Retrieval
Management
Security
C
onditioning
Anonymized
Pseudo
-
anonymized
PII
VOLUME
VARIETY
VELOCITY
Aggregation
15
Data Flow Ecosystem Diagram
by Orit Levin
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
Technology
Layers / Stack diagram
by
Gary
Mazzaferro
M i c r o s o f t
16
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
Mapping to Technologies and Use
Cases
Prepared by the authors of the original RAs
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
17
18
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
19
An Example of Cloud Computing Usage in Big
Data Ecosystem
Individual Data Transfer
Big Data Transfer
Selected Data Storage and Retrieval
Big Data Storage and Retrieval
Aggregation
Dat a Obj ect s
Data Sources
Data Usage
Government (incl. health & financial institutions)
Industries / Businesses
Network Operators / Telecom
Academia
Data Mining
Collection
Data Transformation
Data Infrastructure
VOLUME
VARIETY
VELOCITY
Data Warehouse
Cloud Provider
/ Service Layer
SaaS
P
aaS
I
aaS
Matching
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
Online Data Aggregator
Data Subject / Person
Online Sources
Public Records (commons,
government, etc.)
Offline Sources
Internal Records
Other devices (Smart Grid,
surveillance, scientific, etc.)
End User
d
evices incl. OS
(mobile phones, etc.)
Applications (search,
publishers, etc.)
Match/Bridge Service
Networks
Government, health,
financial institutions,
academia
Industries /
Businesses
Network
Operators
Collection
Data
Management
Platforms
(DMPs)
UI: Do Not Track (DNT)
HTTP: DNT
Analytic Cookie
DMP Cookie
DPI
Match Cookie
Appl. with customers
(communications, social
network, etc.
Match Container Tag
or Pixel request
Offline Data Aggregator
Web Browsers
Data Mining
Person Attribution
Users
SSP
D
SP
AdNet
AdX
Agency
Publisher
Advertiser
Advertising Industry Ecosystem
DMP Container Tag
or Pixel request
Control
Aggregated
1
st
Party
2
nd
Party
De
-
identified
PII
3
rd
Party
Contextual
Data Collection
Behavioral
Data Creation
Big Data Transfer
Individual Data Transfer
20
Use Case: Advertising
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
Individual Data Transfer
Big Data Transfer
Selected Data Storage and Retrieval
Big Data Storage and Retrieval
Online Analytical
Processing (OLAP)
Data Usage
Department Data
Mart
Regional Data
Mart
Subject Data Mart
Application Data
Mart
Data Mining /
Knowledge Discovery in Databases (KDD)
Extraction, Transformation, and Loading
(
ETL)
Data Transformation
Data Infrastructure
Central Data
Warehouse
Management
Security
Archives
Files
Online Transaction Processing
(OLTP) Systems
MS Office Documents
Functional Data
Mart
Operational
Data Store
Staging Area
Data Sources
Manual
Managed Report
E
nvironment (MRE)
Dat a Obj ect s
21
Use Case: Enterprise Data Warehouse
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
7/24/2013
NIST Big Data WG / Ref Arch Sub
-
group
22
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο