Slides - MSDN

whirrtarragonΗλεκτρονική - Συσκευές

21 Νοε 2013 (πριν από 3 χρόνια και 8 μήνες)

70 εμφανίσεις

SELECT COUNT(*) FROM
ParkingLot

WHERE type = ‘AUTO’

AND color = ‘RED’

red cars
last hour
Doesn’t seem like a great solution…

This is the streaming data paradigm in a nutshell



ask questions about data
in flight
.

Present

Time of interest

Web Analytics


Ad placement,

Financial Services, Smart Grids,

Monitoring


Systems mgmt, Health Care,
Manufacturing, etc.

years

months

days

hrs

min

sec

$ value of analytics


Forecasting in Enterprises

Historical Trend Analysis

Traditional DW Analytics

Active DW analytics

Present

Time of interest


100000


10000


1000


100

Custom
-
built solutions that carry huge
development and customization costs

Facts/sec.

years

months

days

hrs

min

sec

Load time in ETL

ET time in ETL

Load barrier is dictated by current choices of
the solution, e.g., loading into databases,
persisting into files. This is intrinsic because
in current approaches no processing can be
done till the data is loaded.

Microsoft

StreamInsight

Sources

Devices, Sensors

Web servers

Stock tickers &

News
feeds

Data Bus

Caching

Processing

Visualization

Distribution

Static Reports

Mining, Validation,

“What
-
If” Scenarios

Operational Dashboard

(Ticking
-

Snapshot)

Reporting Dashboard

(Refreshed)

Message Bus

Operational Analytics

Automated Decisions

In
-
memory Database

Intra
-
Day Cubes

Historic

Cubes

ETL

ETL

Reference
Data

Cache

Refresh

(Push)

Refresh (Push)

Re
-
compute
(Pull)

Service Broker

Analytical results need to reflect important changes in business reality
immediately and enable responses to them with minimal latency

Database Applications

Event
-
driven Applications

Query

Paradigm

Ad
-
hoc queries or requests

Continuous standing queries

Latency

Seconds, hours, days

Milliseconds or less

Data Rate

Hundreds

of events/sec

Tens

of thousands of events/sec or
more

Query Semantics

Declarative relational analytics

D
eclarative relational
and temporal
analytics

request

response

Event

output

stream

input

stream

Relational Database Applications


Financial trading
Applications

Aggregate Data Rate (Events/sec.)

Latency

0

10

100

1000

10000

100000

~1million

Months

Days

hours

Minutes

Seconds

100 ms

<

1ms

Operational Analytics Applications,
e.g., Logistics, etc.

Manufacturing Applications

Monitoring
Applications

CEP Target Scenarios

Data
Warehousing
Applications

Web Analytics Applications

11

Standing Queries

Query
Logic

Event sources

Event targets

`
Devices, Sensors

Web servers

Event stores & Databases

Stock
ticker,
n
ews feeds

Event stores & Databases

Pagers
&

Monitoring
devices

KPI Dashboards,

SharePoint
UI

Trading stations

Input

Adapters

Output

Adapters

StreamInsight Engine

Query
Logic

Query
Logic

StreamInsight

Application Development

StreamInsight Application at Runtime

Industry trends


Data acquisition costs are
negligible



Raw storage costs are
small and continue to
decrease



Processing costs are non
-
negligible



Data loading costs
continue to be significant

Manage
business via
KPI
-
triggered
actions

Mine

historical data

Devise new KPIs

Monitor

KPIs

Record raw
data (history)

CEP advantage


Process data
incrementally, i.e., while
it is in flight



Avoid loading while still
doing the processing
you want



Seamless querying for
monitoring, managing
and mining

13

14

Data Stream

Stream Data Store &
Archive

Event Processing Engine

Data Stream

Asset Specs &
Parameters

Power, Utilities
:


Energy consumption


Outages


Smart grids


100,000 events/sec

Visual trend
-
line and KPI monitoring

Batch & product management

Automated anomaly detection

Real
-
time customer segmentation

Algorithmic trading

Proactive condition
-
based maintenance

Web Analytics
:


Click
-
stream data


Online customer behavior


Page layout


100,000 events /sec

Manufacturing
:


Sensor on plant floor


React through device
controllers


Aggregated data


10,000 events/sec


Threshold queries


Event correlation from multiple
sources


Pattern queries

Lookup

Asset Instrumentation for Data Acquisition, Subscriptions to Data Feeds

Financial Services
:


Stock & news feeds


Algorithmic trading


Patterns over time


Super
-
low latency


100,000 events /sec

Demo
Scenario: Market Monitor

StreamInsight

Output Adapters


Input Adapters

Asset
Class

Ticker

Exchange

SUM

Volume

SUM
Bid

SUM
Ask

Stock

MSFT

NASDAQ

100

100

100

Stock

IBM

NASDAQ

200

200

200

Push

Push

Push

Pull

Timestamps/Me
tadata

Long

pumpID

String

Type

String

Location

Double

flow

Double

pressure













27

LINQ Example


GROUP&APPLY, WINDOW:


from

e3
in

MyStream3

group

e3
by

e3.i
into

SubStream

from

win
in

SubStream.
HoppingWindow
(



FiveMinutes,ThreeSeconds
)

select

new

{
i

=
SubStream.Key
,


a =
win.Avg
(e =>
e.f
) };

LINQ Example


JOIN, PROJECT, FILTER:


from

e1
in

MyStream1

join

e2
in

MyStream2

on

e1.ID
equals

e2.ID

where
e1.f2 == “foo”

select

new { e1.f1, e2.f4 };

Join

Filter

Project

Grouping

Window

Project &

Aggregate

Data Sources

Aggregation &
Correlation

30

Stream
-
Insight

Stream
-
Insight

Stream
-
Insight

CEP for lightweight processing and filtering

CEP for aggregation and correlation of in
-
flight
events

CEP for complex analytics including historical data

Event processing engines are deployed at multiple places
on different scales


At the edge


close to the data source


In the mid
-
tier


consolidate related data sources,


In the data center


historical archive, mining, large scale
correlation.

Devices

Sensors

Web servers

Feeds

Stream
-
Insight

Complex Analytics &

Mining

Stream
-
Insight

Stream
-
Insight

Stream
-
Insight

Stream
-
Insight

Stream
-
Insight

Stream
-
Insight

Stream
-
Insight

Custom/Packaged OLTP Apps

4
procs
,
64GB RAM,

Backup Compression

8
procs
,


2TB RAM,

Adv. Security,

Backup Compression


>8
procs
,

OS Max,

Adv. Security,

Backup Compression


N/A

Server Consolidation

1 VM/license

4 VMs/license,

Resource Governor

App & Multi
-
Server Mgmt

(up to 25 instances)



Unlimited Virtualization, Resource
Governor, App & Multi
-
Server Mgmt
(> 25 instances)

N/A

Data Warehousing

Scale
-
Up DW,


Data Compression


10s of TBs, Up to 30 TB
with
FastTrack

Scale
-
Up DW,

Data Compression


10s of TBs


Scale
-
Out DW


10s
-

100s of TBs

Business Intelligence

Dept/Team BI

Enterprise
-
Scale BI,

Master Data Services,
PowerPivot

Mgmt

Enterprise
-
Scale BI, Master Data
Services,
PowerPivot

Mgmt


Integrated with SSIS,
SSAS and SSRS

Complex Event Processing
(
StreamInsight
)

<5000 events/sec &

> 5 sec latency

<5000 events/sec &

> 5 s latency

>5000 events/sec &

< 5 s latency

Future coverage

Enterprise

Standard

Datacenter

Workload

Parallel Data
Warehouse

Manufacturing

Utilities

Oil & Gas

Financial Services

Web Analytics

Telco

Scenarios:

Alarming


Notifications


Real
-
Time Analysis

AMI/
SmartGrid


Outage
Management

Well

Monitoring


Operational
Intelligence

Risk Management


Market
Monitoring

Behavioral
Targeting


Load Monitoring

CDR
Aggregation

ISV:

OSIsoft

Matrikon

ICONICS


OSIsoft

Matrikon

Telvent

ICONICS

OSIsoft

Matrikon

Lab49

IMGroup

MSFT

AdCenter

XBox

DPE

SI:

Logica

Logica

Logica

Hitachi
Consulting

Lab49

IMGroup

MSFT

AdCenter

XBox

DPE

Standing Queries

Query
Logic

Event sources

Event targets

`
Devices, Sensors

Web servers

Event stores & Databases

Stock
ticker,
n
ews feeds

Event stores & Databases

Pagers
&

Monitoring
devices

KPI Dashboards,

SharePoint
UI

Trading stations

Input

Adapters

Output

Adapters

StreamInsight Engine

Query
Logic

Query
Logic

StreamInsight

Application Development

StreamInsight Application at Runtime

Flexible adapter
SDK with high
performance to
connect to
different event
sources and
sinks

Event
-
driven applications are fundamentally
different from traditional database applications:
queries are continuous, consume and produce
streams, and compute results incrementally

The CEP platform
does the heavy
lifting for you to
deal with
temporal
characteristics of
event stream
data

Development experience with .NET,
C#, LINQ and Visual Studio 2008 and
2010

CEP platform from Microsoft to build event
-
driven
applications

http://www.microsoft.com/sqlserver/2008/en/us/R2
-
complex
-
event.aspx

http://blogs.msdn.com/streaminsight/
http://
blogs.msdn.com/b/streaminsight/archive/2010/10/25/releasi
ng
-
streaminsight
-
v1
-
1.aspx
http://msdn.microsoft.com/en
-
us/library/ee362541(SQL.105).aspx
http://streaminsight.codeplex.com
/
http://europe.msteched.com/topic/list/