Data Quality - MSDN

prunelimitΔίκτυα και Επικοινωνίες

23 Οκτ 2013 (πριν από 4 χρόνια και 21 μέρες)

257 εμφανίσεις

SELECT COUNT(*) FROM
ParkingLot

WHERE type = ‘AUTO’

AND color = ‘RED’

red cars
last hour
Doesn’t seem like a great solution…

This is the streaming data paradigm in a nutshell



ask questions about data
in flight
.

Microsoft

StreamInsight

Sources

Devices, Sensors

Web servers

Stock tickers &

News
feeds

Data Bus

Caching

Processing

Visualization

Distribution

Static Reports

Mining, Validation,

“What
-
If” Scenarios

Operational Dashboard

(Ticking
-

Snapshot)

Reporting Dashboard

(Refreshed)

Message Bus

Operational Analytics

Automated Decisions

In
-
memory Database

Intra
-
Day Cubes

Historic

Cubes

Reference
Data

Cache

Refresh

(Push)

Refresh (Push)

Re
-
compute
(Pull)

Relational Database Applications


Financial trading
Applications

Aggregate Data Rate (Events/sec.)

Latency

0

10

100

1000

10000

100000

~1million

Months

Days

hours

Minutes

Seconds

100 ms

<

1ms

Operational Analytics Applications,
e.g., Logistics, etc.

Manufacturing Applications

Monitoring
Applications

Target Scenarios

Data
Warehousing
Applications

Web Analytics Applications

8

Standing Queries

Query
Logic

Event sources

Event targets

Devices, Sensors

Web servers

Event stores & Databases

Stock
ticker,
n
ews feeds

Event stores & Databases

Pagers
&

Monitoring
devices

KPI Dashboards,

SharePoint
UI

Trading stations

Input

Adapters

Output

Adapters

StreamInsight Engine

Query
Logic

Query
Logic

StreamInsight

Application Development

StreamInsight Application at Runtime

Industry trends


Data acquisition costs are
negligible



Raw storage costs are
small and continue to
decrease



Processing costs are non
-
negligible



Data loading costs
continue to be significant

Manage
business via
KPI
-
triggered
actions

Mine

historical data

Devise new KPIs

Monitor

KPIs

Record raw
data (history)

Stream processing
advantage


Act in almost real
-
time


Process data
incrementally, i.e., while
it is in flight



Avoid loading while still
doing the processing
you want



Seamless querying for
monitoring, managing
and mining

10

11

Data Stream

Stream Data Store &
Archive

Event Processing Engine

Data Stream

Asset Specs &
Parameters

Power Utilities
:


Energy consumption


Outages


Smart grids


100,000 events/sec

Visual trend
-
line and KPI monitoring

Batch & product management

Automated anomaly detection

Real
-
time customer segmentation

Algorithmic trading

Proactive condition
-
based maintenance

Web Analytics
:


Click
-
stream data


Online customer behavior


Page layout


100,000 events /sec

Manufacturing
:


Sensor on plant floor


React through device
controllers


Aggregated data


10,000 events/sec


Threshold queries


Event correlation from multiple
sources


Pattern queries

Lookup

Asset Instrumentation for Data Acquisition, Subscriptions to Data Feeds

Financial Services
:


Stock & news feeds


Algorithmic trading


Patterns over time


Super
-
low latency


100,000 events /sec

*

*

PI Input adapter feeds
Process
Values

(data) and
Quality
Indicator
(quality) tags into
the
piInput

stream.

1


//

Join

the

alldataStream

with

the

threshold

reference

stream


//

to

create

the

alerting

stream


var

alertStream

=

from

e

in

allDataStream


join

th

in

thresholdsStream


on

e.Path

equals

th.thresholdName


where

e.Value

>

th.value


select

new

PIEvent
<
Double
>


{


Annotation

=

e.Annotation
,


Id

=

e.Id
,


IsAnnotated

=

e.IsAnnotated
,


IsEdited

=

e.IsEdited
,


IsQuestionable

=

e.IsQuestionable
,


Path

=

e.Path.Replace
(
".PV"
,

".ALARMTRIG"
),


Status

=

e.Status
,


StatusText

=

e.StatusText
,


Timestamp

=

e.Timestamp
,


Value

=

e.Value




};

http://europe.msteched.com/topic/list/