Microsoft Semantic Engine - MSDN

wrendeceitInternet και Εφαρμογές Web

21 Οκτ 2013 (πριν από 4 χρόνια και 23 μέρες)

137 εμφανίσεις

MICROSOFT SEMANTIC ENGINE

Unified Search, Discovery
and

Insight

Significant Content is Outside Structured Storage (RDBMS, OLAP, BI)

Integration of this Content is Prohibitively Expensive (Time, Money, Resources)

Extracting Insight, Analytics, and Recommendations is even harder

Situation
is
a Confluence of
Search

|
Predictive Analytics
|
Large
-
Scale
C
ollaborative Filtering

Having
all
forms of digital
information on a

single platform
allows people to blend
unstructured and structured content and to drive insight and decision making


Microsoft Semantic Engine provides a combination
of technologies
to
form a
contextual
understanding
of all digital content

Critical Business Need

Analysts gather
documents,
media and web
content about
“Business
Analytics”, “Data
Integration” and
“Search and
Discovery”

Core Machine Learning

Unsupervised
learning infers
“Unified
Information
Access” concept
cluster based on
automated
analysis of
content

Efficient Data Aggregation

Cluster gains in
relevance from
mining across
unstructured and
structured
sources added
from ERP and BI
systems

User Relevance Boost

Users (BDM) re
-
label cluster as
“Unified Search,
Discovery and
Insight” and
engine adopts it
further boosting
that cluster
relevance

Collaborative Boost

Analysts collate
this content
requiring multi
-
resolution super
-
clusters with
embedded sub
-
clusters

Business Decision Making

The CxO explores
super
-
cluster and
drafts business
plan for her new
division

|
|
Search and Collaboration
| Personalized search, discovery and organization

Legal
| Precedent and subject based search over large scale textual corpuses

Life Sciences

| Systems biology with large volume data correlation and search

Government Services
| Intelligence, real
-
time analytics, visualization, clustering

Social Networking
| Social graph relevance mining, ranking criteria auto tuning

|
Unified Search, Discovery
and

Insight

Automatic Clustering and Organization

Meaning
-
Driven Indexing, Classification and Storage

Scalable Content Processing over all Content Types

Instant On Experience for Out of Box Value

|
Search, Discover and Organize features exposed via sample UX gallery

Seamless installation and indexing of desktop, email and web content

Fully documented Managed APIs used in UX gallery and JavaScript / C# samples

|
Streams

|
Descriptors (Properties)

|
Kinds (Concepts)

Streams processed into contextualized and indexed concepts for search | discovery | organization

KR_CLIENT_225.docx

STREAM

LEGAL DOCUMENT

CONCEPT

BILLABLE WORK

CONCEPT

EVIDENCE

CONCEPT

DEPOSITION

CONCEPT

EXTRACTED
PROPERTIES

PROPERTY

LEGAL CASE [xxx]

CONCEPT CLUSTER

SEARCH AND SHARE

MDP

|
Engine consists of self
-
contained set of pluggable services

Text Processing

Image Processing

Video Processing

Audio Processing

Supervised Machine
Learning

Clustering

MDI (RBV)

Conceptual Search

Inference

Sequence Store

(Suffix Tree)

Distributed Content Store

Ontology and Taxonomy
Management

Semantic Engine

Search and Markup

Trend and Predictive
Analysis

Automatic Organization

Recommendation and
Discovery

|
The logical architecture partitions analysis, indexing and storage

API
1

API
2

API
3

Analysis
3

Analysis
2

Analysis
1

Staging

Core

Index

Stream

Store(<content>)

Annotate(<kind>)

Index(<content>)

Organize(<kinds>)

Search(<query>)



Text

Image

Audio

Video

Video

|
Designed to be hassle free
out of the box

Several
programming languages
and
frameworks

supported

CLR/.NET, JavaScript, TSQL, C++

|
Sample of storing a stream in the system

I
nitiates the content processing, classification, and indexing

|
Sample of search and recommendations

Returns contextual results from the store and the web

|
Seamless Integration in Windows Desktop Federated Search

Expose Meaning
-
Driven Indexing and Semantic Actions

Zero Learning Curve

|
Importers

Files

PlugIns

PlugIns

Plug
-
Ins

Semantic

Engine

Database

Kind

Descriptor

Stream

KindLink

ListKind

|
KindID

SourceUri

00000000
-
1111

C:
\
My Documents
\
Saint
Germain

Des Pres Cafe (Finest electro
-
jazz compilation)
\
05 Track
5.wma

StreamID

KindID

StreamUri

Format

Stream

11111111
-
2222

00000000
-
1111

audio/x
-
ms
-
wma

0xFFD8FFE000104A4649460001…

DescriptorID

KindID

Type

Attribute

Value

DescriptorID

KindID

Type

Attribute

Value

10000000
-
0000

00000000
-
1111

Classificat
ion

Audio

1.0

20000000
-
0000

00000000
-
1111

Metadata

Name

05 Track 5.wma

30000000
-
0000

00000000
-
1111

Metadata

Item Type

Windows Media Audio File

DescriptorID

KindID

Type

Attribute

Value

10000000
-
0000

00000000
-
1111

Classificat
ion

Audio

1.0

20000000
-
0000

00000000
-
1111

Metadata

Name

05 Track 5.wma

30000000
-
0000

00000000
-
1111

Metadata

Item Type

Windows Media Audio File

40000000
-
0000

00000000
-
1111

Metadata

Length

00:05:22

50000000
-
0000

00000000
-
1111

Metadata

WM/
ProviderStyl
e

Electronica

DescriptorID

KindID

Type

Attribute

Value

10000000
-
0000

00000000
-
1111

Classificat
ion

Audio

1.0

20000000
-
0000

00000000
-
1111

Metadata

Name

05 Track 5.wma

30000000
-
0000

00000000
-
1111

Metadata

Item Type

Windows Media Audio File

40000000
-
0000

00000000
-
1111

Metadata

Length

00:05:22

50000000
-
0000

00000000
-
1111

Metadata

WM/
ProviderStyl
e

Electronica

60000000
-
0000

00000000
-
1111

Audio

Tonality/Major

0.78

70000000
-
0000

00000000
-
1111

Audio

Tempo/Moderato

0.79

DescriptorID

KindID

Type

Attribute

Value

10000000
-
0000

00000000
-
1111

Classificat
ion

Audio

1.0

20000000
-
0000

00000000
-
1111

Metadata

Name

05 Track 5.wma

30000000
-
0000

00000000
-
1111

Metadata

Item Type

Windows Media Audio File

40000000
-
0000

00000000
-
1111

Metadata

Length

00:05:22

50000000
-
0000

00000000
-
1111

Metadata

WM/
ProviderStyl
e

Electronica

60000000
-
0000

00000000
-
1111

Audio

Tonality/Major

0.78

70000000
-
0000

00000000
-
1111

Audio

Tempo/Moderato

0.79

80000000
-
0000

00000000
-
1111

Classificat
ion

Music

.8

|
|
All Change data is
returned to MSE as one
XML block

MSE data is exposed
through custom views
keyed to the Users’
Primary Keys

|
Seamless Integration of Meaning
-
Driven Indexing in ALL SQL Tables

Expose Meaning
-
Driven Indexing via T
-
SQL

PARTING THOUGHTS

Unified Search, Discovery
and

Insight

over Every Digital Artifact

Extensible and Scalable Semantic Platform

Zero Learning Curve


>
>

channel9.msdn.com/learn
Built by Developers for Developers….

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be re
gis
tered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the
dat
e of this presentation. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accu
rac
y of any information provided after the date of this presentation. MICROSOFT
MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.