the document & data tsunami?

farmpaintlickInternet and Web Development

Oct 21, 2013 (3 years and 9 months ago)

86 views

How to survive

the document & data tsunami?

Lambda
Verdonckt

Business Analyst
TenForce




We know how to handle
large data
,

regardless of the technology used.

1

Semantic Technology




The only
purpose
-
built

technology,

to survive a tsunami of doc and data.

2

Semantic Technology




Leveraging information in old systems,

no need to change
current way of working.

3

How did we end up here in the first place?

Semantic Technology

Turns the web of documents

into a
web of data
.


Turns the web
as a virtual library

into a
virtual database
.


TenForce applies these technologies

in corporate environments
.


How to survive the document & data tsunami?

Semantic Technology


1.
State
-
of
-
the
-
art

2.
Examples

3.
Future

Semantic Technology

The meaning of the data is encoded separately











The only purpose
-
built technology for handling a tsunami of
data, in a flexible way.



data

Software understands the data and can
reason

about it

(JohnDoe, type, Customer)

(JohnDoe, owns, Account123)

(Account123, type, BankingAccount)


model

Customer


type


Person


owns


Account


=
> ontology, thesaurus, taxonomy etc.

Semantic Technology Standards

A set of standards & tools to work with large data sets



Semantic Technology Architectures


TenForce Semantic Offering

Consultancy

Projects

Training

Products

Semantic Technology



Assesment



Architectures


Modeling


Validation


Standard compliancy


End
-
to
-
end projects




mixed teams


research projects


EU framework


Unique Training Offer


Introduction


Modeling


Programming


and
many

others



How to survive the document & data tsunami?

Semantic Technology


1.
State
-
of
-
the
-
art

2.
Examples

3.
Future

Semantic
Technology Solutions


The ‘semantic web’ is
an
application
of semantic technology


Corporate
solutions built with semantic
technology include:


Knowledge
Bases


Automatic
Categorization
&
Archiving


Natural
Language Processing in
documents





Semantic Technology Solutions

TenForce

projects


Publications Office of the EU




a thesaurus of European activities



Wolters

Kluwer Globally




building a multilingual publishing bus



DG Employment of the EC




a taxonomy of European Skills, Competences & Occupations


Semantic Technology Solutions

Advanced examples


New York Times




automatic categorization & archiving with Linked Data



Amdocs




telecom solutions for pro
-
active decision support



Audi




modeling
behaviour

to make testing less error
-
prone

How to survive the document & data tsunami?

Semantic Technology


1.
State
-
of
-
the
-
art

2.
Examples

3.
Future

Industry Analysts

Gartner: high benefit rating (2010)

“ Semantic technologies offer …


options
that now
are difficult
or impossible



HP: top 10 trend in BI (2010)


New
approaches are needed, and semantic technologies hold

part of the
solution.”

A vision of the data web

LOD
2


a European FP
7
project



Build the infrastructure for the web of data


Opportunities & challenges for all of us!



Future



We know the tsunami is coming,

the question is


who will be ready to survive?

www.tenforce.com

lambda.verdonckt@tenforce.com

twitter.com/
LambdaVerdonckt


BACK
-
UP SLIDES

Semantic
Technology Solutions

Knowledge B
ases


Knowledge is captured in a model, making the DB a KB


Allows to manage &
share

knowledge
i.s.o
. mere
storage


>
50
% of companies indicate the need to share stored knowledge (VALUE
-
IT)


Better & faster retrieval of information for decision support


Human
-
readable:

typical CRM with search functionality
Machine
-
readable:

expert systems, incl. reasoning




eg
. clinical decision support


Rules are part of the data,
i.s.o
. hard
-
coded:

more readily adaptable to changing needs,

while interoperable with existing DB’s


Semantic Technology Solutions

Automatic Categorization & Archiving


Categorization based on controlled vocabularies

(
taxonomies, thesauri, ontologies)



makes content more searchable: better!


eliminates cost of
labour
-
intensive processes: cheaper!


vs. user
-
driven categorization & tagging (web 2.0)


Remark:
Look at
Evri

as an online example!

Semantic Technology Solutions

Natural Language Processing

Software that analyzes the structure and meaning of textual
information


analyze
texts,


identify
terms & concepts,


extract
information,


understand
meaning



Automatic categorization & archiving based on NLP


Tools: Alchemy,
OpenCalais
,
PoolParty

Multilingual publishing system in a EU context

for Legal, Tax & Regulatory


2010

TenForce

26

Wolters Kluwer Global

ESCO, a taxonomy
of European Skills
, Competences
& Occupations




2010

TenForce

27

DG Employment of the EU Commission



A Semantic Job Portal to leverage the information in ESCO and other
information on the web


2010

TenForce

28

DG Employment of the EU Commission



Advanced examples

Publishing

New York Times


in
-
house developed vocabulary


automatic categorization & archiving


published as Linked Data (open to the world!)


http
://data.nytimes.com/


Advanced examples

Telecom

Amdocs









Knowing why a customer is calling, saves 3’ per call (or


0,30)!



RDF


billing

s
ocial fora

c
all center logs

...

advanced inference

Pro
-
active

decision support

Advanced examples

Manufacturing

Audi

(
Ontoprise
)

Testing electronic systems in cars using simulations



huge amounts of data are recorded



to be collected and analyzed



time
-
consuming & error
-
prone


Need for a standardized way to describe


desired system
behaviour


known error
-
cases


Solution: ontology
-
driven & visualized