Ontology-Based Computing

sounderslipInternet and Web Development

Oct 22, 2013 (3 years and 9 months ago)

67 views

Ontology
-
Based Computing

Kenneth Baclawski

Northeastern University and Jarg

The Onslaught


Increasingly large amounts of information is
becoming accessible electronically.


The information sources are increasingly
complicated.


The diversity of types of information source
is also increasing.


Technologies are emerging to cope with this
onslaught:
ontology
-
based computing
.

Ontologies


Shared understanding within a community
of people


Declarative specification of entities and
their relationships with each other


Constraints and rules that permit reasoning
within the ontology


Behavior associated with stated or inferred
facts

Relational Database Schemas


Well established technique for specifying
the structure of shared data, not for
communication between people or agents


Declarative specification but of tables, not
of entities and relationships


Some constraints are expressible but no
significant rules (such as inheritance)


No explicit behavior


Standard language is SQL.



Object
-
Oriented Schemas


Emerging technology for communication
between software components


Declarative specifications


Constraints and some rules


Several ways to specify behavior


The Unified Modeling Language (UML) is
the standard OO modeling language.


Logic


Very expressive but very difficult to use.
Not designed for communication.


Most logical languages are not based on
entities and relationships.


Very powerful inferencing capabilities.


Do not usually have any associated
behavior.


Many examples: Prolog, KIF, Slang, ...

XML DTDs and XML Schema


Defines a hierarchical document type.
XML Schema defines data types. Designed
for communication over the Web.


Good support for entities and hierarchical
relationships; awkward for others.


Constraints can be imposed on the
hierarchical structure and on data types.


Behavior can be specified procedurally.

Knowledge Representations


Very well developed branch of AI. Many
tools, but mostly academic. Not yet used
for communication over the Web.


Powerful language for specifying entities
and their relationships.


Most are linked with inference engines.


Behavior is typically handled in an ad hoc
manner.

RDF and DAML


Resource Description Framework (RDF) is
a knowledge representation language
represented in XML. It is a WWW
Consortium Recommendation.


The DARPA Agent Markup Language
(DAML) is an extension of RDF to serve as
the basis for ontology
-
based computing
over the Web: the
Semantic Web
.

Ontological Reasoning in RDF

Class

Property

Person

type

Fish

type

owns

type

Wanda

type

Wendy

type

owns

Type constraint violation: The range of owns is Fish.

OR

There is no inconsistency: Wanda is a fish!

range

domain

type

Mermaid?

Class

Property

College

type

Student

type

majors

type

Cardinality constraint violation: George can’t have two majors

OR

There is no inconsistency: Engineering = Arts & Sciences

domain

range

Restriction

type

subClassOf

onProperty

1

maxCardinality

Arts & Sciences

type

Engineering

type

George

type

majors

majors

equivalentTo

DAML

Representing information


Relational database: records


OO database: objects and links


Logic: facts


XML: documents


Knowledge Representations: annotations


All of these are graph structures: entities
related to other entities by relationships.

Where is the meaning?



Databases: select
-
project
-
join queries



Logic: rules determined by unification



XML: XSLT patterns



Knowledge Representations: templates


All of these are forms of graph matching.
The units of meaning are small connected
subgraphs that I call
motifs
.

Ontology Infrastructure


Ontology development tools


Content creation systems


Storage and retrieval systems


Ontology reasoning, mediation, ...


Integration with applications

Simply introducing a language is not enough.

There must be an infrastructure to support

ontology
-
based computing, including:

Ontology Development


Ontologies can be developed using
graphical tools specifically for ontologies or
by adapting existing tools such as CASE
tools.


Testing ontologies is not easy because they
include constraints and inference rules.


Ontology testing is analogous to type
checking in programming languages.

Content Creation


Databases: Data warehousing technology


Text: Natural Language Processing (NLP)


Image processing


Direct creation of content


No matter how the content is created it must
be tested using consistency checking.

Storage and Retrieval


Scaling up will require high
-
performance,
distributed storage and indexing technology.


The natural units for indexing are the motifs
(precomputed joins), but the number of
motifs is large.


Jarg Corporation has developed a scalable,
high
-
performance indexing technology for
ontology
-
based knowledge representations.

Jarg Architecture

Document

Knowledge Representation

NLP

fragmentation

Knowledge Fragments

Distributed Index Engine

Query

NLP

Knowledge Representation

fragmentation


Knowledge Motifs

Matching

Documents

Conclusion


Ontology
-
based computing is emerging as a
natural evolution of existing technologies to
cope with the information onslaught.


Ontology
-
based technology must be
scalable if it is to contribute to the solution
rather than add to the problem.


Consistency checking is important for the
development of ontologies and content.


Bibliography


Semantic Web: www.w3.org/2001/sw


Ontologies: www.ontology.org


Unified Modeling Language: www.omg.org/uml


Knowledge Interchange Format: logic.stanford.edu/kif


Specware and Slang: www.kestrel.edu


XML and XML Schema: www.w3.org/xml


RDF and RDFS: www.w3.org/rdf


DAML: www.daml.org


Notation 3: www.w3.org/DesignIssues/Notation3.html


Consistency checking: vis.home.mindspring.com


Jarg Knowledge Engine: www.jarg.com