The Semantic Web and its Implications for E-learning

toadspottedincurableInternet and Web Development

Dec 4, 2013 (3 years and 8 months ago)

150 views







The Semantic Web
and its Implications
for E
-
learning







Copyright © 2002 Telemati
ca Instituut


Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promoti
onal purposes or for
creating new collective works for resale or redistribution to servers or lists, or to reuse any
copyrighted component of this work in other works must
be obtained from or via Telematica Instituut (
http://www.telin.nl
).


Colophon

Date :

1 May 2002

Version :

1.3

Change :

-

Project reference:

E
-
linCC (extra deliverable on Semantic Web)

TI reference :

TI/RS/2002/026

URL :

[URL]

Access
permissions :

Anyone

Status :

Final

Editors :

Jacco van Ossenbruggen (CWI) and Mettina Veenstra (Telematica
Instituut)

Company :

Telematica Instituut, CWI

Author(s) :

Jacco van Ossenbruggen (CWI), Rogier Brussee (Telematica Instituut),
Lynda Hardman (
CWI) and Mettina Veenstra (Telematica Instituut)

Synopsi s:

In this document we explain the basics and the status of Semantic Web
technology. The document gives parties interested in the Semantic Web
a basis to assess the relevance of the Semantic Web for
their own
applications, especially for e
-
learning related applications.



E
-
L I N C C ( E X T R A D E L I V E R
A B L E )

V

Preface

The history of the Web can be discussed in terms of three generations. First generation
Web
-
content encodes information in hand
-
written (HTML) Web pages. Second
generation Web

content generates HTML pages on demand, e.g. by filling in templates
with content retrieved dynamically from a database or transformation of structured
(XML) documents using style sheets (e.g. XSL
-
T). Third generation Web pages will
make use of rich mark
-
up along with metadata (e.g. RDF) schemes to make the content
not only machine readable but also machine processable
--

the first step towards the
Semantic Web
.

The goal of the Semantic Web is to allow applications to assess to what extent
information is
useful for a specific user or goal. Thus, the Semantic Web will provide
support for applications that need to “understand” the content that is now implicitly
hidden in Web documents. In this document we describe the state of the art of Semantic
Web technol
ogy and its short, medium and long term expectations in order to gives
parties interested in the Semantic Web a basis to assess the relevance of the Semantic
Web for their own applications. We enter at length into the relevance of Semantic Web
technology f
or e
-
learning, especially for the adaptation of e
-
learning content to the user
and conclude that general purpose technology such as the Semantic Web does have
considerable advantages for e
-
learning.



E
-
L I N C C ( E X T R A D E L I V E R
A B L E )

VII

Table of Contents

1

I nt roduct i on

1

1.1

Goal and structure of this document

1

1.2

Semantic Web: some historical background

1

1.3

Basic concepts and terminology

2

2

A short st at e of the art of the Semanti c Web

5

2.1

URIs and Unicode

5

2.2

XML

5

2.3

RDF

6

2.4

RDF Schema

8

2.5

Ontology languages: DAML+OIL and beyond

10

3

Cuypers: f rom 2
nd

to 3
r d

Generat i on Mul ti medi a

13

4

Semant i c Web and e
-
l earni ng st andards

19

4.1

Introduction

19

4.2

E
-
learning content standards

19

4.3

More flexibility with Semantic Web technology

23

4.4

Bonus advantage

23

4.5

When will Semantic Web te
chnology be applied in practice?

24

5

Short, medi um and l ong t erm expect at i ons of t he Semanti c Web

25

5.1

Short term

25

5.2

Medium term

25

5
.3

Long term

26

Ref erences

29



E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

1

1

Introduction

1.1

Goal and st ruct ure of thi s document

The goal of this document is to explain the basics and status of Semantic Web
technology, and to give sufficient

information to allow other parties to assess the
relevance of the Semantic Web for their own applications, especially for e
-
learning
related applications.

In this chapter we start with some historical background and basic concepts and
terminology with re
spect to the Semantic Web. In Chapter
2

the state of the art of
Semantic Web technology is described. We also describe
Cuypers
, CWI's presentation
generation system for second generation multimedia Web content, and discuss its
re
lationship to the third generation Web, the (developing) Semantic Web (Chapter
3

).
We discuss the application of the Semantic Web to e
-
learning in Chapter
4

and propose a
combination of e
-
learning sta
ndards and Semantic Web technology in order to make e
-
learning content optimally reusable and adaptable. A final chapter discusses expectations
of the Semantic Web in the short, medium and (speculative) long term.

1.2

Semant i c Web: some hi stori cal backg
round

In the Internet days predating the Web, easy access to documents over the net was
hindered by the wide variety of incompatible file formats, platform
-
specific document
processing software, different network transport protocols with their associated a
rcane
command line syntax. The Web changed all this so dramatically that it is now, just 10
years later, difficult to imagine that access to information via the Internet was, not so
long ago, restricted to a small elite of mostly technically oriented acade
mics. The short
evolution of the Web has already been described in terms of first, second and third
generation Web content

[5].

In the
first

generation, the initial success of the Web was based on the
browser
, the
single desktop application that provides
its user a uniform interface to a wide variety of
information on the Internet. The main incompatibilities described above were addressed
by only three basic standards. URIs provide a simple but universal naming scheme, and
HTTP a simple but fast transfer p
rotocol. In theory, HTML was designed to provide the
``glue'' between various information resources in the form of hyperlinks, and as a default
document format Web servers could resort to when other available formats were not
understood by the client. In p
ractice, however, HTML turned out as being the format that
was also used to put the bulk of the content on the Web

[1]. A major problem of the first
generation Web content was the fact that all this HTML content was manually written.
This proved to be too
inflexible when dealing with content that is stored in existing
databases or that is subject to frequent changes. For larger quantities of ha nd
-
written
documents, keeping up with changing browser technology or updating the ``look and
feel'' proved to be h
ard.

In the current,
second

generation Web, the required flexibility is provided by a range of
technologies based on automatic generation of HTML content. Approaches vary from
filling in HTML templates with content from a database back
-
end, to applying CS
S style
sheets and XML
-
based technology. The main goal is to give the presentation the
2

T E L E M A T I C A I N S T I T U U T

appropriate look and feel while storing the content itself in a form free of presentation
and browser related details. Current trends on the Web make the flexibility pro
vided by
the second generation Web technology even more relevant. The PC
-
based Web browser
is no longer the only device used to access the Web. Content providers need to
continuously adapt their content to take into account the significant differences amon
g
Web access using PCs and alternative devices ranging from small
-
screen mobile phones
and hand
-
held computers to set
-
top boxes for wide
-
screen, interactive TV

[9]. Additional
flexibility is required to take into account the different languages, cultural b
ackgrounds,
skills and abilities of the wide variety of users that may access their content.

By providing flexibility in terms of the presentation and user interaction, second
generation Web technology directly addresses the needs of human readers. It als
o
provides the technology that allows the replacement of parts of the manual document
production process, by a (semi
-
) automatic process. It provides little or no support,
however, for applications that need to ``understand'' the content that is now implic
itly
hidden in Web documents. The goal of the third generation Web
--

the Semantic Web
--

is to address the needs of this type of applications.

1.3

Basi c concept s and t ermi nol ogy

Most automatic processing, other than presenting a document to a human reader,
c
urrently involves ``guessing'' what the original author intended when the document was
written. A common example is today's comparison
-
shopping, where sites help consumers
to compare products and prices from different on
-
line stores. Collecting the data th
at
feeds these sites, while partly a manual process, is largely automated by deploying
shopbots
. A shopbot is a piece of software that searches the Web for information on to
the price and quality of goods and services.

The technique used by today's shopbo
ts is often referred to as ``screen
-
scraping'', because
the shopbots interact with the on
-
line store's Web server through HTML pages that were
designed for display on the screen. Note that the information the shopbot is looking for
usually resides in a wel
l
-
structured, machine understandable format in the on
-
line store's
database. However, the shopbot has no access to this database, but only access to the
Web pages that were generated using extracted data. As a result, the shopbot designer
has to try to fin
d clever heuristics to automatically reconstruct the original structured
information, based on the implicit information on the Web site. Typically, these
heuristics are hard to develop, error
-
prone and are only able to reconstruct part of the
information.
In addition, they are often sensitive to the (graphical) design of the Website
and thus need revision every time the on
-
line store changes its website.

Needless to say, screen
-
scraping is, at least in theory, an overly complex, expensive and
fragile solut
ion to this problem. The underlying problem is that first and second
generation Web technology focuses primarily on content presentation, while the intended
semantics of the content is only interpretable by humans and thus remains implicit for
machines.

T
hird generation Web technology addresses this problem by making these intended
semantics explicit, by supporting content that is both human
and

machine processable.
Machine
-
processable content is the main prerequisite for the more intelligent Web
services
that constitute the ``Semantic Web'' as envisioned by Tim Berners
-
Lee and
others

[1], [2].



E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

3

Information that is explicitly added to documents to assist automatic processing is
usually referred to as
metadata
, that is, ``data about other data''. The process

of adding
metadata to existing information is known as
annotation
. Since annotation is, in most
cases, a manual process that needs to be done by domain experts, it is very expensive and
should thus be done ``right'' from the start. The type of information

that is sufficiently
relevant to add is, however, highly application dependent, and, since it involves human
interpretation of the sources being annotated, also highly subjective. Often, material is
annotated at a time it is not yet known which applicatio
ns will use the metadata in the
future, which makes the situation even worse. The encoding of metadata can also take
many forms, ranging from trivially simple to utterly complex, and from very informal,
natural text to highly formalised logic
-
based notatio
ns.

Metadata
-
related issues touch the core of all information sciences, and models and
technology for processing metadata have been influenced by many communities, in
particular, the digital library (DL) community and the knowledge representation (KR)
com
munity.

Within the DL community, metadata is, first of all, seen as a way of supporting
cataloguing and retrieving large document collections. This has resulted in standards that
address such issues, most notably the
Dublin Core

[4]. The Dublin Core basic
ally
standardises a set of 15 commonly agreed upon metadata elements of the type that one
can expect to find in every library catalogue, including title, subject, creator, language,
creation date, etc.

The metadata and document
-
centred focus of the DL com
munity can be contrasted with
the information modelling approach of the knowledge representation (KR) community,
where the focus is on representing the underlying content rather than describing the
document itself. For KR researchers, a well
-
designed power
ful infrastructure for adding
metadata to Web documents forms the basis for publishing explicit, formalised forms of
knowledge directly on the Web. To what extent, and how this knowledge is associated to
existing, informal Web documents, are often consider
ed secondary issues.

When it comes to sharing and communicating explicit knowledge, a key concept is the
notion of an
ontology
. Within KR, ontologies are often defined as a ``specification of a
conceptualisation'', that is, an explicit and commonly agreed
-
upon definition of the
objects and concepts that play a role in a certain domain. These are specified along with
the relations among them and the rules that limit the interpretation of the concepts. Given
an ontology about a certain domain, parties that n
eed to share and communicate
knowledge do this by making an
ontological commitment
: a statement that both people
and applications (agents) will use the terminology specified in the ontology according to
the specified rules.

Despite the differences between

the DL and KR approaches, many applications combine
elements from both worlds. Ontologies, for example, are often used to improve automatic
processing of metadata. By making a commitment to a specific ontology, users can be
assisted in making annotations
in a more systematic and consistent way. In addition,
applications may use the ``background knowledge'' specified by the ontology in addition
to the metadata itself. For example, when the metadata of a particular page about a
painting only specifies that t
he painting is painted by Rembrandt van Rijn, a query for
``17th
-
century Dutch masters'' will not return the page. When the metadata is combined
with an ontology stating that Rembrandt is indeed classified as a 17th
-
century Dutch
master, the page can be re
turned in response to the query.

4

T E L E M A T I C A I N S T I T U U T

Another example of an approach that attempts to combine elements from both worlds is
the MPEG7

[13] ISO standard. Where Dublin Core limits itself to describing generic,
bibliographical properties of documents, and the KR c
ommunity focuses on modelling
the underlying content, MPEG7 attempts to combine both aspects and apply them to
multimedia documents. This has resulted in a very large specification, and it is still too
early to assess whether this ambitious approach will h
ave any serious impact on
multimedia applications in practice.

The ideas developed within the KR community have such a significant influence on the
design and development of the Semantic Web, one could come under the impression that
the Semantic Web is ``
only'' about re
-
wrapping well
-
known knowledge representation
languages in a Web compatible syntax. This is not the case. There are characteristics that
make the Semantic Web fundamentally different from traditional KR.

First of all, the amount of informat
ion that can be found on the Web is several orders of
magnitude larger than the sizes of the knowledge bases that are common in stand
-
alone
expert systems. This makes reasoning over all knowledge on the Semantic Web
impractical, if not impossible. Reasonin
g on the Semantic Web will thus always be in the
context of an explicitly defined set of documents that need to be taken into account.

Second, knowledge on the Web is distributed. That means, for example, that ontologies
may reuse parts of other ontologie
s by
reference

and not, as common in KR, by
physically importing the needed fragment. In addition, it is unlikely that applications can
fetch, load and parse all required knowledge at start up time. An approach where
knowledge fragments are retrieved on de
mand is more efficient, but also more
complicated to realise.

Third, knowledge on the Semantic Web is heterogeneous and dynamic. Like all
information on the Web, knowledge will be added, changed and removed all the time. It
will differ in quality, it may
not be trustworthy and is potentially inconsistent. Semantic
Web applications need to be able to deal with these features that are, to a large extent,
missing in the current generation KR models and tools.

Finally, the process of knowledge modelling itsel
f may change dramatically. Just as other
information published on the Web, knowledge will be used and reused in contexts that
were not foreseen by the original author. Ontological definitions will be extended by
third parties, and combined with other ontol
ogies in an unpredictable way.

Most of the issues mentioned above warrant new research, and will certainly not be
supported by the first versions of the languages that are discussed in the following
section.




E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

5

2

A short state of the art of the Semantic Web

The ultimate goal of the Semantic Web is to allow machines to assess to what extent
information found for the user is accurate and can be trusted, as depicted by the top layer
in

Figure
2
-
1
. In order to reach this level of sophisti
cation, more complex tasks are
carried out by increasing the number of co
-
operating layers of specific
-
purpose
languages and processing tools.



Figure
2
-
1
:

The layers of the Semantic Web as envisioned by Tim Berners
-
Lee, as presented
during a talk at XML 2001 (see
http://www.w3.org/2000/Talks/1206
-
xml2k
-
tbl/slide10
-
0.html
)
.


2.1

URI s and Uni code

The basis of the whole Web pyramid is still the uniform naming scheme provided by the
concept of the URI. The importance of the URI is often overlooked, but is, to a certain
extent, the defining characteristic of the Web. Anythin
g that wants to be part of the Web
needs to have a URI, and,
vice versa
, anything that has a URI is by definition part of the
Web. Note that this does
not

imply that something need to be available electronically
over the Internet to be part of the Web.

Wh
ile earlier versions of HTML used to have a West
-
European bias by only allowing the
ISO Latin
-
1 character set, the current Web infrastructure now supports a wide variety of
other languages by allowing the full range of characters specified by Unicode

[20].


2.2

XML

On top of this layer, the current, XML
-
based, ``document web'' is built. This layer
includes not only XML itself, but also XML schema

[30] and XML namespaces

[3]. The
6

T E L E M A T I C A I N S T I T U U T

current Web uses the syntactic rules specified by this layer, on top of which
self
-
describing

document languages such as XHTML

[28], SMIL

[25] and SVG

[7] are
defined. These documents are called self
-
describing because they have a text
-
based
syntax with mark
-
up that is meaningful to human readers. For example, just by looking at
its raw

encoding, the content of a well
-
written HTML document could be interpreted by a
human reader even when there is no HTML displaying software available (compare this
with other proprietary binary document formats whose content will become lost when the
asso
ciated applications are no longer available).

XML already allows document languages to be specified in such a way that they can be
shared by parties over the Internet. Due to the many XML
-
related standards and the
common availability of the required tool
support, a surprisingly great deal of
interoperability is the result of just committing to the syntactic rules of XML. To list a
few examples, just the simple fact that a document or data structure is encoded in XML
allows you to:



Specify the way your dat
a should be presented to your users, by defining style
sheets in languages such as CSS or XSLT. Free, off
-
the
-
shelf browsers are
available to present XML data according to the rules specified in the style sheet.



Specify, for a class of documents, what doc
ument structures are allowed in your
application domain by defining an XML Schema. Again, free and off
-
the
-
shelf
tools are available that can validate your documents against the rules defined in
the schema.



Rapidly prototype and build domain
-
specific appl
ications by reusing commonly
available, reusable XML software components such as XML parsers.

The XML level of "machine understanding", however, breaks down in that XML is only
able to define the
syntax

of the elements in a language and their "grammatical
" structure.
There is no understanding of anything other than the hierarchical syntactical structure of
the document. What is needed is some way of specifying the semantics that is supposed
to be communicated by the syntactical XML document structures

[7].

Currently, the
implicit semantics of an XML document can, if the author of the document employed
markup using well
-
designed self
-
descriptive tag names, be perfectly clear to a human
reader. However, to make these semantics explicit, and to communicate the
m in a
machine understandable way, XML in itself is not sufficient. Other layers, built on top of
XML, are required to accomplish this.

2.3

RDF

As stated above, a key concept in making these semantics explicit is the notion of
metadata
. Metadata is data that

describes other data on the Web. From a theoretical
perspective, metadata is just a special type of data, and there is no absolute boundary
between data and metadata. On a practical level, however, metadata benefits from having
languages and tools that ar
e especially designed to facilitate the encoding and processing
of metadata. This is the motivation behind the development of RDF (Resource
Description Framework)

[26]. While built as a layer on top of XML, RDF itself was also
designed as a layer on top of

which more specific metadata languages could be built.

The fundamental building block of RDF is the
statement

that is used to define a
property

of a specific
resource
. The
value

of each property is either another resource (specified by


E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

7

a URI) or a litera
l (a string encoded conforming to syntax rules specified by XML). The
name

of a property can be any (namespace qualified) XML name. In short, each RDF
statement is basically a triple, consisting of the resource being described, the name of a
certain proper
ty and the value of this property.

For example, suppose we have a resource on the Web that shows a picture of a painting
by Rembrandt van Rijn. That this Web resource is indeed painted by Rembrandt could,
by interpreting the surrounding text, be obvious t
o human readers, but not to a machine.
To make this explicit, once could state this explicitly in RDF, and add this statement as
metadata to the Web page. In RDF triple terminology: the URL of the page (say, the
URL
#apostlePaul
) would denote the resource,

the ``painted
-
by'' label the property,
and the string ``Rembrandt van Rijn'' the value.



Figure
2
-
2
:

Simple graphical representation of an RDF triple

RDF triples can be expressed graphically, as in
Figure
2
-
2
, or in XML syntax

as in
Figure
2
-
3
. Note that there is no sing
le XML syntax for interchanging RDF, but at least
two alternatives which are commonly used: the basic serialisation syntax and the
abbreviated syntax.


<!
--

Seria
lization syntax:
--
>

<rdf:Description rdf:about="#apostlePaul">


<painted
-
by>Rembrandt van Rijn</painted
-
by>

</rdf:Description>


<!
--

Abbreviated syntax:
--
>

<rdf:Description rdf:about="#apostlePaul" painted
-
by="Rembrandt van Rijn" />


Figure
2
-
3
:

Example of two XML serialisations of the same RDF statement.


RDF triples can be linked, chained and nested. Resources can be the subject of multiple
properties as well as being reus
ed as the value of multiple properties. Chains can be
formed by using the value of the first triple as the object of the following triple (See
Figure
2
-
4
). Note that this holds only when the value is a resource, not when it is a
lit
eral. Triples can be (arbitrarily) nested, so that any triple can be treated as an object
(termed reification) and re
-
used as a resource (See
Figure
2
-
5
). Together, these allow the
creation of arbitrary graph structures.



Figure
2
-
4
:

Linking of two RDF triples

8

T E L E M A T I C A I N S T I T U U T





Figure
2
-
5
:

Reification: nested triples model statements about statements.

Examples of curre
nt RDF usage include the following
1
:



Dublin Core Metadata Initiative

is an open forum engaged in the development
of interoperable on
-
line metadata standards that support a broad range of
purposes and business models.



Open Directory Project

is the largest
, most comprehensive human
-
edited
directory of the Web. It is constructed and maintained by a vast, global
community of volunteer editors.



XMLNews
-
Meta

is a suite of specifications for exchanging news and
information using open Web standards.



MusicBrains

metadata initiative

is designed to create a portable and flexible
means of storing and exchanging metadata related to digital audio and video
tracks based on RDF/XML and Dublin Core.



RDFPic

is a tool to embed an RDF description of an image (digitised
pho
tograph) into the image itself. This tool implements the work described in
Describing and retrieving photos using RDF and HTTP.

2.4

RDF Schema

While RDF allows complex graphs of metadata to be encoded, RDF itself does not
associate any specific semantics to
these graphs. For a generic RDF parser,
#apostlePaul

is just a URI,
painted
-
by

just a property label, and ``Rembrandt van
Rijn'' just a string literal. Defining what the vocabulary used in an RDF graph actually
means is left to the application.

However, j
ust as it is often useful in an XML context to define what syntax constructs
are valid in what syntactic combination, in RDF it is often useful to define, for a specific
application, what set of semantics concepts the application is supposed to recognise,
and
what basic semantic relations hold among those concepts. RDF Schema

[27] defines a
language on top of RDF that supports this. By predefining a small RDF vocabulary for
defining other RDF vocabularies, one can use RDF schema to specify the vocabulary
us
ed in a particular application domain. RDF Schema extends the RDF data model by
allowing organisation of properties in a hierarchical fashion, that is, one can declare one
property to be a
subPropertyOf

another property. In addition, one can group properti
es



1

See
http://www.w3.org/RDF/#projects

for an up to date overview.



E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

9

that belong to the same type of resource in a
Class
. Classes themselves can be organised
in a
subClassOf

hierarchy. Web resources can be declared to be of a certain class by
using the
type

property. One can define constraints on properties by defining t
he
domain

and
range

of each property to be of a specific class.





Figure
2
-
6
: Fragment of the RDF Schema associated with the triple of
Figure
2
-
2
.

For example, a (fragment of) an RDF Schema for our example RDF triple is depicted in
Figure
2
-
6
: Fragment of the RDF Schema associated with the triple of
Figure
2
-
2
.
. It
defines an Artist and Artifact class, with Painter and Painting as their subclasses. In
addition, it constrains the
painted
-
by

property, by only allowing instances of type
Painting

in its domain, and type
Paint
er

in its range.

Note that an RDF Schema can, just as an XML Schema, be used for validation purposes.
The example schema can, for example, be used to warn about undefined classes and
properties, and identify violations of the domain and range constraints.

There is,
however, an important conceptual difference between RDF and XML Schema. An XML
Schema
-
based validator would check only the (serialisation) syntax structure of the RDF
encoding, while an RDF Schema based validator would check the structure of the

RDF
graph, independent of which XML (or other) serialisation syntax was used (for more
insights with respect to the relationship between XML Schema and RDF Schema,
see

[12]).

In addition, the schema in the example already gives sufficient information to
allow basic
queries in terms of the semantics of the concepts and their relationships in the application
domain. For example, one could select all paintings that are painted by a specific painter.
Such queries are much harder when they have to be phrased i
n terms of the XML syntax
structure used to encode the information.

Finally, RDF Schemas could, in principle, provide the basis for some very low
-
level
reasoning services, mainly subsumption based on the class and properties hierarchies. In
practice, howe
ver, this has turned out to be rather problematic, since there exists no
commonly agreed upon formal semantics or inference model for RDF or RDF Schema.
While RDF applications might provide some limited reasoning services, the behaviour of
such services ha
s not been defined and different implementations are likely to come up
with different results when reasoning over the same set of RDF documents. RDF's notion
10

T E L E M A T I C A I N S T I T U U T

of reification makes it hard to develop a formalised semantics, which has made
reification one of
the more controversial features of RDF

[10].

While the need for formal semantics and inference models may be less urgent, they are
critical ingredients for the upper layers of the Semantic Web (e.g. the logic layer and
above in
Figure
2
-
1
). This may explain why, at the time of writing, RDF Schema has
been a W3C Candidate Recommendation since March 2000, and has still not moved to a
full W3C Recommendation.

Even though RDF Schema is not yet a full W3C Recommendation, a number of key

RDF
Schema applications are already under development. CC/PP

[29] provides an RDF
-
based
framework for defining the vocabularies that are needed to define profiles. In addition, it
also provides a small vocabulary that can be reused across different profil
es. A typical
example of a CC/PP profile is the User Agent Profile developed by the WAP Forum

[32].
This profile provides a commonly agreed upon mechanism to communicate the
(technical) capabilities of mobile phones to servers and proxies. The CC/PP framew
ork,
however, is sufficiently flexible to allow the definition of profiles that focus on more
user
-
centred aspects of a delivery context.

2.5

Ontol ogy l anguages: DAML+OI L and beyond

As discussed above, ontologies are used to explicitly specify a set of (doma
in
-
specific)
concepts and the relations among them. While ontologies are not new in knowledge
-
based applications, the topic received much wider attention when people began to realize
that Web applications will not be able to communicate unless they agree o
n the
terminology used.

One of the major results of the European On
-
To
-
Knowledge project is the Ontology
Inference Layer

[11] and the associated Ontology Interchange Language, both known
under the acronym OIL.

OIL

combines the efficient reasoning support

and formal semantics from Description
Logics, rich modelling primitives commonly provided by Frame languages and a
standard for syntactical exchange notations based on the languages discussed above.
Further work on the language was carried out jointly by
both European and American
researchers in the context of DARPA's Agent Markup Language project, and the
language was renamed to
DAML+OIL
.

The serialisation syntax of DAML+OIL reuses as much of RDF and RDF Schema as
possible. As a result, one could be led
to believe that DAML+OIL provides only minor
syntactic extensions to RDF Schema. This is, however, not the case. The major
difference between the two lies not at the syntactic level, but at the semantic level. In
contrast to RDF and RDF Schema, DAML+OIL ha
s a formally defined semantics that is
needed for implementing efficient and automatic reasoning over a set of statements. RDF
and RDF Schema both lack such a reasoning model.

While DAML+OIL does not support RDF's notion of reification and containers, it
provides a formal semantics as well as a number of new features. First of all, it allows a
class to be specified in terms of a set union, intersection or complement of other classes.
Second, it supports a number of extra property constraints (including car
dinality
constraints) which can, in contrast to RDF, be applied with a local scope. It also allows
statements about properties, such as stating that a property describes a transitive relation,


E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

11

or that one property is the inverse of another property. In add
ition, where RDF
distinguishes only between Web resources and (string) literals, DAML+OIL inherits the
full data
-
typing power of XML Schema.

Initially, DAML+OIL was developed in parallel to the W3C's Semantic Web activity.
More recently, however, the DAML
+OIL specification

[21] has become the starting
point for the W3C Web ontology working group (WebOnt). The working group's charter
is to define an Ontology Web Language (tentatively known as the Web Ontology
Language OWL) for the ontology layer in
Figure
2
-
1
. The working group has just started,
and this makes it hard to make in
-
depth statements about OWL and the technology used
for realising the logic, proof and trust layers in
Figure
2
-
1
. Given the formal nat
ure of
these upper layers, however, it seems inevitable that the ontology layer upon which they
are built needs to provide a sound formal basis. On the other hand, the Web has also a
clear need for a more flexible metadata framework, with informal and pote
ntially
application
-
dependent semantics.

Even with the current state of the art, developing Semantic Web applications requires a
decision on using RDF/RDF Schema or DAML+OIL. The right choice will largely
depend on whether the advantages of DAML+OIL's inf
erence and reasoning support
outweighs the advantages of RDF's flexibility and reification. To exploit the advantages
of DAML+OIL, one has to be willing to pay the price: conformance to the strict and
predefined semantics of the language and the absence of

reification. On the other hand,
RDF's flexible semantics (or lack of formal semantics) also comes with a price: little or
no tool support when it comes to reasoning and inferencing.

How the final versions of RDF Schema (still a Candidate Recommendation)
and the Web
Ontology Language will cater for these two different needs is, at the time of writing, still
unclear. The current requirements draft published by the working group

[31] suggest
OWL will be a language with several layers of language features, th
us allowing
applications to commit only to the level they actually need.

Another issue that plays a role is the XML versus RDF discussion: should OWL be built
on top of RDF(S), the approach suggested by
Figure
2
-
1

and currently app
lied by
DAML+OIL? In this approach, one runs the risk of making integration with current
XML
-
based applications harder. In addition, one needs to solve the problems related to
the lack of formal semantics for several RDF constructs; RDF's syntax decisions
will
limit OWL developers in choosing more elegant XML serialisations for OWL, etc. By
bypassing RDF, and building OWL directly on XML, these problems can be avoided.
This approach, however, runs the risk of the development of two incompatible Semantic
Web
s: an XML/OWL
-
based knowledge Web versus an XML/RDF Schema
-
based
metadata Web.

The XML vs. RDF question within the WebOnt working group is closely related to one
of the big controversies surrounding the Semantic Web in general: the question whether
the ad
vantages of developing a common Semantic Web language stack such as shown in
Figure
2
-
1

really outweigh the more pragmatic approach of defining knowledge
interchange formats directly in XML on a per application domain and per user
c
ommunity basis. This is the approach many E
-
business initiatives are currently taking.
In theory, a Semantic Web
-
based approach would require less
a priori

commitment
between the different user groups, and would enable the use of generic (free and
commerci
al) tools. First, the Semantic Web would standardise more levels of the
information stack, agreement about these levels and the possibility of using off
-
the
-
shelf
12

T E L E M A T I C A I N S T I T U U T

tools would thus come ``for free''. In the XML
-
based alternative, users from a specific
commu
nity would need to agree on these levels first, and then develop their own tools.
Second, when new users would join in, adding their own set of knowledge bases and
tools, the Semantic Web promises a better infrastructure for interoperability between the
tw
o worlds. It still remains to be seen to what extent these promises prove to be realistic
in practice.

A final issue, discussed in more depth in the next section, is the strong focus of both the
second
-
generation ``Document Web'' and the third generation
Semantic Web on (XML)
text and page
-
based layout. Many of today's Web technology does not support
multimedia, with its binary, time
-
sensitive streams and scene
-
based layout.





E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

13

3

Cuypers: from 2
nd

to 3
rd

Generation Multimedia


The particular focus of CWI's m
edia group is on the generation of Web
-
based multimedia
presentations. Multimedia is not as flexible as presentations based on text
-
flow, so that
the current suite of 2nd generation document processing tools are inadequate for coping
with the wide range of

variations in, e.g., display devices (for details see

[22]). For
example, in
Figure
3
-
1
, a presentation is shown describing a particular painting technique
while simultaneously displaying a series of paintings illustrating the tech
nique. In this
case there is only sufficient screen space to display one illustration, so that the
illustrations are displayed as a timed slide
-
show. If the screen were larger then the
presentation would be able to show multiple illustrations simultaneousl
y. Alternatively,
if the screen were smaller, then the illustrations would need to be displayed after the text.
It is this type of flexibility which needs to be made available in circumstances where
there are insufficient human authors to create all the va
riations needed. To solve this
problem within the 2nd generation Web context, we have developed the Cuypers
prototype multimedia generation engine.



Figure
3
-
1
:

Example SMIL presentation about the use of
chiaroscuro

in the works of Rembrandt.


Cuypers is a research prototype system, developed to experiment with the g
eneration of
Web
-
based presentations as an interface to semi
-
structured multimedia databases.
Cuypers explores a set of abstractions, both on the document and on the presentation
level, that are geared towards interactive, time
-
based and media centric pres
entations.
Cuypers uses a set of easily extensible transformation rules specified in Prolog,
exploiting Prolog's built
-
in support for backtracking. The system uses a constraint solver
14

T E L E M A T I C A I N S T I T U U T

embedded in Prolog, enabling it to backtrack when the transformation pro
cess generates
a set of insolvable constraints.

Cuypers operates in the context of the environment depicted in
Figure
3
-
2
. It assumes a
server
-
side environment containing a multimedia database management system, an
intelligent mult
imedia IR retrieval system, the Cuypers generation engine itself, an off
-
the
-
shelf HTTP server, and
--

optionally
--

an off
-
the
-
shelf streaming media server. At
the client
-
side, a standard Web client suffices. Given a rhetorical (or other type of
semantic)

structure and a set of design rules, the system generates a presentation that
adheres to the limitations of the target platform and supports the user's preferences.



Figure
3
-
2
: The environment of the Cuypers generation engine


Experience gained from the development of earlier prototypes (e.g. work done by
Bailey

[17]),

however, proved that for most applications, the conceptual gap between an
abstract, presentation
-
independent document structure and a final
-
form multimedia
presentation is too large to be specified by a single, direct transformation. Instead, we
take an i
ncremental approach, and define the total transformation in terms of smaller
steps, each of which performs a transformation to another level of abstraction. These
levels are depicted in
Figure
3
-
3

and consist of the semantic, commun
icative device,
qualitative constraint, quantitative constraint and final
-
form presentation levels.



Figure
3
-
3
:

The layers of the Cuypers generation engine.



E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

15


1.

Final
-
form presentation level

--

At the lowest level of abstraction, we define
the final
-
form presentation. This encodes the presentation in a document format
th
at is readily playable by the end user's Web browser or media player, in our
case SMIL. This level is needed to ensure that the end
-
user's Web
-
client remains
independent of the abstractions used internally in the Cuypers system, and that
the end
-
user can u
se off
-
the
-
shelf Web clients to view the presentations
generated.

2.

Quantitative constraints level

--

To be able to generate presentations with the
same information using different document formats, we need to abstract away
from the final
-
form presentation.

On this level of abstraction, the desired
temporal and spatial layout of the presentation is specified by a set of format
-
independent constraints, from which the final
-
form layout can be derived
automatically.

An example of a quantitative constraint is `
`the x
-
coordinate of the top
-
right
corner of picture X should be at least 10 pixels smaller than the x
-
coordinate of
the top
-
left corner of picture Y''. While more abstract than the final form
presentation, a specification at this level provides sufficient

information for the
Cuypers system to be able to automatically generate the final
-
form presentation.
An off
-
the
-
shelf numeric constraint solver is used to determine whether or not a
given layout specification can be realised, and, if so, to generate any n
umeric
layout specifications needed.

Numeric, or quantitative constraints, are necessary to determine whether a
specific layout can be realised with respect to specific requirements. For many
other purposes, however, these constraints are too low level an
d contain too
much detail. Larger differences cannot be solved at this level of abstraction.
Qualitative constraints are introduced to solve these problems.

3.

Qualitative constraints level

--

An example of a qualitative constraint is
``caption X is position
ed below picture Y'', and backtracking to produce
alternatives might involve trying right or above, etc. Some final
-
form formats
allow specification of the document on this level. In these cases, the Cuypers
system only generates and solves the associated
numeric constraints to check
whether the presentation can be realised at all, it subsequently discards the
solution of the constraint solver and uses the qualitative constraints directly to
generate the final form output.

While qualitative constraints sol
ve many of the problems associated with
quantitative constraints, they are still not suited for dealing with the relatively
large differences in layout, e.g., as in the mobile phone versus the desktop
browser example given above. Therefore, another level o
f abstraction is
introduced: the communicative device.

4.

Communicative device level

--

The highest level of abstraction describing the
presentation's layout makes use of
communicative devices

[18]. These are similar
to the patterns of multimedia and hyperme
dia interface design described by

[16]
in that they describe the presentation in terms of well known spatial, temporal
and hyperlink presentation constructs. An example of a communicative device
16

T E L E M A T I C A I N S T I T U U T

described in

[18] is the
bookshelf
. This device can be effect
ively used in
multimedia presentations to present a sequence of media items, especially when
it is important to communicate the
order

of the media items in the sequence. How
the bookshelf determines the precise layout of a given presentation in terms of
lo
wer level constraints can depend on a number of issues. For example,
depending on the cultural background of the user, it may order a sequence of
images from left to right, top to bottom or
vice versa
. Also its
overflow strategy
,
that is, what to do if the
re are too many images to fit on the screen, may depend
on the preferences of the user and/or author of the document. It may decide to
add a ``More info'' hyperlink to the remaining content in HTML, alternatively, it
could split the presentation up in mult
iple scenes that are sequentially scheduled
over time in SMIL.

While the communicative device level is a very high
-
level description of the
presentation, we still need a bridge from the domain
-
dependent semantics as
stored in the multimedia information re
trieval system to the high
-
level
hypermedia presentation devices. To solve this problem, we introduce one last
level of abstraction: the semantic structure level.

5.

Semantic structure level

--

This level completely abstracts from the
presentation's layout a
nd hyperlink navigation structure and describes the
presentation purely in terms of higher
-
level, ``semantic'' relationships.

In the current Cuypers system we focus on the rhetorical aspects of the
presentation, because it applies well to the application
domains for which we are
currently building prototypes (e.g. generating multimedia descriptions of artwork
for educational purposes).

Depending on the target application, however, other types of semantic relations
can be used as well. Possible alternative
s include abstractions based on the
presentation's narrative structure for story
-
telling applications or abstractions
based on an explicit discourse model. From the perspective of the Cuypers
architecture, any set of semantic relations can be chosen as lon
g as it meets the
following two requirements:

1.

it should sufficiently abstract from all presentation details so that these
can be adequately adapted by the lower levels of the system, and

2.

it should provide sufficient information so that the relations can
be used
to generate an adequate set of communicative devices that convey the
intended semantics to the end user.

The current implementation already uses a declarative encoding of the design, user and
platform knowledge. These different types of knowledge
are, however, still intertwined.
This part of the system needs to be redesigned to be able to manipulate the different
types of knowledge through interfaces that are tailored to the different tasks and roles of
the users that will need to control them, and

to be able to encode the required knowledge
in a declarative and reusable way. We expect the Semantic Web to play a key role in
supplying languages to express the required knowledge and, more importantly, in
supplying tools able to process the knowledge.



E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

17

In particular, the types of information we require to make explicit are a user profile,
platform profile, a discourse or rhetorical model, graphic and information design rules
and a domain ontology of the selected topic. Given modular declarative descript
ions of
these types of information, they need to be fed back into the five
-
layer processing model
of the system, as described above.

Another use of Semantic Web technology within Cuypers is to be able to use knowledge
on the Web rather than operating on a

single multimedia database, thus providing access
to a greater range of materials. This, however, brings with it the limitation of only being
able to use material which is adequately annotated.






<body>


<par>


<text region="title" src="...queries to the multimedia database..."/>


<text region="descr" src="...to find descripton of '
chiaroscuro'..."/>


<seq>


<par dur="10" id="apostlePaul">


<img region="img" src="...the image of the paintings..."/>


<text region="ptitle" src="...the title of the painting.. "/>



</par>


<par dur="10"> ... 2nd painting+title ... </par>


<par dur="10"> ... 3rd painting+title ... </par>


...


</seq>


</par>


</body>

</smil>

Figure
3
-
4
:

SMIL encoding of the presentation shown in
Figure
3
-
1
.

An additional benefit of making information, used during the generation process, explicit
is that it can be retained and inserted in the presentation f
ormat that is finally generated.
For example, the skeleton of the SMIL file that is generated for the presentation shown in
Figure
3
-
1

can be seen in
Figure
3
-
4
. The generated file can, however, be enriched w
ith
explicit information on for example, the domain ontology information used in deciding
the rhetorical structure of the presentation. The head of the SMIL file of the presentation
is shown enriched with RDF and DAML+OIL code in
Figure
3
-
5
.



<smil xmlns="http://www.w3.org/2000/SMIL20/CR">


<head>


<meta name="generator" content="CWI/Cuypers 1.0"/>


<meta
data>


<rdf:RDF xml:lang="en"


xmlns:rdf="http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#"


xmlns:oil="http://www.ontoknowledge.org/oil/rdf
-
schema/2000/11/10
-
oil
-
standard"


xmlns:museum="http://ics.forth.gr/.../museum.rdf"



xmlns:token="http://www.token2000.nl/ontologies/additions.rdf">


<rdf:Property rdf:about="http://www.token2000.nl/additions.rdf#painted
-
by">


<oil:inverseRelationOf


rdf:resource="http://ics.forth.gr/.../museum.rdf#paints
"/>


</rdf:Property>



<museum:Museum rdf:ID="Rijksmuseum"/>


<museum:Painter rdf:ID="Rembrandt">

18

T E L E M A T I C A I N S T I T U U T


<museum:fname>Rembrandt</museum:fname>


<museum:lname>Harmenszoon van Rijn</museum:lname>


</museum:Painter>



<museum:Painting rdf:about="#apostlePaul">


<museum:exhibited rdf:resource="#Rijksmuseum" />


<museum:technique>chiaroscuro</museum:technique>


<token:painted
-
by rdf:resource="#Rembrandt" />


</museum:Painting>


</rdf:RDF>


</metadata>


<layout>...</layout>


</head>


...



Figure
3
-
5
: SMIL encoding of the presentation shown in
Figure
3
-
1

with accompanying RDF and
D
AML+OIL mark
-
up.



E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

19

4

Semantic Web and e
-
learning standards

The goal of e
-
learning is to transfer knowledge and skills from human beings to human
beings with computers as an intermediary. Technology that aims to facilitate the transfer
of semantically
-
rich info
rmation between computers, such as Semantic Web technology,
seems therefore like a natural fit to help in this process. In this chapter we investigate the
ways in which this technology could contribute to the e
-
learning field.

4.1

Introduction

If we assume tha
t the designers of e
-
learning content want the computer to provide
learners with (new) knowledge, they may choose one of the two strategies below:

1.

Create content that is new, understandable and useful for most of its target group.
This will often lead to c
ourses that either start from the very basics or leave out part
of their audience.

2.

Make content modular, and provide a great number of variations on the content by
assembling different courses by combining various pieces. Alternatively, assume that
a huma
n teacher or the learner himself assembles a course from the pieces. This will
lead to more tailor
-
made content and thus, in general, to better learning experiences
for learners.

A human teacher giving a course using a book has some idea of his audience.
He will
thus be able to provide background information or (alternative) explanations, or to decide
to skip certain parts of the book. Because e
-
learning content sometimes needs to stand on
its own, the computer may have to take on the role of the teacher o
r coach.

If a designer of e
-
learning content chooses the first strategy described above, the
computer will not be able to take on this role since it has no understanding of the internal
structure of the content. In other words, in the perception of the co
mputer the content is
atomic. It has no knowledge whatsoever of the modules or pieces the content exists of,
the characteristics of these modules or their mutual relations.

The second strategy, the modularity approach, is the point of departure of e
-
learn
ing
standards discussed in the next section. It is much easier, thus cheaper, to assemble a
course from existing pieces then it is to start from scratch. It is also a much more flexible
approach since one can easily assemble different courses for different

users by
combining those building blocks in various ways. Since the different modules are
“labelled” with metadata such as keywords, creation date, creator and age range of the
intended audience, teachers and learners can easily retrieve and reuse them. I
t would be
very powerful however, if this assemblage could be fully automated or semi
-
automated.
This is exactly what Semantic Web technology could provide. As we saw in Section
1.2

the goal of the Semantic Web is

to address the needs of applications that must
“understand” the content they process.

4.2

E
-
learning content standards


20

T E L E M A T I C A I N S T I T U U T

The primary goal of e
-
learning standards such the Sharable Content Object Reference Model
(SCORM) of the Advanced Distributed Learning (AD
L) initiative is make it possible to
access and (re)use educational content on a variety of e
-
learning platforms [35]. Another goal
of ADL SCORM is to be able to adapt content as much as possible to specific users. The
ADL SCORM vision expressed on the web

site reflects these two goals in the following way:
“Provide access to the highest quality education and training, tailored to individual needs,
delivered cost
-
effectively, anywhere and anytime.”

The core of SCORM is made up by the metadata standard and t
he packaging standard.
SCORM has a model in mind where learning material is divided in smaller reusable
modules that can be enriched with metadata. The SCORM standard is based on several
other standards of which the most important is the IMS standard [36].

We will here
mainly discuss the SCORM standard because it gives the clearest interpretation of how
content is supposed to be aggregated and sequenced. The SCORM standard extends the
IMS standard with small additions to the IMS metadata and packaging stand
ards.
Moreover it defines a run time library and detailed protocol of how this runtime library
must interact with a learning management system. SCORM also uses the Aviation
Industry CBT (Computer
-
Based Training) Committee (AICC) standard to query the
lear
ning management system about information on the user that is using this learning
object. [37].

A SCORM learning object that follows the protocol for the run time environment is
called a SCO. It is the smallest unit that a learning management system may tr
ack, but
can itself be a compound document consisting of smaller objects.

In this chapter we focus on the tailoring or adaptation aspect, since that is the point
where Semantic Web technology offers considerably more flexibility than the current e
-
learning

standards. In this section we describe the current adaptation functionality of
ADL SCORM. In the next section we demonstrate the surplus value of Semantic Web
technology in this respect.

There are two available mechanisms to make content adaptable. In th
e first place an
author can pre
-
assemble several so
-
called organisations. An organisation is a tree whose
leafs are SCOs. In
Figure
4
-
1

each organisation makes up a variant of a course about
XML. Depending on the kno
wledge, goals or preferences of the user one of the three
variants can be chosen. Conventionally the tree is serialised in XML and must be
contained in a document or manifest called imsmanifest.xml.




E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

21


Figure
4
-
1
: Three example organisations sharing the same SCOs

The other mechanism that the standard provides for adaptation is that a SCO can be
launched or not depending on whether a learner has seen other SCO’s or depending on
the passing or failing of the us
er to complete a SCO within a given time frame. Below a
summarised version of the manifest describing the courses in
Figure
4
-
1

is given. The
bold parts in the organisation of the full XML course are examples of this

“SCO
-
launching
-
type” of adaptation. The first bold part indicates that the exercises of the
introduction must be completed within 3 minutes. The second bold part indicates that
before starting on the W3 School exercises, the “regular” exercises need to be

completed.
Note that the AICC standard that we mentioned earlier in this section is applied here.

22

T E L E M A T I C A I N S T I T U U T


<manifest xmlns = "http://www.imsproject.org/xsd/imscp_rootv1p1p2"


xmlns:adlcp = "http://www.adlnet.org/xsd/adl_cp_rootv1p1”



identifier = "Exam
ple course"



version = "1" >



<!
--

metadata on this course in general
--
>


<metadata>



<schema>ADL SCORM</schema>



<schemaversion>1.2</schemaversion>



<adlcp:location>ExampleCourse_MetaData.xml</adlcp:location>


</metadata>



<organizations def
ault="Full">






<organization identifier = "Full" structure = "hierarchical" >




<title>Full XML course</title>





|




|




|





<!

Full introduction consists of three sub sections
--
>




<item

identifier = "Full_introduction" >





<title>Full_intr
oduction</title>





<item

identifier = "Introduction_to_XML"






identifierref = "SCO_Introduction_to_XML" >






<title>Introduction_to_XML</title>





</item>





<item

identifier = "Exercises"






identifierref "SCO_Exercises" >






<title>Ex
cercises</title>






<adlcp:maxtimeallowed>

00:03:00

</adlcp:maxtimeallowed>





</item>





<item

identifier = "Exercises_W3"






identifierref "SCO_Exercises_W3" >






<title>Excercises_W3</title>

<adlcp:prerequisites type = "aicc_script">







Excercises






</adlcp:prerequisites>





</item>





</item>




|




|




|




</organization>




<!

Full course without a section about DTDs
--
>



<organization identifier = “No_DTD” structure = “hierarchical”>



|



|



|



</organization>




<!

A sho
rt version of the course
--
>



<organization identifier = “Short” structure = “hierarchical”>



|



|



|



</organization>



</organizations>


<resources>



<resource>



<!

list of resources including a description of the SCO’s
--
>



</resource>


</resou
rces>

</manifest>




E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

23

4.3

More f l exi bi l i t y wi th Semant i c Web t echnol ogy


The previous section pointed out that the possibilities for adaptation of educational content
within ADL SCORM are currently fairly limited. The system can either adapt a course to a
user by

picking one of several
predefined

selections or by deciding whether or not to present
a specific SCO depending on the user’s history.


Semantic Web technology offers more flexibility. It includes languages such as RDF which
make it possible to express use
r profiles and terminal profiles as well as metadata. These
languages also make it possible to express inference rules. An example of an inference rule
is: Children under 7 years should not be offered text files without the accompanying audio
file. Another

example of in inference rule is: If the user’s terminal has no audio capabilities,
do not select any audio files.


Furthermore, Semantic Web technology includes inference engines which make it possible to
generate a tailor
-
made course for a specific user,

based on his or her user profile, the terminal
profile and the metadata of the various learning objects. This is done by combining the user
and/or terminal profile with the metadata of the available learning objects by means of the
inference rules. Thus,
if the user profile indicates that a user is 5 years old, text files will
never be presented without audio support according to the first example inference rule given
above.


The previous implies that relations between learning objects are essential in the

adaptation
process. For instance, in the above it is relevant for the inference engine to know that audio
file A is the audio version of text file T. Semantic Web languages such as RDF make it also
possible to express relations between learning objects. R
elations are important in order to
give an application an understanding of the structure of the bases document (that can be
tailored to needs of a user).


Hence, in principle Semantic Web technology makes it possible for computers to become
designers sinc
e an application can be provided with knowledge about users, terminals,
information (including the relations between the separate information objects) as well as with
knowledge about the inferences human designers make. Time will tell if this will lead to
more or less acceptable results. At least automatic course designers will be useful supporters
of human teachers and designers, for instance in preparing a first rough version of a tailor
-
made course.


A massive advantage of applying general purpose techno
logy, such as the Semantic Web, in
combination with e
-
learning standards to achieve adaptability is the fact that general purpose
technology receives more interest from organisations all over the world then proprietary
technology and therefore tends to be
develop more quickly.

4.4

Bonus advant age


Proprietary metadata standards such as IMS metadata for e
-
learning and MPEG
-
7 for
multimedia have one fundamental flaw. They are almost useless outside of the comparatively
limited world they were developed for. In or
der to use them for example for searching, one
needs a search engine that understands these standards. This is really clumsy as part of the
attributes in both standards are based on more generic standards such as Dublin Core. It
would not be hard to use an

XSL
-
T transformation to generate e.g. Dublin Core information,
but it does require explicit effort.


A solution to the problem sketched above is to use an RDF serialisation for the metadata. In
such a serialisation it becomes the route of least resistance

to use and extend existing
metadata vocabularies. Thus the Dublin Core subset of IMS metadata and MPEG
-
7 can
24

T E L E M A T I C A I N S T I T U U T

simply be taken to
be

Dublin Core, such that a standard search engine, e.g. Google,
recognises it.

4.5

When wi l l Semanti c Web t echnol ogy be appl i ed i n

practi ce?

In this chapter a rosy picture of the possibilities of the Semantic Web for e
-
learning was
sketched. However, in Chapter
2

we indicated that Semantic Web technology is still in an
early stage. There is
already some experimental technology available but a massive
amount of research on ontology languages and their corresponding inference engines will
be performed the coming years.

At the Telematica Instituut, we are currently exploring on a proof of concep
t basis, the
possibilities of generating organisation documents on the fly, based on templates with
inference rules, IMS metadata, some additional metadata and a simple user profile, all
expressed in RDF [38]. The logic engine CWM (Closed World Machine) [
33] is used to
compose a course that is suitable for the learner. The CWM can natively use the
experimental logic ontology of the W3C Semantic web activity [34].

Since the current Semantic Web infrastructure is still experimental it will not be possible
to

apply the Semantic Web in e
-
learning practice in the short term. However, for the
medium and long term the application of the Semantic Web in e
-
learning becomes more
realistic. The medium and long
-
term expectations of the Semantic Web for e
-
learning are
d
escribed in the next chapter.




E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

25

5

Short, medium and long term expectations of the
Semantic Web

The development of the Semantic Web is at an extremely early stage and few
applications are currently up and running. This makes reliable predictions extremely
dif
ficult to make. In this section we analyze potential Semantic Web applications. By
making the potential benefits and fundamental problems of the Semantic Web explicit,
we point out what the important issues are, and their implications for successful uptake

in the short, medium and long terms.

5.1

Short t erm

Uses of the Semantic Web in the short term will emerge in situations where local benefit
is gained immediately, without having to rely on a more global uptake. The usage of
Semantic Web technology may not
even be obvious to end users, but hidden behind the
scenes (similar to the majority of current XML deployment, that is not visible to end
-
users but applied server
-
side). While these applications use Semantic Web technology,
they will add little to the perc
eption of the Semantic Web as a whole. An example of
such an application is the use of RDF in Mozilla's configuration and preference files.
Here, RDF is used as a local storage format, since the application's data is more readily
described in RDF's graph o
f triples model than in XML's ordered hierarchy model.

Similarly, applications which require the exchange of simple EER/UML
-
like
class/subclass data models over the Web, such as CASE and database modeling tools,
can use RDF Schema as the exchange format.
This provides a common syntax for easily
agreed
-
upon data modeling semantics.

User groups who are currently creating their own ontologies in their own languages, for
example in biology, medical and arts fields, are able to provide Web
-
compatible
serializa
tions of their ontologies using the current version of DAML+OIL. The question
is whether the currently available language (DAML+OIL) and the language to be
developed for the Web (OWL) are sufficiently powerful for their purposes (e.g. the full
complexity o
f thesauri like the AAT
--

Art and Architecture Thesaurus
--

goes beyond
the expressive capabilities of DAML+OIL).

5.2

Medi um t erm

While applications that will emerge in the short term use currently available technology
in a local context, medium term applic
ations will use current technology in a more
global, distributed context. For example, the use of Dublin Core for annotating
documents on the Web is only useful for finding, e.g., all articles written by a certain
author, when all articles are annotated wi
th the corresponding Dublin Core attribute.
Similarly, providing CC/PP descriptions for devices is only effective when sufficient
descriptions are available and are made use of in the complete information chain.

Another class of medium term applications a
re those that use newly
-
developed
technology in a local context, for instance educational content adaptation services using
local documents or databases. Such applications might, for example, use the future Web
Ontology Language OWL in a local context to p
rovide advanced knowledge
-
intensive
26

T E L E M A T I C A I N S T I T U U T

inference and reasoning. These applications will be similar to today's knowledge
-
intensive applications, with the added benefit of using ontologies and data in a format
that is easily exchangeable over the Web, and the a
vailability of off
-
the
-
shelf tool
support. Other examples include support for agent
-
based Semantic Web services among
specific user groups, such as supplier/merchant extranet services. For example, services
that allow product profiles to be compared among
a number of manufacturers that have
committed to a specific ontology.

5.3

Long term

Long term use of the Semantic Web will be in applications that use yet
-
to
-
be developed
technology requiring uptake on a global scale. For instance, in the field of e
-
learning

it
may become possible to automatically generate courses based on learning objects from
all over the world. Another example of this type of application is the scenario sketched at
the beginning of the Scientific American article ``The Semantic Web'' by Be
rners
-
Lee et
al.

[2]. The scenario sketches the ultimate goal of the Semantic Web, a Web where
software agents are able to access a wide range of web services to autonomously perform
a wide range of complex tasks on behalf of their user or user groups.

Wh
ile this scenario has led to high expectations of the Semantic Web (and contributed
significantly to the hype that surrounds it), one can doubt its feasibility, even in the
longer term. These types of applications will only work if all of the many parties
involved participate and obey the right protocols, on various levels. It requires parties to:



employ sufficiently rich metadata annotations on all their Web content;



commit to common vocabularies of which the expressivity goes far beyond that
of, for exa
mple, RDF Schema and DAML+OIL;



commit to yet
-
to
-
be
-
developed standards for Web service description, discovery,
deployment, etc.;



commit to yet
-
to
-
be
-
developed standards for Semantic Web query languages;



perform all processing in a way that can be contro
lled, verified and trusted by the
end
-
user.

In addition to these socio
-
economical problems, a fundamental conceptual problem is the
"automatic lookup" of terms across ontologies to make applications work that did not
a
priori

commit to a common ontology.
This can only be on a "best effort" basis, which
may suffice for many applications, but not for the type of applications as described in
this scenario
--

where trust is of key importance. An important open architectural issue is
the level of distribution t
hat is required to realise the amount of storage and processing
required. Part of the initial success of the Web can be explained by its relatively simple
client/server

model; a model that still dominates the Web today: document storage is
centralised at t
he server
-
side, as is large scale processing such as performed by search
engines. An alternative, potentially more powerful approach, is a peer
-
to
-
peer model for
storing and processing data. Its success has already been demonstrated in projects such as
Set
iHome

[19] (distributed, client
-
side computing) and
Napster

[15] (distributed, client
-
side storage).



E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

27

The peer
-
to
-
peer approach is also being exploited by the Grid Forum

[8]. Grid computing
originated in the particle physics community (as did the current W
eb...), this time driven
by the need to process the huge amount of data that will be generated by next generation
particle accelerators. Instead of relying on a few supercomputers (whose power would be
insufficient and costs would be too high) the grid wou
ld distribute the work over a large
number of ordinary desktop computers connected over the Internet. The concept was
soon adopted by other research communities that required large amounts of storage and
processing resources (ranging from climate simulatio
ns to DNA analysis). The metaphor
of the electricity grid has inspired this model of computing, where the global Internet
offers a wide range of computing resources that are available everywhere.

Most of the current grid
-
related projects focus on the lowe
r
-
level aspects of the common
middleware layer needed to do distributed computing in an efficient, safe and
manageable way, that is, building the infrastructure for the “data grid” and the “compute
grid”. Other projects, however, have already started inves
tigating the building of a
“knowledge grid” on top of these layers. The goals of such a knowledge grid are in
essence identical to those of the Semantic Web. There is no reason why the architecture
of Semantic Web should be restricted to the client/server
model, and the computing
power and distributed management of the grid model might eventually facilitate the more
complex distributed reasoning required for scenarios such as that sketched by Berners
-
Lee et al.

[2].



E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

29

References

[1]


Tim Berners
-
Lee.
Weavi
ng the Web
. Orion Business, 1999.

[2]


Tim Berners
-
Lee, James Hendler, and Ora Lassila. The Semantic Web.
Scientific
American
, May 2001.

[3]


Tim Bray, Dave Hollander, and Andrew Layman. Namespaces in XML. W3C
Recommendations are available at http://www.
w3.org/TR, Januari, 14, 1999.

[4]


Warwick Cathro. Metadata: An Overview. In
Standards Australia Seminar:
Matching Discovery and Recovery
, August 1997. See also http://dublincore.org.

[5]


Decker, D.

Fensel, F.

van Harmelen, I.

Horrocks, S.

Melnik, M.

Kle
in, and
J.

Broekstra. Knowledge Representation on the Web. In F.

Baader, editor,
International Workshop on Description Logics (DL'00)
, 2000.

[6]


Stefan Decker, Sergey Melnik, Frank

Van Harmelen, Dieter Fensel, Michel Klein,
Jeen Broekstra, Michael Erdman
n, and Ian Horrocks. The Semantic Web: The
roles of XML and RDF.
IEEE Internet Computing
, 15(3):63
-
74, October 2000.

[7]



Jon Ferraiolo. Scalable Vector Graphics (SVG) 1.0 Specification. W3C
Recommendations are available at http://www.w3.org/TR/, 4 Sep
tember 2001.

[8]



Global Grid Forum. Global Grid Forum. See http://www.gridforum.org/.

[9]


Lynda Hardman and Jacco van Ossenbruggen. Device Independent Multimedia
Authoring. In
W3C Workshop on Web Device Independent Authoring
, Bristol, UK,
October 3
-
4
, 2000.

[10]


Patrick Hayes. RDF Model Theory. Work in progress. W3C Working Drafts are
available at http://www.w3.org/TR, 25 September 2001.

[11]

I.

Horrocks, D.

Fensel, J.

Broekstra, M.

Erdman, C.

Goble, F.

van Harmelen,
M.

Klein, S.

Staab, R.

Struder,

and E.

Motta. The Ontology Inference Layer OIL.
On the www.ontoknowledge.org website.

[12]

Jane Hunter and Carl Lagoze. Combining RDF and XML Schemas to Enhance
Interoperability Between Metadata Application Profiles. In
The Tenth International
World Wi
de Web Conference

[14], pages 457
-
466.

[13]

International Organization for Standardization/International Electrotechnical
Commission. MPEG
-
7: Context and Objectives, 1998. Work in progress.

[14]


IW3C2.
The Tenth International World Wide Web Conference
,
Hong Kong, May
1
-
5, 2001.

[15]

Napster Inc. Napster. See http://www.napster.com/.

30

T E L E M A T I C A I N S T I T U U T

[16]

Gustavo Rossi, Daniel Schwabe, and Alejandra Garrido. Design Reuse in
Hypermedia Applications Development. In
The Proceedings of the Eighth ACM
Conference on Hypertex
t and Hypermedia
, pages 57
-
66, Southampton, UK, April
1997. ACM, ACM Press.

[17]

Lloyd Rutledge, Brian Bailey, Jacco van Ossenbruggen, Lynda Hardman, and Joost
Geurts. Generating Presentation Constraints from Rhetorical Structure. In
Proceedings of the 1
1th ACM conference on Hypertext and Hypermedia
, pages 19
-
28, San Antonio, Texas, USA, May 30
-

June 3, 2000. ACM.

[18]

Lloyd Rutledge, Jim Davis, Jacco van Ossenbruggen, and Lynda Hardman. Inter
-
dimensional Hypermedia Communicative Devices for Rhetorical

Structure. In
Proceedings of the International Conference on Multimedia Modeling 2000
(MMM00)
, pages 89
-
105, Nagano, Japan, November 13
-
15, 2000.

[19]

Seti@Home. Search for Extraterrestrial Intelligence at Home. See
http://setiathome.ssl.berkeley.edu/.

[20]

The Unicode Consortium . The Unicode Standard, Version 3.0. Reading, Mass.
Addison
-
Wesley Developers Press, 2000.

[21]


Frank van Harmelen, Peter

F. Patel
-
Schneider, and Ian Horrocks. Reference
description of the DAML+OIL (March 2001) ontology markup

language.
http://www.daml.org/2001/03/reference.html. Contributors: Tim Berners
-
Lee, Dan
Brickley, Dan Connolly, Mike Dean, Stefan Decker, Pat Hayes, Jeff Heflin, Jim
Hendler, Ora Lassila, Deb McGuinness, Lynn Andrea Stein, ...

[22]

Jacco van Ossenbrugge
n, Joost Geurts, Frank Cornelissen, Lloyd Rutledge, and
Lynda Hardman. Towards Second and Third Generation Web
-
Based Multimedia.
In
The Tenth International World Wide Web Conference

[14], pages 479
-
488.

[23]

Jacco van Ossenbruggen and Lynda Hardman. Mult
imedia on the Semantic Web.

In Herre van Oostendorp, Andrew Dillon, and Leen Breure, editors,
Creation, Use
and Deployment of Digital Information
. Erlbaum, Publication planned late 2002.

[24]

Jacco van Ossenbruggen, Lynda Hardman, and Lloyd Rutledge. Hype
rmedia and
the Semantic web: A research agenda. Technical Report INS
-
R0105, CWI, 2001.

[25]

W3C. Synchronized Multimedia Integration Language (SMIL) 1.0 Specification.

W3C Recommendations are available at http://www.w3.org/TR/, June 15, 1998.

Edited by
Philipp Hoschka.

[26]

W3C. Resource Description Framework (RDF) Model and Syntax Specification.

W3C Recommendations are available at http://www.w3.org/TR, February, 22,
1999. Edited by Ora Lassila and Ralph R. Swick.

[27]

W3C. Resource Description Fram
ework (RDF) Schema Specification 1.0.

W3C Candidate Recommendations are available at http://www.w3.org/TR, 27
March 2000. Edited by Dan Brickley and R.V. Guha.



E
-
L I N C C ( E X T R A D E
L I V E R A B L E )

31

[28]

W3C. XHTML 1.0: The Extensible HyperText Markup Language: A
Reformulation of HTML 4.0 in
XML 1.0. W3C Recommendations are available at
http://www.w3.org/TR/, January 26, 2000.

[29]

W3C. Composite Capability/Preference Profiles (CC/PP): Structure and
Vocabularies. Work in progress. W3C Working Drafts are available at
http://www.w3.org/TR, 15
March 2001. Edited by Graham Klyne, Franklin
Reynolds, Chris Woodrow and Hidetaka Ohto.

[30]

W3C. XML Schema Part 0: Primer. W3C Recommendations are available at
http://www.w3.org/TR/, May 2, 2001. Edited by David C. Fallside.

[31]

W3C. Requirements fo
r a Web Ontology Language. Work in progress. W3C
Working Drafts are available at http://www.w3.org/TR, 7 March 2002. Edited by
Jeff Heflin, Raphael Volz and Jonathan Dale.

[32]

Wireless Application Group. WAP
-
174: WAG UAPROF User Agent Profile
Specificat
ion, 1999.

[33]

S.B. Palmer.
CWM
-

Closed World Machine
. See
http://infomesh.net/2001/cwm/

(link 5
-
4
-
2002).

[34]

W3C Semantic Web,
http://www.w3.org/2001/sw/

(link

5
-
4
-
2002).

[35]

Advanced Distributed Learning Network (ADLnet),
http://www.adlnet.org

(link 5
-
4
-
2002).

[36]

IMS Global Learning Consortium,
http://www.imsglobal.org/

(link
19
-
1
-
2002).

[37]

Aviation Industry CBT (Computer
-
Based Training) Committee (AICC).
http://www.aicc.org

(link 25
-
4
-
2002).

[38]

Rogier Brussee, Martin Alberink and Mettina Veenstra. Using Semantic Web
techniques for e
-
learn
ing. To appear in
Proceedings of SCI2002
, Orlando, July
2002.

[39] Martin Alberink, Guido Annokée, Rogier Brussee, Marjan Grootveld, Henk de
Poot, Patrick Strating, Janine Swaak (ed.), Mettina Veenstra and Carla Verwijs.
State
-
of
-
the
-
art e
-
learning
, Telem
atica instituut, 2001