Sharing and Browsing
Since Santa Barbara
Focus on morpho
Decided to build ontology (to be
discussed later in this talk)
Decided to build supporting tools
smart search engine (Hedwig)
Some work on xml markup
Currently there is no general way for
researchers in the endangered
languages community to
electronically share information.
The Web is the most likely tool that
could provide a solution.
The current WWW is not adequate.
An Example from the WWW:
What about other data formats?
(comparative) word lists
'Grammatical case suffixes' are those which
express grammatical relations (subject,
object, indirect object), like /karriny
(4). A noun without a case suffix is
interpreted as having Absolutive case
/nanttu/ in (4) and /wangarri/ in (5)
as being the main predicator, or as
agreeing with some argument with
/pulyurrulyurru/ in (5).
(from J. Simpson 1998)
ji +ajjul nyirri
njina nanttu, ngapa
ERG +3pl.S put
PAST.CONT humpy, water
'The people were erecting humpies for fear of the rain.'
nyi +ama wangarri kumppu pulyurrulyurru.
PAST.PUN +he rock ABS big.ABS red.ABS
'He placed a big red hill.' [JS:PND:RS]
Other elements that appear as verbal
prefixes include modals
well as directional elements
'come'. These are placed
in the immediate pre
after the tense. This is shown by the
(from Mchombo 1998)
a maûngu . . .
pumpkins . . .
'The lion did not just smash them, the pumpkins . . .'
'The lion is going to smash some pumpkins.'
Take advantage of new Web
Build a community of practice on the
What is the Semantic Web?
The Semantic Web
New markup: <xml>, <rdf>, <owl>
New tools: smart search engines
ontologies, new editors
Meaning is encoded explicitly.
Pages are interpreted by a reasoner.
An Example from the Semantic
New markup adds functionality to
existing <html> documents.
nocturnal burrowing mammal of the grasslands of Africa that feeds on
termites; sole extant representative of the order Tubulidentata WordNet for
1. nocturnal burrowing mammal of the grasslands of Africa that feeds on
termites; sole extant representative of the order Tubulidentata
<rdfs:comment>nocturnal burrowing mammal of the grasslands of Africa that
feeds on termites; sole extant representative of the order Tubulidentata
WordNet for 'aardvark'<br><br>
1. nocturnal burrowing mammal of the grasslands of Africa that
feeds on termites; sole extant representative of the order Tubulidentata<br>
Crucial component of the Semantic
A resource that explicitly defines
what entities can exist in a domain,
i.e., the endangered languages
A resource that defines what
relations hold between entities
OWL Web Ontology Language
Analogous role of <html> on the
The most current “standard”
Semantic Web language
Under development at the W3C:
Search tools for the Semantic Web
Editors for composing Semantic Web
An extensible data model
A Search Engine
EMELD Arizona’s prototype (Hedwig)
out of service)
demo on Sunday
EMELD Arizona’s prototype (name?)
demo on Sunday
A Good Data Model for Creating a
Community of Practice
Language data should be searchable
Authors or communities want control
over their data (local/distributed).
Local control should be balanced with
data interoperability (Semantic
Local Control with Broad Access
No need to standardize your
terminology or abandon tradition.
No need to learn <xml> (it doesn’t
Use EMELD tools to put your data on
the Semantic Web
Maintain your data
See our website: