YAGO-NAGA

wrendeceitInternet and Web Development

Oct 21, 2013 (3 years and 8 months ago)

130 views

YAGO
-
NAGA Project

Presented By:

Mohammad
Dwaikat


To:

Dr.
Yuliya

Lierler


CSCI 8986


Fall 2012

Agenda


What is YAGO
-
NAGA?


Why YAGO
-
NAGA?


How YAGO
-
NAGA Works?


Demonstration


YAGO
-
NAGA Sub
-
Projects

Agenda


What is YAGO
-
NAGA?


Why YAGO
-
NAGA?


How YAGO
-
NAGA Works?


Demonstration


YAGO
-
NAGA Sub
-
Projects

What is YAGO
-
NAGA?


Harvesting
, Searching, and Ranking Knowledge
from the Web.


Building a conveniently searchable, large
-
scale,
highly accurate knowledge base of common facts
in a machine
-
processable

representation.


Harvested knowledge about millions of entities
and facts about their relationships, from
Wikipedia

and
WordNet

with careful integration
of these two sources
.

What is YAGO
-
NAGA?


Its vision is a confluence of
Semantic Web

(Ontologies),
Social Web

(Web 2.0), and
Statistical Web

(Information Extraction) assets
towards a comprehensive repository of
human knowledge
.

YAGO


Yet Another Great
Ontology

(YAGO) Knowledge base.


It is a huge semantic knowledge base, derived from
Wikipedia
,
WordNet
, and
GeoNames
.


It knows almost 10 million entities (e.g. persons,
organizations, cities), and 120 million facts about these
entities.


It has a manually confirmed accuracy of 95%.


YAGO is an ontology that is anchored in time and
space.


It
attaches a temporal dimension and a
spacial

dimension
to many of its facts and entities.

YAGO


It contains all the entities and ontological facts
extracted from
Wikipedia

(from 2010
-
08
-
17),
with categories mapped to the
WordNet

class
hierarchy.


It also contains multi
-
lingual data from the
Universal
WordNet

(UWN
).

YAGO


It contains all the entities and facts from
GeoNames

-

(
from a dump of August 2010).


It also contains textual and structural data
from
Wikipedia
.


A
ll
links+anchor

texts between the YAGO
entities.


A
ll
Wikipedia category
names.


T
he
titles of references
.

YAGO


It is particularly suited for disambiguation
purposes, as it contains a large number of names
for entities. It also knows the gender of
people.


YAGO is the resulting knowledge base, the facts
are represented as
RDF triples

(Resource
Description Framework).


Methods and prototype systems have been
developed for querying, ranking, and exploring
knowledge
.

NAGA


Not Another Google Answer

(NAGA) is a new semantic
search engine
which
provides ranked answers to
queries based on statistical
models.


It can operate on knowledge bases that are organized
as graphs with labeled nodes and edges, so called
relationship graphs.


As of now, NAGA uses a projection of YAGO as its
knowledge base.


The underlying query language supports keyword
search for the casual user as well as graph
-
based
queries with regular expressions for the expert user
.

Agenda


What is YAGO
-
NAGA?


Why YAGO
-
NAGA?


How YAGO
-
NAGA Works?


Demonstration


YAGO
-
NAGA Sub
-
Projects

Consider These Questions


Which German Nobel laureate survived both
world wars and outlived all four of his children?


The answer is Max Planck.


Which politicians are also accomplished
scientists?


The German chancellor Angela Merkel and Benjamin
Franklin.


How are Max Planck, Angela Merkel, Jim Gray,
and the Dalai Lama related?


All four have doctoral degrees from German
universities.

Why YAGO
-
NAGA?


Three major
research:


Semantic
-
Web
-
style knowledge
repositories.


Such as SUMO,
OpenCyc
, and
WordNet
.


Large
-
scale information
extraction.


Social
tagging and Web 2.0 communities that
constitute
the
social
Web.


Wikipedia is another example of the Social Web
paradigm.


The challenge is how to extract the important facts
from the Web and organize them into an explicit
knowledge base that captures entities and semantic
relationships among them.

Agenda


What is YAGO
-
NAGA?


Why YAGO
-
NAGA?


How YAGO
-
NAGA Works?


Demonstration


YAGO
-
NAGA Sub
-
Projects

How YAGO
-
NAGA Works?


YAGO adopts concepts from the standardized
SPARQL Protocol
and
RDF Query Language

for
RDF data but extends them through more
expressive pattern matching and ranking.


The prototype system that implements these
features is
NAGA.

Query for the YAGO Knowledge Base

A big US city with two airports, one named after a World

War II hero, and one named after a World War II battle field?


A big US city with two airports, one named
after a World War II hero, and one named after
a World War II battle field?


Select Distinct ?c Where {


?c type City . ?c
locatedIn

USA .


?a1 type Airport . ?a2 type Airport .


?a1
locatedIn

?c . ?a2
locatedIn

?c .


?a1
namedAfter

?p . ?p type
WarHero

.


?a2
namedAfter

?b . ?b type
BattleField

. }

Structured Knowledge Queries

Growing the Knowledge Base

Word

Net

Wikipedia

+

YAGO Core

Extractors

YAGO Core

Checker

YAGO

Core

YAGO

Gatherer

YAGO

Gatherer

Hypotheses

YAGO

Gatherer

YAGO

Scrutinizer

YAGO

Web sources

G r o w i n g

knows


慬氠敮瑩t楥i

f潣畳o潮 f慣瑳

19
/38

Information Extraction from
Wikipedia

Subj.

Pred.

Obj.

Stanford

University

type

Private

University

hasPresident

J.L.Hennessy

hasStudents

15,319

foundedBy

L.Stanford

foundedIn

1891








Combine
knowledge from
WordNet

&
Wikipedia.


Additional
Gazetteers
(geonames.org
).

YAGO Knowledge Base

Searching & Ranking RDF Graphs in NAGA

Q
ueries

with

re
g
ular

ex
p
ressions
:

Discover
y

q
ueries
:

Connectedness

q
ueries
:

Ling

$x

scientist

type


hasFirstName | hasLastName

$y

Zhejiang

locatedIn
*

worksFor

Beng Chin Ooi

(coAuthor

| advisor)
*

Kiel

$x

scientist

type


bornIn

Ranking

based

on
confidence
,
compactness

and

relevance

$x

Nobel
prize

hasWon

$a


diedOn

$y

hasSon

$b


diedOn

>

Thomas Mann

Goethe

*

German
novelist

type

Agenda


What is YAGO
-
NAGA?


Why YAGO
-
NAGA?


How YAGO
-
NAGA Works?


Demonstration


YAGO
-
NAGA Sub
-
Projects

YAGO Server: UI & API

YAGO Server: UI & API


YAGO
-
UI


Interactive online demo


RDF

with
time
,
space

&
provenance

annotations


SPARQL

+
keywords



YAGO
-
API

Two basic
WebServices
:


processQuery


(String query)


getYagoEntitiesByNames

(String[] names)

www.mpi
-
inf.mpg.de/yago
-
naga/demo.html


Browse through the YAGO knowledge base.


https://d5gate.ag5.mpi
-
sb.mpg.de/webyagospotlx/Browser


Ask queries on YAGO using SPOTLX patterns.
View the results on a map and timeline.


https://d5gate.ag5.mpi
-
sb.mpg.de/webyagospotlx/WebInterface

YAGO

Agenda


What is YAGO
-
NAGA?


Why YAGO
-
NAGA?


How YAGO
-
NAGA Works?


Demonstration


YAGO
-
NAGA Sub
-
Projects

YAGO
-
NAGA Sub
-
Projects


More than 13 sub
-
projects of YAGO
-
NAGA.


AIDA: is a method, implemented in an online
tool, for disambiguating mentions of named
entities that occur in natural
-
language text or
Web tables.


https://d5gate.ag5.mpi
-
sb.mpg.de/webaida
/

Names, Surface Patterns & Paraphrases

Which chemist was born in London?



(I) Named entity

disambiguation


chemist



wordnet_chemist
,
wordnet_pharmacist


born



Bertran_de_Born
,
Born_Identity
_(Movie), Born_(Album)


London



London_UK
,
London_Arkansas
,
Antonio_London


(II)

Mapping
surface patterns

onto semantic relations


<person>

was_born_in

<location>


bornIn
(
<person>
,
<location>
)


<person>

was_born_in

<date>


bornOn
(
<person>
,
<date>
)


(III) Paraphrases

of questions



<person>

[was] born in

<location>



<location>
-
born
<person>

NN

VBD

VBN

IN

NNP/LOC



bornIn
(
<person>
,

<location>
)

References


YAGO
-
NAGA Project:



http://www.mpi
-
inf.mpg.de/yago
-
naga/


YAGO:



http://yago
-
knowledge.org


NAGA:



http://www.mpi
-
inf.mpg.de/yago
-
naga/naga/demo.html