Presentation1x - BioNLP@DBCLS

schoolmistInternet and Web Development

Oct 22, 2013 (3 years and 11 months ago)

83 views

Ontology ranking


What is ontology ranking?


Selecting ontologies


Evaluating ontologies



When ranking algorithm is developed


Evaluation of the algorithm



Our specific case


OntoFinder
/Factory


Input: Set of relevant terms


Output: set of ontologies



Rank the ontologies according to input terms.


Selection of the previous methods

Paper

Coverage

Richness

Popularity

Comment

Alani

et
al.

(
AKTivRank
)

2005

Class match measure
(CMM)

Centrality,

Density,
Betweennes
.

CEM
(deprecated), DEM, BEM

N/A

Also uses semantic similarity.

Sabou

et
al.

2006

Topic coverage

Richness of knowledge

Popularity

It is more theoretical paper. No real
implementation.

Jonquet

et
al.
(Recommender)

2009

Term Matching

Structure measure

Connectivity


(Mapping
extension)

They also use Size as the metrics (to
normalize the final score). However,
overall, the normalized score
performed worse than not
-
normalized score.


Concentrated on biomedical
domain.

Martinez
-
Romero et
al.

2012

In

Recommender
Systems for the
Social Web
(Springer)

(Context) Coverage

Define: CCScore

(Semantic) Richness


SRScore
calculated as:

Relatives Index, Additional
Information Index, Similar
Knowledge Index

Popularity

Main
contribution
is the
use of Web
2.0 for
popularity
.


They use

semantic expansion with
WordNet and UMLS dictionaries.

Park
et
al.

2011

Topic coverage, concept
match.

Richness

N/A

Adds “semantic similarity between
relations”
m整物捳

RMM
,
Taxo
.


Problem: not a lot of ontologies
contain relation information/label.


Also a query interface to avoid
polysemy.

Park


Relation
Match Measure (
RMM
) defined
as a combination of:



Concept
match
(exact match, partial match, synonymous
match)



Relation
label match
: degree of correspondence between
relation
between search terms
, and relation between concepts matched
by
search terms.



Distance
:
minimum path
length between
concepts (direct match =
directly connected
).



Neighbour

match: can domain and range concept be connected with
the help of their neighbour nodes in addition to their original
linka
?


Park
-

RMM

Martinez
-
Romero


Martinez
-
Romero et al
.:



Expansion of input terms with
WordNet
, UMLS.



Weights
for each metric was recommended by experts.



Previous
approaches have 4 main
drawbacks:


1
. not
completely
automatic,


2
. input is restricted to a single word,


3
. popularity is not considered
or
not correctly
assessed.




4. Semantics from
relations in ontologies are ignored.



Three
metrics again (coverage, richness, popularity
)



No
word
disambiguation (compared to
AKTivRank
).

Martinez
-
Romero

Evaluation of previous methods

Paper

Evaluation method

Comments

Alani

et
al.

(
AKTivRank
)

2005

User case.

Correlation
-

c
ompared
with Pearson Correlation Coefficient.

Sabou

et
al.

2006

No
evaluation.



Jonquet

et
al.
(Recommender)

2009

User case.

Questionary was defined where users (from EBI, Stanford)
ranked the
ontologies. Note: Recommender
did not identify small
ontologies that were expected by evaluators.


Also, the
importance

of metrics was defined/ranked

by users.

Martinez
-
Romero et
al.

2012

No evaluation.

However, weights for the metrics used were defined by 5
experts.

Park et
al.

2011

User case plus
comparing with
previous research.

Three experiments were performed: human ranking, changing
weights and parameters, and comparison to
AKTivRank
.


Three

most important metrics were identified.

OntoFinder

Challenges


Fact: No “perfect algorithm” for evaluation/ranking.



Which metrics to use and how to weight them?


Metrics and weights from previous works?


Define new/improve old metrics:


Coverage

-

Improve string similarity?


Until now: exact match, partial match, synonyms, longest only, edit distance, n
-
Grams (?).


We: Head word and Other string similarity measurements



Add different metrics for
popularity
?


Number of Views on Bio Portal?


PubMed references?



How to evaluate results
-

algorithm?


Human evaluators?


Automatic evaluation method? (Brank et al. 2006)


Ontology based annotation:


Select ontologies
-
> Annotate
-
> Compare result with “golden standard” (CRAFT?)


How?


ML approach (?)