INEX: Evaluating content-oriented XML retrieval

hurriedtinkleAI and Robotics

Nov 15, 2013 (3 years and 8 months ago)

74 views

INEX: Evaluating
content
-
oriented XML
retrieval

Mounia Lalmas

Queen Mary University of London

http://qmir.dcs.qmul.ac.uk

Outline



Content
-
oriented XML retrieval



Evaluating XML retrieval: INEX


XML Retrieval



Traditional IR is about finding relevant documents to a user’s
information need, e.g. entire book.





XML retrieval allows users to retrieve document components that
are more focussed to their information needs, e.g a chapter of a
book instead of an entire book.





The structure of documents is exploited to identify which document
components to retrieve.


Structured Documents



Linear order of words, sentences,
paragraphs …


Hierarchy or logical structure
of a book’s chapters, sections …



Links (hyperlink), cross
-
references, citations …


Temporal and spatial relationships
in multimedia documents



Book

Chapters

Sections

Paragraphs


World Wide Web


This is only


only another


to look one


le to show the need an la a


out structure of and more


a document and so ass to


it doe not necessary text a


structured document have


retrieval on the web is an it


important topic of today’s


research it issues to make se


last sentence..


Structured Documents




Explicit

structure

formalised
through document representation
standards (
mark
-
up languages
)



Layout


LaTeX (publishing), HTML (Web
publishing)



Structure


SGML,
XML

(Web publishing,
engineering), MPEG
-
7 (broadcasting)



Content/
Semantic


RDF, DAML + OIL, OWL (semantic web)


World Wide Web


This is only


only another


to look one


le to show the need an la a


out structure of and more


a document and so ass to


it doe not necessary text a


structured document have


retrieval on the web is an it


important topic of today’s


research it issues to make se


last sentence..


<b>
<font

s
ize=+2>
SDR
</font></b>

<img
src
="qmir.jpg"
border
=0>

<section>


<subsection>


<paragraph>… </paragraph>


<paragraph>… </paragraph>


</subsection>

</section>

<Book rdf:about=“book”>


<rdf:author=“..”/>


<rdf:title=“…”/>

</Book>

XML: e
X
tensible
Mark
-
up

L
anguage


Meta
-
language (user
-
defined tags) currently being
adopted as the document format language by W3C


Used to describe content and structure (and not
layout)


Grammar described in DTD (


used for
validation)


<lecture>


<title> Structured Document Retrieval </title>


<author> <fnm> Smith </fnm> <snm> John </snm> </author>


<chapter>


<title> Introduction into XML retrieval </title>


<paragraph> …. </paragraph>





</chapter> …

</lecture>

<!ELEMENT lecture (title,
author+,chapter+)>

<!ELEMENT author (fnm*,snm)>

<!ELEMENT fnm #PCDATA>



XML: e
X
tensible
Mark
-
up

L
anguage


Use of XPath notation to refer to the XML
structure


chapter/title: title is a direct sub
-
component of chapter

//title: any title

chapter//title: title is a direct or indirect sub
-
component of chapter

chapter/paragraph[2]: any direct second paragraph of any chapter

chapter/*: all direct sub
-
components of a chapter


<lecture>


<title> Structured Document Retrieval </title>


<author> <fnm> Smith </fnm> <snm> John </snm> </author>


<chapter>


<title> Introduction into SDR </title>


<paragraph>
….

</paragraph>





</chapter>


</lecture>

Querying XML documents


Content
-
only (CO) queries


'
open standards for digital video in distance learning
'



Content
-
and
-
structure (CAS) queries





//article [about(., 'formal methods verify correctness aviation
systems')]


/body//section


[about(.,'case study application model checking theorem proving')]



Structure
-
only (SA) queries



/article//*section/paragraph[2]

Content
-
oriented XML retrieval


Return document components of
varying granularity

(e.g. a book, a
chapter, a section, a paragraph, a
table, a figure, etc), relevant to the
user’s information need both with
regards to
content

and
structure
.

Content
-
oriented XML retrieval



Retrieve the

best

components according
to content and structure criteria:



INEX:

most specific component that satisfies the query, while
being exhaustive to the query



Shakespeare study:

best entry points, which are components
from which many relevant components can be reached through
browsing



???







Article


?XML,?retrieval


?authoring







0.9 XML 0.5 XML 0.2 XML


0.4 retrieval 0.7 authoring



Challenges

Title

Section 1

Section 2

no fixed retrieval unit + nested elements + element types


how to obtain document and collection statistics?


which component is a good retrieval unit?


which components contribute best to content of Article?


how to estimate?


how to aggregate?

0.4

0.5

0.2

0.6

0.4

0.4

0.2

Approaches …

vector space model

probabilistic model

bayesian network

language model

extending DB model

boolean model

natural language processing

cognitive model

ontology

parameter estimation

tuning

smoothing

fusion

phrase

term statistics

collection statistics

component statistics

proximity search

logistic regression

belief model

relevance feedback

Vector space model

article index

abstract index

section index

sub
-
section index

paragraph index

RSV

normalised RSV

RSV

normalised RSV

RSV

normalised RSV

RSV

normalised RSV

RSV

normalised RSV

merge

tf and idf as for fixed and non
-
nested retrieval units

(IBM Haifa, INEX 2003
)

Language model

element language model

collection language model

smoothing parameter


element score

element size

element score

article score

query expansion with blind feedback

ignore elements with


20 terms

high value of


leads to increase in size of retrieved elements


results with


= 0.9, 0.5 and 0.2 similar

rank element

(University of Amsterdam, INEX 2003)

Evaluation of XML retrieval: INEX


Evaluating the effectiveness of content
-
oriented XML retrieval
approaches



Collaborative effort


participants contribute to the development of
the collection


queries


relevance assessments



Similar methodology as for TREC, but adapted to XML retrieval



40+ participants worldwide



Workshop in Schloss Dagstuhl in December (20+ institutions)


INEX Test Collection


Documents (~500MB), which consist of 12,107 articles in XML
format from the IEEE Computer Society; 8 millions elements



INEX 2002

30 CO and 30 CAS queries

inex2002 metric



INEX 2003

36 CO and 30 CAS queries

CAS queries are defined according to enhanced subset of XPath

inex2002 and inex2003 metrics



INEX 2004 is just starting

Tasks


CO
: aim is to decrease user effort by pointing
the user to the most specific relevant portions of
documents.



SCAS
: retrieve relevant nodes that match the
structure specified in the query.



VCAS
: retrieve relevant nodes that may not be
the same as the target elements, but are
structurally similar.

Relevance in XML


A element is relevant if it “has significant and
demonstrable bearing on the matter at hand”



Common assumptions in IR


Objectivity


Topicality



Binary nature



Independence


section

paragraph

article


1 2


1 2 3

Relevance in INEX


Exhaustivity


how exhaustively a document component discusses the query: 0, 1,
2, 3


Specificity


how focused the component is on the query: 0, 1, 2, 3


Relevance



(3,3), (2,3), (1,1), (0,0), …



section

article

all sections relevant


article very relevant

all sections relevant


article better than sections

one section relevant


article less relevant

one section relevant


section better than article



Relevance assessment task


Completeness


Element


parent element, children element



Consistency


Parent of a relevant element must also be relevant, although to a
different extent


Exhaustivity increase going



Specificity decrease going




Use of an online interface


Assessing a query takes a week!


Average 2 topics per participants

section

paragraph

article


1 2


1 2 3

Interface

Current

assessments

Assessments


With respect to the elemens to assess


26 % assessments on elements in the pool (66 % in
INEX 2002).


68 % highly specific elements not in the pool



7 % elements automatically assessed



INEX 2002


23 inconsistent assessments per query for one rule

Metrics

Need to consider:



Two dimensions of relevance


Independency assumption does not hold


No predefined retrieval unit


Overlap


Linear vs. clustered ranking

section

article

INEX 2002 metric

Quantization:


strict



generalized


f
strict
(
exh
,
spec
)

1
if
exh
=
3
and
spec
=
3
0
otherwise




f
gen
(
exh
,
spec
)

1
.
00
if
(
exh
,
spec
)

33
0
.
75
if
(
exh
,
spec
)

{
23
,
32
,
31
}
0
.
50
if
(
exh
,
spec
)

{
13
,
22
,
21
}
0
.
25
if
(
exh
,
spec
)

{
11
,
12
}
0
.
00
if
(
exh
,
spec
)

00









INEX 2002 metric


Precision as defined by Raghavan’89 (based on
ESL)




where n is estimated

1
)
)(
|
(







r
i
s
j
n
x
n
x
x
retr
rel
P
Overlap problem


INEX 2003 metric

Ideal concept space (Wong & Yao ‘95)


c
c
t
spec


t
c
t
exh


c

t

INEX 2003 metric

Quantization:


strict







generalised


exh
strict
(
exh
)

1
if
exh

3
0
otherwise




spec
strict
(
spec
)

1
if
spec

3
0
otherwise




exh
gen
(
exh
)

exh
/
3

spec
gen
(
spec
)

spec
/
3
INEX 2003 metric

Ignoring overlap:


recall
s

t

c
i
U
i

1
k

t

c
i
U
i

1
N


exh
(
c
i
U
)
i

1
k

exh
(
c
i
U
)
i

1
N


precision
s

t

c
i
U
c
i
U

c
i
T
i

1
k

c
i
T
i

1
k


spec
(
c
i
U
)

c
i
T
i

1
k

c
i
T
i

1
k

INEX 2003 metric

Considering overlap:


recall
o

exh
(
c
i
U
)

c
i
T

j

1
i

1
c
j
T
c
i
T
i

1
k

exh
(
c
i
U
)
i

1
N


precision
o

spec
(
c
i
U
)

c
i
T

c
j
T
j

1
i

1
i

1
k

c
i
T

c
j
T
j

1
i

1
i

1
k

INEX 2003 metric


Penalises overlap by only scoring novel
information in overlapping results


Assume uniform distribution of relevant
information


Issue of stability


Size considered directly in precision (is it
intuitive that large is good or not?)


Recall defined using exh only


Precision defined using spec only



Alternative metrics


User
-
effort oriented measures


Expected Relevant Ratio


Tolerance to Irrelevance



Discounted Cumulated Gain

Lessons learnt


Good definition of relevance



Expressing CAS queries was not easy



Relevance assessment process must be “improved”



Further development on metrics needed



User studies required

Conclusion


XML retrieval is not just about the effective retrieval of
XML documents, but also about how to evaluate
effectiveness



INEX 2004 tracks


Relevance feedback


Interactive


Heterogeneous collection


Natural language query


http://inex.is.informatik.uni
-
duisburg.de:2004/

INEX: Evaluating
content
-
oriented XML
retrieval

Mounia Lalmas

Queen Mary University of London

http://qmir.dcs.qmul.ac.uk