RI: iOPENERA Flexible Framework to Support Rapid Learning in Unfamiliar Research Domains

kettlecatelbowcornerΤεχνίτη Νοημοσύνη και Ρομποτική

7 Νοε 2013 (πριν από 3 χρόνια και 7 μήνες)

69 εμφανίσεις


1

RI: iOPENER

A Flexible Framework to Support

Rapid Learning in Unfamiliar Research Domains

Project Description

1.

Project Overview

In today’s rapidly expanding disciplines, scientists and scholars are constantly faced with the daunting task of
keeping up with

knowledge in their field. In addition, the increasingly interconnected nature of real
-
world tasks often
requires experts in one discipline to rapidly learn about other areas in a short amount of time:

(1)

Cross
-
disciplinary research requires scientists in ar
eas
such as
linguistics, biology, and sociology to learn about
computational approaches and applications
, e.g.,
computational linguistics, biological modeling, social
networks.

(2)

Authors of journal articles and books must write accurate surveys of previous
work, ranging from short
summaries of related research to in
-
depth historical notes.

(3)

Government decision
-
makers must learn about different scientific fields to determine funding priorities, e.g.., the
National Academy of Sciences
1

has
charged committee me
mbers with writing major publications on a wide range
of disciplines to assist the government in making policy decisions.

(4)

Interdisciplinary review panels are often called upon to review proposals in a wide range of areas, some of which
may be unfamiliar to

panelists. Thus, they must learn about a new discipline

“on the fly”


in order to relate
their own expertise to the proposal.

(5)

Homeland security experts must rapidly learn about new and emerging threats to the health and safety of our
nation, e.g., microb
iology of, and effective cure for, anthrax or Avian flu. Impacts of pandemics involve public
health, medicine, and national security, requiring critical interdisciplinary decisions.

The original contributions of this research are in

linking three currentl
y
available technologies: (1) bibliometric
lexical link mining that exploit
s

the
structure

of citations and relations among citations
;

(2) summarization techniques
that exploit the
content

of the material in both the citing and cited papers
; and (3) visual
ization tools for displaying
both
structure

and
content
.

To our knowledge, this is the first attempt to link these three technologies and also to
evaluate different forms of presentation for rapid learning in unfamiliar research domains. Specifically, w
e w
ill
extend current summarization algorithms (e.g., Multi
-
document Trimmer [Zajic et al., 2006]

and

MEAD [Radev et al.
200
4
])
to detect redundancy, contradictions, and temporal ordering using

citation analyses provided by existing
techniques [Elkiss et al.,

200
7
]

and we will display the resulting surveys, timelines, and bibliometric links in an
existing visualization environment [Plaisant et al., 1998]
.

We will test hypotheses concerning the integration of
bibliometric link mining with summarization techniqu
es and also investigate the presentation of this information in
different modalities to different types of users.

The lack of a framework for filling gaps such as those illustrated above is clearly evident
,
despite
the fact
that papers
are easier than eve
r to obtain
. In

recent online discussions
,

research areas
are
described as “complicated to outsiders
or newcomers,” and every paper appears to be “a small piece of an unfinished jigsaw puzzle” (cf.,
The Machine
Learning blog topic
“What is Missing for Onli
ne Collaborative Research,” 9/18/2006
, at
http://hunch.net/?p=214
).
The problem has not yet been solved even with
collaborative
online resources such as Wikipedia because the
y

are
typically non
-
authoritative collect
ions of disparate

and (
often
)

contradictory

pieces of text
,
written by
many

different
authors.
T
hose who desire to learn quickly about new areas have expressed a great need for finding automatic
methods to organize and summarize the morass of information t
hat is so readily available on different topics.

We envision a system

iOPENER (
I
nformation
O
rganization for
PEN
ning
E
xpositions on
R
esearch)

that addresses
these issues by generating readily
-
consumable surveys of different scientific domains and topics, ta
rgeted to different
audiences and levels, e.g.,
expert specialists,

lay people or
scientists from related disciplines,

educators, government
decision makers,
citizens including
minorities
and underrepresented groups
.

Surveyed material will be presented in
different
modalities
, e.g., an enumerated list of articles, a bulleted list of key facts, a textual summary, a
historical/bibliographic timeline
, or a
visual

presentation with zoom and filter capabilities
. We will leverage existing
publicly
-
available resou
rces such as the
ACL Anthology, ACM digital Library, Citeseer, and
others for

our analysis,
retrieval, selection, and survey/timeline creation and visualization.

A broad
-
reaching implication is that citizens who may need access to complex information that
is otherwise
inaccessible due to its technical nature will have direct access to the information, in a form they can understand and
utilize. For example, the environmental impact of building different types of septic tanks under different percolation
prop
erties of the soil is available on government websites, but these reports are either very general and simplified, or
highly technical. The availability of a tool like iOPENER would allow a common citizen to gain accurate technical
information and make inf
ormed decisions. Areas such as the management of public health threats, weather
-
related



1

http://www.nationalacademies.org/publications/#reports



2

catastrophes, environmental impact and other such issues are increasingly complex, and citizen
-
awareness can only
be improved with better access to comprehensible, acc
urate, and up
-
to
-
date information.

The iOPENER

software

and resulting surveys

and timelines
would be made publicly available and research results
would be presented at conferences such as the ACL, SIGIR, and ASIS
T, as well as to

broader audiences,
e.g.,
an
alysts who prepare briefings for government policy
-
makers. A natural avenue for this will be through the
University of Maryland’s School of Public Policy, which has close ties to government agencies. Another avenue is
through the Center for Advanced Stud
y of Language, where one of the co
-
PIs holds a joint appointment. The
advantage of our location in the Washington D.C. area provides opportunities to link directly to potential government
clients.

All f
ive

investigators have solid track records for their
work on NSF
-
funded projects and have demonstrated
excellence in
natural language processing,
information retrieval,
digital
librar
ies
,

information science,

and
human
-
computer interaction

at both the University of Maryland and the University of Michigan.

2.

Pr
oject Objectives and Expected Significance

We focus on producing surveys on a large number of topics

within two selected domains,
start
ing

with the areas of
Computer Science and the Social Sciences
, e.g., computational linguistics, in order to bootstrap th
e evaluation
process. As the project develops, w
e will pay special attention to cross
-
disciplinary areas and techniques that span a
number of applications. For example, the topic
s

of
social network analysis or
Bayesian net
s

appear frequently in the
litera
ture in Political Science, Psychology, and Machine Learning, yet experts in any
one
of these fields may not
know about these cross
-
disciplinary applications

or they may only discover them serendipitously by reading many
papers outside his/her area of exper
tise. It is crucial to bring these cross
-
disciplinary applications to light so that
scientists can discover new ways to apply techniques based on what can be learned in other domains.

Decision
-
makers will be better informed in placing priorities on funda
mental techniques that contribute to and draw on many
fields, as well as to specifically targeted methods critical to individual fields.

The specific objectives of our research are
to provide the following:

1.

Within Discipline Information:
Educational enhanc
ement and improved access to technical information within a
discipline for expert and lay users:

a.

To present a discipline at an understandable yet accurate level to non
-
expert audiences.

b.

To update knowledge of educators at regular intervals so that they can

keep their curricula up
-
to
-
date.

2.

Interdisciplinary Information:
Cross
-
disciplinary instruction:

a.

To allow scientists in one area to find the relevant papers when they begin to investigate a new area.

b.

To enable the reviewing of proposals outside of the revi
ewers’ expertise area.

c.

To assist panels to write reports for the government on a wide range of scientific areas.

3.

Tools and Resources for Public Use:
Dissemination of tools and resources:

a.

To provide a tool for authors to produce focused historical notes fo
r textbooks in different sub
-
fields.

b.

To assist government decision makers in executing policy decisions about particular scientific areas.

c.

To augment current online resources (wikipedia) that focus only on a small set of research areas (e.g., machine
trans
lation, but not phrase
-
based MT).

We aim to test the following two
scientific hypotheses
:

H1:

By integrating both technologies

bibliometric lexical link mining [Elkiss et al., 2007] and summarization
techniques [Radev et al., 2004; Zajic et al., 2006]

we a
chieve better results than we would with either approach
alone, as measured by automatic distance metrics, human tests of adequacy, and task
-
based evaluations.

H2:

By integrating textual surveys

into visualization environments, we can capture the more subt
le aspects of topic
-
focused surveys that a user would not otherwise find

(e.g., contradictions and paraphrase)
, as measured by task
-
based utility studies [
Shnederman and Plaisant, 2006;
Klavans and Boyack, 2006].


If our hypotheses are correct, we will a
chieve a survey
-
producing capability that surpasses what is currently
available. We propose a set of evaluations

outlined in Section
5

to verify each of these hypotheses
in the context
of our tests of component

and functional
effectiveness of

iOPENER
.

An example of
a

capability
to be
provided in iOPENER is a generator of “historical notes” of the type available in the
latest textbook on Natural Language Processing [Jurafsky and Martin, 2007]. An excerpt from one

such survey
on the
topic of Statistical Machine Translation is
shown here:

Modern statistical methods began to be applied in the early 1990s, enabled by the development of large bilingual corpora and
the growth of the web. Early on, a number of researche
rs showed that it was possible to extract pairs of aligned sentences from
bilingual corpora
\
cite{KayRos88, WarwickRussell
90, BrownLaiMercer91, Gale
&
Church91,
Gale&Church
93,
Kay
&
Röscheisen
93}.

The earliest algorithms made use of the words of the sentence a
s part of the alignment model, while others
relied solely on other cues like sentence length in words or character


At the same time, the IBM group, drawing directly on

3

algorithms for speech recognition … proposed the Candide system, based on the IBM stat
istical mode
ls we have described
\
cite{Brown&al90, Brown&al
93....Progress was made hugely easier by the development of publicly
-
available toolkits….
Initially most research implementations focused on IBM Model 3, but very quickly researchers moved to phra
se
-
based models.
While the earliest phrase
-
based translation mo
del was IBM Model 4
\
cite{Brown&al
93}, modern models derive from Och's
\
cite{
Och98} work on alignment templates.

Note that our examples in this proposal are taken from the areas of computatio
nal linguistics purely for illustrative
purposes; our goal is to explore social science and digital government as well as computer science, once we develop
and evaluate our techniques.

3.

Relation of the Project to the Field and to Other Work

E
xisting work on

systems that assist scientists in assimilating know
ledge about a particular domain focuses
on the
integration
across published work
and the presentation of search
[Shiffrin and B
ö
rner, 2004].

S
ummarization
researchers have focused on the generation of sho
rt descriptions of a paper or a set of papers

[
Carbonell and
Goldstein, 1998; Dorr et al., 2003; Goldstein et al., 2000; Radev et al., 2004
; Zajic et al., 2006]

and on identifying
similarities and differences [McKeown and Radev 1995, Hatzivassiloglou et a
l., 2001].

We focus

instead on the
development of
a methodology for producing topic
-
specific surveys

using techniques from citation analysis,
information retrieval, and summarization. We will also explore language
-
oriented extensions existing visualizatio
n
frameworks (e.g., LifeLines [Plaisant, 1996]), e.g., to show the evolution of a discussion or the roles in a debate.
2


3.1.

Relation of the Project to the State of Knowledge in the Field

The field of bibliometrics involves co
-
citation analysis and the identif
ication of relationships between cited
documents and their citing counterparts [Osareh, 1996; Potter, 1988]. More recent additions to the bibliometric
literature involve multiple visualizations of these relationships, e.g., the Spaces and Places exhibit a
t the New York
Public Library, curated by Katy Börner (
http://www.scimaps.org
). Although these techniques are highly
sophisticated in terms of analysis, links, and mappings, the areas they have not yet covered inclu
de in
-
depth full
-
text
analyses of input documents. For example, concept maps use simple keywords with stemming to determine
relatedness. The computational linguistic community contributes more complex and subtle notions of similarity,
including paraphras
ing and synonymy, as well as determining contrasting, conflicting, or contradicting information.
Thus, a visualization system might represent two articles with the keywords “Bayesian Networks” in the title, without
capturing the fact that one might be “Th
e Failure of Bayesian Networks, and the Success of Linear Standoff
Techniques, in Determining Pandemic Threats,” whereas the other might be “The Use of Bayesian Networks to
Predict Public Health Threats.” The iOPENER project aims to fill this gap by bring
ing together language
-
focused
capabilities with visualization tools to evaluate different presentation modalities for rapid learning of scientific topics.

3.2.

Relation to Work in progress by PI under other Support

In a project called
“Citabs”

[Elkiss et al., 2
007]

where relationships among citing sentences are being studied

it
has been demonstrated that a

“citation summary” (i.e., the set of all sentences that cite a particular paper) can present
information about a cited paper that t
he

authors themselves did n
ot consider as important to include in the abstract.
However, the citation summaries
produced in this work

are not coherent or even chronologically accurate. This
proposal aims to focus on the problem of producing summaries taking into
event timeline rec
onstruction
and
detection
of different forms of paraphrase
, redundancy, and contradiction
. We leverage the
Citabs citation analysis
approach for

building
an initial structure for these summaries before producing a coherent topic
-
specific survey.

In the ar
ea of multi
-
document summarization, a common approach is to rank candidate sentences according to a set of
factors, iteratively re
-
ranking to avoid redundancy within the summary. MEAD [Radev et al., 2004] ranks documents
according to a linear combination
of features including centroid, position and first
-
sentence overlap. Once a set of
sentences has been chosen as the summary, all sentences are rescored with a redundancy penalty based on word
overlap with the chosen sentences. A new set of summary senten
ces is chosen based on the re
-
ranking. A more
recent version of MEAD also incorporates lexical centrality [Erkan and Radev, 2004], a measure of importance of
individual parts of a document in a collection. In iOPENER, we adopt an adaptation of this approac
h that includes a
mechanism for optimizing feature weights

with the added feature of “Trimming” [Dorr et al., 2003; Zajic et al.,
2006], which allows us to eliminate components of the summary that would otherwise be irrelevant or redundant with
information

that has already been provided.

3.3.

R
elation of the Project to Work in progress elsewhere




2

LifeLines (
http://www.cs.umd.edu/hcil/lifelines/
)

provides a time
-
oriented visual overview of a
personal, medical, legal, or
organizational history. Time lines from left to right indicate intervals (e.g. hospitalization for two weeks), and points in
dicate
events (e.g. lab test, sonogram, or vaccination). Such ti
me lines can show thousands of
interv
als and events, color and size coded.


4

Current work on the use of citations for creating summaries (e.g., [Nanba

et al.
, 2000]) focus
es

primarily on detecting
“citation areas” of other articles in order to ex
tract information about a cited article.

This differs from our iOPENER
approach in that we focus not

just

on the detection of citations areas for
other articles, but also on the relevant key
passages in the original article. Moreover, our surveys will br
ing together key passages from multiple sources
pertaining to a particular topic

not just a single article discussing one aspect of that topic.

Syntactic shortening has been used by other researchers, e.g., the Cut
-
and
-
Paste approach by [Jing and McKeown,
2000], the SC approach [
Blair
-
Goldensohn

et al.
, 2004], and CLASSY [Conroy et al., 2006]. The SC system pre
-
processes the input to remove appositives and relative clauses. CLASSY uses an statistical sentence selection
approach combined with a conservativ
e sentence compression method based on shallow parsing to detect lexical cues
to trigger phrase eliminations.
3

Our Trimming approach is most similar to that of [Jing and McKeown, 2000], in
which removal of syntactic constituents from sentences was guided
by observation of human summarizers. Trimmer
differs from these systems in that multiple trimmed candidates of each sentence are used to enhance the overall
quality of the summarized output.

Simone Teufel has worked very actively over the years to parse s
cientific papers based on their rhetorical content.
Her method (called “argumentative zoning”) [Teufel and Moens 2000, Teufel 2005] is used to automatically identify
the

portions of a paper
that
contain new results, comparisons to earlier work, etc. in a w
ay complementary to the
structure of the document as determined by its section headings. This zoning information is an important component
of annotation that may be used by iOPENER
.

The PERSIVAL medical digital library system [McKeown et al., 2001; Elhada
d et al., 2005] is an example of a
system that provides quick and easy personalized access to patient
-
specific medical records

including personalized
summaries of medical information pertaining to an individual. Summaries are tailored to either the medica
l specialist
or to the lay reader

a capability we will provide in iOPENER, but in areas outside the domain of medicine.

Several visualization tools are effective in showing bibliometric information (e.g., citations or co
-
author patterns over
time or acros
s disciplines). However, standard network node
-
link diagrams are often a poor strategy because the
number of nodes and links can easily overwhelm users. Alternate strategies and interactive exploration show promise
[Aris and Shneiderman, 2006; Bilgic et al
., 2006; Plaisant et al., 2006; Shneiderman et al., 1998].
In iOPENER, we
will extend these tools display
topic
-
focused surveys and historical timelines with zoom
-
in/zoom
-
out capability.

4.

Architecture

The overall architecture for iOPENER is shown in

Figure
1
.

The

starting point for our system will be
existing
digital
repositories of scientific articles,

e.g., ACL Anthology, ACM Digital Library, and Citseer. H
owever, in later phases
of the project, we will also explo
re the use of focused Web crawlers to augment static collections with up
-
to
-
date
information (
e.g., Google Scholar
)

and we will explore repositories from other research areas, e.g., government,
social sciences, and crosscutting fields such as the Language

Evolution
http://www.isrl.uiuc.edu/amag/langev/

and
Social network analysis
http://www.faculty.ucr.edu/~hanneman/nettext
/Bibliography.html

.

O
ur architecture include
s

modules for citation analysis, identif
ication of

important articles on a particular topic

(retrieval)
,
identification of

key passages in these articles

(selection)
, and
annotation of passages

with the results

of
detailed linguistic analysis. The intermediate product will be a common representation

an XML
-
based
stand
-
off
annotation

that can be transformed into different output modalities, each of which may emphasize a different aspect
of the surveyed
material.
4

This representation will be used for different

presentation methods
,

including
fluent
summaries
,

historical
timelines
, and
visual

zoom and filter capabilities:



Summaries
, or coherent and fluent texts that discuss the significant ideas and advances in a p
articular topic. They may
range in length, from a few paragraphs (the related work section in a paper or the historical notes in a textbook chapter)
to an entire article.



Timelines,
which emphasize the chronological development of ideas in a particular top
ic. We view timelines as a
strict, temporally
-
ordered sequence of self
-
contained segments that outline important intellectual milestones.




3

Other related areas of summarization include redundancy detection [Carbonell and Goldstein, 1998; Goldstein et al., 2000] and

summarization in informal domains [Muresan et al., 2001; Clarke and Lapata, 2006]. In add
ition, the summarization community
has recently begun to examine “change over time” [Dang and Harman, 2006].

4

We plan on using U
niform
R
etrieval
A
rchitecture (URA)
, an existing open source a Java framework for

manipulating stand
-
off
annotations developed
at University of

Maryland, a
vailable for download at

http://www.umiacs.umd.edu/~jimmylin/downloads




5



Other modalities

specifically visualization tools
, where well
-
designed compact displays present large volumes of
rele
vant information that reduces the burden on short term and working memory, and shifts the load from cognitive to
perceptual systems, thereby amplifying human capability.
5

Retrieval
Selection
Ling Analysis
Common
Representation
Summary
Generator
Timeline
Generator
Visualization
Generator
Citation
Analysis
Presentation to User
Datasets
:
Scientific Articles,
Gov

t
Docs,
Biology Docs

Bibliometric
Graph
Retrieval
Selection
Ling Analysis
Common
Representation
Summary
Generator
Timeline
Generator
Visualization
Generator
Citation
Analysis
Presentation to User
Datasets
:
Scientific Articles,
Gov

t
Docs,
Biology Docs

Datasets
:
Scientific Articles,
Gov

t
Docs,
Biology Docs

Bibliometric
Graph
Bibliometric
Graph

Figure
1
: iOPENER Architecture

The input to our system wil
l be a specific topic (e.g., “statistical paraphrase induction”), as broadly or narrowly
defined as the user wishes. Topics can be “dynamic” in the sense that they need not conform to a list of recognized
labels (e.g., “statistical machine translation”), b
ut can draw concepts from many different topics (e.g., “discriminative
machine learning techniques for semantic role
-
labeling”)

the only limitation is the amount of available of topically
-
relevant articles. In addition, the user must specif
y

the output mo
dality, along with other desired characteristics (for
example, level of detail, temporal scope of articles, etc.).

Consider the topic area

within computational linguistics

of “paraphrase induction.” For this topic, the candidate
list of articles might incl
ude the following two articles:



Lin, Dekang and Patrick Pantel.

DIRT
---
Discovery of Inference Rules from Text. Proceedings of the ACM SIGKDD
Conference on Knowledge Discovery and Data Mining. 2001.



Barzilay, Regina and Kathleen McKeown. Extracting Paraphra
ses from a Parallel Corpus. Proceedings of the 39th
Annual Meeting of the Association for Computational Linguistics. 2001.

The description
of iOPENER will build

on this working example.

We will discuss each compone
nt o
f

Figure
1

below, in turn.
In the
sections

below
,
we
refer to

a
segment

to be a
linguistic unit of information, e.g., a sentence or a
phrase corresponding to a key passage.

4.1.

Citation
Analysis

When not already available, iOPENER must induce a citation g
raph

(i.e., a representation of the links between citing
and cited papers)

for each collection of interest, which includes additional information about the relationship between
individual articles. Consider, for example, the ACL Anthology, a digital reposi
tory for research in computational
linguistics: The
citation
analysis module must extract citation information from papers that are only available in PDF
format, reconcile the different ways that a given paper can be cited, and build a citation graph that
includes not only
information about what papers cite other papers, but also the texts of the actual citations. We leverage off our existing
Citabs system [Elkiss et al. 2007]

to produce this graph
.

We will further continue our work on lexical centrality

i
n
Lexrank

[Erkan and Radev 2004] that defines a graph structure based on lexical similarity among articles in a
collection. That structure

in conjunction with a random
-
walk based ranking algorithm similar to Google’s Pagerank
[Brin and Page 1998]

has been
shown to extract high quality passages from a text, even in the absence of explicit
citation linkage information.

The list of specific tasks associated with this component of the iOPENER architecture are:



Paper collection: we will

use

mostly
articles

in o
ur collections (Citeseer, ACL Anthology, ACM Digital Library) but
also
articles

that we will crawl from the web using our clairlib software (
http://tangra.si.umich.edu/clair/clairlib
).




5

Naturally, our framework can in principle support other types of output modalities (e.g.,

concept maps, visualization for topic
clusters, etc.), although such modules must be separately implemented.


6



Text extract
ion: we will convert PDF to plain text using a tool developed at the University of Michigan and extract
sentences from the output.



Citation extraction: we will build a tool for reliably extracting citing information from arbitrary papers and identifying
th
e targets of the citations.



Citation graph creation: we will build a tool for creating topic
-
specific citation graphs.



Lexrank creation: we will use an existing tool that is part of clairlib that ranks papers according to lexical centrality
.




Labeling extr
acted text with time and author information: we will build a component that reliably associates each
citation with the metadata of the target paper.

In our “paraphrase induction”
example,
consider the set of citing sources for the second of the two article
s, i.e.,
[Barzilay and McKeown, 2001]. The output of an intermediate step of the process described above is
a set of citing
articles, two of which are
shown in

Figure
2
, where the bold
-
faced text indicates the sent
ence containing this cited
article.
This intermediate output will be further transformed during citation analysis to include a normalized form of
ea
ch reference, e.g., “Barzilay and McKeown’s 2001 conference paper” vs. “Barzilay & McKeown (2001)” will be
m
apped to a unique paper ID, possibly based on the DOI (digital object identifier) standard (
http://www.doi.org
).


Figure
2
: Two Citing Sources for [Barzilay and McKeow
n, 2001]

Citation analysis will additionally provide temporal information about a cited document as stand
-
off annotation
associated with the citing document: If we take the cited documents X and Y to be [Barzilay and McKeown, 2001]
and [Lin and Pantel, 200
1], respectively, and the citing document Z to be [Ibrahim, 2002], the citation analysis will
produce temporal annotations associated with Z in the following format:

<precede_relationship offset=35946 length=12 earlier_work=X later_work=Z/>

<precede_relat
ionship offset=35975 length=8 earlier_work=Y later_work=Z/>

where X, Y, and Z represent unique document identifiers and offset and length denote character spans of text (i.e., a
segment) in Z that refers to each of X and Y. For example, the 12
-
character
segment representing X inside document
Z is

Barzilay [3].


This information can be used for producing timelines, as we will see short
ly.
6

4.2.

Retrieval

Within a large digital repository, only a small fraction of articles will be relevant to a user’s topic. Of

those, only a
fraction will be suitable for inclusion in a survey. It is the responsibility of the retrieval module to sift through the
collections of interest and identify a set of candidate articles
, which will be further narrowed down by the selection
module
. T
hree

complementary factors enter
into
consideration
:

relevanc
e,
pertinence
, and centrality.



Relevance
:
Is

the article

about the user’s topic? Relevance is a continuous property, primarily of
the article’s
content
, and has been extensively studied
in the information science literature
.




6

Note that the ordering of the citations and cited works forms a directed acyclic graph and not a linear sequence. This struc
ture
reflects two i
mportant properties of citation networks: the existence of sub
-
areas and research topics outside the mainstream and
the fact that a paper cannot (typically) cite a paper in the future. However, in certain cases authors are allowed to edit a

paper that
has

already been published electronically and include anachronistic references or back
-
references.

Automatic Paraphrase Acquisition from News Articles
-

Shinyama, Sekine, Kiyoshi (2002)
:
....or paraph
rases from corpora by using
NEs. So far we have applied our method to two domains in Japanese newspapers and obtained some notable examples. There are a
few
approaches for obtaining paraphrases automatically.
Barzilay et al. used parallel translations deri
ved from one original document
[1]
.

They targeted literary works and used word alignment techniques developed for MT. However the syntactic variety of the result
ant
expressions is limited since they used only part of speech tags to identify the syntactic p
roperties. In addition, compared with our method
using newspapers, their ....


Extracting Paraphrases from Aligned Corpora
-

Ibrahim (2002)
:
....which is evaluated according to a scoring

function. We envision use
of such paraphrases in many natural language applications, however, the primary goal is to produce paraphrases useful in info
rmation
retrieval.
This thesis approach combines ideas from Dekang Lin [12] and Regina Barzilay
[3]
.

Lin

s path representation and basic
premise is combined with Barzilay s idea of using aligned mono lingual corpora. In this way, the computational complexity and

antonymy problems in Lin s approach can be mediated by using a more restricted data set. On the o
ther hand, Lin s path representation
.... multiple word paraphrases are beyond the scope of single word synonyms. In the paraphrase ( play , have fun ) we need th
e phrase
instead of single word to construct the paraphrase (this paraphrase is closer to an i
nference rule) Multiple word paraphrase can be useful
in many ways.
Barzilay
[3]

uses paraphrases to help eliminate redundancies in sentences constructed by summarizing several
other sentences.

Lin [12] extracts paraphrases as a way of extracting inference

rules. 2.2.1 Using Aligned Corpora Regina Barzilay
published a paper in 2001 [3] which is
one of the bases of this ....


7



Pertinence
:

S
hould the article be included in the survey? Relevant articles may not be suitable for inclusion in a
survey for a variety of reasons, one example of which is importance. Users are often only interested i
n those
papers that make substantial intellectual contributions and advance the field in some way.



Centrality:

How similar is the current
article
to all
other articles

about the same topic
?

For example, an article
that employs the dominant methodology or
technique will be considered “central” to the literature on that topic.

Well
-
known information retrieval techniques can serve as the basis for assessing relevance [Saracevic, 1975; Harter,
1992; Barry and Schamber, 1998; Mizzaro, 1998].
W
e propose to empl
oy the Lemur toolkit, a freely
-
available
language modeling framework for IR applications

(
http://www.lemurproject.org/
).

Topics can be treated as queries
posed to indexes built from various digital repositories
. However, off
-
the
-
shelf tools will not be sufficient for our
purposes. Controlled vocabulary index terms (e.g., the ACM Computing Classification System) as well as author
-
supplied keywords provide valuable information that should be integrated with eviden
ce derived from content. In
addition, certain sections of an article may be more or less important in determining topicality. For example, the
related work section of a paper might contrast its approach with that of others

such text should be excluded from

relevance determination. Similarly, abstracts should be highly regarded in assessing an article’s topical content. We
propose to build on previous work in the argumentative zoning (e.g.,
[
Teufel and Moens, 2000
]
) and develop
approaches that are capable of

automatically segmenting articles based on explicit section headings, discourse
markers, etc. Evidence from different sources can be integrated seamlessly within the Bayesian Inference Network
approach to IR implemented in the Lemur toolkit [Metzler and C
roft, 2004].

W
e view pertinence as a property primarily derived from relationships between articles, as opposed to the article
contents themselves. Although authors often claim certain advances and breakthroughs, a more objective assessment
of a work’s con
tribution can be derived from citation information, which is reflective of how peer researchers view
the work. Pertinence will be derived from the citation graph and associated features

generated by the analysis module,
and combined with relevance informat
ion (e.g., simple linear combination) to generate a candidate list of documents
to be analyzed in greater detail.

Centrality is independent from both relevance and pertinence, since
it

attempts to quantify the relationship between a
particular article and

other articles about the same topic. Individual research communities usually adopt a particular
paradigm or norm in their methodology or general approach: articles that fit within established practices will be
central. To give a concrete example, most met
hods for paraphrase induction involve statistical machine learning
techniques on large corpora: papers that employ this general approach will be central, while the relatively few papers
that discuss human
-
assist processes for paraphrase acquisition would b
e less central. Note that unlike relevance or
pertinence, the user may be interest in central (“what most people are doing”) as well as peripheral (“unique
approaches”) articles. For each paper and citation we will compute several centrality metrics (e.g.
, citation centrality
and lexical centrality, both generically and in a query
-
based way).

Returning to our running example, the result of retrieval is a full set of all citing sources

including ones from

Figure
2

as well as other citing sources (e.g., [Lin and Pantel 2001])

along with their relevance, pertinence, and
centrality
scores:

<relevance value="0.423" pertinence value="0.942 centrality value="0.526"/>

These are
then
passed along to
the selection module.

4.3.

S
election

The iOPENER selection module identifies key passages

(i.e., segments)
in documents selected by the retrieval
module. The goal of this module is to pinpoint exact spans of text that state the contributions of an article: the three
properties discu
ssed
above (relevance, pertinence, centrality) are all applicable at the segment level also. We
propose
two general approaches to this challenge, one based solely on article content (which serves as a baseline), and one
that incorporates evidence from the
citation structure.

Both techniques propagate document
-
level evidence from the
retrieval module.

In the same way that lead sentences in news articles serve as good summaries
[
Dorr
et al., 2003]
, well
-
written
abstracts highlight important facets of the arti
cle. However, abstracts are often indicative (suggestive of the
contributions) rather than informative (actually describing what the contributions are). Thus, it may be necessary to
examine passages in the body of the article. This task can be recast as a
passage retrieval problem, which has been
explored in the context of information retrieval; e.g.,
[Salton et al., 1993;
Mittendorf

and
Sch
ä
uble
, 1994; Clarke et
al., 2000]
. The abstract can be treated as a “query” to “retrieve” segments of the article. Thi
s technique allows the
system to identify passages that elaborate on information conveyed in the abstract. One major advantage of this
a
pproach is its ease of implementation, which makes for a suitable baseline.

Nevertheless, citations provide a better ass
essment of an article’s contribution, and we propose to incorporate this
important source of evidence in the selection module. Consider
[
Barzilay and McKeown
,

2001
]

from our working

8

example. As we saw earlier,

Figure
2

shows two citing documents and the passages containing the citations
themselves. We can view passages surrounding the citations as highly
-
compressed summaries of the cited article.
Although individual citations are narrow in scope and coverage, accumu
lated support from many citations will yield
an accurate characterization of the cited work’s contributions.

Citation text can be leveraged as “queries” to identify key passages in the cited article. As an example, the
[
Shinyama
et al.
,

2002
]

article refer
s to one specific aspect of the
[
Barzilay and McKeown
, 2001] article
: “parallel translations
derived from one original document
.


Using IR techniques

with this citing phrase

as a query on the original
Barzilay and McKeown article

the selection module would

identify the key passage

(written by Barzilay
and
McKeown) shown in

Figure
3
,

where

many of the shared keywords are highl
i
ghted. Key passages identified through
citation analysis may or may not overlap with those
identified with the baseline approach. However, both sources
provide complementary evidence, representing the authors’ own opinions and community consensus.


Figure
3
: Survey
-
Like Material in [Barzilay a
nd McKeown, 2001]

If we take X and Z to be the documents defined above, the stand
-
off annotation, associated with Z, that corresponds
to this passage is:

<cited_segment offset="35946" length="12" cited_offset="2172" cited_length="406" cited_source=X scor
e="0.522"/>

This is interpreted as follows: The 12
-
character span of text starting at position 35946 inside of document Z (i.e., the
citation to [Barzilay and McKeown, 2001] inside of [Ibrahim, 2002]) refers to the 406
-
character segment starting at
positio
n 2172 in the cited source X (i.e., the passage above from [Barzilay and McKeown, 2001]), with a probability
of 0.522. Note that a citing document may include more than one stand
-
off annotation for a cited source, as there may
be more than one correspondin
g passage (i.e., multiple offsets) for the associated citation.

Providing a mechanism for specifying the span of a passage in a cited document also enables the combination of both
temporal and citation information in the stand
-
off annotation for Z:

<preced
e_relationship offset=35946 length=12 earlier_work=X later_work=Z cited_offset="2172" cited_length="406"
cited_source=X />

Note that detecting the span of the passage (in the cited source) relies on a spelled
-
out reference in the citing
document, e.g.,
“Building on X’s use of aligned monolingual corpora.” This information allows the selection module
to home in on the relevant passage. If the reference is vague, e.g., “We used techniques form X,” then the
cited_offset and cited_length may be omitted.

W
e hypothesize that the integration of citation information will provide great value beyond simple content analysis.
Presumably, pertinent articles are also well
-
cited: thus, we can leverage a type of redundancy effect

[Brill et al, 2001;
Dumais et al., 200
2]
, to induce a “popularity distribution” over the various facets of a cited work.
For example, t
wo
contributions
of

the Barzilay and McKeown work are their use of aligned monolingual corpora and
their
use of a co
-
training algorithm to extract paraphrases.

Based on how the paper is cited, our system will be able to automatically
determine the relative importance of different contributions and thereby control the survey process.

Discourse cues present in passages surrounding the citations also provide infor
mation that is often difficult to extract
from the content of the cited document itself. Rhetorical cues such as “therefore”, “in contrast”, “as with”, etc.
highlight relationships among different articles. Meta
-
commentary such as “first attempt to
,
” “fail
ed to
,
” “followed
up on” provide valuable information often absent in the original article.

4.4.

Linguistic Analysis

The
iOPENER system includes a
linguistic analysis

module that applies parsing and trimming, paraphrase induction,
named entity detection, and a
naphora resolution to
provide the information necessary for performing temporal
reference resolution, paraphrase
detection
, contradiction detection, and elimination of redundancy.

4.4.1.

Parse Tree (Trimmed) Variants and Scores for each Segment

The iOPENER appr
oach to producing survey
-
like text will require parse
-
tree variants of the source material by first
applying a Penn Treebank style parser (e.g., Collins [1997]) and then applying trimming operations to produce
multiple reduced outputs. For example, the thi
rd bold
-
faced sentences of
Figure
2

would give rise to the following
This paper presents a corpus
-
based method for automatic extraction of paraphrases. We use a large collection of multiple
parallel

English translati
ons of novels.
1
. This corpus provides many instances of paraphrasing, because
translations
preserve
the meaning of the
original

source, but may use different words to convey the meaning. An example of
parallel

translations

is shown in Figure 1. It contains

two pairs of paraphrases: (“
burst into tears
”, “
cried
”) and (“
comfort
”, “
console
”).


9

parse
-
tree variants, represented in the stand
-
off annotation (inside of the citing document Z), where the first variant
corresponds to a parse of the original se
ntence and the other two variants are trimmed versions of this sentence.

<trimmed_segment offset="35934" length="25" cited_source=X trimmed_variants=

“(S1 (S (NP (JJ Barzilay) (NN [3])) (VP (VBZ uses) (NP (NNS paraphrases)) (S (VP (TO to) (VP (VB help)


(VP (VB eliminate) (NP (NP (NNS redundancies)) (PP (IN in) (NP (NP (NNS sentences))


(VP (VBN constructed) (PP (IN by) (S (VP (VBG summarizing) (NP (JJ several) (JJ other) (NNS sentences))))))))))))))))


(S1 (S (NP (JJ Barzilay) (NN [3]))
(VP (VBZ uses) (NP (NNS paraphrases)) (S (VP (TO to) (VP (VB eliminate)


(NP (NP (NNS redundancies)) (PP (IN in) (NP (NP (NNS sentences)) (VP (VBN constructed)


(PP (IN by) (S (VP (VBG summarizing) (NP (NNS sentences)))))))))))))))


(S1 (S

(NP (JJ Barzilay) (NN [3])) (VP (VBZ uses) (NP (NNS paraphrases)) (S (VP (TO to) (VP (VB eliminate)


(NP (NP (NNS redundancies)) (PP (IN in) (NP (NP (NNS sentences)))))))))))))/>”

During the generation of surveys and historical timelines, each

parsed

variant scored with a

linear combination of
relevance, pertinence, and centrality features. This
information augments the stand
-
off annotation above. For
example, the stand
-
off annotation for each trimmed variant above contains
“ir_
score=0.372
,


“ir_
sco
re=0.
21
2
,


and
“ir_
score=0.
135,


respectively. These scores are used for ranking parsed variants for inclusion in the summary.

4.4.2.

Paraphrase, Synonymy, Derivational Variations, and Antonymy

We propose to extend the iOPENER stand
-
off annotation to accommodate

the inclusion of three types of equivalence
(paraphrase, synonymy, and derivational variations) as well as one form of non
-
equivalence (antonymy). Paraphrases
will be detected using techniques from our work on paraphrase alignment and decoding [Ibrahim et

al., 2003; Shen et
al., 2006; Madnani et al., 2007b].
7

Synonymy and antonymy are detected using the publicly available WordNet at
http://wordnet.princeton.edu/
, henceforth (WN). Derivational Variations are d
etected using techniques from our
work on building categorial variations into the CATVAR database [Habash and Dorr, 2003] and our work on
detection of multiword morphological derivations [Jacquemin et al., 1997], as described in Section
4.5.1
. The stand
-
off annotation for the notion of equivalence and non
-
equivalence each of these components is included at the segment
level, along with the related parse
-
tree constituent. For example, the stand
-
off annotation that relates the se
gment
automatic extraction of paraphrases

in the cited document X=[Barzilay and McKeown, 2001] to the segment
obtain
paraphrases automatically

in the citing document Q=[Shinyama et al., 2002] is:

<equivalence_relation


citing_offset="5467" citing_leng
th="35" citing_source = Q


cited_segment_parse = "(VP (V obtaining) (NP (NNS paraphrases)) (JJ automatically))"


cited_offset="2118" cited_length="35" cited_source=X


citing_segment_parse =" (NP (NP (JJ automatic) (NN extraction)) (PP (IN of) (
NP (NP (NNS paraphrases))"/>


4.4.3.

Named Entity Detection: Namex and Timex Labeling

Referring expressions such as names, dates, locations, and times will be annotated using standard named
-
entity
detection software, e.g., BBN’s Identifinder [Bikel et al., 1999]
(which is already licensed to both universities).
Below is a named
-
entity annotation for our running paraphrase induction example (from the Ibrahim publication):
8

<ENAMEX ID="32144" TYPE="PERSON">Regina Barzilay</ENAMEX> published a paper in <TIMEX ID=”3
2063”
TYPE=”DATE”>2001</TIMEX> which is one of the bases of this thesis. <ENAMEX ID="32144"
TYPE="PERSON">Barzilay</ENAMEX> develops an automatic paraphrase extraction tool …

4.4.4.

Anaphora Resolution

We
propose to apply standard anaphora resolution, e.g., Ling
pipe’s publicly available software (
http://www.alias
-
i.com/lingpipe/index.html
), to annotate co
-
referring pronominal references using the same named
-
entity format
described above. For example, tw
o consecutive sentences of the Ibrahim publication would be annotated as follows:

<ENAMEX ID="32145" TYPE="PERSON">Barzilay et al.</ENAMEX> used parallel translations derived from one original
document … <ENAMEX ID="32145" TYPE="PERSON">They</ENAMEX> targ
eted literary works and used word
alignment techniques developed for MT.




7

Other paraphrase techniques have been developed by a number of different researchers

including those used in our examples
[Barzilay and McKeown, 2001; Ibrahim

et al., 2002; Lin and Pantel, 2001; Shinyama et al., 2002], as well as others [Bannard and
Callison
-
Burch, 2005; Barzilay and Lee, 2003; Hatzivassiloglou et al., 1999; Lapata, 2001; Pang et al., 2003; Quirk et al., 2004].
In addition, several researchers
have examined the use of paraphrase for applications other than summarization, e.g., question
-
answering [Duclaye et al., 2003] and machine translation [Callison
-
Burch et al., 2006]. However, none of these approaches has
focused on the use of paraphrase for

redundancy removal and contradiction identification for su
rvey creation.

8

Timex notation conforms to existing standards for temporal markup to enable reasoning, such as those proposed in the [Mani et

al. 2005]. (See also
http://www.timeml.org/

.)


10


where the identifier 32145 refers to the same person for both “Barzilay et al” and “They.”

4.5.

Presentation: Survey Generation, Timeline Creation, and Visualization

Our presentation modu
le aims to generate summaries at different levels of expertise, using the annotations provided
above. Our
focus

is

on
producing
su
rveys

and historical timelines that
tak
e

into account
r
edundancy and
contradiction
s. In addition, our annotations also serve
as an enabling technology for temporal visualization that will
show the changes of arguments and emphasis over time, plus reveal the impact of one event on another. Another
important feature of a successful visualization is to reveal gaps or missing aspec
ts of an argument, as was shown in
the LifeLines visualization of personal, legal and medical histories [Plaisant et al., 1996]. Finally, search capability
across multiple time lines will enable rapid identification of similar patterns across related data
sets. We discuss each
of these aspects, in turn, below.

4.5.1.

Survey

Generation

Previous work on paraphrasing and redundancy in summarization has focused on: (1) using

redundancy

to generate
paraphrases [Barzilay and McKeown, 2005] and (2) generating novel parap
hrases to describe a particular fact in
different ways [Shen et al., 2006].
9

In iOPENER, we propose a new approach that uses prior knowledge about
paraphrases to augment a current parse
-
and
-
trim technique

called Trimmer [Dorr et al., 2003]

to enable
redund
ancy removal and contradiction identification during the generation of surveys.
O
ur
previous
sentence
-
trimming
approach generated

a summary (from a collection of documents relevant to a user’s query) by producing a
set of trimmed variants for each sentence

in the documents. These variants were ranked using a linear combination of
IR
-
based features. The summary was then produced through a process of adding trimmed sentences, one at a time, to
a “working summary” until a user
-
defined length threshold is reac
hed.

There were several shortcomings of this
approach:
(1)
The existing redundancy
-
checking

mechanism

did not have knowledge of
paraphrases
/synonymy;

(2)
Trimming occurred
before

redundancy detection so there was no way to bias the trimming operations tow
ard the
elimination of redundant material; (3) There was no mechanism for determining which syntactic constituents could be
appropriately deemed “redundant”; (4) There were no “splicing” operations to bring together the non
-
redundant
pieces of two parse tr
ees; (5) There was no contradiction
-
detection component.

In iOPENER we propose to bias

our trimming operations to
ward

the
excis
ion of red
undant

constituents.

We focus
on the identification of redundant

VPs and PPs

leaving redundant NPs to be resolved via

anaphora resolution

and
we retain important constituents identified by Timex and Namex expressions. The summary will be produced by
iterative pairwise “collapsing” of sentence variants produced by trimming, taking into account redundancy and
contradictio
ns
during
the trimming process.
Redundant phrases
will

be identified (using paraphrase, categorial
variations, synonymy)
if they

correspond to syntactic constituents that are

deletable
,”
e.g.,

they cannot be NPs and
they cannot span more than one VP or PP
.

If the portion to be trimmed provides definitional content for a technical
term (e.g., a relative clause defining the nature of a named entity), this information is stored away for possible access
later

even if temporarily “masked out” for the current p
resentation

so that it might later be revealed to the end
user, depending on whether that user is an expert or a lay person. This “masking” applies to cases where named
-
entity recognition has identified a token that is not a simple quantity or time (e.g.,

a substance, product, organization,
person, disease, work_of_art). O
nce deleted

(or masked out)
, the remaining constituents must contain

enough non
-
redundant
,

relevant material to be considered a viable

candidate for the summary.


A pair

of sentences, s
1

and s
2
,
that exhibit redundancy prior to trimming m
ay

be

collapsed in one of three ways:

(1)

If one of s
1
or s
2

is trimmed and none of the remaining material is relevant, pertinent, or central, that trimmed
candidate must be deleted.

(2)

If one of s
1
or s
2

is tri
mmed and none of the remaining NPs exhibit any overlap between the two trimmed candidates,
that trimmed candidate may be included in the summary.

(3)

If one of s
1
or s
2

is trimmed and there are NP constituents that overlap inside of s
1

and s
2
, the trimmed cand
idate may
be
merged

with its paired counterpart by conjoining VP and PP material associated with the overlapping NP.

For the merging step, we rely on an efficient (dynamic programming) approach to aligning syntactic phrases [
Shen
et
al., 2006] that allows
us to determine how to merge
overlapping nodes and joining the non
-
overlapping piec
es.
The
goal is to continue the trimming, deletion,
addition,
and merging

operations, recursively, until the

Working
Summary


reaches a

threshold (e.g., 500 words).




9

Related issues on paraphrase have been examined by participants in “Recognition of Textual Entailment” track of the PASCAL
challenge (Ido et al., 2006;
http://
www.pascal
-
network.org/Challenges/RTE/
) but none has used paraphrase for redundancy
removal and contradiction detection in the production of surveys of scientific material.


11

Consid
er our example (from
Figure
2
) wit
h
the cited document Z
=[Barzilay and McKeown, 2001] and the citing
document Q=[Shinyama et al., 2002]
. The cited document contains the segment “
This paper presents a corpus
-
based
m
ethod for auto
matic extraction of paraphrases
” and the citing document states the following: “
There are a few
approaches for obtaining paraphrases automatically. Barzilay et al. used parallel translations derived from one
original document
[1]
.

Equating
the segments “
obtaining paraphrases automaticall
y”
and


automatic extraction of
paraphrases


is achieved using three mechanisms: (1) extracting CATVAR variants:
obtain/obtaining
,

extract/extracting
,
automatic/automatically
;

(2) determining that
obtain

and
extract

are
WN synonyms (via the sense
of
acquire
); and (3) inducing a paraphrastic relation between
obtain

and

extra
ct
using our paraphrase decoder
[Madnani et al., 2007b].
T
aken together, a survey article might include the following single sentence from

these two
segments: “
Barzilay et al.

2001

used parallel translations derived from one original document for automatic
extraction of paraphrases.
” A paragraph surveying the cited articles above might look like the following:

To accomplish the task of para
phrase induction, Barzilay et al. 2001 used parallel translations derived from one
original document for automatic extraction of paraphrases. The approach of Lin and Pantel 2001 borrows Barzilay’s
idea of using aligned monolingual corpora, but Lin’s path r
epresentation is more sophisticated and allows for
extraction of longer paraphrases. In addition, Barzilay uses a large collection of multiple parallel English translations
whereas Lin uses a more restricted data set.

The above example is an illustration
of the use of equivalence in survey creation. With the addition of antonymy as
an indicator of
non
-
equivalence, iOPENER will also include identification of six different types of contradictions:



Simple negation
: Use of
not

or
didn’t

in front of the origina
l word(s).



Negated
-
synonymy
: Use of
not

or
didn't

in front of
WN

synonym(s) of the original word(s).



Simple antonymy
: Use of
WN

antonym(s) of original word(s).



Negated
-
paraphrase
: Use of
not

or
didn't

in front of a paraphrase of the original phrase.



Negate
d synonymous paraphrase
: Use of
not

or
didn't

in front of
WN

synonym(s)
of paraphrase of
original phrase.



Antonymous
-
paraphrase
: Use of
WN

antonym(s) of a paraphrase of the original phrase.


A pair

of sentences, s
1

and s
2
,
that exhibit

one of these contrad
iction types may be handled in two different ways:

(1)

If s
1
or s
2

are statements made across
two different temporal contexts we would put them into historical context, e.g.,

On/in/at [date/day/time/year], X was claimed to be true and on/in/at
[
date/day/time/
year], X was refuted.”

(2)

If s
1
or s
2

are coming from two different viewpoints, we would state them as opposing views, e.g., “
The status/impact
of X unclear: According to Y, X (happens) and, according to Z, X doesn’t (happen).”

As an example of contradiction
identification, consider the topic of “word sense disambiguation in information
retrieval.” Sanderson’s 1994 statement cites [
Voorhees,
1993]:

Voorhees built a sense disambiguator based on the
WordNet thesaurus [18] … Unfortunately retrieval experiments
run on these
disambiguated collections resulted in a
drop

in IR performance
.”
T
his is refuted by

[
S
ch
ü
tze

and Pedersen, 1995]: “Results show that this sense
disambiguation algorithm

improves

performance

by between 7% and 14% on average.” The (bold
-
faced)
antonyms
are indicative of a contradiction that might be summarized as: “The impact of sense disambiguation on IR is unclear;
according to Voorhees 1994 (as reported by Sanderson 1994), sense disambiguation resulted in a drop in performance
but, according
to S
ch
ü
tze

and Pedersen, 1995, sense disambiguation improves performance.”

4.5.2.

Timeline Generation

Note that, in addition to presenting different points of view, t
h
is

notion of tracking changes over time is one important
step

toward

the creation of timelines

(
e.g., to be used for the creation of historical chapter notes by textbook authors).
Specifically, timeline construction relies both on the actual publication date of an article and on the temporal referring
expressions used by other authors to cite the p
ublication. Both types of information are encoded in relationships
identified through the citation analysis. An example of the use of this information is a timeline associated with our
sense disambiguation topic above:



1993: Voorhees built a word sense d
isambiguator WordNet
-
based and demonstrated a drop in IR performance.



1995:
Sch
ü
tze and Pedersen’s demonstrated that sense disambiguation algorithm

improves IR performance by between
7% and 14% on average.


This information is useful for presentation of a
“thumbnail sketch” for rapid learning of a topic. It also provides the
basis for visualization of key historical facts, where zoom
-
in/zoom
-
out capabilities would allow drill
-
down to the
corresponding text from the survey creation module.


12

4.5.3.

Visualization

A
well
-
designed visualization tool allows users to understand patterns, identify clusters, and see relationships.
Further benefits accrue in that users are more effective in discovering anomalies, finding outliers, and spotting gaps.
These payoffs work on i
ndividual items, as well as user
-
selected subsets and computed aggregates. We will extend a
current visualization tool [Plaisant, 1998] to allow users to view the essence of different language
-
related notions
above, e.g., equivalences (overlap between two
people’s work), non
-
equivalences (contradictions), and
timelined/historical information. For example, conflicting findings such as the two results presented in the timeline
example above might be associated with a color
-
coded link that indicates contradic
tory segments.

Building on earlier work on discourse and argument visualization such as used in discussion list thread visualization
[Kerr, 2003], we will enrich these ideas with appropriate coding mechanisms (color, size, glyphs) and enable user
controlle
d presentation of textual labels appropriate to the users task. The aim would be to highlight key contradictory
statements made about controversial topics (e.g., whether word sense disambiguation helps information retrieval) and
to present groups of relate
d work in the form of zoom
-
in/zoom
-
out clusters of information. The central tenet of our
information visualization approach is to allow the user to control complexity. Enabling users to selectively display
only the items of current interest they can rapidl
y understand the relevant and pertinent aspects of complex data sets.

5.

Evaluation Plan

To evaluate the effectiveness of our approach, we will develop a set of methodologies, some of which buil
d

on
techniques from the natural language processing community an
d some of which are tailored specifically to the
iOPENER task.
For each of these methodologies, our aim is to validate our two hypotheses:

H1:

By integrating both technologies

bibliometric lexical link mining [Elkiss et al., 2007] and summarization
techn
iques [Radev et al., 2004; Zajic et al., 2006]

we achieve better results than we would with either approach
alone, as measured by automatic distance metrics, human tests of adequacy, and task
-
based evaluations.

H2:

By integrating textual surveys

into visua
lization environments, we can capture the more subtle aspects of topic
-
focused surveys that a user would not otherwise find

(e.g., contradictions and paraphrase)
, as measured by task
-
based utility studies [
Shneiderman and Plaisant, 2006;
Klavans and Boyack
, 2006].


Regarding H1, we will use five different measures of success: (1) A
utomatic
distance
comparisons
among a baseline
output, iOPENER
output
,

and expert
-
edited output;
(2) Tests of adequacy of human understanding of iOPENER
output; (3) Task
-
based e
valuations of iOPENER output based on human experiments of sentence ordering and other
extrinsic evaluations; (4)
Evaluation of iOPENER output using primary sources associated with pre
-
existing survey
s

using automatic distance comparisons.
; (5) Evaluation
of the ability of iOPENER to survey technical at a level that a
lay person can understand. Regarding H2, we will conduct user studies to measure the impact of visualization of the
more subtle language
-
oriented relationships identified by iOPENER

such as co
ntradiction and paraphrase

that are
not captured in typical visualization environments that utilize keyword displays [B
ö
rner 2006,
Shiffrin

and
B
ö
rner,
2004]. All five PIs have a solid track record of obtaining Institutional Review Board approval for cond
ucting user
studies.

In order to set up a controlled evaluation, we will select two focused user groups for development and evaluation.
The rationale for starting with computational linguistics was to be able to have access to a core group for user testin
g.
This group consists of our graduate students and our colleagues in related areas of computer science and information
studies who may be knowledgeable about their own areas of expertise but need additional information in
computational linguistics. For
the less informed user group in computational linguistics, we have access to many
undergraduates, many of which may have expertise in other areas but are coming back for additional degree and most
of which are young learners. This variety in age, level of

education, and exposure will provide a rich user group for
developing iOPENER. Since all the PIs are involved in interdisciplinary research, we have a natural set of colleagues
and students in this category. All of us have handed our introductory texts t
o incoming experts in areas such as
databases, art history, statistics, or medicine.

Our larger goal is to be able to apply these techniques very widely. As such, we will seek a community
-
based group
at the local or state government level and a domain
of interest for that group. That is, once we have tested our
techniques with the initial core group, we will identify and select a second group based on the results in Years One
and Two. Returning to the example of land use and development (one which is o
f urgent importance in digital
government), we will use our existing digital government experience to identify a community of users who will be
able to work with us on use and evaluation. Most importantly, these groups must satisfy the criterion of having

complaints and gaps in how they access necessary information outside of their specific domain, and how they
integrate this information into their own decision
-
making. Potential topics for these groups could include zoning,
waste management, communications
, educational and medical resources, emergency planning and response, etc.
Each of the PIs brings experience in working with at least one of these units in local or state government, and has

13

simulated the functionality of iOPENER in obtaining survey
-
type
information on a topic for community
-
based action.
This experience will be used to identify and select the digital government user sample.

5.1.

Intrinsic and Extrinsic Formative and Summative Evaluation

of iOPENER

We propose two types of evaluations in two sta
ges.

The first stage of evaluation will occur at the start of the project
and will set a baseline standard for the iOPENER system

(for testing H1)
. Our goal in these evaluations will be to see
if an existing multi
-
document summarization system can be used

over
a wide range of
more loosely connected pre
-
selected texts relevant to a given topic to identify and select key information. This baseline will not
contain the range
of

linguistic information
described
in Section
4.4
. Th
e
se formative evaluations will thus be intrinsic,
i.e.,
evaluating
in a standalone mode the quality (akin to precision) and completeness (akin to recall) Once a baseline is established,
such evaluations will be scheduled every four to six months while iOP
ENER undergoes iterative rounds of
development and improvement.

A complete verification of H1 will require that, in addition to iOPENER output, we
solicit human editing of the iOPENER output to see if the number of edits is reduced on successive iteratio
ns. To
determine this, we will apply automatic measures to determine how far each of the outputs is from the human
-
edited
output, using standard automatic measures of distance used in the summarization community, e.g., ROUGE [Lin,
2004], POURPRE [Lin and
Demner
-
Fushman, 2005], and TER [Snover et al., 2006].

The second formative evaluation

(for testing H1)

will also occur at the beginning of the project. Our goal

will be to
determine how an existing multi
-
document summarization system might help a user

gai
n

an understanding of a new
field. “Understanding” is typically elusive to measure. We will develop a battery of experimental conditions,
including the generation of a paragraph demonstrating information integration and the ability to answer a set of
que
stions
after reading
the output of the multi
-
document summarizer.

Framing these questions as
multiple
-
choice

will allow structured measurement and comparison of results.

These two formative evaluations will establish starting point baselines upon which to

build progress as iOPENER is
developed. As seen in the project timeline, we will be adding capability according to the following schedule
:

1.

Condition One


Test Basic system with first user group

2.

Condition Two


Test
Basic system with visualization

3.

Cond
ition Three


Test
Basic system with
linguistic

information

4.

Condition Four
-

Basic system with
linguistic information

and visualization


Under each of these major Conditions will be sub
-
conditions, e.g., different types of visualizations, linguistic
info
rmation, and combinations. However, careful combinatoric control is required for experimental reliability
computation.

Results of the tests will be used to design system improvements.

The next set of evaluations (for testing H1) is summative. We will co
nduct a number of extrinsic evaluations to
determine the effectiveness of summaries, measured i
n a number of different ways
, e.g.,
human experiments on
ordering of sentences

[Madnani et al., 2007a]
and standard extrinsic evaluation techniques, e.g., appli
cation of
both
LDC Agreement and Relevance Prediction measures to
the output of multi
-
document summarization
[President et al.,
2007]. One challenge in evaluating

iOPENER is that each human subject can only be tested one time with a given
topic, since it i
s not possible to “unlearn” information on a given topic. Thus, it will be necessary to run many
subjects on different topics in order to obtain enough results for statistical significance.

In the first year of the project, we will develop a test corpu
s that we will use throughout the project. We will also
employ annotators to create ground
-
truth surveys.

A selected set of core examples will serve as ground truth, marked
up by human annotators. This set will be used throughout the project, with addi
tional annotations added as informed
by our progress. Human annotations will establish a ground truth; as we move from the CL domain to digital
government, we will need another set of human annotations to establish ground truth over the new domain. When
possible, we will select corpora from public domain resources, so that our annotations and experimental
enhancements can then be donated back to the LDC as part of the ongoing construction of language resources.



5.2.

Evaluation of iOPENER Output using Back
-
ma
pped P
rimary Sources
from a Pre
-
Existing Survey

As part of developing new and creative methodologies for evaluation, we propose to explore the following
additional

technique
s to use it for further verification of H1 beyond the standard approach
. Under the

usual methodology, a
gold standard is established, and system performance is measured against that standard,

be it intrinsic or extrinsic. In
addition to this, we also propose to
identify three key survey articles in two areas we have chosen. Next, we
obtain
digital copies of primary sources cited in bibliographies of these articles. We then can use these sources

to:

1.

Manually trace a citation back to all citing source articles

2.

Run the iOPENER summarizer over these articles

3.

Use the survey as the “gold

standard” and compare the number of matching facts



14

As example of this type of language is shown in
Figure
4
, from the survery article on the “Thesaurus” authored by
Karen Sparck Jones in the Encylopedia of Artifi
cial Intelligence, 2
nd

editionS. Shapiro, ed, Wiley 1992:


Figure
4
: Example Illustrating Citation Tracing for Compare and Contrast with Temporal Information

Note that functions such as “compare” and “c
ontrast” will not be counted as part of the evaluation at the start (e.g.
early approaches were single
-
term oriented whereas later ones were hierarchical)
. As the project progresses, and as
annotation occurs, we will incorporate this kind of logic into th
is back
-
mapping

evaluation
approach.

We
will

start this process with relevant articles from the recently published Oxford Encyclopedia of Artificial
Intelligence. The section editor for computational linguistics was one of the co
-
PIs, Judith Klavans.
Kla
vans also has
a relationship with OUP as part of the Mellon
-
funded Computational Linguistics for Metadata Building (CLiMB)
project (
www.umiacs.umd.edu/~climb
).

Thus, electronic copies of selected encyclope
dia entries can be obtained and
used to collect cited full
-
text articles. In the area of computational linguistics and computer science, most references
are available in electronic form. Other survey articles that could be used include introductory tex
t articles
.
We
have established relationship

with

C
iteseer

and PRESRI (see enclosed letters of endorsement) w
hich will permit us to
obtain abstracts as a proxy for full text in order to rapidly ramp up the development of the new bidirectional
evaluation
metric.

For the second test area in digital government, we will use publicly available information as well
as pre
-
screened government information from resources such as the American National Corpus (ANC)
http://www.americannationalcorpus.org
, which provide added metadata.

5.3.

iOPENER and Visualization

T
he University of Maryland
has an excellent
Human
-
Computer Interaction research program, with specialization in
visualization and information access
[Kules e
t al., 2006; Shneiderman, 1996; 2002]
. As part of the formative
evaluation for this research,
the language and information retrieval groups
w
ill coordinate with this group
to explore
potential visualization tools for
the iOPENER

output

and also to apply
evaluation techniques that are well
-
established
in the visualization community [Shneiderman and Plaisant, 2006].
Evaluation of our results will involve testing the
ability of humans to retain information using visualization approaches versus textual presen
tations. We will compare
different approaches to presenting surveyed information and evaluate the ability of the human to answer questions
about the domain. Ultimately, we would want to produce input to surveys that are of a high enough quality for
revie
w and publication in a scientific journal relevant to each of the domains we are studying.

Our work on iOpener will be guided by a progressively more rigorous set of empirical studies with users conducting
realistic tasks. We will begin with informal usab
ility studies during the first year. Four to eight graduate student users
will be asked to use our prototype and answer a set of questions about our test collection of data. Logs of their usage,
our observations of their activity, and their responses to s
tructured questions will guide our revision of the design.
Then in the second year we will conduct 4
-
8 case studies with graduate students who bring their own tasks to the
project on a dataset of relevance to them. These evaluations will not only shape ou
r prototype but enable us to refine
our evaluation methods. In the third year we will conduct 4
-
8 indepth case studies with professional users who bring
their own tasks on a database of relevance to them. Their comments will help us refine the system sti
ll further. At the
same time we will be developing a cognitive theory of discovery in textual databases. We will build on recent work
on exploratory search and visualization of search results.

6.

Deliverables, Rights, and Permissions

The deliverables of the
iOPENER Project include: (1) a dataset of survey articles, based on the publicly available
paper databases that we will process; (2) a software for temporal ordering of research contributions within a field; (3)
software for survey generation; (4) user acc
essible system; (5) new methodologies for evaluating such technologies.

We have already begun the process of collecting different resources for our experiments, e.g., ACM digital Library
(http://portal.acm.org/dl.cfm), which contains publications in many

sub
-
disciplines within computer science.
http://www.acm.org/pubs/copyright_policy/
;
Encyclopedia of Artificial Intelligence; PRESRI Database [
Nanba et al.,
2004], (
http://www.presri.com/
)

see enclosed letter of support from
Hidetsugu Nanba

at
Hiroshima City University;
Source Text for new edition of
Speech and Language Processing

[Jurafsky and Martin, 2007]

see enclosed letter of
support from Dan Jurafsky at
Stanford University; ACL Anthology (http://acl.ldc.upenn.edu/), which contains all
papers published in computational linguistics, under the auspices of the Association for Computational Linguistics;
“…. The thesaurus terms, viewed as desc
riptive labels, thus had an essential normalizing purpose, namely to ensure conceptual
matching, namely to ensure conceptual matching regardless of the surface variety or ambiguity of natural language (Lancaster,

1972)…..It was soon found, however, that th
e view of thesaurus terms as a band of brothers on an equal footing was too simple,
especially for a large document collection. It was necessary to allow for hierarchical relations between terms… A modern the
saurus
(Foskett, 1980) will, therefore, consist

of a term vocabulary supported by a prescriptive apparatus and amplified by a relational one.” (p.
1607)
.


15

Citeseer (
htt
p://citeseer.ist.psu.edu
)

see enclosed letter of support from Lee Giles at Penn State University;
Scholarly Database of papers/patents/grants

see enclosed letter of support from Katy B
ö
rner at Indiana University;
Citabs collection
[Elkiss et al. 2007]

dev
eloped as part of co
-
PI Radev’s research; Collections of papers on Social
Network Analysis, Language evolution, and Bayesian Nets in the social sciences. Together, these include a total of
more than 1 million full text papers and 17 million paper abstracts
. We will further investigate the acquisition of
similar paper collections in several areas of the social sciences.

W
e are well aware that we must address the issue of rights and permissions in order to disseminate the results of our
research. As Director

of a digital library research center, Dr. Klavans has extensive experience both in negotiating
and addressing rights and permissions when using published and copyrighted material. Initial contacts (see, e.g.,
enclosed letters of endorsement) indicate tha
t we will be able to conduct research using these resources.

7.

Results of Prior NSF Support

Dr. Dorr

was awarded both the NSF NYI (Award #IRI
-
9357731, 1993
-
1997) and the NSF PFF (Award #IRI
-
9629108, 1997
-
1999) where she investigated large
-
scale interlingual

machine translation, tutoring, and information
filtering. More details (including related publications) can be found at:
http://www.umiacs.umd.edu/~bonnie/PFF
-
PECASE.pdf
. Dr. Dorr was als
o a co
-
principal investigator on an NSF
-
funded grant entitled Collaborative Research:
Interlingual Annotation of Multilingual Text Corporation (for the University of Maryland portion: Award #IIS
-
0324942; amount $268,750; period of support 9/01/2003
-
08/31/2
004, plus no
-
cost extension to 08/31/2005). More
details (including related publications) can be found at:
http://aitc.aitcnet.org/nsf/iamtc/

Dr. Klavans
has been a PI or co
-
PI on many NSF awards since s
he joined Columbia University in 1992. Her
primary research focus has been on text data mining, creation and use of lexical resources, and summarization across
the domains of digital government, news, and medicine. Major awards include: (1)

"STIMULATE: G
enerating
Coherent Summaries of On
-
Line Documents: Combining Statistical and Symbolic Techniques"
;

(2)
"A Patient
-
Care
Digital Library


PERSIVAL: Personalized Retrieval and Summarization for Medical Digital Libraries"
, Columbia
University, Department of C
omputer Science; (3)

Automatic Identification of Significant Topics in Domain
Independent Full Text Analysis''
, Columbia University, Center for Research on Information Access; (4)

"Digital
Government Research Center: Energy Data Consortium:

New technol
ogies for Access to Heterogeneous Statistical
Databases
"
; (5)
"Digital Government Research Center: Bringing Complex Data to Users".

Dr. Lin

has recently received an NSF grant entitled “DHB: A Computational Approach to Understanding the
Dynamics of the Judi
cial System” (
BCS
-
0624067) with collaborators in the Government and Politics Department at
the University of Maryland (9/2006
-
9/2009). The aim of this project is to apply automated content analysis
techniques to the study of legal texts, in particular, br
iefs and opinions of the U.S. Supreme court.

Dr. Radev

has received many grants from NSF and NIH. The NSF grants include “ITR: Information Fusion Across
Multiple Text Sources: A Common Theory” (IIS
-
0082884, 2000
-
2003), “Probabilistic and link
-
based Metho
ds for
Exploiting Very Large Textual Repositories” (IDM
-
0329043, 2003
-
2006), “Collaborative Research: Semantic Entity
and Relation Extraction from Web
-
Scale Text Document Collections” (HLC
-
032904, 2003
-
2006), “DHB: The
Dynamics of Political Representation
and Political Rhetoric” (BCS
-
0527513, 2005
-
2008), and “Collaborative
Research: BlogoCenter
-

Infrastructure for Collecting, Mining and Accessing Blogs” (IIS
-
0534323, 2006
-
2009). A
number of shared infrastructure artifacts such as MEAD (a general framework

for text summarization), NSIR (a
publicly available question answering system, NewsInEssence (news aggregation and summarization system), and
clairlib (an open source framework for natural language processing and information retrieval) have been generated

under these grants, in addition to more than 20 papers in top journals and conferences. Dr. Radev’s recent work on
the BCS
-
0527513 grant was recognized by the Gosnell Prize for best paper in political methodology.

Dr. Shneiderman

has received an NSF Digi
tal Government grant entitled
Integration of Data and Interfaces to
Enhance Human Understanding of Government Statistics:
Toward a Statistical Knowledge Network ($492,000, EIA
0129978), Supplement for AudioMap ($50,000)

(7/1/2002
-
6/30/2006). This project
involved the development of
techniques for improving public access to government statistical data, specifically, it extended the open course model
to the notion of citizen access and participation in statistical data presentation, annotation, and analysis
(jointly with
partners at the Univ. of North Carolina). Funding from a major NSF grant in collaboration with colleagues in the
Dept of Sociology has additionally supported a student to work on web navigation issues. The project web site
includes published

papers, technical reports, presentations, software, etc. (
http://www.ils.unc.edu/govstat
).



16

Coordination Plan


As a mid
-
size NSF project with PIs who have a history of working together, we envision a smoot
h transition to a new
project team. However, the new challenges of the iOPENER project suggest a set of activities, each of which will be
led by a different team member. Details of our coordination plan are given below.


Roles of Investigators


All PIs w
ill be directly involved in the operation of the project.

The
Project Director will be
Dr. Bonnie Dorr
, a
faculty member in the Computer Science Department at the University of Maryland.

Dr
s
. Judith Klavans,
Jimmy
Lin,
Dragomir Radev,
and Ben Shneiderman

are key personnel who will be instrumental in achieving the project
goals and objectives.

Details of specific roles will be given below.


Dr. Dorr

will provide leadership and management of the project, monitor project progress, and complete annual
report
s.

She will also lead the survey creation and timeline generation team. Dorr’s results with the Trimmer
algorithm achieved first place (out of 40 summarization systems) in the DUC
-
2004 evaluation; and since then, Dorr
has extended the approach to the prob
lem of multi
-
document summarization, which is a natural fit for this project.


Dr. Klavans

will

lead two aspects of the project. She will be in charge of designing and running the evaluations,
along with Shneiderman, a leader in the field of user intera
ctions with computational systems. H
aving successfully
led the NewsBlaster multi
-
document summarization effort at Columbia University, Klavans will also co
-
lead the
effort on survey creation and timeline generation with Dorr and Radev.


Dr. Lin

will be re
sponsible for developing the retrieval and selection modules. This involves coordinating with Dr.
Radev regarding the output of the citation analysis and with Dr. Dorr and Dr. Klavans regarding expectations of the
linguistic analysis components.


Dr. Rad
ev

will

be in charge of the data acquisition and analysis tasks (collection acquisition, metadata extraction,
summarization, lexical centrality analysis, bibliometrics, and web interfaces) and will collaborate with the rest of the
team on the remaining par
ts of the system.


Dr. Ben Shneiderman

will provide guidance during the early requirements analysis to make explicit the range of
user goals, thereby providing the intellectual framework to guide the phased implementation from simple to more
complex analy
sis tasks. The user requirements will guide early designs of the user interface and visualization
components, so that prompt prototype studies can be conducted. Guided by these usability studies we will shape
subsequent versions and refine our empirical
testing. Dr. Shneiderman will guide the work from more controlled
usability studies in which we provide the tasks, to more realistic open
-
ended studies in which users provide the tasks.
Observational studies, interviews, and logging will provide evidence

to refine our design and validate its
effectiveness. Throughout the work Dr. Shneiderman will develop theoretical frameworks of information seeking,
exploratory search strategies and sense making in unfamiliar research domains. These models will have cog
nitive
components with predictive power for measurable user performance.


Project Management across Institutions


The faculty at the University of Maryland Department of Computer Science and College of Information Science and
the University of Michigan Dep
artment of Information Science will work together across institution
s

to meet project
goals and facilitate the success of this project.

Table
1

presents the management plan
that
will be followed
in order to

attain
our

objectives in a timely manner.

Both universities are leaders in natural language processing, information
retrieval, bibliometrics, and human
-
computer interaction

and both have specific expertise in summarization,
retrieval and question
-
answering techn
iques, and conducting user studies.


Table
1
: Overall Project Coordination Timetable

Year

Activit
y

Person
(s) r
esponsible

1

Adaptation of Trimmer to incorporate
stand
-
off annotation

Dorr

1

Implementation of enhanced redundancy dete
ction

Dorr

1

Develop baseline formative intrinsic evaluation

Klavans

1

Develop baseline formative extrinsic evaluation

Klavans

1

Formulate annotation criteria

Klavans

1

Obtain rights to all corpora

Klavans

1

Acquisition of corpora

Radev


17

1

Development

of software for citation passage retrieval

Radev

1

Development of software for building citation chains

Radev

1

Development of baselines for Retrieval

Lin

1

Development of baselines for Selection

Lin

1

Requirements analysis for simple tasks

Shneiderma
n

1

Integration of simple surveys into visualization tool

Shneiderman

1

Initial case study with graduate students

Shneiderman

2

Implementation of contradiction detection

Dorr

2

Implementation of timelining

Dorr

2

Run initial evaluations on contradic
tion and timelining

Klavans

2

Identify user group for digital government application

Klavans

2

Graph analysis of citation networks

Radev

2

Building temporal lattices for given research topics

Radev

2

Integration of features from citation analysis into

Retrieval

Lin

2

Integration of features from citation analysis into Selection

Lin

2

Requirements analysis for more complex tasks

Shneiderman

2

Case studies with graduate students

Shneiderman

3

Implementation of merging operation for survey creation

Do
rr

3

Incorporate feedback from users

Dorr

3

Provide input to system development from user studies

Klavans

3

Run studies with digital government group

Klavans

3

Integrating the parts of the system into a shareable library

Radev

3

Evaluation of the summ
aries produced

Radev

3

Optimization of Retrieval based on output and user feedback

Lin

3

Optimization of Selection based on output and user feedback

Lin

3

Case studies for professional users

Shneiderman



Coordination Mechanisms


The PIs have a proven

history of working together, as evidenced by co
-
authoring articles, being co
-
PIs on grants and
contracts, being members of common centers, and attending similar meetings. This commonality of purpose and
place will enable frequent and smooth interaction a
nd coordination. The following chart expresses a selected set of
these overlaps:



Dorr

Klavans

Lin

Radev

Shneiderman

UMD Institute for Advanced Computer Studies

x

x

x



UMD Computer Science

x


x


x

UMD College of Library and Information Sciences


x

x



UMD Computational Linguistics and Information
Processing Laboratory

x

x

x



UMD Human
-
Computer Interaction Laboratory


x

x


x

Association for Computational Linguistics

x

x

x

x


Darpa Global Autonomous Language Exploitation

x

x

x

x



As part of the
iOPENER project, o
ur
team will have bi
-
monthly teleconferences and also meet at major conferences,
where our work will be presented. Calls will use VOIP, and thus there is no budget line for these calls. We will
arrange for an in person meeting between a
ll team members twice a year, to coincide with major conferences in order
to limit travel time and expense. Domestic and foreign travel is budgeted for this.


We will make use of a password
-
protected collaborative wiki for posting our results and distribut
ing software among
our team members for joint use. Shared software will include the University of Maryland’s Trimmer algorithm
(
http://www.umiacs.umd.edu/~dmzajic/software/Su
mmarization.tar.gz
), the University of Michigan’s MEAD
summarization algorithm (
http://www.summarization.com/mead
), the University of Michigan’s Clairlib architecture
for NLP and IR tasks (
http://tangra.si.umich.edu/clair/clairlib
), the University of Michigan’s citabs citation network
analysis software

[Elkiss et al. 2007]
. In addition, a new ACL initiative was recently launched by one of our team

members (Radev), the Computational Linguistics Wiki (
www.aclweb.org/aclwiki
) which we can use as a starting
point for discussion about the proposed iOpener shared resource.


Our students will work together as

a part of an exchange program over the summers. Maryland students will work at
the Michigan campus during one of the two summers

and then the exchange will be reversed in the subsequent
year. This placement enables a closer working relationship among the

PIs

we will have the option of holding
working meetings in the summer that include the entire team of students and faculty. In Years One and Two, we will

18

include six invited guests who can contribute to the research planning and evaluation. In Years Two

and Three, we
will include representatives of our selected digital government user groups, so we can share customer needs as well as
system design and development progress and plans. At each meeting, we will prepare presentations of current
research whic
h will be posted on the public website so the larger community can track our progress.


We will also coordinate efforts across our team members in the area of visualization. We will participate in the
exploration of bibliographic information as a part o
f an ongoing
“Visualization Contest”
run by members of the UMD
HCIL laboratory

[Fekete et al., 2004]
. Past participants of this
contest were required to use visualization tools to
perform a number of language
-
oriented tasks by hand
. These manual tasks wou
ld be facilitated by the automatic
capabilities

we propose in iOPENER. Specifically, users were asked to
: (1) create a static overview of 10 years of
the InfoVis conference; (2) characterize research areas and their evolution; (3) specify how a particular

author/researcher fit
s

within certain research areas; and (4) determine the relationships between two or more
researchers. T
his competition

which was chaired by members of the

HCI Laboratory at the University of Maryland

in 2004

has

made the InfoVis data
set of 614 papers (published between 1974 and 2004)

available to the iOPENER
project. We will use this data

alongside of the human
-
generated output of the four tasks above (for comparison with
our output)

to evaluate the effectiveness of iOPENER. Moreover
, through the InfoVis contest, HCIL has provided

the infrastructure necessary for integration of our language technology into a visualization environment.


Lin is in charge of the seminar series at he UMD CLIP lab, where Dorr is a co
-
director. As such, he

will invite two to
three speakers specifically relevant to the iOPENER project each semester. These talks will be advertised to the
entire UMIACS (UM Institute for Advanced Computer Science), HCIL (Human
-
Computer Interaction Laboratory),
CS, CLIP (Comput
ational Linguistics and Information Processing) and CLIS (College of Information Studies)
communities. Often HCIL and CLIP co
-
sponsor guest lecturers, which creates an added opportunity for coordination
and enrichment. Since the iOPENER project leverage
s expertise within each of these University entities, we will
reach out to our related communities through these seminars. At the same time, we will make the link between these
seminars and iOPENER a topic for discussion. This will serve the purpose of i
n
-
group communication as well as
informing others of our progress. It will also serve as a recruiting mechanism for students who might want to
propose special projects.


We will also create an iOPENER website when the project is underway

where the public

can read about progress in
building our survey
-
creation and timeline
-
generation software and producing surveys that may be distributed
publicly. Our team will also make publicly available all of our
survey
-
producing software
. Our
research results
would
be
presented at conferences such as the ACL, SIGIR, and ASIS

as well as to broader audiences, such as
analysts who prepare briefings for government policy
-
makers.


Coordination and iOPENER Budget


Deleted for privacy purposes.