enable

fabulousgalaxyΒιοτεχνολογία

1 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

76 εμφανίσεις


1

ENABLE Knowledge Base Introduction

Laboratory of Applied Informatics Research

Indiana University



1.

ENABLE Knowledge Base Architecture

The current version of
ENABLE Knowledge Base

consists of

two
major
components: bioinformatics

education

resource

collectio
n

and
bioinformatics
education
resource

management
.

The r
esource collection
component

automatically collects

and refines

information of
on
-
line bioinformatics education
resources.
Based on the collected education resources, t
he r
esource
management componen
t provides learning services to users
through

a query
interface with

some
v
isualization

techniques

us
ed
.

Online

bioinformatics education resources are collected
in the form of HTML
documents.
In the resource collection component,
a bioinformatics resource

data
processor automatically parses the HTML documents, extracting

attributes like

title, description, and URL for each document. Also, it sends requests

using
SOAP

to
LUCAS
,
a conce
pt extraction web service, to e
xtra
ct keywords for each
document.

The ref
ined resource data are then sent to database for later use.

The resource
management

component

has a 3
-
tier architecture. The front end is
Java
-
enabled web browser which runs java applet based GUI. The intermediate
tier is Java application server, running s
ervlet to get client side requests, process
the requests, and return the responses back to the client in the HTML format.
The back end is the oracle database, which holds all the

resource

data, executes
queries

issued

from the servlet and sends results bac
k.

The applet runs on the
JVM inside the
web
browser providing rich UI capabilities on the browser.

Currently a visualization technique called
force
-
directed

layout is introduced to
visualize relations
between

keywords
of all the resources stored in databa
se.

Accordingly, a clustering algorithm called
scatter
-
gather

is implemented to help
generate clusters needed in the visualization.

The most compute
-
intensive task
s

like keyword
-
keyword matrix computation
are

taken care at the intermediate tier
by
the
serv
let to guarantee a light weight client side.


2

Figure 1

below

shows the whole architecture of the
current version of
ENABLE
Knowledge Base
.


2.

ENABLE Knowledge Base

Query Interface

ENABLE Knowledge Base

provides a GUI at client side to help users retrieve
interested bioinformatics education resources.

The GUI has two layers. The back layer consists of control and display widgets,
which enable users to execute search,

browse operations and receive feedbacks.
The front layer is the visualization area. Both layers are implemented using Java
swing components

[1]
.

Figure 2 shows the initial query interface with a semi
-
transparent visualization
layer. Users can specify res
ource category for a search/browse operation.
Currently the resource categories include
learning tool, bioinformatics application,
literature, protein sequence, and gene sequence.

After a category is selected
and a keyword is submitted

(for search operati
on)
,
the search/
browse results will
appear in the results area.

Figure 1. ENABLE Knowledge Ba
se Architecture


3


Figure

2. ENABLE Knowledge Base GUI

As
figure 3 shows, the search/browse results are organized in a tree form in the
results area.
When the mouse point is moved over a resource node, a too
l tip will
appear with the description of the current resource.

When a resource node is
selected (by double click), the detail information about the resource, including
title, URL, description, and keywords, will appear in the detailed information area
at
the bottom.
When the URL of the resource is clicked, the default browser of
current client system will be launched and directed to the according website.
This
browser launce function works
for

most
Windows, Linux,
UNIX
, and Mac systems.

The visualization p
art of the GUI shows the relations
between

keywords of
various resources. When a resource node is selected, the visualization
area

is
also changed accordingly. Only the keywords of currently selected resources and
their related keywords (co
-
citation) will
appear in the visualization
area
.
A slide
bar on the GUI can help users change the alpha value of the semi
-
transparent
visualization layer. To make the visualization layer

clearer,
the bar

needs to be
slid

towards

the right hand side.


4


Figure

3. ENABLE Kn
owledge Base GUI Search Operation

Since
the visualization layer and the
control/display layer are overlapped, the
ESC key is used to switch control between the two laye
r
s
.
Initially the control
focus is on the back layer. To switch the focus to the visuali
zation layer, users
need to stroke ESC key once.

After the focus is switched, when mouse pointer is moved
over a keyword node
in the visualization layout, the node is highlighted in red color and its related
keywords are highlighted in orange color.

When
mouse pointer exits a keyword
node area, the node itself and its related node
change back

to normal color.

In
Figure 4, the node of keyword “sequence” and its neighbors are highlighted.
Edges between these nodes are highlighted too.


Besides, users can zoo
m in/out the visualization layout by pressing the right
mouse button

when the control focus is on the visualization layer
.
Users can also
move the layout around by pressing the left mouse button.



5


Figure

4. ENABLE Knowledge Base GUI

Visualization

Scatter
-
gather

[2]

is a clustering algorithm helping users to refine their retrieve
results.
In the GUI, users can select interested keywords by clicking the
corresponding nodes and then right
-
click the mouse in the visualization area.
As
it is shown in figure 5,

a

popup menu then appears with two options

“gather &
scatter again” and “back to initial state”.
When the first option is selected, the
scatter
-
gather algorithm

will generate a new keyword
-
keyword matrix based on
the selected keywords. Then the force
-
dire
cted algorithm generates a new layout
in the visualization area accordingly.
If the second option is selected, the
visualization area will go back to initial state where keywords from all resources
are used to
generate

the keyword
-
keyword matrix.

Users can

deselect any node by clicking the node again.
Execution of scatter
-
gather operation may lead to separated graphs
, as shown in figure 6,

since there
might be no relation between two previously selected keywords.




6


Figure

5. ENABLE Knowledge Base GUI Visu
alization Gather Operation


3.

ENABLE Knowledge Base
Client Platforms

Currently
ENABLE Knowledge Base

client

ha
s

been

tested on
multiple
operation systems including Windows XP, Red Hat Linux 9, and Mac OS X. It
supports various web browsers, such as IE, Mozil
la Firefox, and Netscape 7.0,
with JVM properly installed.
The supported JDK version is 1.4.0 and above. It
has not been tested on the JDK with lower version.

Some swing features may fail
when running the GUI on a lower version JDK.




7


Figure

6. ENABLE K
nowledge Base GUI Visualization Scatter Operation


4.

References

[1]

Java

Foundation Class (JFC/Swing).
http://java.sun.com/products/jfc/

[2]

Douglass R. Cutting
et al
.
Scatter/Gather: A Cluster
-
based Approach to
Bro
wsing Large Document Collections
.
In
Proceedings of the Fifteenth
Annual International ACM SIGIR Conference
, pages 318
-
329, June 1992.

[3]

Yueyu Fu, and Javed Mostafa. Toward Information Retrieval Web Services
for Digital Libraries. IEEE/ACM Joint Conference o
n Digital Libraries 2004,
Tucson, Arizona, 2004

[4]

Jeffrey Heer, Stuart Card, and James Landay. Prefuse: A Toolkit for
Interactive Information Visualization. Submitted paper draft, April 2004