Search Based on Metalogy

cabbagewheatInternet και Εφαρμογές Web

13 Νοε 2013 (πριν από 3 χρόνια και 8 μήνες)

88 εμφανίσεις

Design of a Search Engine for Metadata
Search Based on Metalogy

Ing
-
Xiang Chen, Che
-
Min Chen,and
Cheng
-
Zen Yang



Dept. of Computer Engineering and Science

Yuan
-
Ze University

http://syslab.cse.yzu.edu.tw/

ICADL 2001
-

2001/12/11

czyang@acm.org

YZU, Taiwan
-

ICADL2001

2

Outline


Introduction


Related Technologies


System Architecture


An Experimental Prototype


Conclusions


Future work

czyang@acm.org

YZU, Taiwan
-

ICADL2001

3

Introduction


Metadata management is not an easy
task:



It requires specific domain knowledge for
appropriate data categorization.


It needs to deal with the complicated
relationships between the metadata items.


A good management tool for easing metadata
construction and manipulation is necessary.

czyang@acm.org

YZU, Taiwan
-

ICADL2001

4

Introduction


Metalogy


Metalogy is a management system developed
by ROSS project group in Taiwan.


It can be used to manipulate various digitized
items and export/import XML records.


It is mainly designed for metadata management
of each digital library.


czyang@acm.org

YZU, Taiwan
-

ICADL2001

5

Introduction


Search across digital libraries:


Metalogy does not consider how to search
information across digital libraries.


As digital libraries are widely deployed,
searching information across several digital
libraries becomes important.


We design a search engine to help users find
resources without connecting to digital libraries
and inputting the same query terms.

czyang@acm.org

YZU, Taiwan
-

ICADL2001

6

Introduction


We design this search engine based on
the XML data exported from Metalogy
for some reasons:


XML/Metalogy provides comprehensive
metadata descriptions and DTD information for
metadata search.


The quality of the distributed service highly
depends on the quality of the data resource.

czyang@acm.org

YZU, Taiwan
-

ICADL2001

7

Related Technologies


Z39.50


It was proposed to search and retrieve information from
heterogeneous databases over networks.


Provide abstract search capability.


It is difficult to be implemented because of its
strengthened functionality.


OAI


Arc


Arc is developed for cross
-
archive searching.


It adopts the OAI protocol to harvest digital archives.

czyang@acm.org

YZU, Taiwan
-

ICADL2001

8

Related Technologies


Harp


Harp provides a uniform query interface across legacy
public libraries through HarpSQL.


A HarpSQL server acts as a query agent for storing and
handling the intermediate query results not as a search
engine to collect and store all metadata.


METALICA


It adopts a meta
-
search engine like MetaCrawler to
provide a uniform user interface for supporting cross
-
archive search.


czyang@acm.org

YZU, Taiwan
-

ICADL2001

9

System Architecture


XML

XML Parser

(Java Application)

Index

Database

Search Engine

(Java Servlet)

DTD Manager

(Java Servlet)

User

Interface

Manager

Interface

Query

Request

Metadata

DTD

Digital Library 1

DTD

Browser









Digital Library n

Digital Library 2

czyang@acm.org

YZU, Taiwan
-

ICADL2001

10

System Architecture


The search engine is constructed with
three modules:


Search engine module


Provide an integrated user interface


Adopt Java servlets to provide search services


Index database module


Provide metadata repository for digital library
sources.


Adopt simple Dublin Core set as default metadata.


Store DTD mapping relationships.


czyang@acm.org

YZU, Taiwan
-

ICADL2001

11

System Architecture


Metadata/DTD manager


Provide an administration interface to manage
XML/DTD mapping relationships .


Parse and translate the XML/DTD documents
provided by remote digital libraries.


Gather information from remote digital libraries and
update the index database repeatedly.

czyang@acm.org

YZU, Taiwan
-

ICADL2001

12

An Experimental Prototype


Development tool:


Implement this search engine with Java to reach
platform
-
independence.


Parse XML information with JAXP (Java API
for XML parsing) package.


The database is constructed with a public
domain database MySQL.

czyang@acm.org

YZU, Taiwan
-

ICADL2001

13

An Experimental Prototype


XML/DTD manager

Manage
functionality

czyang@acm.org

YZU, Taiwan
-

ICADL2001

14

An Experimental Prototype


A mapping example

Mapping
information

czyang@acm.org

YZU, Taiwan
-

ICADL2001

15

An Experimental Prototype


An search example

A famous calligrapher
His
-
Chih Wang (303
-
361 AD)

czyang@acm.org

YZU, Taiwan
-

ICADL2001

16

An Experimental Prototype


Search results


Matched
metadata

Link to the
resource file

czyang@acm.org

YZU, Taiwan
-

ICADL2001

17

Conclusions


Present the design of a search engine
for searching information across digital
libraries based on metadata/XML.


The design of the search engine has
three advantages:


First, the system architecture is simple and the
cost is low.

czyang@acm.org

YZU, Taiwan
-

ICADL2001

18

Conclusions


Second, the system extensibility is high for
newly required services.


Third, users need not to know how and where
to search information by using this uniform
user interface.

czyang@acm.org

YZU, Taiwan
-

ICADL2001

19

Future Work


The quality control on the metadata
provided by the original digital library
source.


The mapping scheme to support more
heterogeneous digital archives should
be further discussed.


czyang@acm.org

YZU, Taiwan
-

ICADL2001

20

Future Work


The performance issue should be
further addressed when the
environment is in a large scale.


How to effectively update information
from the remote digital libraries is
another important work to do.