Simplified parsing Data Typing

rangesatanskingdomΛογισμικό & κατασκευή λογ/κού

2 Δεκ 2013 (πριν από 3 χρόνια και 8 μήνες)

82 εμφανίσεις

Adapting

Fusionplex for
XQUERY and XML
:

A Case Study of Music Industry Recording Data


Art
hur A.

Figueiredo

IT803 Fall 2007

Research and Project Paper



Abstract.
It

is
my
goal to gain

practical

knowledge in the
data
integration and interoperation realm
.

I am going to use
some
recording
industry

data

as a test case since there is so much information existing
already on the web. Within this domain I
will
prototype
queries with
XQuery across both relational and XML (hierarchical) data instantiations.
A pr
ototype capability will be developed within the listed technologies
which follows the Fusionplex architecture with some
accomodations
.
This will necessarily lead to a new integration architecture being
proposed
.


1
Introduction
and Motivation


Use of th
e

Extensible Markup Language (XML) has become a prevalent
approach

to
representing a variety of data types. There is a large and growing family of
languages which share the base syntax of XML

and which define their own
meaningful tags (or schema)
.

This a
doption is
primarily
due to several desirable
features which are embodied by XML

(and which will be addressed in due course
below)
.

Because of the features of XML,
(well
-
formed, etc.),
documents in this
family

allow for automated processing utilizing a la
rge set of tools which have been
developed over a more than 10 year period.

XML provides the base syntax for the
Extensible HyperText Markup Language (XHTML)


2 XML Features


3 Tools for XML



The
Extensible Markup Language

(
XML
) is a general
-
purpose
markup language
.
[1]

It is classified as an
extensible language

because it allows its users to define their own
tags
. Its primary purpose is to facilitate
the sharing of structured data across different inform
ation systems, particularly via the
Internet
.
[2]

It is
used both to encode documents

and
serialize

data. In the latter context, it is comparable with other text
-
based serialization languages such as
JSON

and
YAML
.
[3]

It started as a simplified subset of the
Standard Generalized Markup Language

(SGML), and is designed to
be relatively human
-
legible. By adding semantic constraints, application languages can be implemented in
XML. These include
XHTML
,
[4]

RSS
,
MathML
,
GraphML
,
Scalable Vector Graphics
,
MusicXML
, and thousands
of others. Moreover, XML is sometimes used as the
specification language

for such application languages.

XML is
recommended

by the
World Wi
de Web Consortium
. It is a fee
-
free
open standard
. The W3C
recommendation specifies both the
lex
ical grammar
, and the requirements for
parsing
.



Simplified parsing

Data Typing

Metadata



Enabling Technologies


Screen Scraping

XML

XHTML

XSLT


Software Components

PostgreSQL


General p
urpose DB


Java

XQJ


XQ
uery for Java


Data Sources

Musicbrainz

All Music Guide

Gracenote CDDB

Wikipedia


Schema Mapping




References


Alexiev, V.: Information Integration with Ontologies. Wiley, 2005.


Ehrig, M.
Ontology Alignment: Bridging the Semanti
c Gap
. Springer, 2007.



Katz, H., et. Al.:
XQuery from the Experts
. Addison
-
Wesley, 2004.


Ling, L., Lee, M., Hsu, W.:
Rewriting Queries for XML Integration Systems
. In: DEXA
2006, LNCS 4080, pp. 138
-
148, 2006.


Motro, A., Anokhin
, P.:
Fusionplex: res
olution of data inconsistencies in the
integration of heterogeneous information sources
. Elsevier, 2004.