Overview of Database Federation

basesprocketΔιαχείριση Δεδομένων

31 Οκτ 2013 (πριν από 4 χρόνια και 9 μέρες)

94 εμφανίσεις

1

Overview of Database Federation
and IBM Garlic Project

Presented by Xiaofen He

2

Reference


Data Integration through database
federation,

L.M. Haas, E.T.Lin, M.A.
Roth


Towards Heterogeneous Multimedia
Information Systems: The Garlic
Approach,

IBM Almaden Research
Center

3

Outline


Approaches to data integration


Database Federation in IBM DB2


IBM Garlic Project

4

Various Approaches to Data
Integration (1)


Application
-
specific solutions


Always works


Expensive, fragile and hard to extend


Application
-
integration frameworks


Protection from changes of data source


Do not address data integration issues


Workflow frameworks


Limited support for comparing and manipulating

5

Various Approaches to Data
Integration (2)


Digital libraries


Meta search engine


No combination of data


Data warehousing


Powerful, high
-
level query language


May not be possible or cost effective, loss of
functionality


Database federation


Virtual data warehouse


Performance tradeoff (query rewrite & cost
-
based
optimization)


6

Database Federation


Basics of Database Federation


DB2 styles of database federation


Determining the style of database
federation to use

7

Basics of Database Federation


What is

database federation


(DF)


Aka.

mediation



An architecture in which middleware,
consisting of a relational database
management system, provides uniform
access to a number of heterogeneous data
sources


8

Common Mediation
Architecture


Data Source


Wrapper


Mediator


Figure 1. Common Mediator Architecture

9

Goals of IBM DF


Transparency


Support heterogeneity


A high degree of function


Extensibility


Openness


Autonomy of individual data sources


Query optimization

10

DB2 architecture for DF

Figure 2. DB2 architecture for database Federation

11

DB2 Styles of federation


Scalar UDFs:
Federating function


Table UDFs:
Federating data


Wrappers:
Federating function
and data

Figure 3. Different styles of federation

12

Wrapper Architecture


Multi
-
server integration


Multi
-
dataset integration and multi
-
operation integration


Optimization


Transactional integration

13

Determining the style of DF
to use

Figure 4. Determine the style of federation to use

14

IBM Garlic Project


Introduction


Overview


Architecture


Repositories and Databases


The Garlic Data Model


Queries in Garlic


Interface and Application


Conclusion

15

Introduction


Need


Goal


Object
-
Oriented Model

16

Garlic Overview

C++ Application

Query/Browser

Query Services &
Runtime System

Metadata Repository

Repository
Wrapper

Repository
Wrapper

Repository
Wrapper

Repository
Wrapper

Complex Object
Repository

Data Repository

Data Repository

Data Repository

Figure 5. Garlic System Architecture

17

Garlic Overview


Repositories


Repository type


Repository instance


Repository manager


Databases


Global schema


Wrapper schemas (local schemas)

18

Garlic Data Model (1)


ODMG
-
93 object model


Objects and values


Inheritance


Object identity


Weak identity


unique, not necessarily
immutable


Legacy references


Implementation
-
constrained reference

19

Garlic Data Model (2)


Extensions


Degree of support for alternative
implementations of interfaces


Type system flexibility
-

conformity


Object
-
appropriate view definition facility


Object
-
Centered Views


Enhance objects by adding or hiding some
of their attributes/methods.


20

Queries in Garlic


Query language


Object
-
oriented extension of SQL


Integrating approximate match query semantics
with traditional exact match query semantics.


Query Processing


Decomposition


Interesting Question


How to characterize the query power of a
repository, in terms of the language subset that its
wrapper is capable of processing directly

21

Interfaces and Applications


C++ API


Compiled applications


Dynamic applications


Query/Browser


A dynamic application


Moving back and forth between querying
and browsing activities

22

Summary


Database Federation


A powerful tool for integrating data


Future work



to improve the ease of use


Enhance the performance


Garlic Project


New research in many dimensions