Object-Oriented Query Languages and Views

processroguishΛογισμικό & κατασκευή λογ/κού

18 Νοε 2013 (πριν από 3 χρόνια και 1 μήνα)

149 εμφανίσεις

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
1

Sept. 2000



Tutorial:


Object
-
Oriented Query Languages

and Views


Part 1: Basic concepts and issues

Lecturer:

Kazimierz Subieta



Polish
-
Japanese Institute of Information Technology,

Warsaw, Poland


Institute of Computer Science

Polish Academy of Sciences, Warsaw, Poland


subieta@ipipan.waw.pl


http://www.ipipan.waw.pl/~subieta

ADBIS
-

DASFAA'2000


Fourth International Symposium on Advances in Databases and Information Systems

September 5
-
8, 2000, Prague, Czech Republic

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
2

Sept. 2000

A

lot

of

views
...


User
-
friendly
:

A

language

for

a

person

who

does

not

want

to

knows

anything

about

databases,

but

wants

to

operate

with

them
.


Theoretical

achievement
:

A

syntactic

variant

of

some

famous

mathematical

theory,

e
.
g
.

logic
.

Used

mainly

to

produce

a

next

paper

to

a

next

conference

(
:
-
))
.

Ad

hoc

interactive

database

language
:

A

facility

for

quick

retrieval

and

simple

updates

through

formalized

commands

(
select
...
from
...
,

update
...
set
...
,

...
)

or

through

some

simple

visual

interfaces
:

forms,

graphs,

menus,

etc
.

A

very
-
high
-
level

programming

construct

to

be

embedded

into

a

popular

programming

language,

e
.
g
.

SQL

embedded

into

C
.

A

very
-
high
-
level

programming

construct

integrated

with

a

new

programming

language,

e
.
g
.

PL/SQL,

many

4
GLs,

and

(recently)

SQL
3
.

What Is a Query Language?

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
3

Sept. 2000

Properties of Query Languages

Abstract,

conceptual,

data

independent
:

no

concepts

related

to

physical

level

of

data

(file

organizations,

indices,

moving

data

between

disk

and

main

memory,

etc
.


Declarative

(non
-
procedural)
:

determines

what

is

to

be

done

rather

than

how
.


Macroscopic
:

from

the

point

of

the

user

a

query

determines

simultaneous

(parallel)

actions

on

many

data
.


Natural
:

supporting

a

natural

way

of

thinking

of

the

user,

supporting

conceptual

modeling,

easy

to

learn

and

use
.


Efficient
:

acceptable

response

time

(performance)

through

automatic

query

optimization
.


Universal
:

giving

the

possibility

to

express

every

(reasonable)

request
.


Independent

of

an

application

domain
:

supporting

all

applications

of

the

given

DBMS
.


Lately

bound,

interpreted
:

supporting

ad

hoc

queries

and

various

abstractions

stored

in

a

database

(views,

stored

procedures,

active

rules,

etc
.
)
.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
4

Sept. 2000

A Query Language in a Database Environment

A

tool

for

an

end

user

enabling

him/her

interactive

(
ad

hoc
)

querying,

generating

reports

and

updating

data

stored

in

a

database
.

Very
-
high
-
level

conceptual

language

constructs

for

database

programming
.


Defining

integrity

constraints

preventing

illegal

operations/states
.

Defining

subschemas

and

access

restrictions
.

Defining

virtual

views,

materialized

views,

derived

(replicated)

data,

and

procedures

stored

in

a

database
.


Components

of

scripts

in

4
GL
-
s

and

RAD

tools
.

Defining

active

rules

(triggers)

and

deductive

rules
.

Determining

data

to

be

selected

and

transmitted

in

distributed

databases
;

interoperability

between

heterogeneous/remote

databases

(ODBC,

JDBC,

...
)
.

......

several

other

applications

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
5

Sept. 2000

Query Optimization

A query language is unacceptable without automatic optimization of queries.

Typical optimization methods
:

Rewriting.
A query q
1

is substituted with a semantically equivalent query q
2

promising better performance. E.g. performing selections before joins or
removing dead (not used) parts of queries. The methods have no negative side
effects. They require, however, regularity and homogeneity of the language
definition and deep understanding of formal semantics
;


Auxiliary data structures and special data organizations:
indices, tables of
pointers, has
h

coding, etc
;

Caching results of queries and then reusing them
(materialized views)
;

Simultaneous optimization of many queries
;

Selecting an optimal query evaluation plan
.

Optimization of object
-
oriented QLs is in infancy. There is a lot of wishful thinking
and poorly justified hopes concerning the optimization potential of particular theories.
This is one of the reasons of slow adoption of object QLs in the commercial world.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
6

Sept. 2000

Optimizable and Non
-
optimizable Queries

An entire query language

Efficient (optimizable) queries

Most useful queries

In any query language there are non
-
optimizable queries
.

The non
-
optimizable part of a QL should be as small as possible.

In real QLs not all useful queries are optimizable. Poorly defined, informal or
irregular features of a QL decrease the optimization potential.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
7

Sept. 2000

OQL and SQL3 on One Slide

OQL
is a part of the ODMG standard. It is claimed to be a compatible extension
of SQL, but actually OQL retains some syntactic patterns of SQL only.
Semantically OQL is very different from SQL, because it follows an object model,
which is incompatible with the relational model. OQL does not deal with updating
and does not define SQL
-
like facilities such as views, triggers and stored
procedures. OQL statements can be embedded into Java, C++ and Smalltalk, with
a lot of impedance mismatch. The semantics of OQL is defined poorly and
inconsistently, thus probably it is not fully implementable.

SQL3

is a new SQL standard developed by ANSI and ISO. In contrast to its
predecessors SQL3 is assumed to be a programming language with full
computational power. The main data structure is a table, equipped however with a
lot of options (thus using the term “relational” makes no sense). SQL3 supports
user
-
defined abstract data types, including methods, object identifiers, subtypes,
inheritance and polymorphism. Further facilities include control statements and
parameterized types. Together with an extremely rich collection of various
features, SQL3 is claimed to be downward compatible with SQL
-
92 and follows
the sweet
select...from...where
... syntax. The standard is eclectic and extremely
huge (more than 1000 pages), thus probably it is not fully implementable.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
8

Sept. 2000

Is Java an Alternative to Query Languages?

There is a lot of discussion around the role of Java in database programming.

Java is a very important language.

But...

Java offers rather classical low
-
level (object
-
oriented) programming.

The portability of Java bytecode is low
-
level. There are many details of database
interfaces outside the Java bytecode standard.

The Java object model is not powerful enough to be a database model.

The Java database interfaces (JDBC, SQLJ,...) are
not

object
-
oriented. They
present SQL interfaces to
relational

databases, wrapped into the Java syntax.

Java does not solve the problem of storing important abstractions within a
database (views, database procedures, triggers, etc.)


J
ava itself is not an answer to database problems. It presents much lover
level of database programming than programming via query languages.
Java + any query language
do not avoid

the impedance mismatch.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
9

Sept. 2000

Requirements to Object Query Languages

Conceptual

simplicity,

generality

and

minimality
:

clean

and

precise

semantics,

a

small

set

of

semantic

primitives,

no

redundant

constructs
.

Pragmatic

universality
:

the

possibility

to

formulate

any

request
.

Universality

of

an

approach

to

semantics
:

no

black

holes

in

the

semantics,

treating

every

semantic

inconsistency

or

irregularity

as

a

very

big

problem
.

Compositionality

and

orthogonality
:

no

big

syntactic

constructs,

every

reasonable

combination

of

constructs

should

be

allowed
.

Modularity
:

the

possibility

to

create

complex

encapsulated

reusable

units
.

Homogeneous

and

consistent

approach

to

all

concepts

of

the

underlying

object

model
:

complex

objects,

classes,

types,

interfaces,

ADTs,

inheritance,

methods,

encapsulation,

polymorphism,

etc
.

Homogeneous

and

consistent

approach

to

integration

of

the

query

language

with

programming

constructs

(updating,

etc
.
)

and

abstractions

(views,

methods,

stored

procedures,

parameters

of

procedures,

etc
.
)
.

High

potential

for

query

optimization
.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
10

Sept. 2000

Coupling a QL with a PL

Loose

coupling

(“embedding”)
:

A

QL

is

developed

independently

of

a

PL
.

An

additional

interface

(“glue”)

is

implemented,

enabling

the

use

of

QL

within

the

underlying

PL
.

The

complexity

of

the

interface

presents

a

problem
.

The

incompatibility

between

QL

and

PL

is

referred

to

as

impedance

mismatch
.

Typical

cases
:

SQL

+

C,

JDBC+Java,

OQL

+

C++,

OQL

+

Java

Advantages
:

the

programmers

can

deal

with

their

favorite

PL,

the

API

to

a

database

is

independent

of

PLs
.

Disadvantages
:

aesthetically

ugly,

a

lot

of

limitations,

programs

are

longer

than

necessary,

tricky

programming,

poor

maintainability
.

Tight

coupling

(“seamless

integration”)
:

Queries

are

building

blocks

for

programming

constructs
.

No

special

interface

between

QL

and

programming

constructs

is

provided
.

Typical

cases
:

Oracle

PL/SQL,

many

4
GLs,

DBPL,

LOQIS,

SQL
3

Advantages
:

a

consistent

homogeneous

solution,

no

impedance

mismatch
.

Disadvantages
:

implies

a

new

programming

language,

which

is

difficult

to

promote

in

the

current

commercial

world
.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
11

Sept. 2000

What is Impedance Mismatch?

Incompatibility between a PL and a QL to be embedded. It concerns:

Syntax:
Two different grammars within one programming interface;

Type systems:
different types, lack of bulk types in PL, no static typing of QL;


Semantics and pragmatics:
declarative QL vs. procedural PL;

Abstraction levels:
data independence of QL vs. deep data dependence of PL;

Binding phases and mechanisms:
late binding of QL vs. early binding of PL;


Name spaces and scoping rules:
two incompatible name spaces the programmer
deals with, stack based scoping rules of PL are not respected by QL;

Null values:
they are ignored in PL, special mapping tricks are required;

Iteration mechanisms
: implicit in QL (
selection, projection
,...), explicit in PL;

Persistence:
QL
-

only persistent data, PL
-

only volatile data; in PL special
mechanisms are required to copy persistent data into volatile memory and v/v.

Generic programming:
reflection in QL (see dynamic SQL), other techniques
(e.g. casting, templates) in PL.

Looking at the above, claims that ODMG Java + OQL avoid the impedance
mismatch are at least dubious.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
12

Sept. 2000

Object
-
Orientedness in Databases

The

commercial

world
:

manifestos,

standards

and

products

-

no

agreement
.


OODB

Manifesto,

3
rd

Generation

DB

Manifesto,

Third

Manifesto,

ODMG

standard,

SQL
3

standard

(SQL

1999
,

SQL

2000
),

persistent

Java,

XML

as

a

database

model,

CORBA

as

a

database

model,

Gemstone,

Versant,

O
2
,

ObjectStore,

Poet,

Objectivity/DB,

Uni

SQL,

Oracle

8
,

Informix

Dynamic

Server,

and

others
.

Useful
.

But
...

Eclectic

solutions,

legacy

burden,

design

monsters,

underspecified

semantics,

non
-
universal

solutions,

redundant

constructs,

many

inconsistencies
.


The

academic

world
:

theories

and

prototypes

-

no

agreement
.

Nested

relational

algebras,

F
-
logic,

comprehensions,

monoid

calculus,

object

algebras,

object

calculi,

structural

recursion,

functional

approaches,

and

others
.

Theories

neglect

vital

aspects

of

object
-
orientedness,

present

wishful

thinking

(e
.
g
.

concerning

a

mapping

from

OQL),

are

limited,

are

conceptually

or

mathematically

invalid

(e
.
g
.

object

algebras)
.

False

stereotypes

inherited

from

the

relational

model

(e
.
g
.

concerning

the

role

of

an

algebra

in

query

optimization)
.

The today’s state
-
of
-
the
-
art is PREMATURE
(despite thousands of papers).
The stability
-

not sooner than after 5
-
10 years. Thus in this tutorial we will
discuss concepts without referring to a particular standard, product or theory.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
13

Sept. 2000

Complex Objects

EMPLOYEE

ENO
E127


PHOTO

NAME
Smith


WORKS_IN

PREV_JOB

COMPANY
ICL


WHEN
1977
-
90


PREV_JOB

COMPANY
IBM


WHEN
1975
-
77


JOB
designer


JOB
analyst




Many data hierarchy levels (no limitations);



Nested repeating attributes (collections);



Large objects (BLOBs) as regular values;



References (pointers) to other objects.

Conceptual

modeling

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
14

Sept. 2000

Classes

Correct definition:

A class is a conceptual (imaginary) entity storing
invariant properties

of objects.

(Invariant properties are factored out from objects to their classes.)


Typical invariant properties include:



Names and types of objects’ attributes (i.e. a type of an object);



Methods that can be applied to an object.


But invariant properties can be of various kind:



A name of an object (ODMG);



Rules for events or exceptions that can occur on objects (CORBA, ODMG);



Links (pointers, references relationships) to other objects (ODMG);



Active rules (triggers) and integrity constraints;



Default values for attributes;



...

Sometimes a class is a regular run
-
time object (Self).

Bad definitions:

A class is a
collection

of objects (
wrong:
what about methods, inheritance, ...?);

A class is a
blueprint

for objects (
wrong:
only creation of objects is regarded).

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
15

Sept. 2000

Encapsulation, Interfaces

Interface
-

general definition:
It consists of everything that the programmer can
use or has to know in order to correctly process the object. An interface should not
include unnecessary (physical) details of objects’ construction or operation.

Interface
-

particular definition
(CORBA, ODMG, Java): An interface a
specification of all public properties (attributes and methods) of an object that the
programmer can use in a particular context.

Interface

is a concept different from
class

(
e.g. classes can be sold, interfaces
-

not
);

Interface

is a concept different from
type

(
e.g. exceptions are not relevant to types
).

Encapsulation

and

information

hiding


are

basic

principles

of

any

engineering,

including

software

engineering
.


A

TV

set

encapsulates

a

lot

details,

which

the

user

does

not

need

to

know
.


De
-
encapsulation

of

these

details

may

result

in

an

electric

shock

of

the

user
.

A

similar

threat

concerns

a

programmer

working

on

non
-
encapsulated

software
.

Bad understanding of encapsulation (all attributes are private, only some methods
are public) has led to the
absurd thesis on contradiction between encapsulation
and query languages
.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
16

Sept. 2000

Inheritance

If two or more classes have common invariants, then they can be factored out to
another class. Hence
classes are organized in a hierarchy
.

An object inherits invariants from its class and from all its superclasses.

Multi
-
inheritance


many superclases are allowed.

Employee

name

date of birth

salary

age

salary net

Student

name

date of birth

faculty

grades

age

average grade

Person

name

date of birth

age

Employee

salary

salary net

Student

faculty

grades

average grade

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
17

Sept. 2000

Methods and Messages

A method is a procedure stored within a class.

It acts on an environment consisting of:

internal environment (attributes) of the currently processed object;

private and public properties of the same class and public properties of all its
superclasess;

base environment, which includes database, volatile variables/objects of the user
session and global environment (environment variables, libraries);

public properties of currently active program modules.

A message is a call of a method
.

Message passing does not mean parallel asynchronous communication between
autonomous agents (this was false association made by OO pioneers).

The usual syntax for messages: object . methodName [( parameters )]

e.g.




(Person
where

name = “Smith”). age

Query languages can introduce other syntax for messages, e.g.

e.g.




(Person
where

age > 30) . name

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
18

Sept. 2000

Types

A type is an expression, which constraints the content of objects or value.
Types determine input/output properties of operators, functions, procedures
and methods.
Specification of types is obligatory in strongly typed languages.

Types formally restrict
a context of the use

of objects, operators, methods, etc.

Strong (static) type checking

(strong typing): every use of objects, operators,
functions, methods, etc. is checked at compilation time.

Typing safety
: more than 80% of programmer’s errors are detected at compilation.

Dynamic type checking
(dynamic typing): types are checked at run
-
time. Less
efficient w.r.t. detecting errors.

ODMG OQL is strongly typed
, but the typing system is inconsistent (S.Alagic).

SQL and SQL3 are dynamically typed
.

Strong typing decreases the power of a language (generic programming) and
presents a difficulty for developers of DBMS. Thus the attitude of the commercial
world to strong typing is undetermined (officially approved, practically neglected).

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
19

Sept. 2000

Links (references, pointers)

Objects can be connected by explicit pointer links. Links support conceptual
modeling (see OMT, UML, etc.). Links can be directly used in queries (through path
expressions). They much simplify queries and are more efficient than joins.

NAME
Jones

SALARY
2500

WORKS_IN

EMPLOYEE

NAME
Brown

SALARY
3500

WORKS_IN

EMPLOYEE

NAME
Smith

SALARY
2000

WORKS_IN

EMPLOYEE

EMPLOYS

EMPLOYS

EMPLOYS

NAME
Syntex

LOC
London

BOSS

COMPANY

(EMPLOEE where NAME = “Smith”).WORKS_IN.COMPANY.BOSS.EMPLOYEE.NAME

A path expression in SBQL:

Name of the Smith’s boss:

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
20

Sept. 2000

Class Extents

An extent is a named collection of objects being current members of a class.


The concept has roots in the relational model, where a
declaration

of a table (a
protoplast of the class concept) is sticked with
creation

of the table (i.e. extent).

In PLs declarations of types/classes are separated from declaration/creation of
corresponding variables/objects.

An extent is a bit doubtful concept. For example, having a class Person and its
subclass Employee the extent for Person and the extent for Employee has a non
-
empty intersection: some parts of objects Employee are “virtual” parts of the extent
for Person. This can be the source of inconsistencies and programmers’ errors.

It is also easy to imagine the situation, where a class must have not one but many
extents. E.g. the class EmployeePhotoAlbum has an extent for each employee.

Probably, the extent concept gives very little for the database designers and requires
additional attention of programmers. In my opinion, it should be dropped.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
21

Sept. 2000

Collections

The

relational

model

deals

only

with

one

collection

-

a

table

(relation)
.

Object

models

(ODMG)

assume

more

collections,

in

particular
:

sets

(no

duplicates,

no

order)
;

bags

(duplicates

are

allowed,

no

order)
;

sequences

(duplicates

are

allowed,

the

order

is

informative)
;

arrays

-

as

sequences,

no

inserting/deleting

elements

except

top,

access

through

order

numbers
.

Collections

can

be

nested,

e
.
g
.

collection
-
valued

attributes

are

allowed
.

Moreover,

collection
-
valued

pointer

links

are

allowed

too
.


Collections

have

a

big

meaning

for

conceptual

modeling
.


Nested

collections

simplify

queries

(navigations

instead

of

joins)
.


Typical

object
-
oriented

programming

languages

have

no

explicit

collection

types
;

collections

must

be

modelled

by

some

tricks
.


K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
22

Sept. 2000

The OODBMS Ideals

Orthogonal

persistence
:

the

same

types

can

be

applied

to

persistent

and

volatile

objects
.

Ergo
:

a

query

language

should

process

persistent

and

volatile

data

uniformly
.

The

ideal

much

reduces

the

complexity

of

a

QL
.


Object

relativism
:

each

object

consists

of

objects
;

there

are

no

other

concepts

that

are

used

for

description

of

objects
.

Ergo
:

(atomic)

attributes

are

objects,

links

are

objects,

each

object

can

be

a

component

of

a

higher
-
level

object,

etc
.

The

ideal

much

reduces

the

complexity

of

a

QL
.


Total

internal

identification
:

each

run
-
time

program

entity,

which

can

be

separately

retrieved,

bound,

updated,

inserted,

indexed,

protected,

locked,

etc
.
,

must

possess

an

unique

internal

identifier
.

Internal

identifiers

can

be

used

as

references

(l
-
values)

by

various

language

constructs

(e
.
g
.

updating)
.

The

ideal

much

simplifies

semantics

of

a

QL

and

makes

it

consistent
.



Unfortunately,

the

above

ideals

are

not

respected

by

commercial

OODBMS
-
s

and

standards
.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
23

Sept. 2000

Syntax, Semantics and Pragmatics of Languages

Each

language,

including

query

languages,

has

three

aspects
:

Syntax
:

describes

how

to

build

correct

expressions

of

the

language

from

the

alphabet

(basic

symbols)
.

Semantics
:

describes

what

expressions

of

the

language

denote
.

Semantics

is

the

basis

for

implementation,

in

particular

for

query

optimization
.

Semantics

is

usually

syntax
-
driven
:

semantic

rules

are

built

on

top

of

syntax

rules
.

Pragmatics
:

describes

how

to

use

the

language

to

accomplish

practical

needs
.

It

deals

with

mapping

concrete

problems

or

tasks

onto

expressions

of

the

language
.

Pragmatics

is

informal,

important

for

teaching,

frequently

the

most

difficult

for

users
.

(Even

the

developers

of

SQL
3

have

problems

how

to

use

their

own

creature!)


Many

languages

are

explained

through

syntax

and

pragmatics
.

Few

languages

are

specified

through

precise

formal

semantics
.



If

semantics

is

underspecified,

then

each

implementation

of

the

language

augments

the

specification

on

its

own

way
.

This

is

the

reason

of

low

portability

of

languages
.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
24

Sept. 2000

Semantics of Query Languages

Semantics of a query is a function which maps a state into a result.

For any query language, what we have to define?

Query

-

a syntactic domain of all queries;

Result

-

a domain containing all possible results of queries;

State

-

a domain containing all possible states.

sem : Query


⡓瑡t攠


剥獵汴R

獥洠㨠兵敲礠


⡓瑡t攠


⡒敳畬琠


却慴攩)

獥洠㨠兵敲礠


⡓瑡t攠


却慴攩e

For queries with
no

side effects;

For queries
with

side effects;

For imperative queries (e.g. the
update

clause of SQL).

Formal semantics can be different from implementation (see SQL).

Stateless approaches to query languages (logic, algebra) have severe limitations.

General definitions of semantics:

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
25

Sept. 2000

Four sides of a query language

As follows from the previous slide, description of any query language must involve
four sides:


Description of data stored in a database that are to be queried, i.e. the description
of data/object model (formal definition of the set
State
);


Description of syntax (formal definition of the set
Query
);


Description of results returned by queries (formal definition of the set
Result
);


Description of the mapping from queries into results (formal definition of the
mapping
sem
).

Unfortunately, it is not easy to present all sides in detail during 180 minutes

of the tutorial.

Usually in practical languages, these definitions are incomplete, inconsistent or
even sloppy. If one has to be serious with implementation and query optimization
all these definitions must be as clean and precise as possible.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
26

Sept. 2000

If

the

subquery

select

*

from

Employee

as

y

where

y
.
salary

>

x
.
salary


has

to

be

evaluated

independently

of

the

context,

then

the

“free”

variable

x

must

be

stored

on

an

internal

structure

of

the

query

processing

machine
.


This

internal

structure

(stack)

augments

the

concept

of

“state”
.

What is “state”?

For real QLs
state

is more than
database state
. A state includes:

database:

all data, objects, classes, methods, etc. in the database;

volatile objects/variables/...

of the run
-
time environment of a user session;

local objects/variables/...

of all currently executed procedures, functions, methods;

global environment:

environment variables, libraries, files, etc
.

select

*
from

Employee
as

x

where

count(
select

*
from

Employee
as

y
where

y.salary > x.salary ) < 10

Get 10 best
-
paid employees:

A state includes temporary internal structures of the query processing machine:

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
27

Sept. 2000

The Closure Property

It

is

a

property

of

a

query

language

(or

a

theoretical

framework)

saying

that

the

input

and

output

of

queries

should

belong

to

the

same

formal

domain
.


For

relational

QLs,

the

input

consists

of

tables

and

the

output

is

a

table
.

For

object
-
oriented

QLs

the

input

is

a

set

of

objects

and

the

output

is

a

set

of

objects

too
.

According

to

its

advocates,

the

closure

property

is

a

condition

for

nesting

queries
.

I

disagree
.

Essentially,

the

closure

property

is

a

false

inconsistent

stereotype

inherited

from

the

relational

model
.

The

closure

property

does

not

hold

even

for

SQL
.

E
.
g
.

input

tables

are

named

and

output

tables

are

unnamed
;

hence

there

is

a

big

semantic

difference

between

input

and

output

tables
.

For

object
-
oriented

QLs

the

closure

property

is

a

conceptual

nonsense
.

In

particular,

it

leads

to

subdivision

of

queries

onto


object

preserving


and


object

generating
”,

which

is

a

nonsense

too
.



We

will

formulate

QLs

semantics

in

consistent

and

formally

correct

terms,


without

subdividing

queries

onto

“object

preserving”

and

“object

generating”
.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
28

Sept. 2000

Results of Queries

In

ODMG

terms
,

queries

return

literals
.

We

generalize

this

concept
.

The

recursive

definition

below

defines

the

domain

Result
:

Each

atomic

value



Result
.

Each

reference

(to

an

object,

attribute,

link,

method,

view,

etc
.
)



Result
.

If

v



Result
,

n

is

a

name,

then

n(v)



Result
.

Such

results

will

be

called

binders
.

If

v
1
,

v
2
,

v
3
,

...



Result
,

then

row
{

v
1
,

v
2
,

v
3
,

...
}



Result
.

In

general,

the

order

of

elements

in

the

row

is

essential
.

This

construct

generalizes

a

tuple

known

from

relational

systems
.

If

v
1
,

v
2
,

v
3
,

...



Result
,

then

set
{

v
1
,

v
2
,

v
3
,

...

},

bag
{

v
1
,

v
2
,

v
3
,

...

},

sequence
{

v
1
,

v
2
,

v
3
,

...

},

...



Result
.


There

is

no

other

results
.

In

our

terms

queries

never

return

objects,

but

can

return

references

to

objects,

or

more

precisely,

some

structures

built

upon

references,

values

and

names
.


K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
29

Sept. 2000

Example results of queries

Atomic
:

25, "Smith", i
11,
i
18

Complex
:

row
{i
1
, i
21
}

bag
{
row
{i
1
, i
21
},
row
{i
7
, i
26
} }

bag
{
row
{ 2, Lecture(i
26
), Stud(
bag
{




row
{ n("Russel"), y(i
36
) },



row
{ n("Black"), y(i
30
) }})}


i
1
,

i
2
,...
-

references

We present bags of rows as rectangular tables (similarly to relational tables),

for example:

i
2

52
lect
(
i
21
)

i
8

44
lect
(
i
26
)

i
33

i
40

i
47

4

“Russell”

“Jones”

“Black”

i
15

p
(
i
1
)

p
(
i
7
)

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
30

Sept. 2000



Tutorial:


Object
-
Oriented Query Languages and Views


Part 2: The Stack
-
Based Approach

Lecturer:

Kazimierz Subieta



Polish
-
Japanese Institute of Information Technology,

Warsaw, Poland


Institute of Computer Science

Polish Academy of Sciences, Warsaw, Poland


subieta@ipipan.waw.pl


http://www.ipipan.waw.pl/~subieta

ADBIS
-

DASFAA'2000


Fourth International Symposium on Advances in Databases and Information Systems

September 5
-
8, 2000, Prague, Czech Republic

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
31

Sept. 2000

Why the stack
-
based approach (SBA) to QLs?

We would like to achieve:

Universality

concerning both data structures an QL/PL functionalities;

Modularity

and
compositionality

(the hierarchy of conceptual abstractions);

Regularity
, full
orthogonality

of the concepts;

Minimality

of semantic primitives;

Clean and precise
(formal)
semantics
;

Uniform approach and full integration with procedural capabilities:
updating, procedures, views, methods, etc.;

Precise treatment of object
-
oriented concepts
(classes, encapsulation, ...);

High potential for query optimization.

The motto of SBA (frequently neglected by database researchers):

Each, even apparently small semantic problem is a big problem.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
32

Sept. 2000

The Environment Stack (ES)

In PLs the environment stack is a basic mechanism to accomplish:

Abstraction:
the programmer can abstract from internal details of procedures.

Semantic independence and program reuse:
the meaning and the behaviour of
procedural abstractions is independent on the context of its use.

Recursion:
A procedure (function, method, view) can call other procedures, in
particular, can call itself. Encapsulation of local environments is preserved.

Consistent binding:
Name x is bound to the most local definition or declaration
of x. Other definitions or declarations of the name x should be allowed.

Parameter passing:
The stack makes it possible to store and manage parameters
of procedures and to accomplish consistently parameter passing methods.

Proper scoping
: a program entity should act only on the data environment and
name space that the programmer who has programmed the entity was aware of.

In SBA the environment stack has a new role: consistent semantic
mechanism for definition and implementation of query operators.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
33

Sept. 2000

Assumptions of the SBA: syntax

Unification of PL expressions and queries:

2+2

(x + y) * z

(x + (EMP
where

(NAME = “Smith”)).SAL) * z

EMP, NAME, SAL

persistent data

x, y, z

volatile data

2, “Smith”, 1000,...




x, y, z, EMP, NAME, SAL, ...

+,
-

,*, /, =, >,
where
, ., ...


binary

operators

sin, sqrt, sum, count, distinct,...

unary

operators

q1, q2 are queries,


楳i愠扩湡特r潰敲慴潲









楳i愠煵敲a

焠楳i愠煵敲aⰠ

††


楳i愠畮a特r潰敲慴潲





⡱⤠

楳i愠煵敲a

}

atomic queries

Abstract syntax and compositionality
: no big syntactic/semantic constructs,

e.g. no famous
select ... from ... where ... group by ... having ... order by ...

Big constructs decrease orthogonality, maintainability, reusability and optimization
potential, are more difficult in implementation, are the reason of irregularities.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
34

Sept. 2000

Assumptions of the SBA: semantics

The naming
-
scoping
-
binding principle:

Each
name

occurring in a query is
bound

to run
-
time database/program entities

(persistent objects, volatile objects, attributes, procedures, parameters, views,
methods,...) according to the actual
scope

for the name.



Scopes are organized in ES with the “
search
-
from
-
the
-
top
” rule.



ES is
separated

from the
object store.



Binding

of a name can be
multi
-
valued
(macroscopic binding).

This concerns:



names of persistent objects;



names of objects’ attributes, sub
-
attributes, ...;



auxiliary names (“variables”) defined within a query;



names of transient objects, programming variables and their attributes;



names of procedures, methods, operators, ...;



names of parameters of procedures, methods,...;



.....
any other name
.

In

SBA

the

domain

State

consists

of
:

Object

Store

+

Environment

Stack
.


K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
35

Sept. 2000

An Abstract Store Model

I
-

a set of
internal

identifiers

N
-

a set of
external

names

V
-

a set of
atomic
values, blobs, compiled
procedures, ...

< i, n, v >

< i
1
, n, i
2

>

< i, n, T >


T is a set of objects

atomic

object

link

object

complex

object

Store:

A set of objects +

A set of identifiers (roots)

+

some obvious constraints

(uniqueness of identifiers,


referential integrities)

No record, tuple, array, set, and bag constructors in the model:
essentially all
of them are collections of objects (“environments”).

No uniqueness of external names on any level of data hierarchy:
modeling
bulk data.

Uniform treatment of
relational, object
-
relational and pure object
databases.

Classes, inheritance and encapsulation

require extension.

The component of a ‘state”.

later

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
36

Sept. 2000

Tiny Database

i
1

EMP

i
9

EMP

i
5

EMP

i
13

DEPT

i
17

DEPT

i
2

NAME Brown

i
10

NAME Jones

i
6

NAME Smith

i
3

SAL 2500

i
7

SAL 2000

i
11

SAL 1500

i
4

WORKS_IN i
13

i
12

WORKS_IN i
17

i
8

WORKS_IN i
17

i
14

DNAME Toys

i
18

DNAME Sales

i
15

LOC Paris

i
16

LOC London

i
19

LOC Berlin

EMP
[0..*]

NAME

SAL

JOB[0..1]

age

DEPT
[0..*]

DNAME

LOC[1..*]


WORKS_IN

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
37

Sept. 2000

Universality of the Store Model

Complex hierarchical objects

can be defined (no limits of levels);
programming variables are treated as objects;

Orthogonal persistence
: we abstract from the
persistence status

of objects and
variables, i.e. we define in the same way persistent and transient objects;

Object relativity
: no difference in treatment of objects on any hierarchy level
-

big advantage for the universality, minimality and simplicity of semantics.

Total internal identification:
each entity stored in the store model, including
attributes, links, BLOBs, methods, views, etc. has a unique internal identifier;

Binary relationships

(associations) can be defined via link objects; as in the
ODMG model we do not deal with ternary and higher order relationships and
attributes of relationships (they must be reduced to binary ones);

Bulk data
: we deal with sets/bags. They are modeled by the same name assigned
to many objects on the same hierarchy level;

Relational structures
: each tuple is understood as an object with subobjects;


K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
38

Sept. 2000

What is binding?

For

example
:




procedure

name

occurring

in

a

program

is

substituted

by

a

call

of

a

machine

code
;



variable

name

is

substituted

by

an

address

of

a

main

memory



attribute

name

is

substituted

by

an

offset

relatively

to

the

beginning

of

a

structure
;



object

name

is

substituted

by

an

object

identifier
.

Binding is substituting a name occurring in a query or a program

by a run
-
time program entity
(entities).


Binding is
early

or
static
, if the substitution is made before the program is
executed (i.e. during compilation and linking).


Binding is
late

or
dynamic
, if the substitution is made during run time.

In query languages binding is usually dynamic, because of dynamic database features
(inserting new data, removing data, creating/removing views, etc.)

Static binding is sometimes used for optimization.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
39

Sept. 2000

What is binder?

Binder is an internal structure to determine bindings.

A binder consists of two parts:

General definition of binders:

n






剥獵汴R


渨 r )

is a binder

An
external

name defined by the
application designer or
programmer

An
internal

run
-
time program entity,
e.g. an object identifier, a value, a
procedure code.

This is an abstract view. In implementation binders may be not so explicit.


Binding
: for each external name occurring in a query a proper binder is found;

then the name is substituted by the corresponding internal entity.


Binders are written as
n(x)
, where
n

is an external name,
x

is an internal entity.

For a binder n( i ) name n may be different from the name of the object identified by i.


In query languages binders have additional roles.

Binders can be nested, i.e.
x

may consist of binders.

In general, a binder will be understood as a
query result

equipped with a
name
.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
40

Sept. 2000

The Environment Stack

It consists of
sections
. Each section is a set of
binders
.

The stack is growing and shrinking according to program/query nesting.

The most
local
data

are at the top.

The most
global

data

are at the bottom.

......


Binders to local entities of

currently executed method

.....


NAME(i
2
) SAL(i
3
)

WORKS_IN(i
4
)


Binders to global entities

of the user session


EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)


Binders to entities of

the global environment

The section of

the


“Tiny database”

The section of the
currently processed object

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
41

Sept. 2000

Binding through the environment stack

Binding ( G ) = “Mary”

Binding ( X ) = i
221

Binding( SAL ) = i
3

Binding( EMP ) = {i
1
, i
5
, i
9
}

Binding( DEPT ) = {i
13
, i
17
}



First the top section is visited, then lower sections are visited.



The search is finished after a binder
with the proper name

is found.



All
binders with the proper name form the result of the search.

Binding a name
-

search from the top:

......


G(“Mary”) X(i
221
)

.....


NAME(i
2
) SAL(i
3
)

WORKS_IN(i
4
)


...


EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)


...

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
42

Sept. 2000

Opening a new scope by a query operator

In

PLs

opening

a

new

scope

(
an

activation

record
)

at

the

top

of

an

environment

stack

is

associated

with

an

activation

of

a

block

or

a

procedure
.


In

SBA

a

new

scope

at

the

top

of

the

environment

stack

is

opened

to

evaluate

a

query

component

in

the

context

determined

by

another

component
.

EMP
where

SAL > 1000

A context

EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

The ES state:

NAME(i
2
) SAL(i
3
)
WORKS_IN(i
4
)


EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

Operator

The ES state (in one iteration):

EMP is bound to {i
1
, i
5
, i
9
}

SAL is bound to i
3

A subquery evaluated in the context

The new scope

opened

by
where

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
43

Sept. 2000

Function
“nested”

Given identifier

i

of a complex object, the function
nested

returns binders

to
direct sub
-
objects

of the object identified by
i
.

nested
( i
1

) = { NAME( i
2

), SAL( i
3

), WORKS_IN( i
4

)}

nested
( i
5

) = { NAME( i
6

), SAL( i
7

), WORKS_IN( i
8

)}

Function
nested

determines
a new scope

-

the environment in which the subquery
will be evaluated. The scope is
pushed

at the top of the environment stack.


Function

nested

is naturally generalized for any r


剥獵汴R

If
l

is a link, then
nested
(
l
) returns the binder to the entity that the link points to.

If b is a binder, then
nested

(
b
) returns
b

(no change).

For rows
, nested

( row{v1, v2, ...} ) =
nested
(v1)


nested
(v1)

...

EMP
where

SAL > 1000

yields {
i
1
,
i
5
,

i
9
}

A context

A subquery evaluated in the context

For “Tiny database”:

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
44

Sept. 2000

The Language SBQL

A

formalized

variant

of

SQL
-
like

languages,

including

ODMG

OQL

and

SQL
3
.

It

is

relevant

to

relational,

object
-
relational

and

object
-
oriented

models
.

Abstract

syntax,

free

of

sugar
.

Literals, names, unary or binary operators, parentheses.

Relativity:

Syntax:

Orthogonality:

Each query is evaluated
relatively

to the state of the environment stack.

Thus queries such as
SAL > 1000

have semantics independent from the
context
(providing ES contains a SAL binder).

1000

EMP

SAL

2+2

SAL > 1000

EMP
where

(SAL > 1000)

((( EMP
where

(SAL > 1000)) . WORKS_IN ) . DEPT ) . (DNAME, LOC)

Examples queries:

Query operators are subdivided into
algebraic

and
non
-
algebraic
.

Algebraic operators

do not deal with the environment stack.

Non
-
algebraic operators

operate on the environment stack.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
45

Sept. 2000

SBQL
-

Algebraic Operators



Numerical and string comparisons, operators and functions:


=, <, +, *,
concatenation
, sqrt, sin, log, ...



Boolean
and, or,

and
not



Aggregate arithmetic functions
sum, max, min, avg



Function
count
, function for removing duplicates, function
exists



Equality of complex query results

(shallow, deep)



Dereferencing operator
(usually implicit)




Coercion operators
(changing representation and types);




Operators for bags
(union, intersections, difference, equality, ...)



Operators for sets
(union, intersections, difference, equality, containment)



Operators for sequences
(concatenation, i
-
th element, sorting,...)



Cartesian product



... many other operators

Examples of use:

2+2

SAL > 1000

NAME = “Smith”
and

SAL > 1000

implicit dereferencing

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
46

Sept. 2000

SBQL
-

Declaration of an Auxiliary Name

Unary algebraic operator

Syntax:

q

as

n

n

-

auxiliary name,

q

-

a query returning a single
-
column table,


e.g. bag{
x
1
, x
2
, ... x
n
}

Applications
:

SQL, OQL correlation variables (“synonyms”):

Variables bound by quantifiers:

Cursors in “for each” statements:

Virtual names in SQL
-
like views.

EMP
as

e



EMP
as

e ( ... )

x
1

x
2

...

x
k

Let return a bag:

q

DEPT
as

d

for each

EMP
as

x
do

...

Semantics:

Then returns the bag of binders:


q

as

n

n(x
1
)

n(x
2
)

...

n(x
k
)

Each value returned by q is
equipped with the name n.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
47

Sept. 2000

SBQL
-

Non
-
algebraic Operators

Syntax:


q
1

where

q
2

q
1

. q
2

q
1

closed by

q
2

q
1

order by

q
2

q
1

join

q
2



q
1

( q
2

)



q
1

( q
2

)

Semantics
-

the uniform homogeneous idea:


For each
element

r

of the
collection

returned by
q
1


the query
q
2

is evaluated

with the environment stack augmented by
nested
(
r

)
.


A
partial result

is a combination of
r

and the result returned by
q
2
.

All partial results are merged into the
final result
.

dependent join (plus possibly other operators)

(ordering)

(transitive closure)

All these operators are implemented in SBQL.

If


is a non
-
algebraic operator, then in q
1



q
2

the evaluation order of q
1

and q
2

is
essential. Hence non
-
algebraic operators do not follow basic propert
ies

of algebraic
expressions. In contrast to relational/object algebras, our non
-
algebraic
operators
are
not indexed by (informal) meta
-
language expressions. No informal treatment of
names: each name in a query, including names of attributes, links, etc. precisely
follows the same scoping
-
binding discipline.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
48

Sept. 2000

SBQL: Selection

The row
r
belongs to the final result, iff

q
2

returns

TRUE
for it.

q
1

where

q
2

For each row r returned by q
1
query q
2

is evaluated

with the stack ES augmented by
nested
( r ).

EMP

where
SAL > 1800



i
1




i
5




i
9



EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

NAME( i
2

) SAL( i
3

)

WORKS_IN( i
4

)


EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

NAME( i
6

) SAL( i
7

)

WORKS_IN( i
8

)


EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

NAME( i
10

) SAL( i
11

)

WORKS_IN( i
12

)


EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

ES

i
3

i
7

i
11

1800

1800

1800

TRUE

TRUE

FALSE

(2500)

(2000)

(1500)

The

final

result

i
1




i
5

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
49

Sept. 2000

SBQL: Projection, navigation

The final result is the union of tables returned by

q
2

.

q
1

.

q
2

For each row r returned by q
1
query q
2

is evaluated

with the stack ES augmented by
nested
( r ).

EMP

.
SAL


i
1




i
5




i
9



EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

NAME( i
2

) SAL( i
3

)

WORKS_IN( i
4

)


EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

NAME( i
6

) SAL( i
7

)

WORKS_IN( i
8

)


EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

NAME( i
10

) SAL( i
11

)

WORKS_IN( i
12

)


EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

ES

i
3

i
7

i
11

i
3


i
7


i
11

The

final

result

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
50

Sept. 2000

SBQL: Navigational (Dependent) Join

A partial result is a concatenation of
r

with each row returned
by q
2
. The final result is the union of partial results.

q
1

join

q
2

EMP


join


SAL


i
1




i
5




i
9



EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

NAME( i
2

) SAL( i
3

)

WORKS_IN( i
4

)


EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

NAME( i
6

) SAL( i
7

)

WORKS_IN( i
8

)


EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

NAME( i
10

) SAL( i
11

)

WORKS_IN( i
12

)


EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

ES

i
3

i
7

i
11

i
1
i
3


i
5
i
7


i
9
i
11

The

final

result

For each row
r

returned by q
1
query q
2

is evaluated

with the stack
ES

augmented by
nested
( r )
.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
51

Sept. 2000

Examples in OQL and SBQL

select

e
.
NAME
,
d
.
DNAME
, count(
select

x

from

d
.
EMPLOYS

as

x
)

from

EMP

as

e
,
e
.
WORKS_IN

as

d

where

e
.
JOB

= "
manager
"

((((
EMP

as

e
)

join


(((
e
.
WORKS_IN
).
DEPT
)
as

d
))

where

((
e
.
JOB
) = "
manager
"))) .

(((
e
.
NAME
), (
d
.
DNAME
)),
count
((((
d
.
EMPLOYS
).
EMP
)
as

x
) .
x
))

OQL:

SBQL:

Syntactic differences, but striking similarity
of

the idea
s
.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
52

Sept. 2000

Classes, inheritance and encapsulation on ES

The state of the environment stack

The state of the object store

EMP

....

PERSON

....

DEPT

....

i
1

EMP

...

...

...

i
17

DEPT

...

...

...

i
13

DEPT

...

...

...

i
6

EMP

...

...

...

i
1

EMP

...

...

...

The order of search during binding name
X

Binders to attributes of the tested EMP object

Binders to properties of the EMP class

Binders to properties of the PERSON class

Some invisible sections (for lexical scoping)

Binders to properties of the current session

Binders to database objects, views, ...

Binders to global procedures, variables,...

The query:
EMP

where

...
X

...

If some operator opens a section on ES for an object,

then sections of its classes and super
-
classes are inserted onto ES, in proper order
(shown on the picture below).

Encapsulation
:

some sections inserted on ES contain only binders to public properties.

Polymorphism

-

a a side effect of the above ES rules.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
53

Sept. 2000

Object views

Customization,

conceptualization,

encapsulation
.

The

user

works

with

a

part

of

a

database

that

is

relevant

to

his/her

area

of

interest

in

a

way

that

is

convenient

for

his/her

everyday

processing

routines

and

concepts
.

Views

have

the

potential

of

significant

improvement

of

programmer’s

productivity
.

Security,

privacy,

and

autonomy
.

Views

restrict

possible

accesses

only

to

a

relevant

part

of

a

database
.

In

federated

databases

this

restriction

is

required

to

enable

the

autonomy

of

local

databases

Interoperability,

heterogeneity,

schema

integration,

legacy

applications
.

Views

enable

integration

of

heterogeneous

databases,

allowing

understanding

and

processing

foreign,

legacy

or

remote

databases

according

to

a

common,

unified

schema
.


Data

independence,

schema

evolution
.

Views

enable

the

user

to

change

database

organization

and

schema

without

affecting

already

written

applications
.

Expectations:

The current state
-
of
-
the
-
art is immature, the above expectations are actually wishes.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
54

Sept. 2000

Views are Stored Functional Procedures

function

RichEmp


begin return

EMP

where

SAL

> 1800
end
;

(
RichEmp

where

JOB

= "
designer
").
NAME

The name
RichEmp

is bound and the function is invoked. A new ES section contains
only a return address. Then, the query within the body of this function is executed.

It returns
bag
{
i
1
,
i
5
}, containing identifiers of
Brown

and
Smith

objects. This bag is a
function output.

Function

declaration:

Function

invocation:

EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

The initial ES state



EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

After invocation of

the
RichEmp

function

NAME(i
2
) JOB(..)

SAL(i
3
) WORKS_IN(i
4
)


age(...)






EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

After
where

within

the
RichEmp

function

EMP(i
1
) EMP(i
5
) EMP(i
9
)

DEPT(i
13
) DEPT(i
17
)

After return from

the function

.....

The section for

view invocation

The section for

EMP class

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
55

Sept. 2000

Two points of view on the database content

DEPT

DEPT

EMP

EMP

EMP

RichEmp

Objects

Stored

functional

procedure

a) The database seen

by the view definer

b) The database seen

by a view user

DEPT

DEPT

EMP

EMP

EMP

Objects

RichEmp

Imaginary

objects

RichEmp

The imaginary
RichEmp

objects adopt the entire semantics of
EMP

objects:

(
RichEmp

where

age

< 30).


WORKS_IN.DEPT.DNAME


RichEmp
as

r

(
r
.
age

> 50



and

"
Rome
"


r
.
WORKS_IN.DEPT.LOC
)

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
56

Sept. 2000

Generalized functions

The stack
-
based semantics makes no restriction in calling functions from inside of
functions, i.e. to define a view through other views. Loqis makes it possible to define
arbitrary stored functional procedures, including procedures with parameters (being
arbitrary queries), with local objects, recursive, with side effects, and higher
-
order
(procedures with parameters being procedures).


For example, the
WellPaid

function has a parameter
JobPar

and a local variable
a
:

function

WellPaid
(
JobPar

)


begin


local
a
:
real
;


a

:=
avg
(
EMP.SAL
);


return

EMP

where

JOB

=
JobPar
and

SAL

> 2*
a
;


end
;

Get names of departments of well paid clerks:

LowPaid
( "
clerk

").
WORKS_IN. DEPT. DNAME

The algorithm can be

complex.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
57

Sept. 2000

Query modification to process views

Macro
-
substitution of view invocation by the view body.

function

RichEmp


begin return

EMP

where

SAL

> 1800
end
;

(
RichEmp

where

JOB

= "
designer
").
NAME

Function

declaration:

Function

invocation:

After

macro
-
substitution:

((
EMP

where

SAL

> 1800)
where

JOB

= "
designer
").
NAME

Because of uniformity and universality of the stack
-
based semantics

query modification in SBA is very easy to implement.


Query modification makes it possible to use other optimization methods:

rewriting, access through indices.

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
58

Sept. 2000

Performance of views

Various approaches to object views do not consider this issue.

If this issue is unsolved, there will be
no chance

for object views.


What can be done?


Caching view results

(materialized views): there is a side problem of keeping
materialized views consistent with stored data (incremental updating algorithms).

Query modification:
macro
-
substitution of invocation of views by bodies of view
definitions. Applicable only to views defined by single queries. The method can be
difficult for more complex views.

Predicate
-
move
-
around
:
a predicate occurring in a query invoking a view modifies
the view definition. The method is not
tested practically
(probably).

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
59

Sept. 2000

Updatable views

A challenging theme.

Typical theoretical approaches (algebras, calculi, logic, functional...) are stateless,
hence updating operations are non
-
expressible. There are naive believes that
translations of view updates can be done in some auto
-
magical way, e.g. through
employing some constraints.

Oracle, with „do instead” triggers, is perhaps on the right way.

Updating through methods
encapsulated in a class of virtual objects

(no generic
updating operations). An idealistic view on object
-
orientedness and bad
understanding of encapsulation.

Updating through references
returned by view invocations

(
undesirable side
effects, no universality
).

Overriding generic updating operations
by codes written by the view definer
(the mentioned Oracle way).

K.Subieta.
Object
-
Oriented Query Languages and Views
, slide
60

Sept. 2000

Conclusion
s

Object
-
oriented and object
-

relational query languages are a very important feature
of future DBMS.

Query languages have the potential to increase significantly the efficiency of
programming, as well as the maintainability and reusability of programs.

The current state
-
of
-
the
-
art of object query languages is immature. For us,
researchers, this is optimistic opportunity


we have a lot of work to do.

Commercial achievements, in particular OQL and SQL3, are a big progress, but
suffer from inconsistencies, underspecified semantics and eclecticism.

Theoretical proposals have many limitations, are mathematically and/or
conceptually inconsistent (object algebras), present wishful thinking.

Object views are still not developed to a satisfactory degree, both theoretically and
practically.

Optimization of object queries and views is not developed to a satisfactory degree

The stack
-
based approach to object query languages and views is a right
paradigm, which is able to make significant progress in the area.