What is a Data Model?

obtainablerabbiΔιαχείριση Δεδομένων

31 Ιαν 2013 (πριν από 4 χρόνια και 11 μήνες)

245 εμφανίσεις

©Silberschatz, Korth and Sudarshan

1.
1

Chapter 1: Introduction


Purpose of Database Systems


View of Data


Data Models


Data Definition Language


Data Manipulation Language


Transaction Management


Storage Management


Database Administrator


Database Users


Database Management System Structure

©Silberschatz, Korth and Sudarshan

1.
2

What is a Database?


According to the book:


Collection of interrelated data


Set of programs to access the data


A DBMS contains information about a particular enterprise


DBMS provides an environment that is both
convenient

and
efficient

to use.



Another definition:


A
database

is a collection of organized, interrelated data, typically
relating to a particular enterprise


A
Database Management System

(DBMS) is a set of programs for
managing and accessing databases

©Silberschatz, Korth and Sudarshan

1.
3

Some Popular

Database Management Systems


Commercial “off
-
the
-
shelf” (COTS):


Oracle


IBM DB2 (IBM)


SQL Server (Microsoft)


Sybase


Informix (IBM)


Access (Microsoft)



Open Source:


MySQL


PostgreSQL


Note: This is
not

a course on any particular DBMS!

©Silberschatz, Korth and Sudarshan

1.
4

Some Database Applications



Databases touch all aspects of our lives:


Banking


accounts, loans, customers


Airlines
-

reservations, schedules


Universities
-

registration, grades


Sales
-

customers, products, purchases


Manufacturing
-

production, inventory, orders, supply chain


Human resources
-

employee records, salaries, tax deductions



Anywhere there is data, there could be a database.



Course context is an “enterprise” that has requirements for:


Storage and management of 100’s of gigabytes or terabytes of data


Support for 100’s or more of concurrent users and transactions


Traditional supporting platform, e.g, Sun Enterprise server, 2GB RAM, 10TB
of disk space

©Silberschatz, Korth and Sudarshan

1.
5

Purpose of Database System


In the early days, database applications were built on top of
file systems


coded from the ground up. Sometimes this
approach is still advocated.



Drawbacks of this approach:


Data redundancy and inconsistency


Multiple files and formats


Difficulty accessing data


A new program is required to carry out each new task


Data integrity problems


Integrity constraints (e.g. account balance > 0) become
embedded throughout program code


Hard to add new constraints or change existing ones


Plus others…


©Silberschatz, Korth and Sudarshan

1.
6

Purpose of Database Systems (Cont.)


Database systems offer solutions for the above problems.



Database systems also support:


Atomicity of updates


Failures may leave database in an inconsistent state with partial updates


E.g., transfer of funds from one account to another should either
complete or not happen at all


Concurrent access by multiple users


Concurrent accessed is needed for performance


Uncontrolled concurrent accesses can lead to inconsistencies


E.g. two people reading a balance and updating it at the same time


Data security



Recoding this functionality from scratch is not easy!

©Silberschatz, Korth and Sudarshan

1.
7

Levels of Abstraction


Physical level: defines low
-
level details about how a data item
(e.g., customer) is stored on disk.



Logical level: describes data stored in a database, and the
relationships among the data.



type

customer =
record




name

: string;




street

: string;




city

: integer;



end
;



View level: defines how information is presented to users. Views
can also hide details of data types, and information (e.g., salary)
for security purposes.

©Silberschatz, Korth and Sudarshan

1.
8

View of Data

An “architecture” for a database system:

©Silberschatz, Korth and Sudarshan

1.
9

View of Data, Cont.


In general, the interfaces between the three levels should be
defined so that changes in some parts do not seriously influence
others.



Carefully defining the interfaces helps enhance

Physical Data
Independence
, which is the ability to modify the physical schema
without changing the logical schema.

©Silberschatz, Korth and Sudarshan

1.
10

Instances vs. Schemas


Similar to types and variables in programming languages



Schema



defines the structure or design of a database


Analogous to type information of a variable in a program


E.g., a database might consists of information about a set of
customers and accounts and the relationship between them


More precisely:


Logical schema
: database design at the logical level


Physical schema
: database design at the physical level



Instance



a database and its’ contents at one point in time


Analogous to a variable and its’ value

©Silberschatz, Korth and Sudarshan

1.
11

What is a Data Model?


The phrase “data model” is used in a couple of different ways.



The phrase (use #1) is frequently used to refer to an overall
approach or philosophy for database design and development.



For those individuals, groups and corporations that subscribe to
a specific data model, that model permeates all aspects of
database design, development, implementation, etc.



Current data models:


Entity
-
Relationship model


Relational model


Object
-
oriented model


Object
-
relational model


Semi, and non
-
structured data models



Legacy models:


Network


Hierarchical

©Silberschatz, Korth and Sudarshan

1.
12

What is a Data Model, Cont?


During the early phases of database design and development, a
“data model” is frequently developed (use #2).



The purpose of developing the data model is to define:


Data


Relationships between data items


Semantics of data items


Constraints on data items


In other words, a data model defines the logical schema, i.e., the
logical level of design of a database.



A data model is typically conveyed as one or more diagrams.



The type of diagrams used depends on the overall approach or
philosophy (i.e., the data model, as defined in the first sense).



This early phase is referred to as
data modeling
.

©Silberschatz, Korth and Sudarshan

1.
13

Entity
-
Relationship Model

Example of an entity
-
relationship diagram:










Widely used for database modelling.


An ER model is converted to tables in a relational
database.

©Silberschatz, Korth and Sudarshan

1.
14

Relational Model


Example of tabular data in the relational model













From a data modeling perspective, which approach is preferable?
The ER model, or the relational model?

Attributes

customer
-
name

customer
-
id

customer
-
street

customer
-
city

account
-
number

Johnson


Smith


Johnson


Jones


Smith

192
-
83
-
7465


019
-
28
-
3746


192
-
83
-
7465


321
-
12
-
3123


019
-
28
-
3746

Alma


North


Alma


Main


North

Palo Alto


Rye


Palo Alto


Harrison


Rye

A
-
101


A
-
215


A
-
201


A
-
217


A
-
201

©Silberschatz, Korth and Sudarshan

1.
15

A Sample Relational Database

©Silberschatz, Korth and Sudarshan

1.
16

Data Definition Language (DDL)


Notation and program for defining a (physical) database schema (see
page 129 for a more complete example):



create table

account

(


account
-
number

char
(10),


balance


integer
)



Given a DDL file, the DDL compiler generates a set of tables



A description of those tables is stored in a
data dictionary:


Contains information from the database schema


Frequently referred to as metadata (i.e., data about data)



Data
storage and definition

language:


Language in which the storage structure and access methods used by the
database system are specified


Usually an extension of the data definition language

©Silberschatz, Korth and Sudarshan

1.
17

Data Manipulation Language (DML)


Language for accessing and manipulating the data


DML is also known as query language



Two classes of DML languages:


Procedural


user specifies what data is required and how to get
that data


Non
-
procedural


user specifies what data is required without
specifying how to get that data



SQL is the most widely used query language


Usually referred to as a non
-
procedural query language

©Silberschatz, Korth and Sudarshan

1.
18

SQL Examples


Find the name of the customer with customer
-
id 192
-
83
-
7465:



select

customer.customer
-
name


from

customer


where

customer.customer
-
id

= ‘192
-
83
-
7465’



Find the balances of all accounts held by the customer with customer
-
id
192
-
83
-
7465:



select

account.balance


from

depositor
,
account


where

depositor.customer
-
id

= ‘192
-
83
-
7465’
and



depositor.account
-
number = account.account
-
number



Databases are typically accessed by:


Users through a command line interface


Application programs that (generally) access them through:


Language extensions to allow embedded SQL


An application program interface (e.g. ODBC/JDBC) which allow SQL
queries to be sent to a database


©Silberschatz, Korth and Sudarshan

1.
19

Database Users


Users are differentiated by the way they interact with the system



Naïve users


invoke application programs that have been written
previously


E.g. people accessing a database over the web, bank tellers, clerical staff,
ATM users



Application programmers


interact with the system by making DML
calls through an API, e.g., ODBC or JDBC from within a computer
program



Sophisticated users


form requests in a database query language,
typically submitted at the command
-
line



Specialized users


write specialized database applications that do not
fit into the traditional data processing framework


©Silberschatz, Korth and Sudarshan

1.
20

Database Administrator (DBA)


Coordinates all the activities of the database system; the DBA has a good
understanding of the enterprise’s information resources and needs.



DBA duties include:


Granting user authority to access the database


Acting as liaison with users


Installing and maintaining DBMS software


Monitoring performance and performance tuning


Backup and recovery



According to the book, the DBA is also responsible for:


Schema definition


Storage structure and access method definition


Schema and physical organization modification


Specifying integrity constraints


Responding to changes in requirements



These latter tasks are typically performed by a DB designer.

©Silberschatz, Korth and Sudarshan

1.
21

Transaction Management



A
transaction

is a collection of operations that performs a single
logical function in a database application



The
backup and recovery

components of a DBMS ensure that the
database remains in a consistent (correct) state despite failures:


system, power, network failures


operating system crashes


transaction failures.



The
concurrency
-
control manager

in a DBMS controls the
interaction among the concurrent transactions, to ensure the
consistency of the database.

©Silberschatz, Korth and Sudarshan

1.
22

Storage Management


The
storage manager

in a DBMS provides the interface between
the low
-
level data stored in the database and the application
programs and queries submitted to the system.



The storage manager is responsible to the following tasks:


interaction with the file manager


efficient storing, retrieving and updating of data



Note that the DBMS may or may not make use of the facilities of
the operating systems’ file management facilities.

©Silberschatz, Korth and Sudarshan

1.
23

Overall System Structure

Query Optimizer

©Silberschatz, Korth and Sudarshan

1.
24

Application Architectures


Architectures:



Mainframe



client programs and DBMS reside on one platform.


Two Tier



client programs and DBMS reside on different platforms;
clients connect to DBMS via an API such as ODBC/JDBC.


Three Tier



client programs, application server (or other “middleware”),
and DBMS; clients connect to DBMS indirectly through the application
server (also via an API). Typically used in web
-
based applications.


N Tier



recent generalization of 2 and 3 tier architectures.