CS 430 Database Theory

sunfloweremryologistΔιαχείριση Δεδομένων

31 Οκτ 2013 (πριν από 4 χρόνια και 8 μέρες)

101 εμφανίσεις

1

CS 430

Database Theory

Winter 2005

Lecture 1: Introduction

2

What’s a Database


“Collection” of “related” “data”


Contains data about some aspect of the “real
world”


Refers to a Universe of Discourse (UoD)


A “logically coherent” collection of data


Has a “specific purpose”

3

Typical Characteristics of Databases

(1 of 3)


“Large”


Typically bigger than a spreadsheet


May be very large


Example, IRS Tax return database:


About 200M returns per year, 5 year retention


About 1K
-
10K bytes per return (guess)


About 1


10 Terabytes (without overhead)


Shared


More than single user and single application

4

Typical Characteristics (2 of 3)


Structured


More than a simple flat table


Self describing


Contains
Metadata
(data about data) describing
the data contained in the database


Metadata

maintained separately from applications
that use and manipulate the data


Has a
Catalog

which is a “database” of the
Metadata

5

Typical Characteristics (3 of 3)


Supports multiple
views

of the data


Different users and applications can view the data
differently


ACID properties


Atomicity



Atomic transactions (updates are all or
nothing)


Consistency



Enforces integrity constraints


Isolation



Transactions are isolated from each
other


Durability



Data from completed transactions is
never lost

6

A Little History of Databases (1 of 3)


Mid to late 1960s
-

first databases


Applications


Maintain parts data for Lunar Lander


Airline reservations


Multiple data models


Hierarchical, Network, Inverted File System


Early, mid 1970s
-

Relational data model


Edgar Codd


Father of Relation database


Basis for
SQL

(Structured or Standard Query
Language)

7

History (2 of 3)


1979


Oracle Version 2


Initial version (marketing decision)


Incomplete and slow


Late 1980s


IBM DB2 Version 1


Used to define the SQL standard


Late 1980s


Object Oriented databases


Created to manage data for “non
-
traditional”
applications

8

History (3 of 3)


1990s


Object Relational Databases


Pioneered by Michael Stonebraker


Today


Dominant technology: Relational DBMS (RDBMS)


Oracle, MS SQL Server, IBM DB2, …


MySQL, PostgreSQL, …


OO capabilities being added to RDBMS


New: Object
-
Relational Mapping Software


Try to handle “impedance mismatch” between RDBMS
and OO programming languages

9

Database Applications (1 of 2)


Traditional


Business applications


Personnel, accounting, ...


Student and Course data


Traditional data types


Numbers, strings, dates


Data warehousing


Large “historical” databases for analytic support


Manufacturing Control


Real
-
time issues

10

Database Applications (2 of 2)


Non
-
traditional


Image and Video


GIS (Geographic Information Systems)


Engineering


CAD (Computer Aided Drafting or Design)


Time Series


Stock market data


Full text search


Environmental and Remote Sensing

11

Data Base Management System (DBMS)


Software that manages and or facilitates


Data definition


E.g. creating and maintaining the catalog


Data construction


E.g. loading data into the database


Data manipulation


Applications retrieving and updating the database


Data sharing


ACID properties

12

DBMS In Context

Database System

Users/Programmers

DBMS Software

Query Processing

Application Program Interface

Access/Update Stored Data

Application Programs

External Queries

Metadata

Catalog

The Data

Elmasri and Navathe, Figure 1.1, Page 6

13

Database People (Actors) (1 of 2)


Data Administrator


Responsible for correctness of the data


Database Administrator


Configure DBMS, manage data storage, DBMS
performance tuning


Database Designer


Design the database


All three of these may be same person or
group of people

14

Database People (2 of 2)


Application Analysts and Developers


Responsible for analyzing, designing, building,
and maintaining database applications


End Users


Use the database to accomplish useful work

15

Why use a DBMS? (1 of 2)


Manage redundancy


If the same data is stored multiple times (often
enough, without periodic reconciliation) it is
guaranteed to be inconsistent


Access Control


Not all the users can view and/or update all the
data


Persistent storage of program data


Rather than having to implement your own DBMS
internal to your application

16

Why a DBMS? (2 of 2)


Efficiency


DBMS vendors have done a lot of work to make their
products work efficiently


Mixed blessing (see “Why not to use a DBMS?”)


Enforce integrity constraints


Defined and enforced once


Share data


Among multiple applications, GUIs, users


ACID Properties


Difficult to implement correctly

17

Why not to use a DBMS?


Learning curve


“It takes four years to learn to be an Oracle DBA”


Overhead costs (time and space)


Generality


Concurrency and transactions


Multiple application and user access


Complex data structures


Rule of thumb: Using an RDBMS doubles the space
required for the data (e.g. versus a text file)

18

Course Administration


Course web site


http://faculty.cs.wwu.edu/reedyc/CS_430_Winter_2005


Email:
Chris.Reedy@wwu.edu


Textbook


Elmasri, Navathe, Fundamentals of Database Systems,
Fourth Edition


Assignments


Use MySQL


Most convenient form of access?


Get hands dirty:


Design a database


Create database and load the data


Write a database application

19

Course Outline (1 of 2)


Introduction to Databases


Chapters 1 and 2


Introduction to Data Modeling


Chapter 3 (partial)


Relation Data Model, Algebra, and Calculus


Chapters 5, 6


Functional Dependencies and Normalization


Chapters 10 and 11 (partial)

20

Course Outline (2 of 2)


SQL Database Programming


Chapters 8 and 9


Entity
-
Relationship Modeling


More of chapters 3, 4, and 7


Overview: What’s inside a DBMS?


CS530, Chapters 13
-
19


Overview of additional topics


Object
-
Oriented and Object Relational DBMSs


XML in Databases