DBS Lecture#1 - Course Overview - Computer Science ...

musicincurableData Management

Jan 31, 2013 (4 years and 9 months ago)

127 views

Database Systems:
Course Overview

Professor Navneet Goyal & K Hari Babu

Department of Computer Science & Information Systems

BITS, Pilani

©
Prof. Navneet Goyal, BITS, Pilani

Text Book


Hector

G

Molina,

Jeffrey

D
.
Ullman

&

Jennifer

Widom
.


Database

Systems



The

Complete

Book,

Pearson

Education,

2002
.




Home

Page
:


http
:
//www
-
db
.
stanford
.
edu/~ullman/dscb
.
html

©
Prof. Navneet Goyal, BITS, Pilani

Reference Books


Ramakrishna R. & Gehrke J.


Database Management Systems, 3e, Mc
-
Graw Hill,
2003.


http://www/cs.wisc.edu/~dbbook



Silberschatz A, Korth H F, & Sudarshan S.


Database System Concepts, 5e, TMH, 2005.


http://www.db
-
book.com


http://www.mhhe.com/silberschatz



Elmarsi R, & Navathe S B.


Fundamental of Database System, 4e, Pearson
Education, 2004.


http
:
//www
.
aw
.
com/cssupport

©
Prof. Navneet Goyal, BITS, Pilani

Course Website & Email


http://csis/faculty/goel/Database Systems

csc352dbs@yahoo.co.in


©
Prof. Navneet Goyal, BITS, Pilani

Topics


Evolution of Databases


Data, Database, DBMS, & DBS


Data Modeling


Relational Databases


Schema Design & Normalization


Query Languages


Storage & Indexing


Query Processing & Optimization


Concurrency


Crash Recovery


Advanced Topics

Tsunami of Data


Telecom data (


4 bn mobile subscribers)


WWW


Weblog data (160 mn websites)


Email data


Satellite imaging data


Social networking sites data


Genome data


CERN’s LHC (15 petabytes/year)

©
Prof. Navneet Goyal, BITS, Pilani

Basic Definitions


Database
: A collection of related data.


Data
: Known facts that can be recorded and
have an implicit meaning.


Mini
-
world
: Some part of the real world about
which data is stored in a database. For example,
student grades and transcripts at a university.


Database Management System (DBMS)
: A
software package/ system to facilitate the
creation and maintenance of a computerized
database.


Database System
: The DBMS software
together with the data itself. Sometimes, the
applications are also included.

©
Prof. Navneet Goyal, BITS, Pilani

DBMS Functionalities


Define a database : in terms of data types,
structures and constraints


Construct or Load the Database on a
secondary storage medium


Manipulating the database : querying,
generating reports, insertions, deletions
and modifications to its content


Concurrent Processing and Sharing by a set
of users and programs


yet, keeping all
data valid and consistent


Crash Recovery

©
Prof. Navneet Goyal, BITS, Pilani

File System vs. DBMS


A company has 500 GB of data on
employees, departments, products, sales,
& so on..


Data is accessed concurrently by several
employees


Questions about the data must be
answered quickly


Changes made to the data by different
users must be applied consistently


Access to certain parts of the data be
restricted

©
Prof. Navneet Goyal, BITS, Pilani

File System vs. DBMS


Data stored in operating system files


Many drawbacks!!!


500 GB of main memory not available to hold all data.
Data must be stored on secondary storage devices


Even if 500GB of main memory is available, with 32
-
bit
addressing, we cannot refer directly to more than 4GB
of data


Data redundancy and inconsistency


Multiple file formats, duplication of information in different
files


Special program to answer each question a user may
ask

©
Prof. Navneet Goyal, BITS, Pilani

File System vs. DBMS


Many drawbacks!!!


Integrity problems


Integrity constraints (e.g. account balance > 0) become
“buried” in program code rather than being stated
explicitly


Hard to add new constraints or change existing ones


We must protect the data from inconsistent changes
made by different users. If application programs need
to address concurrency, their complexity increases
manifolds


Consistent state of data must be restored if the system
crashes while changes are being made


OS provide only a password mechanism for security.
Not flexible enough if users have permission to access
subsets of data

©
Prof. Navneet Goyal, BITS, Pilani

File System vs. DBMS


These drawbacks have prompted the
development of database systems


Database systems offer solutions to
all the above problems?


©
Prof. Navneet Goyal, BITS, Pilani

Advantages of a DBMS


Program
-
Data Independence


Insulation between programs and data:

Allows
changing data storage structures and operations
without having to change the DBMS access programs.


Efficient Data Access


DBMS uses a variety of techniques to store & retrieve
data efficiently


Data Integrity & Security


Before inserting salary of an employee, the DBMS can
check that the dept. budget is not exceeded


Enforces access controls that govern what data is
visible to different classes of users

©
Prof. Navneet Goyal, BITS, Pilani

Advantages of a DBMS


Data Administration


When several users share data , centralizing the
administration offers significant improvement


Concurrent Access & Crash Recovery


DBMS schedules concurrent access to the data in such
a manner that users think of the data as being
accessed by only one user at a time


DBMS protects users from the ill
-
effects of system
failures


Reduced Application Development Time


Many important tasks are handled by the DBMS


©
Prof. Navneet Goyal, BITS, Pilani

Databases Everywhere!!!


DBMS contains information about a particular enterprise


Collection of interrelated data


Set of programs to access the data


An environment that is both
convenient

and
efficient

to use


Database Applications:


Banking: all transactions


Airlines: reservations, schedules


Universities: registration, grades


Sales: customers, products, purchases


Online retailers: order tracking, customized recommendations


Manufacturing: production, inventory, orders, supply chain


Human resources: employee records, salaries, tax
deductions


Databases touch all aspects of our lives

©
Prof. Navneet Goyal, BITS, Pilani

Levels of Abstraction


Databases provide users with an
abstract view of data


©
Prof. Navneet Goyal, BITS, Pilani

Major Players


Oracle


9i,10g, 11i


Microsoft


SQL SERVER 200x


IBM DB2


MySQL AB


PostgreSQL


Oracle 44.3% vs. 21% for IBM vs.
18.5% for Microsoft

©
Prof. Navneet Goyal, BITS, Pilani

Heard of SUN
Microsystems?


On
16 January

2008
,
MySQL AB

announced that it had agreed to be
acquired by
Sun Microsystems

for
approximately US$1 billion


Oracle Corp. acquired Sun in 2009

DBMS



A Microcosm of CS!


The area of DBMS is a microcosm of
computer science in general


The issues addressed and the
techniques used span a wide
spectrum

DBMS



A Microcosm of CS!


Languages


Object
-
orientation & other programming paradigms


Compilation


Operating systems


Concurrent programming


Data structures


Algorithm


Parallel & distributed computing


User interfaces (Human Computer Interaction)


Expert systems & AI


Statistical techniques & Dynamic programming

Reference: DBMS by Raghurama Krishna & Gherke, 3e

©
Prof. Navneet Goyal, BITS, Pilani

Exercise/Suggestion


During the course, keep an eye on
the role of the listed sub areas of
computer science!

©
Prof. Navneet Goyal, BITS, Pilani

Benchmarking DBs


The term transaction is often applied to a
wide variety of business and computer
functions. Looked at as a computer
function, a transaction could refer to a set
of operations including disk read/writes,
operating system calls, or some form of
data transfer from one subsystem to
another

©
Prof. Navneet Goyal, BITS, Pilani

Benchmarking DBs


While TPC benchmarks certainly involve the
measurement and evaluation of computer
functions and operations, the TPC regards a
transaction as it is commonly understood in the
business world: a commercial exchange of
goods, services, or money. A typical transaction,
as defined by the TPC, would include the
updating to a database system for such things
as inventory control (goods), airline reservations
(services), or banking (money).


©
Prof. Navneet Goyal, BITS, Pilani

Benchmarking DBs


In these environments, a number of customers
or service representatives input and manage
their transactions via a terminal or desktop
computer connected to a database. Typically,
the TPC produces benchmarks that measure
transaction processing (TP) and database (DB)
performance in terms of how many transactions
a given system and database can perform per
unit of time, e.g., transactions per second (tpsC)
or transactions per minute (tpmC)

©
Prof. Navneet Goyal, BITS, Pilani

Benchmarking DBs


Results

Q & A

Thank You