DATABASE
SYSTEMS
Lecture 21
Dr. Adam P. Anthony
Overview
Database Fundamentals
System components
Schemas
Database Models
The Relational Model
Survey of Database Products
Data Mining
Social Issues
What is a Database?
Non
-
Technical: A computer program that stores
data
Examples of Databases using this definition?
Technical: A computer program that stores
structured
data
in an organized fashion, and provides the
ability to grant
limited, or restricted access
to
different
users
based on their
data requirements.
Examples of Databases using this definition?
Anything ruled out that we came up with above?
Database Systems
Initially: each department
maintained their own
“database”
Sometimes just a text or
Excel file
Problems with duplication?
Problems with sharing
information?
Now: One database holds
everything, and each
department is given access
only to the data they need
Schemas
Schema
full description of all the database,
including what information is provided and how it
interacts
Used to maintain and program the database
Provides a nice overview of what data a business collects
and (perhaps) how they use that data
Subschema
Partial description of only a few relevant
items
Anything in a Subschema will also appear in the database’s
Schema
Based on providing only what the user needs to do their job
Provides basic security: payroll record example
The Database Management System
Software applications can be enhanced by working with a
database
Especially on the web! (Google, Amazon, etc.)
Database Management System (DBMS): has all the
necessary code for organizing, manipulating data
When application needs data, it just asks the DBMS to provide it
More on Database Management
Systems
Database Management System
(DBMS): A software
layer that manipulates a database in response to requests
from applications
Schemas and
subschemas
are created, managed by the
DBMS
Can have a different subschema for each application that
accesses the DBMS
Distributed Database:
A database stored on multiple
machines
Data is shared between each machine over a network
DBMS will mask this organizational detail from its users
Data independence:
The ability to change the
organization of a database without changing the
application software that uses it
Database Models
Bits and bytes are hard to talk about!
Want to reflect the organization of our data without
thinking about the complex programming techniques
needed to keep our data organized
A
Database Model
provides a conceptual view of the
database that gives us a ‘general idea’ of how the
data is organized
Flat File Model (not really a DB)
Hierarchical Model
Object
-
Oriented Model
Relational Model
The Relational Database Model
Pretty Simple, on the surface
A
Relation
is just a rectangular table of data
An
Attribute
is any
column
in that table
A
Tuple
is any
row
in the table
There is complexity not in the table, but in how two
different tables can
interact
If two
relations
have a common
Attribute
, then we can
combine
tuples
from the two tables
to get a more
complex piece of data!
9
-
10
A
relation containing employee
information
Designing a Good Relational
Database
When we set up the DBMS, we could just put all our
data in one relation, like a big Excel file!
But that’s no fun!
And, it can cause problems
Redundancy:
When the same data is repeated
unnecessarily
Deletion Anomalies:
When deleting a
tuple
results in
the loss of important information
We can prevent some of these problems using a multi
-
relation design!
9
-
12
A
relation containing redundancy
9
-
13
An
employee database consisting
of three relations
The Employee Database Schema
Employee
Empl
ID
Name
Address
SSN
Job
Job ID
Job Title
Skill Code
Dept
Assignment
Empl
ID
Job ID
Start Date
Term Date
Combining Data From Multiple Tables
In which departments has G. Jerry Smith worked?
Designing a Relational Database
Given a data problem:
What are the entities/objects in this problem?
How do those entities interact?
Make one relational table for each entity
Make one relational table for each interaction
Identifying/Avoiding Redundancy
Sometimes it’s obvious in the design
Two columns with same/similar names
Sometimes you need to focus on how the data will be
used
Common perpetrators:
Addresses
—
more than one person can live at same
address
Organizations
—
more than one person belongs to a single
organization
Owners
—
a person/company can own more than one asset
Creators
—
a creator may have many creations
In
-
Class Example (from book)
Chapter Review, question 18 (similar to assigned
problem 20)
Design a relational database containing information
about music performers, their recordings, and the
composers of the music they recorded (Avoid
redundancies similar to those in Figure 9.4)
SQL (Structured
Query Language)
Developed at IBM as a part of their Relational DBMS
project
Basic language humans can use to express, with
precision, the data that they want from each (separate)
relational table, and how it should be combined:
select
EmplId
, Dept
from ASSIGNMENT, JOB
where
ASSIGNMENT.JobId
=
JOB.JobId
and
ASSIGNMENT.TermData
=
“
*
”
Survey of Database Products
MySQL
Microsoft Access
Microsoft SQLServer
Oracle DB
MySQL
Free, Open
-
Source database product
Little support from creators, minimal software for managing the system
Must know a great deal about database design fundamentals in order to
use successfully on a large scale
Can be done, but requires expert knowledge
Good Choice for:
temporary data solutions
Zero
-
budget scenarios (but only if you have expert users to run it!)
Hiring people to run the system costs money!
Internet applications
Facebook
??
Poor Choice for:
Non
-
Expert users
MASSIVE data problems (but
Facebook
??)
Microsoft Access
Cheap, business
-
oriented database product
Emphasis on user
-
friendliness, ease of development
Packaged with ‘data templates’ that might automatically fit your data needs, no
design required!
very nice application
-
development features that are popular in the business
world
Limits on amount of data that can be stored in a single database
Data may be as complex as on any other DBMS, just limited in size
Good choice for:
Small business settings
Non
-
Expert Users
Temporary data solutions
Poor choice for:
Large, enterprise
-
level data solutions
Sharing data with a large number of people
Microsoft SQLServer and Oracle DB
Large, expensive enterprise
-
level DBMS’s
Distributed DBMS support
Provides both expert
-
level and non
-
technical tools
for interaction with DBMS
SQLServer more
-
so
than Oracle
Good For:
Everything (if you can afford them!)
Large jobs
Bad for:
Low
-
budget scenarios (except for SQLServer Express!)
Data Mining
Data Mining = Buzz word that covers any action that seeks to detect
a previously unknown pattern from a large data set
Encompasses many areas of Computer Science, Mathematics, Statistics
Class Description:
Divide data set into pre
-
determined groups (e.g., Mexican food lovers
and Italian food lovers)
Characterize each group by identifying common patterns
“What type of person enjoys Mexican food?”
Class Discrimination:
Start with an initial data set, generate a class description for each
group
Given a new (incomplete) item, predict which group it will fall into:
“Will this new customer prefer Mexican or Italian food?”
Data Mining, Continued
Cluster Analysis:
Similar to Class Description but now the groups aren’t
predetermined!
Must find the most distinct groups AND describe them at the
same time!
Much more difficult
Association Analysis
Find patters
between
groups!
Big in market research
“Are Mexican food lovers also lovers of imported beers?”
Social Impact of Databases
Problems
Massive amounts of personal data are being collected
Often without knowledge or meaningful consent of affected people
Data merging produces new, more invasive information
Errors are widely disseminated and hard to correct
Remedies
Existing legal remedies often difficult to apply
Negative publicity may be more effective
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο