db_project_report.docx

southdakotascrawnyData Management

Nov 29, 2012 (4 years and 8 months ago)

243 views






Database Project Report

Scott Hand

sah071000@utdallas.edu

CS 4347


Database Systems

Professor
Balaji Raghavachari

Professional Database Project for Spring 2010




Contents

Entities and Relationship Descriptions

................................
................................
................................
.........

1

Entities

................................
................................
................................
................................
......................

1

Relations

................................
................................
................................
................................
...................

2

Utility
Entities

................................
................................
................................
................................
............

3

Database Schema

................................
................................
................................
................................
..........

4

Functional Dependencies

................................
................................
................................
..............................

5

Table Creation and Data Entry

................................
................................
................................
......................

6

SQL Queries for Table Creation:

................................
................................
................................
................

6

Importing the Data

................................
................................
................................
................................
....

9

Profession
al Tasks

................................
................................
................................
................................
.......

10

Audit Trails

................................
................................
................................
................................
..............

10

Encryption of Passwords and Data

................................
................................
................................
.........

11

System Security and Firewall Management
................................
................................
............................

11

Intrusion Detection

................................
................................
................................
................................
.

11

Applic
ation Screenshots

................................
................................
................................
..............................

13

Client Application

................................
................................
................................
................................
....

13

Login

................................
................................
................................
................................
....................

13

Main

................................
................................
................................
................................
....................

13

Administrator Application

................................
................................
................................
.......................

13

Lo
gin

................................
................................
................................
................................
....................

13

Start

................................
................................
................................
................................
.....................

14

New Course

................................
................................
................................
................................
.........

14

User Manager
................................
................................
................................
................................
......

14

New Section

................................
................................
................................
................................
........

15

Edit CLO

................................
................................
................................
................................
...............

15

Concluding Remarks
................................
................................
................................
................................
....

16

Design Strengths and Accomplishments

................................
................................
................................
.

16

Design Weaknesses and Incomplete Tasks

................................
................................
.............................

16

Final Words

................................
................................
................................
................................
.............

16


Hand
1


Entities and Relationship
Descriptions

E
ntities


Primary keys are indicated in bold, foreign keys are indicated with an underline.

Course



Pkeys
: Prefix, Number

Description:

The course entity stores the basic information about a course such as its name, credits, and
active
status. It is indexed by both its prefix and number.



Prefix



Number



Assume that numbers like 4v40 will be required



Name



Credits


Must be from 0 to 4



Active


Section



Pkeys
: C_Prefix, C_Number, S_Number, S_Semester

Description:

The section entity is a f
requently used part of the database, as it stores all the relevant
information about a section being taught. It is indexed by a course prefix and number linked to
a course entity as well as a section number and a semester (such as Spring 2010 or Fall 2009
).



C_Prefix

(Course)



C_Number

(Course)



S_Number



Section number must be an integer



S_Semester



Semester must be a one character season descriptor and a two
-
digit year



Instructor Email

(Instructor)



Room



Begin time


Must be before end time



End time


Mus
t be after begin time



Days


Max of 5 characters with each one being a one character abbreviation for the
respective calendar day


Instructor



Pkeys
: Email

Description:

The instructor entity stores information about a professor such as name, designation,
office, etc.
All of these are indexed by a unique email address.



Email



must be of form *@*.com



Last Name



First Name



Office



Designation

Hand
2




Phone Extension


CLO



Pkeys
: Course_Prefix, Course_Number

Description:

The CLO entity is just an objective (usually a

sentence or two) linked with the information for its
associated course. The course information points to a course entity.



Course Prefix

(Course)



Course Number

(Course)



Objective


Feedback form



Pkeys: Course_num, Course_prefix, Sect_num, Sect_semester

D
escription:

The Feedback form entity stores the text feedback that professors give for a c
ourse as well as a
timestamp that

records when the feedback was submitted.

It contains references to the section
entity in the form of course prefix and number as we
ll as section number and semester.



Feedback



Course_num

(Section)



Course_prefix

(Section)



Sect_num

(Section)



Sect_semester

(Section)



Timestamp

Relations

Student performance



Pkeys: SP_CourseNumber, SP_CoursePrefix, SP_SectionNumber,
SP_SectionSemester, CLO_Objective

Description:

The student performance relation links a section with the associated CLO and records
information about that CLO for that particular section such as

performance percentages

and
basis for evaluation
. It links to a section entity through the section number and semester as well
as the objective and course information of a CLO.



SP_CourseNumber

(
CLO
)



SP_CoursePrefix

(
CLO
)



SP_SectionNumber

(Section)



SP_SectionSemester

(Section)



CLO_Objective

(CLO)



Percent Below Expectations



Percent Met Expectation



Percent Exceeded Expectations



Basis

Hand
3


Utility E
ntities

Users



Pkeys: User Email

Description:

The users table stores usernames and salted password hashes. Th
ey are indexed by the
username, which references an email from the instructor table.



User Email

(Instructor)



must be of form *@*.com



User Password


Stored as salted hash


Admins



Pkeys: Admin Email

Description:

The admins table is the same as the
users table except that its usernames are not linked to
another table.



Admin Email



must be of form *@*.com



Admin Password


Stored as salted hash


Logins



Pkeys: Time

Description:

Logins stores information about login attempts, whether successful or un
successful.

They are
indexed by the time at which they occur, and they log username, IP address, as well as whether
the login was successful.



Time



Username



ValidPass


Boolean, true for successful, false for unsuccessful



IP Address


Must be 255.255.255.2
55 IPv4 format



Hand
4


Database Schema

Primary keys are underlined, foreign keys are indicated with arrows
.

Course

Prefix

Cnum

Cname

Credits

Active


Section

C_prefix

C_no

S_no

S_year

I_email

S_room

S_begin

S_end

S_days


CLO

Course_prefix

Course_number

Objective


Feedback_form

Feedback

Course_num

Course_prefix

Sect_num

Sect_semyear

Form_timestamp


Student_performance

Sp_cnum

Sp_cprefix

Sp_snum

Sp_syear

Percent_me

Percent_ee

Percent_be

Clo_objective

Basis


Instructor

Email

Lname

Fname

Phoneext

Office

Designation


Users

User_email

User_password


Admins

Admin_email

Admin_password


Logins

Time

Username

Valid_pass

Ip_addr


Hand
5


Functional Dependencies

Course(
Prefix
,
Cnum
,Cname,Credits,Active)


{Prefix,Cnum}


{Cname, Credits, Active}

Section(
C_prefix
,
C_no
,
S_no
,
S_year
,I_email,S_room,S_begin,S_end,S_days)

{C_prefix, C_no, S_no, S_year}


{I_email, S_room,S_begin,S_end,S_days}

CLO(
Course_prefix
,
Course_number
,
Objective
)


None

Feedback_form(Feedback,
Course_num
,
Course_prefix
,
Sect_num
,
Sect_semyear
,Form_timestamp)


{Course_num,Course_prefix,Sect_num,Sect_semyear}


{Form_timestamp,
Feedback
}

Student_performance(
Sp_cnum
,
Sp_cprefix
,
Sp_snum
,
Sp_syear
,Percent_me,Percent_ee,Percent_be,
Clo_objective
,
Basis)

{Sp_
cnum,Sp_cprefix,Sp_snum,Sp_syear, Clo_objective}


{Percent_me,Percent_ee,Percent_be
,Basis}

Instructor(
Email
,Lname,Fname,Phoneext,Office,Designation)


Email


{
Lname,Fname,Phoneext,Office,Designation
}

Users(
User_email
,User_password)


User_email


User_pass
word

Admins(
Admin_email
,Admin_password)


Admin_email


Admin_password

Logins(
Time
,Username,Valid_pass,Ip_addr)


Time


{Username, Valid_pass, Ip_addr}


Each functional dependency has the primary key in its left side
, satisfying 2NF
, and each relation only
has one
functional dependency
, satisfying BCNF and thereby 3NF
. Therefore, 2NF, 3NF, and BCNF are satisfied.



Hand
6


Table Creation and Data Entry

SQL Queries for Table Creation:

create

table

admins (


admin_email character varying(30)
not null
,


admin_password character varying

);

create

table

clo (


course_prefix character varying(8)
not null
,


course_number character varying(4)
not null
,


objective text
not null

);

create

table

course (


prefix character varying(8)
not null
,


cnum character varying(4)
not null
,


cname character varying(50),


credits integer,


active boolean
default

true

);

create

table

feedback_form (


feedback text,


course_num character varying(4)
not null
,


course_prefix character varying(8)
not null
,


sect_num integer
not null
,


sect_semyear character varying(3)
not null
,


form_timestamp timestamp with time zone

);

create

table

instructor (


lname character varying,


fname character varyin
g,


email character varying(30)
not null
,


office character varying,


designation character varying,


phoneext integer

);

Hand
7


create

table

logins (


"time" timestamp without time zone
not null
,


username character varying(30),


valid_pass
boolean,


ip_addr inet

);

create

table

section (


c_prefix character varying(8)
not null
,


c_no character varying(4)
not null
,


s_no integer
not null
,


i_email character varying(30),


s_room character varying,


s_year character varying
not null
,


s_begin time without time zone,


s_end time without time zone,


s_days character varying

);

create

table

student_performance (


sp_cnum character varying(4)
not null
,


sp_cprefix character varying(8)
not null
,


sp_snum integer
not null
,


percent_be real
default

0.0,


percent_me real
default

0.0,


percent_ee real
default

0.0,


basis text
default

'Blank'::text,


clo_objective text
not null
,


sp_syear character varying(3)

);

create

ta
ble

users (


user_email character varying(30)
not null
,


user_password character varying

);



Hand
8


After table creation, primary and foreign keys were set up with separate commands. The commands used were:

alter

table

only

admins


add constraint prima
ry key

(admin_email);

alter

table

only

course


add constraint primary key

(cnum, prefix);

alter

table

only

student_performance


add constraint primary key

(sp_cnum, sp_cprefix, sp_snum, clo_objective);

alter

table

only

feedback_form


add
constraint primary key

(course_num, course_prefix, sect_num, sect_semyear);

alter

table

only

users


add constraint primary key

(user_email);

alter

table

only

instructor


add constraint primary key

(email);

alter

table

only

clo


add constraint prim
ary key

(course_prefix, course_number, objective);

alter

table

only

section


add constraint primary key

(c_prefix, c_no, s_no, s_year);

alter

table

only

logins


add constraint primary key

("time");

alter

table

only

section


add constraint foreign
key

(c_prefix, c_no)
references

course(prefix, cnum)
on delete
cascade
;

alter

table

only

clo


add constraint foreign key

(course_prefix, course_number)
references

course(prefix, cnum)
on
delete cascade
;

alter

table

only

users


add constraint foreign
key

(user_email)
references

instructor(email)
on delete cascade
;

alter

table

only

section


add constraint foreign key

(i_email)
references

instructor(email)
on delete cascade
;

alter

table

only

student_performance


add constraint foreign key

(sp_cprefix, sp_cnum, sp_snum, sp_syear)
references

section(c_prefix, c_no, s_no, s_year)
on delete cascade
;

alter

table

only

feedback_form


add constraint foreign key

(course_prefix, course_num, sect_num, sect_semyear)
references

section(c_prefix, c_n
o, s_no, s_year)
on delete cascade
;


Hand
9


Importing the Data

The data was imported into the database using PostgreSQL’s COPY command. This command reads in a CSV file
and dumps it into a specified table. An example of its usage:

copy instructor from
'/var/lib/pgsql/instructor.csv' using delimiters ',' csv header;

One problem with PostgreSQL’s COPY command is that it lacks flexibility when defining the structure of the CSV
files. Delimiters can be set, but the columns of the CSV file may not be specif
ied. This was addressed by copying
data into temporary tables matching the CSV files’ column order and then using SQL queries to copy the data over
to the real tables.

Other problems with the data as given included needing to make custom delimiters, impor
ting times, and data
consistency. The custom delimiters were required in the CLO file because many of the objective text contained
relevant commas. To solve this problem, I used a macro in a text editor to replace every delimiter comma with an
underscore
. The copy command could then be used as:

copy clo from '/var/lib/pgsql/clos.csv' using delimiters '|' csv header;

The problem with the time data was that it was given in 12
-
hour time, and PostgreSQL contains no string
-
to
-
time
function that accepts time v
alues in that format. However, there are functions to convert such strings into
timestamp
s and functions to convert time
stamps into times. All of the tokenizing and conversion was done with
the following SQL query that loads them into a tempory table:

in
sert into temp_times

select to_timestamp(trim(substring(trim(split_part(split_part(s_time, ' ',2), '
-
', 1)) from 0 for 3)


|| ':' ||substring(trim(split_part(split_part(s_time, ' ',2), '
-
', 1)) from 3 for 2)


|| ' ' || substring(trim(split_part(split_part
(s_time, ' ',2), '
-
', 1)),5,4)),'HH:MI')::time as t_begin,


to_timestamp(trim(substring(trim(split_part(split_part(s_time, ' ',2), '
-
', 2)) from 0 for 3)


|| ':' || substring(trim(split_part(split_part(s_time, ' ',2), '
-
', 2)) from 3 for 2)


|| ' ' || s
ubstring(trim(split_part(split_part(s_time, ' ',2), '
-
', 2)),5,4)),'HH:MI')::time as t_end,


trim(split_part(s_time, ' ',1)) as t_days, s_time


from section

The last problem that occurred involved certain data not meeting consistency constraints. Many of
the course
learning objectives had no associated courses.

I made the decision that course learning objectives without an
associated course table entry were meaningless, so I pruned out those CLOs with no equivalent course.

To
populate

the user table, I filled it with emails from the instructor table. I then used an MD5 salted hash to make
default passwords, with the password being the first part of their email. For example, “user@utdallas.edu” would
have a default password of “user.”

The SQL to set a password is:

insert into users values (
EMAIL, crypt(PASS, gen_salt('bf
')));

The SQL

query

to check if an input password matches a stored hash is:

select user_password = crypt(PASS, users.user_password)


from users


where user_email=USE
R

Hand
10


This query returns a boolean value.

The admins table is similar, it just has no tie to any other table.

Professional Ta
sks

Audit Trails

Audit trails are stored at multiple levels. At the operating system level, the auditd service logs every
suspicious login or user activity. A description of its operation is located in System Security and Firewall
Management. The next level of auditing i
s at the DBMS
level. PostgreSQL logs all failed login attempts
with information such as remote host and user name to a log file. This log file can be examined to spot
intrusion attempts. At the highest level, an audit system has been set up for the data
base itself.
Whenever a user or admin logs in to the database using one of the client applications, an entry is made
to the logins table containing information about remote host, user, and login success or failure, and
these are indexed by the time of the

login. Queries can be executed to prune out repeated failure
attempts. One such query is:

select ip_addr as "Source IP", count(ip_addr) as "Number of failed attempts"




from logins




where valid_pass = false




group by ip_addr;

This gives an easy
-
to
-
read list of failed attempts by IP address along with a number representing the
number of times the intruder attempted to access the database.
Other queries can then be executed to
narrow down results. Examples include queries like the above query tha
t filter results by only showing
those whose failed attempt count is above a certain threshold, queries that view attempts from one
address grouped by username, and a query that groups by address and then time and shows the
standard deviation of the time.

A small standard deviation would indicate clustered login attempts,
suggesting intrusion attempts.

One further level of auditing could be implemented, and this is table level auditing. This would read all
changes made to a table and store them to another

table that is regularly backed up.

PostgreSQL does
not provide built
-
in table auditing functionality, but it is possible to implement table auditing through
triggers. An easy way to set it up is through the Table Log project on the PgFoundry website. I

did not
choose to implement such auditing, as
any queries are controlled by the client applications. If I were
deploying these applications
I might implement such functionality

to satisfy any over
-
zealous security
requirements
, but it seemed unnecessary
for this project.


Hand
11


Encryption of Passwords and Data

The only data
fields
stored in this database that require encryption are the user and administrator
passwords. The database was designed so that successful intrusion or acquisition of database contents
w
ould not compromise the passwords of instructors or administrators. The first measure I took to
ensure password security was to store passwords as hashes rather than plain
-
text records.

Because
MD5 hashes are weak enough to be considered a security issue
, I chose to use a Blowfish
-
based hashing
function included with PostgreSQL’s pgcrypto library.

The second measure I took for password security
was to salt the hashes, thereby making it nearly impossible for an intruder to decrypt the passwords.

The SQL
commands needed to hash the passwords are given above in the Importing the Data section of
this report.


System Security and Firewall Management

Several steps were taken to ensure that the CentOS server met the necessary security standards.
The
following
specific security measures were taken:



C
ommon precautions were taken such as disab
ling root logi
ns, making sure that SSH used the
SSH2 protocol, and user accounts were created to use rather than root for serve
r
administration.



A firewall was configured usi
ng the iptables service in CentOS, and rules were setup to allow
access only through authorized ports. To strengthen the firewall, I installed the Advanced Policy
Firewall (APF) application. It allows easy configuration of allowed ports, and it connects
to a
repository
containing addresses that are associated with frequent intrusion attempts.
APF

blocks all traffic from such addresses.




Applications for rootkit detection and removal (chkrootkit and rootkit
-
hunter)
were installed
and configured.



A UNIX au
dit program was configured to record any login failures or suspicious activity on the
CentOS server. Logs can be ac
cessed though the aureport tool, which allows the administrator
to specify certain events, such as failed user logins or suspicious system c
alls, to view.



User password policies were updated to force users to change passwords with a specified
regularity.



Syslog was configured so that, given a remote server, remote logging is possible.



A security auditing tool called Lynis was installed. Lynis

executes a wide range of tests that
evaluate the system’
s security hardening level by looking for security issues in the system and its
installed software.
A
well
-
hardened security rating

was obtained from Lynis.


Intrusion Detection

For this report, I will assume that the topic of intrusion detection involves the automated response to
unusual activity that might constitute an intrusion. While the implementation of intelligent intrusion
detection currently receives considerable resear
ch, smaller scale detection methods can be
Hand
12


implemented that would provide excellent results in the scope of this project’s database. I have not
implemented an such intelligent intrusion detection tools, as effective tools would require a semester of
resea
rch and development to create. However, I have developed several ideas for intrusion detection
that are realistic to implement.



Unusual queries should be flagged
,

and the administrator should be notified. Unusual queries
include queries executed by non s
uper
-
user database users that are not part of th
e client
applications. No user

should ever have access to any query not built into the applications.



Spikes in activity for administrators should be logged. This means that a query frequency over a
certain
threshold should be logged regularly.



Activities such as user deletions should be logged, and any spike in deletion activity should be
logged and notice sent to administrators immediately.


Hand
13


Application Screenshots

Screenshots are taken from the programs ru
nning on the Windows 7 64
-
bit operating system.

Client Application

Login


Main


Administrator Application

Login


Hand
14


Start


New Course


User Manager


Hand
15


New Section


Edit CLO




Hand
16


Concluding Remarks

Design Strengths and Accomplishments

The database that I designed for this project succeeded at
implementing

the important
aspects

of
relational database design.




This design

allows for multiple views that fit the demands of different user types, from everyday
clients to database administrat
ors.

These views were used in creating two types of applications
(user and administrator applications) with the same database.



Data consistency is maintained through a series of explicit relationships between different tables
in the database. Inconsisten
cies in the given input data were identified, as detailed in the
Importing the Data section, because of the constraints that had been set on the database prior
to data entry.



The task of securing the CentOS distribution installation was an excellent opport
unity to learn
the basics of Linux
-
based server administration, and my list of security measures in the System
Security and Firewall Management section shows the variety of administration tasks that I
explored.

The end result is a secure, robust database
server.



The data entry and application develop allowed me to become acquainted with PostgreSQL and
its C# .NET development tools. The addition of PostgreSQL tasks to my skills was necessary, as I
had only worked with MySQL previously, and PostgreSQL is a
popular database management
system.


Design Weaknesses and Incomplete Tasks

Certain areas of any database design can always be improved, and there are a few such areas for this
project.



There were not many steps taken to optimize the database transactions.

Denormalization and
query tuning could be done to improve the database’s efficiency.



This design could benefit from more readability, as the many similar and shared attribute names
get confusing and require constant reference to the database schema in wr
iting queries.



Advanced PostgreSQL tasks such as automatic table audit triggers and scheduled remote
backups need to be implemented.

Final Words

In doing this project, I completed all of the tasks that I set out to accomplish. The opportunity to gain
work
ing experience in database design and server administration was exciting, and I took as many
chances as I could find to expand my
knowledge in these two fields
.

I am certain that
this new
knowledge will be valuable in future endeavors.