Sept 15 2009

bubblemessengerSecurity

Nov 5, 2013 (3 years and 1 month ago)

65 views

ARI E D.
J ONES

(
AJ
)

PRI NCI PAL TECHNOLOGY MANAGER

Database Normalization


Sept 15
th

2009

Why you should listen to AJ



Graduate degree: Computational Physics


20+ years of programming experience



10+ years experience in database
platforms



SQL Server



Sybase



Oracle



MySQL




Postgres



DB2



4 time author


What Is Normalization?


Eliminating redundant data within the database to
make it as efficient as possible


Efficiency is usually considered lowered complexity


** Mulligan: Efficiency does not always equal better
performance, nor does it necessarily equate to
efficient query processing.

Normal Forms


Basis of normalization strategy


Created by E.F.
Codd

waaaaay

back when….


1
st

Normal Form (1NF)


2
nd

Normal Form (2NF)


3
RD

Normal Form (3NF)


Nth Normal Form



First Normal Form (1NF)


Every entity has an attribute or set of attributes that
constitute a primary key


Natural versus Surrogate Keys


Each attribute can have only one value


NO single instance may have multiple values for a
given attribute

Example Not 1NF

Author

Title

Title2

Genre

Release

Date

Herbert
Jefferies

Entity
Framework

4.0

NULL

IT

2/1/2009

John Jones

ASP.NET

NULL

IT

1/1/2008

Arie D. Jones

Learn SQL in 1
Hour a Day

Live

Lessons:
SQL

IT

1/1/2009

Shakespeare

A Midsummer

Night’s Dream

乕䱌

佬搠却畦O

ㄯㄯㄶ㠶

R数敡e敤⁄ 瑡tC潬畭us

1NF

Book_ID

Author

Title

Genre

Release

Date

1

Herbert
Jefferies

Entity
Framework

4.0

IT

2/1/2009

2

John Jones

ASP.NET

IT

1/1/2008

3

Arie D. Jones

Learn SQL in
1 Hour a Day

IT

1/1/2009

4

Arie D. Jones

Live Lessons
:
SQL

IT

1/1/2009

5

Shakespeare

A Midsummer

Night’s Dream

佬搠却畦O

ㄯㄯㄶ㠶

W攠桡v攠湯e
-
畮楱i攠v慬略猠a漠o攠慰灬礠愠
灲業慲礠ke礠

1NF


So now each row is uniquely distinct


No repeating attributes



Second Normal Form (2NF)


Builds on 1NF


Plus, a primary key uniquely identifies the non
-
key
attributes in the row.


So basically we are trying to put attributes that are not
directly reliant on the primary key are moved out of
the table.


These types of relationships are referred to as
Partial
Dependencies


1NF


2NF

Book_ID

Author

Title

Genre

Release

Date

1

Herbert
Jefferies

Entity
Framework

4.0

IT

2/1/2009

2

John Jones

ASP.NET

IT

1/1/2008

3

Arie D. Jones

Learn SQL in
1 Hour a Day

IT

1/1/2009

4

Arie D. Jones

Live Lessons
:
SQL

IT

1/1/2009

5

Shakespeare

A Midsummer

Night’s Dream

佬搠却畦O

ㄯㄯㄶ㠶

乯N 摥灥湤慮琠䍯n畭us

Fix = 2NF

Author
ID

Author

Genre

1

Herbert
Jefferies

IT

2

John
Jones

IT

3

Arie D.
Jones

IT

4

Shakesp
eare

Old Stuff

BookID

Title

Author
ID

Release

Date

1

Entity
Framework

4.0

1

2/1/2005

2

ASP.NET

2

1/1/2008

3

Learn SQL in 1
Hour a Day

3

1/1/2009

4

Live

Lessons:
SQL

3

1/1/2009

5

A Midsummer

Night’s Dream



ㄯㄯㄶ㠶

Third Normal Form (3NF)


3NF is very much like 2NF


You want to now remove
Transitive Dependencies


The are similar to partial dependencies but rely on
another non
-
key attribute that is dependant on the
primary key


So removing the attribute and putting it in a separate
entity solves the issue…just like 2NF


3NF?

Author
ID

Author

Genre

1

Herbert
Jefferies

IT

2

John
Jones

IT

3

Arie D.
Jones

IT

4

Shakesp
eare

Old Stuff

BookID

Title

Author
ID

Release

Date

1

Entity
Framework

4.0

1

2/1/2005

2

ASP.NET

2

1/1/2008

3

Learn SQL in 1
Hour a Day

3

1/1/2009

4

Live

Lessons:
SQL

3

1/1/2009

5

A Midsummer

Night’s Dream



ㄯㄯㄶ㠶

NORMALIZATION VS. DENORMALIZATION

What’s the big deal?


Normalization
means reducing duplicate data by using


Denormalization

means the opposite, which is
deliberately duplicating data in one or more
structures.


Each has specific effects on your databases
performance

NORMALIZATION VS. DENORMALIZATION

What’s the big deal?


Normalization
improves entering data into the tables
because it reduces the amount of data that needs to be
updated.


Denormalization

improves the selecting of tables
because it reduces the number of tables needed to be
accessed by any particular query


Normalization versus
Denormalization


When do you use them?


Dependent upon the system


Normalization


Highly transactional systems


Space is a consideration


Width of rows a consideration


Denormalization



Reporting systems or rarely updated


Data warehousing
-
> throw these rules out the window



Does this matter?


Does this equate with current technologies


Cloud computing


Entity Framework


SQL Server 2011




Answer: YES!

Conclusion

Slides can be found on my blog

http://www.programmersedge.com


Email: arie.jones@perptech.com

Questions & Open Discussion