Cơ sở dữ liệu - Normalization for relational databases
Uses the concepts of FDs, primary key
Definitions:
◦ Prime attribute - attribute that is member
of the primary key K
◦ Full functional dependency - a FD Y Z
where removal of any attribute from Y means
the FD does not hold any more
27 trang |
Chia sẻ: huyhoang44 | Lượt xem: 1012 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Cơ sở dữ liệu - Normalization for relational databases, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
Normalization for
Relational Databases
Informal Design Guidelines for
Relational Databases
Relational database design: The grouping of
attributes to form "good" relation schemas
Two levels of relation schemas:
◦ The logical "user view" level
◦ The storage "base relation" level
Design is concerned mainly with base relations
Criteria for "good" base relations:
◦ Discuss informal guidelines for good relational design
◦ Discuss formal concepts of functional dependencies and
normal forms 1NF 2NF 3NF BCNF
Semantics of the Relation
Attributes
Each tuple in a relation should represent one
entity or relationship instance
◦ Only foreign keys should be used to refer to other
entities
◦ Entity and relationship attributes should be kept apart
as much as possible
◦ Design a schema that can be explained easily relation
by relation. The semantics of attributes should be easy
to interpret.
Redundant Information in
Tuples and Update Anomalies
Mixing attributes of multiple entities may
cause problems
◦ Information is stored redundantly wasting
storage
◦ Problems with update anomalies:
Insertion anomalies
Deletion anomalies
Modification anomalies
EXAMPLE OF AN UPDATE
ANOMALY
Consider the relation:
EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours)
◦ Update Anomaly
Changing the name of project number P1 from “Billing” to
“Customer-Accounting” may cause this update to be made for
all 100 employees working on project P1
◦ Insert Anomaly
Cannot insert a project unless an employee is assigned to .
Inversely- Cannot insert an employee unless he/she is assigned
to a project.
EXAMPLE OF AN UPDATE
ANOMALY (2)
◦ Delete Anomaly
When a project is deleted, it will result in deleting all the
employees who work on that project. Alternately, if an employee is
the sole employee on a project, deleting that employee would
result in deleting the corresponding project.
Design a schema that does not suffer from the
insertion, deletion and update anomalies. If there
are any present, then note them so that
applications can be made to take them into
account
Null Values in Tuples
Relations should be designed such that their
tuples will have as few NULL values as possible
◦ Attributes that are NULL frequently could be placed
in separate relations (with the primary key)
◦ Reasons for nulls:
◦ a. attribute not applicable or invalid
◦ b. attribute value unkown (may exist)
◦ c. value known to exist, but unavailable
Introduction to Normalization
Normalization: Process of decomposing
unsatisfactory "bad" relations by breaking up
their attributes into smaller relations
Normal form: Condition using keys and FDs
of a relation to certify whether a relation
schema is in a particular normal form
◦ 2NF, 3NF, BCNF based on keys and FDs of a relation
schema
◦ 4NF based on keys, multi-valued dependencies
First Normal Form
Disallows composite attributes,
multivalued attributes, and nested
relations; attributes whose values for an
individual tuple are non-atomic
Considered to be part of the definition of
relation
First Normal Form
Multi valued attributes
◦ Create more tuples in the relation.
◦ Create a new relation with foreign keys to
this relation
Composite attributes
◦ Create an attribute for each sub-attribute.
Non-atomic
Second Normal Form
Uses the concepts of FDs, primary key
Definitions:
◦ Prime attribute - attribute that is member
of the primary key K
◦ Full functional dependency - a FD Y Z
where removal of any attribute from Y means
the FD does not hold any more
Examples
Second Normal Form
1NF + no partial dependency
Partial dependency
◦ An attribute is dependent of a PART of the
key.
◦ Only interesting if the key is composite.
What to do? Divide the relation
◦ Partial dependent attributes must have their
own relation.
Examples
Second Normal Form
clientNo, propertyNo pAddress, rentStart,
rentFinish, rent, ownerNo, oName IS NOT FULL FD
since
propertyNo pAddress, rent, ownerNo, oName
1NF
propertyNo → pAdress
PK = {clientNo, propertyNo)
1NF
2NF
Third Normal Form
A relation schema R is in third normal
form (3NF) if it is in 2NF and no non-
prime attribute A in R is transitively
dependent on the primary key
3rd Normal Form
:
1. propertyNo pAddress,
rent, ownerNo, oName
propertyNo ownerNo
2. ownerNo oName
: propertyNo oName
PropertyOwner không
đạt 3NF
BCNF (Boyce-Codd Normal
Form)
A relation schema R is in Boyce-Codd
Normal Form (BCNF) if whenever an FD X
A holds in R, then X is a superkey of R
◦ Each normal form is strictly stronger than the
previous one:
Every 2NF relation is in 1NF
Every 3NF relation is in 2NF
Every BCNF relation is in 3NF
◦ There exist relations that are in 3NF but not in
BCNF
◦ The goal is to have each relation in BCNF (or 3NF)
3NF
Primary key
Primary key
BCNF
{Student,course} Instructor
Instructor Course
Decomposing into 2 schemas
◦ {Student,Instructor} {Student,Course}
◦ {Course,Instructor} {Student,Course}
◦ {Course,Instructor} {Instructor,Student}
Example
Given the relation
Book(Book_title, Authorname, Book_type,
Listprice, Author_affil, Publisher)
The FDs are
Book_title Publisher, Book_type
Book_type Listprice
Authorname Author_affil
Các file đính kèm theo tài liệu này:
- chapter_7_normalization_6918.pdf