Cơ sở dữ liệu - Functional dependencies

For a set X of attributes, we call the closure of X (with respect to a set of functional dependencies F), noted X+, the maximum set of attributes such that XX+ (as a consequence of F) Consider the relation scheme R(A,B,C,D) with functional dependencies {A}{C} and {B}{D}.  {A}+ = {A,C}  {B}+ = {B,D}  {C}+={C}  {D}+={D}  {A,B}+ = {A,B,C,D}

22 trang | Chia sẻ: huyhoang44 | Lượt xem: 887 | Lượt tải: 0

Bạn đang xem trước 20 trang tài liệu Cơ sở dữ liệu - Functional dependencies, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên

Functional Dependencies 2 Functional Dependencies (FD) - Definition Let R be a relation scheme and X, Y be sets of attributes in R. A functional dependency from X to Y exists if and only if:  For every instance of |R| of R, if two tuples in |R| agree on the values of the attributes in X, then they agree on the values of the attributes in Y We write X  Y and say that X determines Y Example on Student (sid, name, supervisor_id, specialization):  {supervisor_id}  {specialization} means  If two student records have the same supervisor (e.g., Dimitris), then their specialization (e.g., Databases) must be the same  On the other hand, if the supervisors of 2 students are different, we do not care about their specializations (they may be the same or different). Sometimes, I omit the brackets for simplicity:  supervisor_id  specialization 3 Trivial FDs A functional dependency X  Y is trivial if Y is a subset of X  {name, supervisor_id}  {name}  If two records have the same values on both the name and supervisor_id attributes, then they obviously have the same name.  Trivial dependencies hold for all relation instances A functional dependency X  Y is non-trivial if YX =   {supervisor_id}  {specialization}  Non-trivial FDs are given implicitly in the form of constraints when designing a database.  For instance, the specialization of a students must be the same as that of the supervisor.  They constrain the set of legal relation instances. For instance, if I try to insert two students under the same supervisor with different specializations, the insertion will be rejected by the DBMS 4 Functional Dependencies and Keys A FD is a generalization of the notion of a key. For Student (sid, name, supervisor_id, specialization), we write: {sid}  {name, supervisor_id, specialization}  The sid determines all attributes (i.e., the entire record)  If two tuples in the relation student have the same sid, then they must have the same values on all attributes.  In other words they must be the same tuple (since the relational model does not allow duplicate records) 5 Superkeys and Candidate Keys A set of attributes that determine the entire tuple is a superkey  {sid, name} is a superkey for the student table.  Also {sid, name, supervisor_id} etc. A minimal set of attributes that determines the entire tuple is a candidate key  {sid, name} is not a candidate key because I can remove the name.  sid is a candidate key – so is HKID (provided that it is stored in the table). If there are multiple candidate keys, the DB designer chooses designates one as the primary key. 6 Reasoning about Functional Dependencies It is sometimes possible to infer new functional dependencies from a set of given functional dependencies  independently from any particular instance of the relation scheme or of any additional knowledge Example: From {sid}  {first_name} and {sid} {last_name} We can infer {sid}  {first_name, last_name} 7 Armstrong’s Axioms Be X, Y, Z be subset of the relation scheme of a relation R Reflexivity: If YX, then XY (trivial FDs)  {name, supervisor_id}{name} Augmentation: If XY , then XZYZ  if {supervisor_id} {spesialization} ,  then {supervisor_id, name}{spesialization, name} Transitivity: If XY and YZ, then XZ  if {supervisor_id} {spesialization} and {spesialization} {lab}, then {supervisor_id}{lab} 8 Properties of Armstrong’s Axioms Armstrong’s axioms are sound (i.e., correct) and complete (i.e., they can produce all possible FDs) Example: Transitivity Let X, Y, Z be subsets of the relation R If XY and YZ, then XZ Proof of soundness: Assume two tuples T1 and T2 of |R| are such that, for all attributes in X, T1.X = T2.X. We want to prove that if transitivity holds and T1.X = T2.X, then indeed T1.Z = T2.Z  since XY and T1.X = T2.X then, T1.Y = T2.Y  since YZ and T1.Y = T2.Y then T1.Z = T2.Z 9 Additional Rules based on Armstrong’s axioms Armstrong’s axioms can be used to produce additional rules that are not basic, but useful: Weak Augmentation rule: Let X, Y, Z be subsets of the relation R If XY , then XZY Proof of soundness for Weak Augmentation If XY (1) Then by Augmentation XZYZ (2) And by Reflexivity YZ Y because Y  YZ (3) Then by Transitivity of (1) and (2) we have XZ  Y Other useful rules: If X  Y and X  Z, then X  YZ (union) If X  YZ, then X  Y and X  Z (decomposition) If X  Y and ZY  W, then ZX  W (pseudotransitivity) 10 Closure of a Set of Functional Dependencies For a set F of functional dependencies, we call the closure of F, noted F+, the set of all the functional dependencies that can be derived from F (by the application of Armstrong’s axioms).  Intuitively, F+ is equivalent to F, but it contains some additional FDs that are only implicit in F. Consider the relation scheme R(A,B,C,D) with F = {{A} {B},{B,C} {D}} F+ = { {A} {A}, {B}{B}, {C}{C}, {D}{D}, {A,B}{A,B}, [], {A}{B}, {A,B}{B}, {A,D}{B,D}, {A,C}{B,C}, {A,C,D}{B,C,D}, {A} {A,B}, {A,D}{A,B,D}, {A,C}{A,B,C}, {A,C,D}{A,B,C,D}, {B,C} {D}, [], {A,C} {D}, []} 11 Finding Keys Example: Consider the relation scheme R(A,B,C,D) with functional dependencies {A}{C} and {B}{D}. Is {A,B} a candidate key? For {A,B} to be a candidate key, it must  determine all attributes (i.e., be a superkey)  be minimal {A,B} is a superkey because:  {A}{C}  {A,B}{A,B,C} (augmentation by AB)  {B}{D}  {A,B,C}{A,B,C,D} (augmentation by A,B,C)  We obtain {A,B}{A,B,C,D} (transitivity) {A,B} is minimal because neither {A} nor {B} alone are candidate keys 12 Closure of a Set of Attributes For a set X of attributes, we call the closure of X (with respect to a set of functional dependencies F), noted X+, the maximum set of attributes such that XX+ (as a consequence of F) Consider the relation scheme R(A,B,C,D) with functional dependencies {A}{C} and {B}{D}.  {A}+ = {A,C}  {B}+ = {B,D}  {C}+={C}  {D}+={D}  {A,B}+ = {A,B,C,D} 13 Algorithm for Computing the Closure of a Set of Attributes Input:  R a relation scheme  F a set of functional dependencies  X  R (the set of attributes for which we want to compute the closure) Output:  X+ the closure of X w.r.t. F X(0) := X Repeat X(i+1) := X(i)  Z, where Z is the set of attributes such that  there exists YZ in F, and  Y  X(i) Until X(i+1) := X(i) Return X(i+1) 14 Closure of a Set of Attributes: Example R = {A,B,C,D,E,G} F = { {A,B}{C}, {C}{A}, {B,C}{D}, {A,C,D}{B}, {D}{E,G}, {B,E}{C}, {C,G}{B,D}, {C,E}{A,G}} X = {B,D} X(0) = {B,D}  {D}{E,G}, X(1) = {B,D,E,G},  {B,E}{C} X(2) = {B,C,D,E,G},  {C,E}{A,G} X(3) = {A,B,C,D,E,G} X(4) = X(3) 15 Closure of a Set of Attributes: Example • R = {A,B,C,D,E,G,H} • F = { {B}{A}, {D,A}{C,E}, {D}{H}, {G,H}{C}, {A,C}{D}} • X = {A,C} ? • R = {A,B,C,D,E,F} • F = { {A}{B}, {A}{C}, {C,D}{E}, {C,D}{F}, {B}{E}} • X = {A,C} • Which of the following functional dependencies does not hold: • {A}{E} • CD -> EF • AD -> F • B -> CD 16 Uses of Attribute Closure There are several uses of the attribute closure algorithm: Testing for superkey  To test if X is a superkey, we compute X+, and check if X+ contains all attributes of R. X is a candidate key if none of its subsets is a key. Testing functional dependencies  To check if a functional dependency X  Y holds (or, in other words, is in F+), just check if Y  X+. Computing the closure of F  For each subset X  R, we find the closure X+, and for each Y  X+, we output a functional dependency X  Y. Computing if two sets of functional dependencies F and G are equivalent, i.e., F+ = G+  For each functional dependency YZ in F  Compute Y+ with respect to G  If Z  Y+ then YZ is in G+  And vice versa 17 Redundancy of FDs Sets of functional dependencies may have redundant dependencies that can be inferred from the others  {A}{C} is redundant in: {{A}{B}, {B}{C},{A} {C}} Parts of a functional dependency may be redundant  Example of extraneous/redundant attribute on RHS: {{A}{B}, {B}{C}, {A}{C,D}} can be simplified to {{A}{B}, {B}{C}, {A}{D}} (because {A}{C} is inferred from {A}  {B}, {B}{C})  Example of extraneous/redundant attribute on LHS: {{A}{B}, {B}{C}, {A,C}{D}} can be simplified to {{A}{B}, {B}{C}, {A}{D}} (because of {A}{C}) 18 Canonical Cover A canonical cover for F is a set of dependencies Fc such that  F and Fc,are equivalent  Fc contains no redundancy  Each left side of functional dependency in Fc is unique.  For instance, if we have two FD XY, XZ, we convert them to XYZ. Algorithm for canonical cover of F: repeat Use the union rule to replace any dependencies in F X1  Y1 and X1  Y2 with X1  Y1 Y2 Find a functional dependency X  Y with an extraneous attribute either in X or in Y If an extraneous attribute is found, delete it from X  Y until F does not change Note: Union rule may become applicable after some extraneous attributes have been deleted, so it has to be re-applied 19 Example of Computing a Canonical Cover R = (A, B, C) F = {A  BC B  C A  B AB  C} Combine A  BC and A  B into A  BC  Set is now {A  BC, B  C, AB  C} A is extraneous in AB  C because of B  C.  Set is now {A  BC, B  C} C is extraneous in A  BC because of A  B and B  C. The canonical cover is: A  B B  C 20 Pitfalls in Relational Database Design Functional dependencies can be used to refine ER diagrams or independently (i.e., by performing repetitive decompositions on a "universal" relation that contains all attributes). Relational database design requires that we find a “good” collection of relation schemas. A bad design may lead to  Repetition of Information.  Inability to represent certain information. Design Goals:  Avoid redundant data  Ensure that relationships among attributes are represented  Facilitate the checking of updates for violation of database integrity constraints. 21 Example of Bad Design Consider the relation schema: Lending-schema = (branch-name, branch-city, assets, customer-name, loan-number, amount) where: {branch-name}{branch-city, assets} Bad Design  Wastes space. Data for branch-name, branch-city, assets are repeated for each loan that a branch makes  Complicates updating, introducing possibility of inconsistency of assets value  Difficult to store information about a branch if no loans exist. Can use null values, but they are difficult to handle. 22 Usefulness of FDs Use functional dependencies to decide whether a particular relation R is in “good” form. In the case that a relation R is not in “good” form, decompose it into a set of relations {R1, R2, ..., Rn} such that  each relation is in good form  the decomposition is a lossless-join decomposition  if possible, preserve dependencies In our example the problem occurs because there FDs ({branch-name}{branch-city, assets}) where the LHS is not a key Solution: decompose the relation schema Lending-schema into:  Branch-schema = (branch-name, branch-city,assets)  Loan-info-schema = (customer-name, loan-number, branch-name, amount)

Các file đính kèm theo tài liệu này:

chapter_6_funtional_dependencies_1162.pdf