Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Database Management Systems
Database Normalization
Malay Bhattacharyya
Assistant Professor
Machine Intelligence Unit
Indian Statistical Institute, Kolkata
January, 2019
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
1 Overview
2 1NF
3 2NF
4 3NF
5 BCNF
6 EKNF
7 4NF
8 5NF
9 DKNF
10 6NF
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Why normalization?
Redundancy in a database denotes the repetition of stored data
Redundancy might cause various anomalies and problems in
storage requirements:
Update anomalies: If one copy of such repeated data is
updated, all copies need to be updated to prevent
inconsistency.
Insertion anomalies: It may be impossible to store certain
information without storing some other, unrelated information.
Deletion anomalies: It may be impossible to delete certain
information without losing some other, unrelated information.
Increasing storage requirements: The storage requirements
may increase over time.
These issues can be addressed by decomposing the database
normalization forces this!!!
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Why normalization?
Redundancy in a database denotes the repetition of stored data
Redundancy might cause various anomalies and problems in
storage requirements:
Update anomalies: If one copy of such repeated data is
updated, all copies need to be updated to prevent
inconsistency.
Insertion anomalies: It may be impossible to store certain
information without storing some other, unrelated information.
Deletion anomalies: It may be impossible to delete certain
information without losing some other, unrelated information.
Increasing storage requirements: The storage requirements
may increase over time.
These issues can be addressed by decomposing the database
normalization forces this!!!
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
An overview of different normal forms in the literature
Normal Form Details Reference
1NF (Codd (1970),
Date (2006))
Domains should be atomic/At least one can-
didate key
[1, 9]
2NF (Co dd (1971)) No non-prime attribute is functionally depen-
dent on a proper subset of any candidate key
[2]
3NF (Codd (1971),
Zaniolo (1982))
Every non-prime attribute is non-transitively
dependent on every candidate key
[2, 7]
BCNF (Codd
(1974))
Every non-trivial functional dependency is a
dependency on a superkey
[3]
EKNF (Zaniolo
(1982))
Every non-trivial functional dependency is ei-
ther the dependency of an elementary key at-
tribute or a dependency on a superkey
[7]
4NF (Fagin (1977)) Every non-trivial multi-valued dependency is
a dependency on a superkey
[4]
5NF (Fagin (1979)) Every non-trivial join dependency is implied
by the superkeys
[5]
DKNF (Fagin
(1981))
Every constraint on the table is a logical con-
sequence of the domain and key constraints
[6]
6NF (Date et al.
(2002))
No non-trivial join dependencies at all (w.r.t
generalized join)
[8]
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Motivations behind normalization
Normal Form Basic Motivation
1NF Removing non-atomicity
2NF Removing partial dependency (Part of key attribute
Non-key attribute)
3NF Removing transitive dependency (Non-key attribute
Non-key attribute)
BCNF Removing any kind of redundancy
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Denormalization
Denormalization is the process of converting a normalized
schema to a non-normalized one
Note: Designers use denormalization to tune performance of
systems to support time-critical operations.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Denormalization
Denormalization is the process of converting a normalized
schema to a non-normalized one
Note: Designers use denormalization to tune performance of
systems to support time-critical operations.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
1
st
Normal Form
The domain (or value set) of an attribute defines the set of values
it might contain.
A domain is atomic if elements of the domain are considered to be
indivisible units.
Company Make
Maruti WagonR, Ertiga
Honda City
Tesla RAV4
Toyota RAV4
BMW X1
Company Make
Maruti WagonR, Ertiga
Honda City
Tesla, Toyota RAV4
BMW X1
Only Company has atomic domain None of the attributes have atomic domains
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
1
st
Normal Form
Definition (1
st
Normal Form (1NF))
A relational schema R is in 1NF iff the domains of all attributes in
R are atomic.
The advantages of 1NF are as follows:
It eliminates redundancy
It eliminates repeating groups.
Note: In practice, 1NF includes a few more practical constraints
like each attribute must be unique, no tuples are duplicated, and
no columns are duplicated.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
1
st
Normal Form
The following relation is not in 1NF because the attribute Model is
not atomic.
Company Country Make Model Distributor
Maruti India WagonR LXI, VXI Carwala
Maruti India WagonR LXI Bhalla
Maruti India Ertiga VXI Bhalla
Honda Japan City SV Bhalla
Tesla USA RAV4 EV CarTrade
Toyota Japan RAV4 EV CarTrade
BMW Germany X1 Expedition CarTrade
We can convert this relation into 1NF in two ways!!!
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
1
st
Normal Form
Approach 1: Break the tuples containing non-atomic values into
multiple tuples.
Company Country Make Model Distributor
Maruti India WagonR LXI Carwala
Maruti India WagonR VXI Carwala
Maruti India WagonR LXI Bhalla
Maruti India Ertiga VXI Bhalla
Honda Japan City SV Bhalla
Tesla USA RAV4 EV CarTrade
Toyota Japan RAV4 EV CarTrade
BMW Germany X1 Expedition CarTrade
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
1
st
Normal Form
Approach 2: Decompose the relation into multiple relations.
Company Country Make
Maruti India WagonR
Maruti India Ertiga
Honda Japan City
Tesla USA RAV4
Toyota Japan RAV4
BMW Germany X1
Make Model Distributor
WagonR LXI Carwala
WagonR VXI Carwala
WagonR LXI Bhalla
Ertiga VXI Bhalla
City SV Bhalla
RAV4 EV CarTrade
RAV4 EV CarTrade
X1 Exp edition CarTrade
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Partial dependency
The partial dependency X Y holds in schema R if there is a
Z X such that Z Y .
We say Y is partially dependent on X if and only if there is a
proper subset of X that satisfies the dependency.
Note: The dependency A B implies if the A values are same,
then the B values are also same.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
2
nd
Normal Form
Definition (2
nd
Normal Form (2NF))
A relational schema R is in 2NF if each attribute A in R satisfies
one of the following criteria:
1 A is part of a candidate key.
2 A is not partially dependent on a candidate key.
In other words, no non-prime attribute (not a part of any candidate
key) is dependent on a proper subset of any candidate key.
Note: A candidate key is a superkey for which no proper subset is
a superkey, i.e. a minimal superkey.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
2
nd
Normal Form
The following relation is in 1NF but not in 2NF because Country
is a non-prime attribute that partially depends on Company, which
is a proper subset of the candidate key {Company, Make, Model,
Distributor}.
Company Country Make Model Distributor
Maruti India WagonR LXI Carwala
Maruti India WagonR VXI Carwala
Maruti India WagonR LXI Bhalla
Maruti India Ertiga VXI Bhalla
Honda Japan City SV Bhalla
Tesla USA RAV4 EV CarTrade
Toyota Japan RAV4 EV CarTrade
BMW Germany X1 Expedition CarTrade
We can convert this relation into 2NF!!!
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
2
nd
Normal Form
Approach: Decompose the relation into multiple relations.
Company Country
Maruti India
Honda Japan
Tesla USA
Toyota Japan
BMW Germany
Company Make Model Distributor
Maruti WagonR LXI Carwala
Maruti WagonR VXI Carwala
Maruti WagonR LXI Bhalla
Maruti Ertiga VXI Bhalla
Honda City SV Bhalla
Tesla RAV4 EV CarTrade
Toyota RAV4 EV CarTrade
BMW X1 Expedition CarTrade
Note: Each attribute in the left relation is a part of the candidate
key {Company, Country} and in the right relation is a part of the
candidate key {Company, Make, Model, Distributor}.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Functional dependency
The notion of functional dependency generalizes the notion of
superkey. Consider a relation schema R, and let X R and
Y R. The functional dependency X Y holds on schema R if
t1[X ] = t2[X ],
in any legal relation r(R), for all pairs of tuples t1 and t2 in r, then
t1[Y ] = t2[Y ].
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Functional dependency
Armstrong’s axioms:
Reflexivity property: If X is a set of attributes and Y X ,
then X Y holds. (known as trivial functional dependency)
Augmentation property: If X Y holds and γ is a set of
attributes, then γX γY holds.
Transitivity property: If both X Y and Y Z holds,
then X Z holds.
Other properties:
Union property: If X Y holds and X Z holds, then
X YZ holds.
Decomposition property: If X YZ holds, then both
X Y and X Z holds.
Pseudotransitivity property: If X Y and γY Z holds,
then X γ Z holds.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Functional dependency
Armstrong’s axioms:
Reflexivity property: If X is a set of attributes and Y X ,
then X Y holds. (known as trivial functional dependency)
Augmentation property: If X Y holds and γ is a set of
attributes, then γX γY holds.
Transitivity property: If both X Y and Y Z holds,
then X Z holds.
Other properties:
Union property: If X Y holds and X Z holds, then
X YZ holds.
Decomposition property: If X YZ holds, then both
X Y and X Z holds.
Pseudotransitivity property: If X Y and γY Z holds,
then X γ Z holds.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Closure of functional dependencies (FDs)
We can find F
+
, the closure of a set of FDs F , as follows:
Initialize F
+
with F
repeat
for each functional dependency f = X Y F
+
do
Apply reflexivity and augmentation properties on f and
include the resulting functional dependencies in F
+
end for
for each pair of functional dependencies f
1
, f
2
F
+
do
if f
1
and f
2
can be combined together using the transitivity
property then
Include the resulting functional dependency in F
+
end if
end for
until A
+
does not further change
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Closure of functional dependencies (FDs) An example
Consider a relation R = <UVWXYZ> and the set of FDs = {U
V, U W, WX Y, WX Z, V Y}. Let us compute some
non-trivial FDs that can be obtained from this.
By applying the augmentation property, we obtain
1 UX WX (from U W)
2 WX WXZ (from WX Z)
3 WXZ YZ (from WX Y)
By applying the transitivity property, we obtain
1 U Y (from U V and V Y)
2 UX Z (from UX WX and WX Z)
3 WX YZ (from WX WXZ and WXZ YZ)
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Closure of attribute sets
We can find A
+
, the closure of a set of attributes A, as follows:
Initialize A
+
with A
repeat
for each functional dependency f = X Y F
+
do
if X A
+
then
A
+
A
+
Y
end if
end for
until A
+
does not further change
Note: The closure is defined as the set of attributes that are
functionally determined by A under a set of FDs F .
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Closure of attribute sets
The usefulness of finding attribute closure is as follows:
Testing for superkey
Compute A
+
and check if R A
+
Testing functional dependencies
To check if an FD X Y holds, just check if Y X
+
Same for checking if X Y is in F
+
for a given F
Computing closure of F
For each A A(R), we find the closure A
+
, and for each
S A
+
, we output a functional dependency A S
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Closure of attribute sets An example
Consider a relation R = <UVWXYZ> and the set of FDs = {U
V, U W, WX Y, WX Z, V Y}. Let us compute UX
+
,
i.e., the closure of UX.
Initially UX
+
= UX
Then we have UX
+
= UVX (as U V and U UX)
Then we have UX
+
= UVWX (as U W and U UVX)
Then we have UX
+
= UVWXY (as WX Y and WX
UVWX)
Finally, we have UX
+
= UVWXYZ (as WX Z and WX
UVWXY)
Note: The closure of UX covers all the attributes in R.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Decomposition of a relation
If a relation is not in a desired normal form, it can be decomposed
into multiple relations such that each decomposed relation satisfies
the required normal form.
Suppose a relation R consists of a set of attributes
A(R) = {A
1
, A
2
, . . . , A
n
}. A decomposition of R replaces R by a
set of (two or more) relations {R
1
, . . . , R
m
} such that both the
following conditions hold:
i : A(R
i
) A(R)
A(R
1
) · · · A(R
m
) = A(R)
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Decomposition criteria
The decomposition of a relation might aim to satisfy different
criteria as listed below:
Preservation of the same relation through join (lossless-join)
Dependency preservation
Repetition of information
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Preservation of the same relation through join
.
X Y Z
x
1
y
1
z
1
x
1
y
2
z
2
&
X Y
x
1
y
1
x
1
y
2
X Z
x
1
z
1
x
1
z
2
X Z
x
1
z
1
x
1
z
2
Y Z
y
1
z
1
y
2
z
2
& . & .
X Y Z
x
1
y
1
z
1
x
1
y
1
z
2
x
1
y
2
z
1
x
1
y
2
z
2
X Y Z
x
1
y
1
z
1
x
1
y
2
z
2
Lossy-join decomposition Lossless-join decomposition
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Testing for lossless-join decomposition
A decomposition of R into {R
1
, R
2
} is lossless-join, iff
A(R
1
) A(R
2
) A(R
1
) or A(R
1
) A(R
2
) A(R
2
) in F
+
.
Consider the example of a relation R = <UVWXY> and the set of
FDs = {U VW, WX Y, V X, Y U}.
Note that, the decomposition R
1
= <UVW> and R
2
= <WXY>
is not lossless-join because R
1
R
2
= W, and W is neither a key
for R
1
nor for R
2
.
However, the decomposition R
1
= <UVW> and R
2
= <UXY> is
lossless-join because R
1
R
2
= U, and U is a key for R
1
.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Dependency preservation
The decomposition of a relation R with respect to a set of FDs F
replaces R with a set of (two or more) relations {R
1
, . . . , R
m
} with
FDs {F
1
, . . . , F
m
} such that F
i
is the subset of dependencies in F
+
(the closure of F) that include only the attributes in R
i
.
The decomposition is dependency preserving iff (
i
F
i
)
+
= F
+
.
Note: Through dependency preserving decomposition, we want to
minimize the cost of global integrity constraints based on FDs’
(i.e., avoid big joins in assertions).
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Testing for dependency preserving decomposition
Consider the example of a relation R = <XYZ>, having the key
X, and the set of FDs = {X Y, Y Z, X Z}.
Note that, the decomposition R
1
= <XY> and R
2
= <XZ> is
lossless-join but not dependency preserving because F
1
= {X
Y} and F
2
= {X Z} incur the loss of the FD {Y Z},
resulting into (F
1
F
2
)
+
6= F
+
.
However, the decomposition R
1
= <XY> and R
2
= <YZ> is
lossless-join and also dependency preserving because
F
1
= {X Y } and F
2
= {Y Z}, satisfying (F
1
F
2
)
+
= F
+
.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
3
rd
Normal Form
Definition (3
rd
Normal Form (3NF))
A relational schema R is in 3NF if for every non-trivial functional
dependency X A, one of the following statements is true:
1 X is a superkey of R.
2 A is a part of some key for R.
Note: A superkey is a set of one or more attributes that can
uniquely identify an entity in the entity set.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
3
rd
Normal Form
The following relation is in 2NF but not in 3NF because Country
is a non-prime attribute that depends on Company, which is again
a non-prime attribute. Notably, the key in this relation is {PID}.
PID Company Country Make Model Distributor
P01 Maruti India WagonR LXI Carwala
P02 Maruti India WagonR VXI Carwala
P03 Maruti India WagonR LXI Bhalla
P04 Maruti India Ertiga VXI Bhalla
P05 Honda Japan City SV Bhalla
P06 Tesla USA RAV4 EV CarTrade
P07 Toyota Japan RAV4 EV CarTrade
P08 BMW Germany X1 Expedition CarTrade
We can convert this relation into 3NF!!!
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
3
rd
Normal Form
Approach: Decompose the relation into multiple relations.
Company Country
Maruti India
Honda Japan
Tesla USA
Toyota Japan
BMW Germany
PID Company Make Model Distributor
P01 Maruti WagonR LXI Carwala
P02 Maruti WagonR VXI Carwala
P03 Maruti WagonR LXI Bhalla
P04 Maruti Ertiga VXI Bhalla
P05 Honda City SV Bhalla
P06 Tesla RAV4 EV CarTrade
P07 Toyota RAV4 EV CarTrade
P08 BMW X1 Expedition CarTrade
Note: Each attribute in the left relation is a part of the superkey
{Company, Country} and in the right relation is a part of the
candidate key {ProductID}.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Boyce-Codd Normal Form
Definition (Boyce-Codd Normal Form (BCNF))
A relational schema R is in BCNF if for every non-trivial functional
dependency X A, X is a superkey of R.
Note: A superkey is a set of one or more attributes that can
uniquely identify an entity in the entity set.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Boyce-Codd Normal Form
The following relation is in 3NF but not in BCNF because the
attribute Price depends on non-superkey attributes.
PID Company Make Model Distributor Price
P01 Maruti WagonR LXI Carwala 415K
P02 Maruti WagonR VXI Carwala 470K
P03 Maruti WagonR LXI Bhalla 410K
P04 Maruti Ertiga VXI Bhalla 820K
P05 Honda City SV Bhalla 990K
P06 Tesla RAV4 EV CarTrade 1700K
P07 Toyota RAV4 EV CarTrade 1700K
P08 BMW X1 Expedition CarTrade 3520K
We can convert this relation into BCNF!!!
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Decomposition into BCNF An algorithm
Result := {R} and flag := FALSE
Compute F
+
while NOT flag do
if There is a schema R
i
Result that is not in BCNF then
Let X Y be a non-trivial functional dependency that
holds on R
i
such that (X R
i
) / F
+
and X Y = φ.
Result := (Result R
i
) (R
i
Y ) (X , Y ) // This is
simply decomposing R into R Y and XY provided
X Y in R violates BCNF
else
flag := TRUE
end if
end while
Note: This decomposition process ensures lossless property
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Decomposition into BCNF An algorithm
Result := {R} and flag := FALSE
Compute F
+
while NOT flag do
if There is a schema R
i
Result that is not in BCNF then
Let X Y be a non-trivial functional dependency that
holds on R
i
such that (X R
i
) / F
+
and X Y = φ.
Result := (Result R
i
) (R
i
Y ) (X , Y ) // This is
simply decomposing R into R Y and XY provided
X Y in R violates BCNF
else
flag := TRUE
end if
end while
Note: This decomposition process ensures lossless property
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Decomposition into BCNF An example
Given a relation R = <ABCDPQVZ>, which is not in BCNF,
having the key A and functional dependencies {CP A, BD
P, C B}.
Solution: Let us start with BD P. Based on this, we
decompose R and obtain <ABCDQVZ> and <BDP>. Now
<BDP> is in BCNF (BD is the key).
For C B, <ABCDQVZ> is not in BCNF. Therefore, we have
further decomposition into <ACDQVZ> and <CB>.
Thus, the decomposition <ACDQVZ>, <CB> and <BDP> is a
lossless-join decomposition of R into BCNF.
Alternate solution: Suppose, we start with C B. Then the
relation R would be decomposed into <ACDPQVZ> and <CB>.
The only dependencies that hold over <ACDPQVZ> are CP A
and the key dependency A ACDPQVZ. CP is a key. Hence the
decomposed relations are in BCNF.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Decomposition into BCNF An example
Given a relation R = <ABCDPQVZ>, which is not in BCNF,
having the key A and functional dependencies {CP A, BD
P, C B}.
Solution: Let us start with BD P. Based on this, we
decompose R and obtain <ABCDQVZ> and <BDP>. Now
<BDP> is in BCNF (BD is the key).
For C B, <ABCDQVZ> is not in BCNF. Therefore, we have
further decomposition into <ACDQVZ> and <CB>.
Thus, the decomposition <ACDQVZ>, <CB> and <BDP> is a
lossless-join decomposition of R into BCNF.
Alternate solution: Suppose, we start with C B. Then the
relation R would be decomposed into <ACDPQVZ> and <CB>.
The only dependencies that hold over <ACDPQVZ> are CP A
and the key dependency A ACDPQVZ. CP is a key. Hence the
decomposed relations are in BCNF.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Decomposition into BCNF An example
Given a relation R = <ABCDPQVZ>, which is not in BCNF,
having the key A and functional dependencies {CP A, BD
P, C B}.
Solution: Let us start with BD P. Based on this, we
decompose R and obtain <ABCDQVZ> and <BDP>. Now
<BDP> is in BCNF (BD is the key).
For C B, <ABCDQVZ> is not in BCNF. Therefore, we have
further decomposition into <ACDQVZ> and <CB>.
Thus, the decomposition <ACDQVZ>, <CB> and <BDP> is a
lossless-join decomposition of R into BCNF.
Alternate solution: Suppose, we start with C B. Then the
relation R would be decomposed into <ACDPQVZ> and <CB>.
The only dependencies that hold over <ACDPQVZ> are CP A
and the key dependency A ACDPQVZ. CP is a key. Hence the
decomposed relations are in BCNF.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Decomposition into BCNF An example
Given a relation R = <ABCDPQVZ>, which is not in BCNF,
having the key A and functional dependencies {CP A, BD
P, C B}.
Solution: Let us start with BD P. Based on this, we
decompose R and obtain <ABCDQVZ> and <BDP>. Now
<BDP> is in BCNF (BD is the key).
For C B, <ABCDQVZ> is not in BCNF. Therefore, we have
further decomposition into <ACDQVZ> and <CB>.
Thus, the decomposition <ACDQVZ>, <CB> and <BDP> is a
lossless-join decomposition of R into BCNF.
Alternate solution: Suppose, we start with C B. Then the
relation R would be decomposed into <ACDPQVZ> and <CB>.
The only dependencies that hold over <ACDPQVZ> are CP A
and the key dependency A ACDPQVZ. CP is a key. Hence the
decomposed relations are in BCNF.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Decomposition into BCNF An example
Given a relation R = <ABCDPQVZ>, which is not in BCNF,
having the key A and functional dependencies {CP A, BD
P, C B}.
Solution: Let us start with BD P. Based on this, we
decompose R and obtain <ABCDQVZ> and <BDP>. Now
<BDP> is in BCNF (BD is the key).
For C B, <ABCDQVZ> is not in BCNF. Therefore, we have
further decomposition into <ACDQVZ> and <CB>.
Thus, the decomposition <ACDQVZ>, <CB> and <BDP> is a
lossless-join decomposition of R into BCNF.
Alternate solution: Suppose, we start with C B. Then the
relation R would be decomposed into <ACDPQVZ> and <CB>.
The only dependencies that hold over <ACDPQVZ> are CP A
and the key dependency A ACDPQVZ. CP is a key. Hence the
decomposed relations are in BCNF.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Comments
Note that
BCNF is stronger than 3NF if a schema R is in BCNF then
it is also in 3NF.
3NF is stronger than 2NF if a schema R is in 3NF then it is
also in 2NF.
2NF is stronger than 1NF if a schema R is in 2NF then it is
also in 1NF.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Elementary Key Normal Form
Definition (Elementary Key Normal Form (EKNF))
A relational schema R is in EKNF if for every non-trivial functional
dependency X A, one of the following statements is true:
1 X is a superkey of R.
2 X is an elementary key attribute
Note: A non-trivial functional dependency X Y is an
elementary dependency if there exist no partial dependency. A key
K is elementary key if K Y is an elementary dependency.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Multi-valued dependency
Consider a relation schema R, and let X R and Y R. The
functional dependency X Y holds on schema R if
t1[X ] = t2[X ],
in any legal relation r(R), for all pairs of tuples t1 and t2 in r ,
implies
t1[X ] = t2[X ] = t3[X ] = t4[X ]
t1[Y ] = t3[Y ] and t2[Y ] = t4[Y ]
t1[Z ] = t4[Z ] and t2[Z ] = t3[Z ]
where the two tuples t3 and t4 are also in r and Z denotes
R (X Y ).
Note: The tuples t1, t2, t3 and t4 are not necessarily distinct.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Multi-valued dependency
Consider a relation schema R, and let X R and Y R. The
functional dependency X Y holds on schema R if
t1[X ] = t2[X ],
in any legal relation r(R), for all pairs of tuples t1 and t2 in r ,
implies
t1[X ] = t2[X ] = t3[X ] = t4[X ]
t1[Y ] = t3[Y ] and t2[Y ] = t4[Y ]
t1[Z ] = t4[Z ] and t2[Z ] = t3[Z ]
where the two tuples t3 and t4 are also in r and Z denotes
R (X Y ).
Note: The tuples t1, t2, t3 and t4 are not necessarily distinct.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Visualizing multi-valued dependency
X Y R (X Y )
t1 m
1
...m
i
m
i+1
...m
j
m
j+1
...m
k
t2 m
1
...m
i
n
i+1
...n
i
n
j+1
...n
k
t3 m
1
...m
i
m
i+1
...m
j
n
j+1
...n
k
t4 m
1
...m
i
n
i+1
...n
i
m
j+1
...m
k
An example of X Y
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Visualizing multi-valued dependency
X Y R (X Y )
t1 m
1
...m
i
m
i+1
...m
j
m
j+1
...m
k
t2 m
1
...m
i
n
i+1
...n
i
n
j+1
...n
k
t3 m
1
...m
i
m
i+1
...m
j
n
j+1
...n
k
t4 m
1
...m
i
n
i+1
...n
i
m
j+1
...m
k
An example of X Y
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Visualizing multi-valued dependency
X Y R (X Y )
t1 m
1
...m
i
m
i+1
...m
j
m
j+1
...m
k
t2 m
1
...m
i
n
i+1
...n
i
n
j+1
...n
k
t3 m
1
...m
i
m
i+1
...m
j
n
j+1
...n
k
t4 m
1
...m
i
n
i+1
...n
i
m
j+1
...m
k
An example of X Y
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Inference rules for multi-valued dependency
If X Y holds, then X (R (X Y )) holds.
If X Y holds and W Z, then WX YZ holds.
If X Y and Y Z both holds, then X (Z Y ) holds.
If X Y holds, then X Y holds.
If X Y holds and there exists W such that (a)
W Y = φ, (b) W Z and (c) Y Z, then X Z holds.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
4
th
Normal Form
Definition (4
th
Normal Form (4NF))
A relational schema R is in 4NF if for every non-trivial
multi-valued dependency X A, X is a superkey of R.
Note: A superkey is a set of one or more attributes that can
uniquely identify an entity in the entity set.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
4
th
Normal Form
The following relation is in 4NF.
Name Age Codeword Media
Irfan 28 abc News
Irfan 40 xyz News
Irfan 40 abc Radio
Irfan 28 xyz Radio
Imran 42 abc News
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Decomposition into 4NF An algorithm
Result := {R} and flag := FALSE
Compute D
+
// Given schema R
i
, let D
i
denote the restriction
of D
+
to R
i
while NOT flag do
if There is a schema R
i
Result that is not in 4NF w.r.t. D
i
then
Let X Y be a non-trivial functional dependency that
holds on R
i
such that (X R
i
) / D
i
and X Y = φ.
Result := (Result R
i
) (R
i
Y ) (X , Y ) // Decompose
R into R Y and XY provided X Y in R violates 4NF
else
flag := TRUE
end if
end while
Note: The decomposition process ensures lossless property
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Decomposition into 4NF An algorithm
Result := {R} and flag := FALSE
Compute D
+
// Given schema R
i
, let D
i
denote the restriction
of D
+
to R
i
while NOT flag do
if There is a schema R
i
Result that is not in 4NF w.r.t. D
i
then
Let X Y be a non-trivial functional dependency that
holds on R
i
such that (X R
i
) / D
i
and X Y = φ.
Result := (Result R
i
) (R
i
Y ) (X , Y ) // Decompose
R into R Y and XY provided X Y in R violates 4NF
else
flag := TRUE
end if
end while
Note: The decomposition process ensures lossless property
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Join dependency
Given a relation schema R, a join dependency JD(R
1
, R
2
, . . . , R
n
)
is defined by the constraint that every legal relation r(R) should
have a non-additive join decomposition into R
1
, R
2
, . . . , R
n
, i.e. for
every such r we have
(π
R
1
(r), π
R
2
(r), . . . , π
R
n
(r)) = r .
Note: Multi-valued dependency is a special case of join
dependency where n = 2.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Join dependency
Given a relation schema R, a join dependency JD(R
1
, R
2
, . . . , R
n
)
is defined by the constraint that every legal relation r(R) should
have a non-additive join decomposition into R
1
, R
2
, . . . , R
n
, i.e. for
every such r we have
(π
R
1
(r), π
R
2
(r), . . . , π
R
n
(r)) = r .
Note: Multi-valued dependency is a special case of join
dependency where n = 2.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
5
th
Normal Form
Definition (5
th
Normal Form (5NF))
A relational schema R is in 5NF if for every non-trivial join
dependency JD(R
1
, R
2
, . . . , R
n
) in F
+
, every R
i
is a superkey of R.
Note: 5NF is also known as project-join normal form.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
5
th
Normal Form
Definition (5
th
Normal Form (5NF))
A relational schema R is in 5NF if for every non-trivial join
dependency JD(R
1
, R
2
, . . . , R
n
) in F
+
, every R
i
is a superkey of R.
Note: 5NF is also known as project-join normal form.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
Domain Key Normal Form
Definition (Domain Key Normal Form (DKNF))
A relational schema R is in DKNF if all the constraints and
dependencies that should hold on the valid relation states is a
logical consequence of the domain and key constraints on the
relation.
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
6
th
Normal Form
Definition (6
th
Normal Form (6NF))
A relational schema R is in 6NF if there exists no non-trivial join
dependencies at all (with reference to generalized join operator).
Malay Bhattacharyya Database Management Systems
Outline Overview 1NF 2NF 3NF BCNF EKNF 4NF 5NF DKNF 6NF
References
E. F. Codd (1970) CACM, 13(6):377-387.
E. F. Codd (1971) IBM Research Report, RJ909.
E. F. Codd (1974) IBM Research Report, RJ1385.
R. Fagin (1977) ACM TDS, 2(3), 262-278.
R. Fagin (1979) IBM Research Report, RJ2471.
R. Fagin (1981) CACM, 6, 387-415.
C. Zaniolo (1982) ACM TDS, 7(3), 489-499.
C. J. Date (2002) Temporal Data and the Relational Model,
Morgan Kaufmann.
C. J. Date (2006) Date on Database: Writings 2000-2006,
Springer-Verlag.
Malay Bhattacharyya Database Management Systems