Sabtu, 25 April 2009

NORMALIZATION

DATABASE DESIGNING PROCESS
• Collect the user / business requirement
• Develop the E-R Model based on user / business requirement
• Convert the E-R Model to a relation (table) community
• Normalize the relation to get rid of the anomaly
• Implement them into database by making table for each relation that has been normalized

DATABASE NORMALIZATION
• Normalization is a database structure making process so most of the ambiguity can be cleared
• Normalization phase start at the lowest phase (1NF) till the highest phase (5NF)
• Usually only reach 3NF level or BCNF because it has been adequate to produce high quality tables
• Why Do We Need Normalization ?
- To optimize the table structures
- To increase speed
- To restrict the insertion of same data
- More efficient in using storage media
- Reduce redundancy
- To avoid anomaly (insertion anomalies, deletion anomalies, update anomalies)
- To increase data integrity
• A table can be a good (efficient) one or normal if meet 3 criteria below :
- If there's decompositioning table, so the decompository must be ensured safe (Lossless-Join Decomposition). Which means, after that table being decomposed into new tables, that new tables could also produce the former tables similarly
- Being maintained the functional dependency when occur the data changes (Dependency Preservation)
- Not defying the Boyce-Codd Normal Form (BCNF)
• If the third criteria cannot be fulfilled, so at least that table not defying the Normal Form Third Phase (3rd Normal Form / 3NF)

FUNCTIONAL DEPENDENCY
• Functional Dependency describes the attributes relationship in a relation
• An attribute can be functional dependant with others if we use that attribute value to determine the value of other attribute
• The symbol that was used is --> to represent functional dependency. --> was read : functionally determine
• Notation : A --> B
- A and B are attributes from a tabel. Which means functionally A determines B or B depends on A, if and only if there're 2 data row with same A value, so B value also the same
• Notation :
- Is a reverse from previous notation
• Functional Dependency Example
- Functional Dependency :
• NRP --> Nama
• Mata Kuliah, NRP --> Nilai
- Non Functional Dependency :
• Mata Kuliah --> NRP
• NRP --> Nilai
• Functional Dependency from table nilai :
- NRP --> Nama
• Because for every same NRP value, so Nama value also the same
- {Mata_Kuliah, NRP} --> Nilai
• Because attribute Nilai depends on Mata_Kuliah and NRP all together. In other meaning, for every same value of Mata_Kuliah and NRP, so Nilai is also the same, because Mata_Kuliah and NRP are the key (unique)

FIRST NORMAL FORM (1NF)
• A table is in a normal 1 form if it's not in an unnormalized table form, where there're duplication of homogenous field and may exist null fields
• They're not permitted to be :
- Multivalued attribute
- Composite attribute or combination of both
So :
- Domain attribute value must be atomic
• Example
- If there're tables as follows :
Or
- Both of tables above doesn't meet 1NF requirement
- It should be decomposed to be :
• Table Mahasiswa
• Table Hobi

SECOND NORMAL FORM (2NF)
• Second Normal Form 2NF is satisfied in a table if it has satisfies the 1NF, and all attribute except primary key, integratedly have functional dependency in primary key
• A table doesn't meet 2NF, if there're attributes which its dependency (Functional Dependency) only being partial (only dependant in half of the primary key)
• If there're attributes which doesn't have dependency with primary key, so that attribute must be removed
• Functional Dependency X --> Y was said to be complete if deleting some attribute A from X means Y not anymore functional dependant
• Functional Dependency X --> Y was said to be partial if deleting some attribute A from X means Y still functional dependant
• Relation schema R in form of 2NF if every attribute non primary key A є R full dependant functionally in primary key R
• This table below meet 1NF, but doesn't meet 2NF :
• Doesn't meet 2NF, because {NIM, KodeMk} which assume to be the primary key, while :
{NIM, KodeMk} --> NamaMhs
{NIM, KodeMk} --> Alamat
{NIM, KodeMk} --> Matakuliah
{NIM, KodeMk} --> Sks
{NIM, KodeMk} --> NilaiHuruf
• That table must be decomposed first into some table which meet the 2NF requirement
• The functional dependency :
- {NIM, KodeMk} --> NilaiHuruf (fd1)
- NIM --> {NamaMhs, Alamat} (fd2)
- KodeMk --> {Matakuliah, Sks} (fd3)
• So :

THIRD NORMAL FORM (3NF)
• Third Normal Form 3NF was met if it has met the form 2NF, and if there're no non primary key attribute which have dependency with other non primary key attribute (transitive dependency)
• Example :
- Table mahasiswa as follows, met the 2NF requirement but not met the 3NF requirement
- Because there's still non primary key attribute (Kota and Provinsi) which has dependency with other non primary key attribute (KodePos)
KodePos --> {Kota, Provinsi}
- So that table must be docomposed into :

BOYCE-CODD NORMAL FORM (BCNF)
• Boyce-Codd Normal Form has a stronger constraint from third normal form. To become a BCNF, relation must be in First Normal Form and every attribute is induced to be dependant at function in super key attribute
• In example below, there's relation Seminar, where the primary key is NPM + Seminar :
- Siswa could take one or two seminar. Every seminar needs 2 pembimbing and every siswa is guided with either 2 of those two pembimbing seminar. Every pembimbing must only have one seminar. In this example, NPM and Seminar show the Pembimbing.
- The Seminar relation form is a third normal form, but not a BCNF because KodeSeminar still functional dependant at Pembimbing, if every Pembimbing could teach only one seminar. Seminar dependant at one non super key attribute like the one that conditioned by BCNF. So, relation Seminar must be splitted into two part, that is :

FOURTH AND FIFTH NORMAL FORM
• A relation is in Fourth Normal Form (4NF), if a relation is in BCNF and doesn't have multivalued dependency. To erase the multivalued dependency from one relation, we should divide the relation into two new relation. Each relation then has two attribute which has a multivalued relation
• Relation in Fifth Normal Form (5NF) has to deal with property which called join without losing the information (lossless join). This fifth normal form (5NF) also called PJNF / Projection Join Normal Form. This case is really rare appear in reality and hard to detect practically.

Below is summary of the normal form mentioned above :

Tidak ada komentar:

Posting Komentar