Sabtu, 25 April 2009

NORMALIZATION

DATABASE DESIGNING PROCESS
• Collect the user / business requirement
• Develop the E-R Model based on user / business requirement
• Convert the E-R Model to a relation (table) community
• Normalize the relation to get rid of the anomaly
• Implement them into database by making table for each relation that has been normalized

DATABASE NORMALIZATION
• Normalization is a database structure making process so most of the ambiguity can be cleared
• Normalization phase start at the lowest phase (1NF) till the highest phase (5NF)
• Usually only reach 3NF level or BCNF because it has been adequate to produce high quality tables
• Why Do We Need Normalization ?
- To optimize the table structures
- To increase speed
- To restrict the insertion of same data
- More efficient in using storage media
- Reduce redundancy
- To avoid anomaly (insertion anomalies, deletion anomalies, update anomalies)
- To increase data integrity
• A table can be a good (efficient) one or normal if meet 3 criteria below :
- If there's decompositioning table, so the decompository must be ensured safe (Lossless-Join Decomposition). Which means, after that table being decomposed into new tables, that new tables could also produce the former tables similarly
- Being maintained the functional dependency when occur the data changes (Dependency Preservation)
- Not defying the Boyce-Codd Normal Form (BCNF)
• If the third criteria cannot be fulfilled, so at least that table not defying the Normal Form Third Phase (3rd Normal Form / 3NF)

FUNCTIONAL DEPENDENCY
• Functional Dependency describes the attributes relationship in a relation
• An attribute can be functional dependant with others if we use that attribute value to determine the value of other attribute
• The symbol that was used is --> to represent functional dependency. --> was read : functionally determine
• Notation : A --> B
- A and B are attributes from a tabel. Which means functionally A determines B or B depends on A, if and only if there're 2 data row with same A value, so B value also the same
• Notation :
- Is a reverse from previous notation
• Functional Dependency Example
- Functional Dependency :
• NRP --> Nama
• Mata Kuliah, NRP --> Nilai
- Non Functional Dependency :
• Mata Kuliah --> NRP
• NRP --> Nilai
• Functional Dependency from table nilai :
- NRP --> Nama
• Because for every same NRP value, so Nama value also the same
- {Mata_Kuliah, NRP} --> Nilai
• Because attribute Nilai depends on Mata_Kuliah and NRP all together. In other meaning, for every same value of Mata_Kuliah and NRP, so Nilai is also the same, because Mata_Kuliah and NRP are the key (unique)

FIRST NORMAL FORM (1NF)
• A table is in a normal 1 form if it's not in an unnormalized table form, where there're duplication of homogenous field and may exist null fields
• They're not permitted to be :
- Multivalued attribute
- Composite attribute or combination of both
So :
- Domain attribute value must be atomic
• Example
- If there're tables as follows :
Or
- Both of tables above doesn't meet 1NF requirement
- It should be decomposed to be :
• Table Mahasiswa
• Table Hobi

SECOND NORMAL FORM (2NF)
• Second Normal Form 2NF is satisfied in a table if it has satisfies the 1NF, and all attribute except primary key, integratedly have functional dependency in primary key
• A table doesn't meet 2NF, if there're attributes which its dependency (Functional Dependency) only being partial (only dependant in half of the primary key)
• If there're attributes which doesn't have dependency with primary key, so that attribute must be removed
• Functional Dependency X --> Y was said to be complete if deleting some attribute A from X means Y not anymore functional dependant
• Functional Dependency X --> Y was said to be partial if deleting some attribute A from X means Y still functional dependant
• Relation schema R in form of 2NF if every attribute non primary key A є R full dependant functionally in primary key R
• This table below meet 1NF, but doesn't meet 2NF :
• Doesn't meet 2NF, because {NIM, KodeMk} which assume to be the primary key, while :
{NIM, KodeMk} --> NamaMhs
{NIM, KodeMk} --> Alamat
{NIM, KodeMk} --> Matakuliah
{NIM, KodeMk} --> Sks
{NIM, KodeMk} --> NilaiHuruf
• That table must be decomposed first into some table which meet the 2NF requirement
• The functional dependency :
- {NIM, KodeMk} --> NilaiHuruf (fd1)
- NIM --> {NamaMhs, Alamat} (fd2)
- KodeMk --> {Matakuliah, Sks} (fd3)
• So :

THIRD NORMAL FORM (3NF)
• Third Normal Form 3NF was met if it has met the form 2NF, and if there're no non primary key attribute which have dependency with other non primary key attribute (transitive dependency)
• Example :
- Table mahasiswa as follows, met the 2NF requirement but not met the 3NF requirement
- Because there's still non primary key attribute (Kota and Provinsi) which has dependency with other non primary key attribute (KodePos)
KodePos --> {Kota, Provinsi}
- So that table must be docomposed into :

BOYCE-CODD NORMAL FORM (BCNF)
• Boyce-Codd Normal Form has a stronger constraint from third normal form. To become a BCNF, relation must be in First Normal Form and every attribute is induced to be dependant at function in super key attribute
• In example below, there's relation Seminar, where the primary key is NPM + Seminar :
- Siswa could take one or two seminar. Every seminar needs 2 pembimbing and every siswa is guided with either 2 of those two pembimbing seminar. Every pembimbing must only have one seminar. In this example, NPM and Seminar show the Pembimbing.
- The Seminar relation form is a third normal form, but not a BCNF because KodeSeminar still functional dependant at Pembimbing, if every Pembimbing could teach only one seminar. Seminar dependant at one non super key attribute like the one that conditioned by BCNF. So, relation Seminar must be splitted into two part, that is :

FOURTH AND FIFTH NORMAL FORM
• A relation is in Fourth Normal Form (4NF), if a relation is in BCNF and doesn't have multivalued dependency. To erase the multivalued dependency from one relation, we should divide the relation into two new relation. Each relation then has two attribute which has a multivalued relation
• Relation in Fifth Normal Form (5NF) has to deal with property which called join without losing the information (lossless join). This fifth normal form (5NF) also called PJNF / Projection Join Normal Form. This case is really rare appear in reality and hard to detect practically.

Below is summary of the normal form mentioned above :

Minggu, 19 April 2009

Database and Entity Relationship Diagram

DATABASE DEFINITION
• A set of data, stored in magnetic disc, optical disc, or other secondary storage
• A collection of interconnected data from some enterprise (companies, governments institute, or private)
- Manufacture company --> production planning data, actual production data, material ordering data, etc
- Hospital --> patient data, doctor, nurse, etc
• A structured collection of records or data that is stored in a computer system. The structure itself is achieved by organizing the data according to a database model. Where the model in most common use today is the relational model.

DATABASE MANAGEMENT SYSTEM
• A software to organize the storage of data
• A collection / combination of database with a data based application software
• These application program was used to accessing and maintaining database
• Main objectives of DBMS are to provide an easy and efficient environment for usage, retrieval and storing data and information

BIT, BYTE, and FIELD
• Bit is the smallest part of data contains a 0 or 1 value
• Byte --> a collection of one kind bits
• Field --> a collection of one kind bytes where in database often called attribute
ATTRIBUTE / FIELD
• Attribute / field is a characteristic from one entity which provide a detailed explanation about that entity
• A relation can also have attributes
• Example of attribute :
- MAHASISWA : NIM, NAMA, ALAMAT
- MOBIL : NOMOR_PLAT, WARNA, JENIS, CC

ATTRIBUTE TYPES
• Single Vs Multivalue
- Single --> can also be stored with at most one value
- Multivalue --> can also be stored with more than one value but still in one type
• Atomic Vs Composition
- Atomic --> cannot be divided into a smaller attribute
- Composition --> a combination of some smaller attribute
- Derived Aatribute
• An attribute which its value can be produced from value of other attribute
• Ex : age can be produced from attribute birthdate
- Null value attribute
• An attribute which doesn't have a value for certain record
- Mandatory Value Attribute
• An attribute which must have a value

RECORD / TUPPLE
• Record / tupple is a row of data in certain relation
• Consist of a collection of atribute which its attribute interrelated each other to inform the entity / relation completely

Entity / File
• File is a collection of one types record and have a same element, same attribute but different data value
• File Types
- In application processing, file can be categorize by :
• Main file
• Transaction file
• Report file
• History file
• Protection file
• Working file

DOMAIN
• Domain is a set of value which permitted to be in one or more attribute
• Every attribute in certain relational database was defined as a domain

ELEMENT DATA KEY
• Key is a record element which used to find that record in access time or can also be used to identify every entity / record / row

KEY TYPES
• Superkey is one or more attribute from certain table which can be used to identify entity / record from table in unique way (not all attribute can be a superkey)
• Candidate key is a superkey with minimal attribute. Candidate key cannot be filled with attribute from other table so a candidate key is definitely a superkey, but not in the contrary
• Primary key
- One of attribute from candidate key can be choosen / specified into primary key with these three criteria :
• That key must be more natural to be used as a reference
• That key must be simpler
• That key must be ensured its uniqueness
• Alternate key is an attribute from unchosen candidate key which was chosen to be primary key
• Foreign key is a random attribute which points to primary key in other table. Foreign key can exist in certain relation which have a one to many cardinality or many to many. Foreign key usually placed in a table that connects to many
• External key is a lexical attribute (lexical attribute compilation) which its value often identifies one instance object

ERD (ENTITY RELATIONSHIP DIAGRAM)
• ERD is a networking model which used a structured word which stored in the system abstractly
• The differences between DFD and ERD :
- DFD is a functional networking model which will be applied by the system
- ERD is a data networking model which emphasizes in structured and data relationship

ERD ELEMENTS
• Entity
In ER Diagram, entity was denoted with rectangle shape. Entity is something that exist in real system as well as abstract where lies the storage of the data or the data being stored
• Relationship
This relationship ER Diagram was denoted with a rhombus shape. Relationship is a natural relation that exists between entity. Generally named with a basic verb in order to easy to do the reading of the relation
• Relationship Degree
Denoting a number of entity which participated in one relationship. A degree which often used in ERD.
• Attribute
Denoting a characteristic from every entity or relationship
• Cardinality
Shows a maximum number of tupple that can be related with entity, in other entity

RELATIONSHIP DEGREE
• Unary Relationship
Denoting a relationship model which exist between entity that come from same entity set
• Binary Relationship
Denoting a relationship model which exist between 2 entity
• Ternary Relationship
Denoting a relationship between instance from 3 entity type unilaterally

CARDINALITY
There're 3 cardinality relation, those are :
• One To One
Relation One To One denoted with one event in the first entity, only have one relation with one event in the second entity, and vice versa
• One To Many or Many To One
Relation One To Many is the same with Many To One, depends from which direction that relation is seen. For one event in the first entity can have many relation with event in the second entity. If otherwise, one event in the second entity only can have relation with one event in the first entity
• Many To Many
This relation exists, if every event in certain entity have many relation with event in other entity

CARDINALITY EXAMPLE

NOTATION (ER DIAGRAM)
Simbolic notation in ER Diagram are :
• Rectangle denoting an entity community
• Circle round shape denoting attribute
• Rhombus denoting a relation community
• Line as a connector between relation community with entity community and Entity Community with its attribute

Minggu, 05 April 2009

DATA FLOW DIAGRAM

DATA FLOW DIAGRAM

Data Flow Diagram
• Describing the system classification into a smaller modul
• Facilitate the user which less understand about computer to understand better about the systems under way
• Also called, a tools for making the systems model that allows the expert to describe the systems process as a functional network that connect as one with the data flow, either it is manually or computerized

Context Diagram
• Consist of one process and describe the scope of the system
• The highest level of DFD which describe all input into the system and output from the system
• System was limited by boundary (shown by ruptured line)
• Doesn't have a storage
• Steps for making context diagram :
- Specify the name of the system
- Determine the limits of the system
- Determine the terminator that exist in the system
- Determine what is to be received/given for the terminator from/to the system
- Draw the context diagram

Nil Diagram
• Describing the process from DFD
• Giving a comprehensive sight of view for the systems under way, to show the existing function or main process, data flow and external entity
• In this level there's possibility of data storage
• For process which didn't explained in detail for the next level, so it was added by simbol "*" or "P" in the end of the process number
• The balance of input and output (balancing) between nil diagram and context diagram must be well preserved

Detailed Diagram
• A diagram that explain which process should exist in the nil diagram or in the upper level
• Level numbering in DFD :

Level name Diagram name Process number
----------------------------------------------------------------
0 Context
1 Diagram 0 1.0, 2.0, 3.0, ...
2 Diagram 1.0 1.1, 1.2, 1.3, ...
3 Diagram 1.1 1.1.1, 1.1.2, ...

• On one level, it should be better if there exist not more than 7 process and it should've max 9 process, if there're more than we must do the decomposition

Process Spesification
• Every process in the DFD must've a process spesification
• On the top level, the method that was used to figure the process can be in mean of descriptive sentence
• On more detail level, that was in the lowest level (functional primitive) need a more structured spesification
• Process spesification will be a guide for the programmer in making program (coding)
• The method that was used in process spesification : process explanation in the form of narrative, decision table, decision tree

External Unit
• Something that was exist outside of the system, but it gives data to the system or gives data from the system
• Simbolize by notation box
• External entity not included into the system
• Denomination :
- Terminal name used the any noun word
- Terminal mustn't have same name except the object is also the same

Data Flow
• Show the flow of the information
• Figured by a straight line that connecting component from the system
• Data flow shown by an arrow line given the name by the streaming data flow
• Data flow streams between yhe process, data storage and show the data flow from the data in form of the input for the system
• Guide for denomination :
- Naming for data flow consist of some word that connect to the connecting line
- There mustn't any data flow which has a same name and denomination must reflect the content
- Data flow consist of some element which denominated by element group
- Avoid using the word "data" and "information" for denomination in data flow
- The name of data flow must be written fully
• Another prescription :
- Data flow name which get into a process mustn't identic with the data flow name which come out from that system
- Data flow that get in or out from data storage may not given name if :
• The data flow is simple and easy to understand
• The data flow figures the entire of data item
- There mustn't any data flow from the terminal goes to the data storage or vice versa because terminal is not a part from the system, terminal relationship with data storage must go through the process

Process
• Process explained what should be done by the system
• Process may processed the data or incoming data flow into an outcoming data flow
• Process has a function to transformed one or more input data into one or more output data according to the desired spesification
• Every process has one or more input and produce one or more output
• Process also often called "bubble"
• Guide in process denomination :
- Process name consist of verb and noun which reflect the function of that process
- Never use a word "process" as a part of name for one bubble
- There mustn't any process which have a same name
- Process must given numbers. Number sequence considerably follows the flow or process sequence, but the number sequence doesn't mean the process sequence cronologically

Data Storage
• Data storage is a place to store tha data that exist in the system
• Simbolize by a parallel line or two line with one of the side spread wide open
• Process may take data from the database or give data to the database
• Guide for denomination :
- The name must reflect data storage
- If the name consist of more than one word so we must give a conjunction sign

DFD Symbol



































Data Dictionary
• Has a function to help the systems subject to interpret the application in detail and organize all data element that was used by the system exactly so that the user and system analyst have a same basic understanding about the input, output, storage, and process
• In analyzing phase, the data dictionary was used to communicate between system analyst and the user
• In system development phase, the data dictionary was used to develop the input, report, and database
• Data flow in DAD has a global characteristic, more detail explanation can be seen in data dictionary
• Data dictionary contains all these things :
- Data flow name must be taken note in order for the reader who needs detailed explanation about one data flow can search through it more easily
- Alias or other name for the data must be written if exist
- Data shape was used in categorizing data dictionary into its usage when developing systems
- Data flow shows from where the data flows and where the data headed
- Explanation, give the explanation about the meaning of the data flow

Balancing in DFD
• Data flow that goes in and out from one process must be equal with the data flow that goes in and out from the detailed process in the under level
• Data flow name that goes in and out from one process must be equal with the name of data flow that goes in and out from that detailed process
• The number and name from external entity from one process must be equal with the number and name of the external entity from that detailed process
• Things that should be taken note for in DFD that has more than one level :
- There must be a balance in input and output between one level and other level
- Balance between level 0 and level 1 can be seen in input/output from the data flow to/from the terminal in level 0, while the balance in level 1 and level 2 can be seen in input/output from data flow to/from the relevant process.
- Data flow name, data storage and terminal in every level must be equal if the object is also the same

Prohibition In DFD
• Data flow mustn't be from external entity directly go to other external entity without going through some process
• Data flow mustn't be from data storage directly go to external entity without going through some process
• Data flow mustn't be from data storage directly go to other data storage without going through some process
• Data flow from one process directly go to other process without going through some data storage should likely be avoided