NORMALIZATION : NORMALIZATION Normalization is the process of efficiently organizing data in database.
A database is (technically) normalized if it has no repeating groups---all data in fields are atomic
Further normalization---bringing the database into the right form---this is the “scientific” part of database design. Benefits of normalized database : Benefits of normalized database Avoiding repetitive entries.
Reducing required storage space.
Preventing the need to restructure existing tables to accommodate new data.
Increased speed and flexibility of queries, sorts and summaries.
The database community has developed a series of guidelines for ensuring that databases are normalized .These are referred to as normal forms. Principal normal forms : Principal normal forms - First (i.e. normalized)--1NF
- BC (BOYCE/CODD)--BCNF
- Fourth The First Normal Form : The First Normal Form First normal form(1NF) sets the very basic rules for an organized database:
Eliminate duplicative columns from the same table.
Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key) Example: : Example: For a table to be in first normal form, data must be broken up into the smallest units possible. The following table is not in first normal form. Slide 6: To conform to first normal form, this table would require additional fields.
The name field should be divided into first and last name and the address should be divided by street, city, state and zip like this. Slide 7: In addition to breaking data up into the smaller meaningful values, tables in first normal form should not contain repetitions groups of fields such as in the following table. Slide 8: The problem is that each representative can have multiply clients not all will have three.
The solution to this is to add a record for each new piece of information.
Notice the splitting of the first and last name fields again. Primary key: Rep id &Foreign key: Client id : Primary key: Rep id &Foreign key: Client id Second normal form(2NF) : Second normal form(2NF) Second normal form(2NF) further addresses the concept of removing duplicative data:
Meet all the requirements of the first normal form.
Remove subsets of data that apply to multiple rows of a table and them in separate tables.
Create relationships between these new tables and their predecessors through the use of foreign keys. Slide 11: The second normal form states that each field in a multiple field primary key table must be directly related to the entire primary key.
Or in other words , each non-key field should be a fact about all the fields in the primary key.
Only fields that are absolutely necessary should show up in our table, all other fields should reside in different tables. Orderitem table : orderno + itemno : Orderitem table : orderno + itemno Slide 13: All keys does not depend on the entire primary key , but some of them depend on a primary key.
The principle of second normal form is violated.
Move columns that do not depend on the entire primary key to another table.
Anomalies in the above table:
Delete problem The columns that do not depend on the primary key are itemname and unitprice.They depend only on the itemno.So these two columns have to be removed from oderitem table. : The columns that do not depend on the primary key are itemname and unitprice.They depend only on the itemno.So these two columns have to be removed from oderitem table. Slide 15: Insert: it is possible to add a new item in item table without worrying about order table.
Update: modifications can be made easily in the item table.
Delete: even if an order is deleted, the information about the item is not lost. Third normal form : Third normal form A table is in third normal form (3NF)
if it is in second normal form and if all non- key columns in the table depend non-key columns in the table depend non-transitively on the entire primary key.
B C so A B C Functional dependencies : Functional dependencies Orders Table: Orderdate and cusno depends functionally on orderno.
Orderitem table: qty and itemno depends functionally on orderno.But billamount would be calculated as bill amount=qty * unitprice.
Qty is available in orderitem table, whereas unitprice is available in item table. Thus the column billamount does not depend directly on the primary keys of orderitem table.
This is the case of transitive dependency.
Item table: itemname and unitprice functionally depends on the primary key itemno. So that there is no transitive dependency in this table. Slide 18: Anamolies in the table:
Insert problem: Whenever a new order is placed in orderitem table, billamount would be blank, since it is not known beforehand.
Update problem: if the orders were to be modified, changes would be made to qty or unitprice and billamount have to be recalculated every time and updated to all rows containing that order.
Delete problem: if an item has to be deleted for some reason from the order, billamount have to be recalculated every time and updated to all rows containing that order. Slide 19: Thank you