DBMS-normalization

Views:
 
Category: Education
     
 

Presentation Description

This PPT -describes the types of Normal Form in full detail

Comments

By: hellomathews (6 month(s) ago)

This is a useful presentation for me but it cant be downloadable .please give permission to download tis.

Presentation Transcript

PowerPoint Presentation:

normalization

CONTENT:

CONTENT 1. ATOMIC DOMAINS AND NORMAL FORMS 2. DATABASE-DESIGN PROCESS 3. DESIGN ISSUES 4. MODELING TEMPORAL DATA

1. ATOMIC DOMAINS AND NORMAL FORMS :

1. ATOMIC DOMAINS AND NORMAL FORMS ATOMIC: A domain is “ATOMIC”, if elements of the domain are considered to be indivisible units. NORMALIZATION: Normalization is the step-by-step decomposition (dividing (or) splitting) of complex records into simple records. (or) Organizing the data into more than one table is called “NORMALIZATION”.

BENEFITS OF NORMALIZATION:

BENEFITS OF NORMALIZATION (1) Faster Sorting (2) Index Creation (3) Simplify the structure of tables (4) Reduce redundancy (repetition) using the principles of “Non-loss decomposition”. - that’s records are decomposed without any loss of data

NORMAL FORMS (OR) TYPES OF NORMALIZATIONS:

NORMAL FORMS (OR) TYPES OF NORMALIZATIONS

PowerPoint Presentation:

Based on dependency the Normalization forms are classified as follows: 1. First Normal Form (1NF) Normalization using Functional dependency 2. Second Normal Form (2NF) 3. Third Normal Form (3NF) 4. Boyce Codd Normal Form (BCNF) Normalization using Multi-valued dependency 5. Fourth Normal Form (4NF) Normalization using Join dependency 6. Fifth Normal Form (5NF) (or) Project-Join Normal Form (PJNF) 7. Domain Key Normal Form (DKNF)

First Normal Form (1NF) :

First Normal Form (1NF)

First Normal Form (1NF) :

First Normal Form (1NF) 1. RULE: 2. For example- consider the following table “project”. Emp_code Dept Project_code Hours E101 System P27 p51 90 150 E102 Sales P27 p22 100 60 A table is said to be in First Normal Form, if each cell of the table contains only one value.

PowerPoint Presentation:

3. The data in the above table isn’t normalized, because as per rule each cell of the table contains only one value. But project_code , Hours –these two attributes contains more than one value. By applying 1NF-rule to the “project” table, we will get the following normalized 1NF-table Emp_code Dept Project_code Hours E101 System P27 90 E101 System p51 150 E102 Sales P27 100 E102 Sales p22 60

Normalization using Functional dependency :

Normalization using Functional dependency Functional Dependency (FD) (or) Equality-generating dependency Given two sets of attributes ‘X’ and ‘Y’, Y-is said to be functionally dependent on X. Functional Dependency - is denoted as Example consider the table “ Employee” X Y CODE NAME CITY E1 Henry Delhi E2 John Bangalore E3 Teddy Chennai

PowerPoint Presentation:

In the above table the attribute “code” has only one value for the attribute “Name”. Example: - for Code - E1 there’s exactly one value of Name - Henry - for E2- one value of Name-John - so “ Name ” is functionally dependent on “code” “ City “- is also functionally dependent on “Code” Because for each “ City “ there are one “ Code” Code Name Code City

SECOND NORMAL FORM (2NF):

SECOND NORMAL FORM (2NF)

SECOND NORMAL FORM (2NF):

SECOND NORMAL FORM (2NF) RULE Example Let us take the “project” table from 1NF A table is said to be in Second Normal Form (2NF), when it’s in 1NF and every attribute in the row is functionally dependent on whole key (fully), not just part of the key. Emp_code Dept Project_code Hours E101 System P27 90 E101 System p51 150 E102 Sales P27 100 E102 Sales p22 60

PowerPoint Presentation:

1. The “ Project” table is in 1NF and check whether the attribute is functionally dependent on whole key (or) not. 2. Each “ Emp_code “ has more than one value of “ Hours ” E101 - has two values of “Hours” 90,150 E102 - has two values of “ Hours “ 100,60 so, “Hours” is not functionally dependent on “ Emp_code ” 3. Similarly, project-code also have more than one value of Hours p27 - has two values of “Hours” (90,100) so, “project-code” is not functionally dependent on “Hours” Emp_code Hours project_code Hours

PowerPoint Presentation:

4. The combination of “ Emp_code ” and “ project_code ” have exactly one value of “ Hours ”. E101 and p27 - has Hours ’90’ E101 and p51 - has Hours ‘150’ E102 and p27 - has Hours ‘100’ E102 and p22 - has Hours ’60’ So, Emp_code and project_code is Functionally dependent on Whole key “ Hours “. Emp_code Project_code Hours

PowerPoint Presentation:

“Dept” is functionally dependent on whole key of “Emp_code”. (or) The “department” system have only one value of “Emp_code” (E-101) The “department” sales have only one value of “Emp_code” (E-102) Emp_code Dept E101 System E101 System E102 Sales E102 Sales Emp_code Dept Emp_code Dept E101 System E102 Sales

Guidelines for converting a Table into – 2NF:

Guidelines for converting a Table into – 2NF Find and remove attributes that are functionally dependent on only a part of the key and not on the whole key. And place them in a separate table. Ignore the remaining attributes. ------------------------------------- According to “guidelines the table “project” is normalized into 2Nf-as follows: (1) “ Dept ” is functionally dependent on whole key of “ Emp_code ”. So place these two attributes in a separate table as follow, and name the table as “Emp_dept”. Emp_code Dept E101 System E101 System E102 Sales E102 Sales

PowerPoint Presentation:

The combination of Emp_code and project_code is functionally dependent on the whole key of “ Hours”. So, place these attributes in a separate table. project Emp_code Project_code Hours E101 P27 90 E101 p51 150 E102 P27 100 E102 p22 60

THIRD NORMAL FORM- (3NF):

THIRD NORMAL FORM- (3NF)

THIRD NORMAL FORM- (3NF):

THIRD NORMAL FORM- (3NF) RULE: Example: Let us consider the following table “Employee” A table is said to be in Third Normal Form (3NF), when it’s in 2NF and primary key is functionally dependent on every non-key attribute. Emp_code Dept Dept_Head E101 System Jones E102 Sales Smith E103 Finance Kelly E104 Finance Kelly E105 Sales Smith

PowerPoint Presentation:

First check whether the table “Employee” is in 2NF (1) In the Employee table the attribute Emp_code is Primary key. (2) Dept and Dept_Head are non-key attributes (3) Each Emp_code has exactly only one value of Dept. (4) So, Emp_code is functionally dependent on the non-key attribute Dept . (5) Emp_code is also functionally dependent on the non-key attribute Dept_Head . Dept Emp_code Dept_Head Emp_code

Guidelines for converting a Table into – 3NF:

Guidelines for converting a Table into – 3NF 1. Find and remove non-key attributes that aren’t functionally dependent on “primary key”. 2. place them in a separate table. 3. Group the remaining attributes. ------------------------------ According to Guidelines the table “Employee” is normalized to 3NF as follows: (A) Emp_code ---- Primary key (B) Dept, Dept_Head ---- Non-key attribute Dept_Head is functionally dependent on Dept. Dept Dept_Head System Jones Sales Smith Finance Kelly

PowerPoint Presentation:

So places these two non-key attributes (Dept, Dept_Head) in separate table & name the table. Department Dept Dept_Head System Jones Sales Smith Finance Kelly Finance Kelly Sales Smith

PowerPoint Presentation:

And place the Emp_code and Dept in separate table and name the table Emp Emp_code Dept E101 System E102 Sales E103 Finance E104 Finance E105 Sales

PowerPoint Presentation:

BOYCE-CODD NORMAL FORM (BCNF)

BOYCE-CODD NORMAL FORM (BCNF):

BOYCE-CODD NORMAL FORM (BCNF) The definition (or) rule of 3NF was insufficient to some situations. And it wasn’t satisfactory for the following table: -> the table that had multiple candidate keys. -> where the multiple candidate keys are Composite keys. -> where the multiple candidate keys overlapped (has at least one attribute in common). Composite key: in some table two (or) more attributes are used as a primary key to identify rows uniquely is called “Composite key”.

PowerPoint Presentation:

Example: Let us take the “ purchase “ table In this table cust_code & purchase_code are used as primary key, and both these attributes together called “Composite key”. cust_code product_code purchase_date C11 P1 02/Jan/01 C15 P18 15/Aug/99

PowerPoint Presentation:

RULE –for BCNF: CANDIDATE KEY : An attribute (or) set of attributes that uniquely identifies each row is called “ Candidate key ”. Example (for BCNF): Let us consider the following table “ project”. A table is said to be in Boyce Codd Normal Form (BCNF), if and only if every determinant (attribute) is a candidate key. Emp_code Name Project_code Hours E1 Veronica P2 100 E2 Anthony P5 100 E3 Mac P6 15 E4 Susan P3 250 E5 Veronica P7 40

PowerPoint Presentation:

Emp_code Name Project_code Hours E1 Veronica P2 100 E2 Anthony P5 100 E3 Mac P6 15 E4 Susan P3 250 E5 Veronica P7 40 Emp_code Project_code Primary key Emp_code Project_code Candidate key Name Project_code Candidate key

PowerPoint Presentation:

Hours is functionally dependent on Emp_code + project_code Name is functionally dependent on Emp_code Hours is also functionally dependent on Emp_code Emp_code Project_code Hours Emp_code Name Name Project_code Hours

PowerPoint Presentation:

Notice that this “project” table has - multiple candidate keys, that’s - the candidate keys are composite - the candidate keys overlap since the attribute – project_code is common, but overlapping candidate key is Emp_code, Name Emp_code Project_code Candidate key Name Project_code Candidate key Emp_code Project_code Primary key Emp_code Project_code Candidate key Name Project_code Candidate key

Guidelines for converting a Table into – BCNF:

(1) Find and remove the overlapping candidate keys, and place the attributes in separate table and the name the table. (2) Group the remaining attributes and place it in a separate table. ------------------------------ According to Guidelines the table “Employee” is normalized to BCNF as follows: * the overlapping attributes are “Name” and “Emp_code” place it in a table and name the table as Employee Guidelines for converting a Table into – BCNF Emp_code Name E1 Veronica E2 Anthony E3 Mac E4 Susan E5 Veronica

PowerPoint Presentation:

Group the remaining attributes Project_code and Hours with one primary key. And the name the table as “ project”. Emp_code Project_code Hours E1 P2 100 E2 P5 100 E3 P6 15 E4 P3 250 E5 P7 40

PowerPoint Presentation:

Normalization using Multi-valued dependency Fourth Normal Form (4NF)

Multi-valued Dependency:

Multi-valued Dependency Multi-valued dependency (or) Tuple-generating dependency (1) It is a type of Functional dependency, where the determinant (attribute) can determine more than one value. (2) Multi-valued Dependency (MVD) is written as like this Where A,B – attributes (3) Usually, Multi-valued dependency contain 3-criterias: (a) There must be at least 3-attributes in the relation, call them A,B and C (b) A – determines multi-values of B A – determines multi-values of C . (c) B and C – are independent of one another. A B

PowerPoint Presentation:

Example: Let us consider the following table “ Sports_event ” student participates in one or more activities Stud_id (100) have two major (MCA,MBA) ( one stud_id with many majors) Stud_id Major Activities 100 MCA Baseball 100 MCA Volleyball 100 MBA Cricket 100 MBA Volleyball 200 ITC Swimming Stud_id Major

PowerPoint Presentation:

Stud_id (100) have three activities (Volleyball, cricket, Baseball) ( one stud_id with many Activities) Stud_id Activities

Fourth Normal Form (4NF) :

Fourth Normal Form (4NF) RULE: (or) A table is said to be in Fourth Normal Form (4NF), when it is in BCNF and it has one independent Multi-valued dependency (or) one independent multi-valued dependency with a functional dependency. Reduce BCNF entities to 4NF by removing any independent multi-valued components of the primary key to two new parent entities.

PowerPoint Presentation:

Let us consider the following table “sports_event” Stud_id Major Activities C100 MCA MCA Baseball Volleyball M100 MBA MBA Cricket Volleyball T200 ITC Swimming Usually, Multi-valued dependency contain 3-criterias: (a) There must be at least 3-attributes in the relation, call them A,B and C (b) A – determines multi-values of B A – determines multi-values of C . (c) B and C – are independent of one another .

PowerPoint Presentation:

The sports_event relation contains three attributes consider stud_id has - A consider Major has - B consider Activities has - C So --- stud_id determines multi-values of Major. stud_id determines multi-values of Activities. Major and Activities are independent of one another. Make the above table into First Normal Form and check whether it’s in BCNF. Stud_id Major Activities C100 MCA MCA Baseball Volleyball M100 MBA MBA Cricket Volleyball T200 ITC Swimming

PowerPoint Presentation:

Stud_id Major Activities C100 MCA Baseball C100 MCA Volleyball M100 MBA Cricket M100 MBA Volleyball T200 ITC Swimming Now the above table become First Normal Form And check whether the “sport_event” relation is in 2NF: A table is said to be in Second Normal Form (2NF), when it’s in 1NF and every attribute in the row is functionally dependent on whole key (fully), not just part of the key.

PowerPoint Presentation:

In sport_event Major is functionally dependent on the primary key stud_id. In sport_event Activities is functionally dependent on the primary key stud_id. And check whether the “sport_event” relation is in 3NF: Stud_id Major A table is said to be in Third Normal Form (3NF), when it’s in 2NF and primary key is functionally dependent on every non-key attribute. Stud_id Activities

PowerPoint Presentation:

In “sport_event”relation stud_id - Primary key Major Activities - Non-key So, the primary key “stud_id” is functionally dependent on non-key attributes “ Major & Activities”. Now check whether the relation in – BCNF Major Activities Stud_id A table is said to be in Boyce Codd Normal For m (BCNF), if and only if every determinant (attribute) is a candidate key.

PowerPoint Presentation:

In sport_event relation stud_id - Primary key stud_id Activities - Candidate key stud_id Major - Candidate key To convert the sport_event relation to 4NF : Reduce BCNF entities to 4NF by removing any independent multi-valued components of the primary key to two new parent entities.

PowerPoint Presentation:

stud_id determines multi-values of Major. stud_id determines multi-values of Activities. Major and Activities are independent of one another. Major and Activities are independent of one another, so remove These two independent attributes and place them in a separate table and name each tables. stud_major (or) Stud_id Major C100 MCA C100 MCA M100 MBA M100 MBA T200 ITC Stud_id Major C100 MCA M100 MBA T200 ITC

PowerPoint Presentation:

Stud_id Activities C100 Baseball C100 Volleyball M100 Cricket M100 Volleyball T200 Swimming Stud_activity Combining two new tables (stud_major & stud_activity) the sport_event table will be formed

Other Normal Forms:

Other Normal Forms

PowerPoint Presentation:

Normalization using Join dependency 6. Fifth Normal Form (5NF) (or) Project-Join Normal Form (PJNF) 7. Domain Key Normal Form (DKNF)

2.Database Design Process:

2.Database Design Process 1. E-R Model and Normalization 2. Naming of Attributes and Relationships 3. De-normalization for performance

1. E-R Model and Normalization :

1. E-R Model and Normalization When an E-R diagram is carefully designed, identifying all entities correctly, the tables generated from the E-R diagram should not need further normalization. However, in an imperfect design there can be FDs from non- key attributes of an entity to other attributes of the entity Example: Employee relation with attributes department-number and department-address , and an Functional Dependency of department_number → department_ address This Employee-relation requires Normalization because Employee relation can be normalized to “ Dept” relation. Functional dependency is helpful for detecting poor ER-design

Universal Relation Approach:

Universal Relation Approach The relation r 1 r 2 … r n is called a universal relation since it involves all the attributes in the “universe” defined by R 1 U R 2 U … U R n

2. Naming of Attributes and Relationships :

2. Naming of Attributes and Relationships One of the important features of a database design is “ unique-role assumption”. Each attribute name has a unique meaning in the database. This prevent from using the same attribute with different meaning in different schemas. Example: The attribute “ number “ for loan_number in the “Loan” schema and account_number in “Account” schema. * In large database schemas, relationship sets are named (via) a concatenation of the names of related entity sets.

3. De-normalization for Performance:

3. De-normalization for Performance De-normalization: The process of taking a normalized scheme and making it non- normalized is called “ De-normalization”. Example: displaying customer-name along with account-number and balance requires join of account relation with depositor relation Account_number Branch_name Balance A-101 A-102 A-201 A-215 A-217 A-222 Downtown Perryridge Brighton Mianus Brighton Redwood 500 400 900 700 750 700 Account

PowerPoint Presentation:

Alternative 1: Use de-normalized relation containing attributes of account as well as depositor with all above attributes Depositor Customer_id Account_number C1 C2 C3 C4 C5 C6 A-101 A-201 A-215 A-102 A-305 A-217 Account_number Branch_name Balance Customer_id Account_number A-101 A-102 A-201 A-215 A-217 A-222 Downtown Perryridge Brighton Mianus Brighton Redwood 500 400 900 700 750 700 C1 C2 C3 C4 C5 C6 A-101 A-201 A-215 A-102 A-305 A-217

PowerPoint Presentation:

Advantages & disadvantages faster lookup Extra space and extra execution time for updates extra coding work for programmer and possibility of error in extra code Alternative 2: use a materialized view defined as follow account depositor Benefits and drawbacks same as above, except no extra coding work Account_number Customer_id Branch_name Balance A-101 A-102 A-201 A-215 A-217 A-222 C1 C2 C3 C4 C5 C6 Downtown Perryridge Brighton Mianus Brighton Redwood 500 400 900 700 750 700

3. Design Issues:

3. Design Issues

Other Design Issues:

Other Design Issues The aspects in database design aren’t addressed by Normalization can lead to bad database design. Examples of bad database design: Alternative:1 Let us take company database that stores earning of companies in different years. com_earnings ( company-id, company_year, amount ) * The amount is functional dependent on company_id and company _ year Company_id Company_year amount

PowerPoint Presentation:

Above are in BCNF, but make querying across years difficult and needs new table each year earnings-2000, earnings-2001, earnings-2002 , etc., all on the schema ( company-id, earnings ) Alternative:2 CROSS-TAP values for one attribute become column names are called “CROSS_TAP”. company-year ( company-id, earnings-2000, earnings-2001, earnings-2002 ) Use: Spreadsheets Data analysis Tools

4. MODELING TEMPORAL DATA :

4. MODELING TEMPORAL DATA

PowerPoint Presentation:

Temporal Data Temporal data are data that have an associated time interval during which they are valid. Snap shot the value of the data at a particular point in time. Example: Snap shot of “customer” database gives the values if all attributes, such as cust_id, cust_name, cus_address, etc., Temporal Functional Dependency Functional dependency hold at a particular point in time are called “Temporal Functional Dependency”. Example: Where - Temporal Functional Dependency X Y T T

THANK YOU:

THANK YOU