Explicit and Implicit LIST Aggregate Function for Relational Databases: Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine
mailto:Witold.litwin@dauphine.fr
Summary: Summary New Aggregate Function
Transforms a set of values into single one
Char type
A basic long time need
Should be highly useful
Plan: Plan Motivating Examples
Explicit LIST
Implicit LIST
Conlusion
Further Work
Motivating Example 1: Motivating Example 1 The Supplier-Part (SP) table of the best-known S-P database
Motivating Example 1: Motivating Example 1 The classical query :
select SP.[S#], Sum(SP.Qty) AS [Total Qty]
from SP
group By SP.[S#];
S# Total Qty
S1 1300
S2 700
S3 200
S4 900 How to get also the individual quantities ?
Motivating Example 2: Motivating Example 2 A database of persons having:
Multiple Hobbies
Multiple preferred Restaurants
Many Friends
Best design:
four 4-NF tables
P (SSÂ #, Name), H (SS#, Hobby), R (SS#, Rest), F (SS#, Friend)
Database: Database
Fragment: Fragment
Query: Query select P.[SS#], P.Name, F.Friend, R.Rest, H.Hobby
from ((P INNER JOIN F ON P.[SS#] = F.[SS#])
INNER JOIN H ON P.[SS#] = H.[SS#])
INNER JOIN R ON P.[SS#] = R.[SS#]
where P.[SS#] ="ss1" ; Select Name, Friends, Restaurants, Hobbies, of Person ‘SS1’ SQL :
Result: Result Usable ???
General Problem: General Problem Current RDBs manage tables in 1NF
All attributes are single-valued (atomic values)
Example 1 ; We wished
Single-valued attribute :
SUM(QTY)
Multi-valued attribute
Individual quantities
The result would not be 1NF
General Problem: General Problem RDB manages tables in 1NF
All attributes are single-valued
Example 2 ; We wished :
Single-valued attributes :
S#, Name
Multi-valued attributes (multi-sets):
Hobby, Rest, Friend
The result is normalized to 1NF
{(ss1,Witold, x, y, z) : x  Hobby, y  Rest, z  Friend }
The table is not in 4NF
Subject to well-known anomalies
Solutions: Solutions Design RDBS for 0NF tables
A revolution
0NF RDBS will not be here for years
Aggregate set or multi-set values into atomic values
An evolution
All RDBS already do it using:
SUM, AVG, COUNT…
perhaps with GROUP BY
We need a new aggregate leaving the entire set visible
E.g: (multi)-set of values X => (single) list of values X
Local Culinary Example: Local Culinary Example The set-valued attribute:
(Schwarz, Wälder; Kirchen, Chocoladen, Torte)
The aggregated attribute:
Schwarzwälderkirchenchocoladentorte
Local specialty, try it !
Explicit LIST function: Explicit LIST function Select S#, sum (Qty) AS [Total Qty], LIST (Qty) AS Histogram from SP group by S#;
Explicit LIST function: Explicit LIST function select P.SS#, Name, LIST (DISTINCTÂ (Friend)), LIST (DISTINCTÂ (Rest)), LISTÂ (DISTINCT (Hobby)) from P, F, R, H
where P.SS# = F.SS# and F.SS# = R.SS# and R.SS# = H.SS# and P.SS# ="ss1" group by P.SS#, Name ;
Explicit LIST function: Explicit LIST function Simulated actual output using MsAccess forms with list boxes
Form with three subforms
No SQL query used
Explicit LIST function: Explicit LIST function select P#, SUM (Qty) as [Total Qty], LIST (S#, Qty) as [Per supplier]
from SP
group by P#;
Implicit LIST function: Implicit LIST function For any single-valued A :
A = LIST (A)
Any non-aggregated attribute in an SQL query has to be in the GROUP BY clause
Now, any non-aggregated perhaps composite attribute A from a single table and not in GROUP BY clause is implicitly under
LIST (DISTINCT (A))
Queries may become less procedural
Implicit LIST function: Implicit LIST function select P#, SUM (Qty) as [Total Qty], S#, Qty
from SP
group by P#
having ‘S# QTY’ like ‘*s4*’;
Implicit LIST is LIST (S#, QTY)
Implicit LIST function: Implicit LIST function Query
Select S.*, P#, Qty From S, SP Where S.S# = SP.S#
Repeats all the data of the supplier S in every resulting tuple
6 times for S1: its Name, City, Status
Query
Select S.*, P#, Qty From S, SP Where S.S# = SP.S#
Group By S.S#
Does it only once per supplier
Less redundancy
Implicit Joins and From: Implicit Joins and From Equijoins following the referential semantic links or integrity may be implicit
MsAccess, SQL Server…
FROM clause content can be inferred from the attribute names
Even less procedural formulation may result:
select P.SS#, Name, Friend, Rest, Hobby
group by P.SS#, Name ;
Implementation Issues: Implementation Issues Should be easy for the RDBS owner
Any RDB already processes the aggregates Already done hiding the list Should also be shown
Implementation Issues: Implementation Issues For explicit LIST, foreign function interface may suffice
Oracle, DB2, Yukon…
See related work in the paper for current (limited) proposals
Oracle & iAnywhere (core code)
Not for the implicit LIST
Access to core code is necessary
Conclusion: Conclusion LIST is a new aggregate function
Aggregates a multi-valued attribute into a single value
Responds to a long-standing fundamental RDBS user need
- 30 years ?
Should be rather easy to implement
Future work should start with the implementation
Using foreign functions for explicit LIST
Research Support : Research Support European Commission ICONS Project
no. IST-2001-32429.
Microsoft Research
Thank You for Your Attention: Thank You for Your Attention Witold Litwin Université Paris 9 Dauphine
mailto:Witold.litwin@dauphine.fr