logging in or signing up pods03 Roxie Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 24 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 29, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Concise Description of Structured Sets: Concise Description of Structured Sets Alberto O. Mendelzon Ken Q. Pu University of TorontoMotivation: A true story: Motivation: A true story An OLAP report writing software: GUI: Range Selection SQL Translation RDBMS Report Writer / PlottingThe GUI: The GUI User selects the region of interest via GUI: CTRL-A selects ALL, Click’n-drag selects LOTS.An Engineering Problem and Solution: An Engineering Problem and Solution The Problem: SQL statement has a length limit. User selection is LARGE, especially when the visualization is plotting. The Solution: Break a large selection to a collection of SQL queries. The Complaint: Kind of slow!A formalization: A formalization A tree-structured categorical set: (dimensions) A selected subset: (user selection) Find the shortest possible description of the selected subset: (translated SQL query)Outline: Outline Formal definition of structured sets, expressions and the Minimal Descriptive Length (MDL) problem Partition structures Hierarchical structures Multidimensional product structuresThe structured set: The structured set A finite universe: U A finite alphabet: An interpretation function: [ ] : P(U) E.g. [ a ] = { 1, 2, 3 } [ 1 ] = { 1 } U/ is the set cover naturally induced by on U. Expressions: Expressions A language L An evaluation function: [ ] : L P(U) The Propositional Language: L (U, ) L = | L + L | L – L | L ·L The L+ (U, ) :L+ = | L+ + L+ “+” is union, “ – ” is difference, “ · ” is intersection.The L -MDL Problem: The L -MDL Problem Let V U, its descriptive length: ||V||L = min { ||s|| : [ s ] = V and s L } The L -MDL decision problem: ||V||L k ? The L -MDL problem: Find a (non-unique) compact (shortest) expression for V in L.Complexity Issues: Complexity Issues The L (U, )-MDL problem is NP-complete. The L+-MDL problem is NP-complete. [reduction from set-cover] Is there anything good we can do? Back to EngineeringTractable MDL-Problems: Tractable MDL-Problems Strategy: reduce the complexity of the MDL problem by restricting to a class of structures. Two tractable cases: Partition: = 1 U, U/1 is a partition Hierarchy: = N N-1 … 1 U, and U/i U/i+1 for all i > 0.Partition: Partition Toronto Ottawa Windsor San Jose San Francisco San Diego U = { Ottawa, Toronto, Windsor, Kingston San Jose, San Francisco, San Diego } 1 = { Ontario, California } KingstonEfficient Symbols: Efficient Symbols V = { Ottawa, Toronto, Kingston, San Francisco } #(V) = { Ontario } California is not efficient. Efficient symbols for V: #(V) = { 1 : | [ ] V | | [ ] – V | + 1 }Normal Form for Partitions: Normal Form for Partitions N is a sub-language of L : t = (1 + 2 + 3 + …) + (a1 + a2 + a3 + …) – (b1 + b2 + b3 + …) t is N -compact for V if S = #(V) A = V – [ S ] B = [ S ] – V S A BL-MDL for Partitions: L-MDL for Partitions Every expression in L can be reduced to an equivalent expression in N ! To compute a compact expression for V: (V) = #(V) (V) = (V – [ (V)] , [ (V)] – V )An Example: An Example Toronto Ottawa Windsor San Jose San Francisco San Diego Kingston V = { Ottawa, Toronto, Kingston, San Francisco } (V) = { Ontario }, (V) = ({San Francisco}, {Windsor}) t = Ontario + San Francisco – WindsorA Normal Form for Hierarchy: A Normal Form for Hierarchy Levels: (0=U) 1 2 3 … N A recursive definition for N : The expression t is normal w.r.t level i if t = t’ + A – B, with A, B i and t’ is normal w.r.t. level (i+1). Example: t = (( Canada + California – Ontario ) + Toronto – San Jose) t’ A B L-MDL for Hierarchies: L-MDL for Hierarchies Every expression in L can be reduced to an equivalent expression in N ! t = t’ + A – B is a N -compact expression for V if [ t’ ]1 = 1#(V) and is N -compact (A, B) = (V) An iterative algorithm: An iterative algorithm V ( VN + AN - BN ) ( s3 + A3 - B3 ) ( s1 + A1 - B1 ) is compactBack to Engineering: Back to EngineeringMultiple dimensions: What about multi-dimensional sets: V U1 U2 U1 and U2 are hierarchically structured sets: U1 has = 0 1 2 … U2 has = 0 1 2 … Form a new (product) structured set: U = U1 U2 = Interpretation: Multiple dimensions .Example:: Example: U1 -- Time = { Jan, Feb, Mar } U2 -- Geography = { Ontario, California } [Jan Ontario] = {(Jan, Ontario)} L-MDL for product structure is NP-complete.Reduction by example: Reduction by example Reduction from 3-set coverRelated Work: Related Work “The generalized MDL approach for summarization”, Lakshmanan, Ng, Wang, Zhou, Johnson, VLDB, 1999 limited expressiveness for multidimensional data sets; optimal expression is computed in polynomial time Multidimensional MDL problem resembles rectilinear polygon packing and covering problem “Covering rectilinear polygons with axis-parallel rectangles”, Kumar, Ramesh, ACM STOC, 1999What’s next: What’s next More complex structures -- ordering More complex operators -- range operator Conjectured and known NP-hardness of the MDL problems: L (, +) is NP-complete , polynomial if is total. ?Slide26: The End. You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
pods03 Roxie Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 24 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 29, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Concise Description of Structured Sets: Concise Description of Structured Sets Alberto O. Mendelzon Ken Q. Pu University of TorontoMotivation: A true story: Motivation: A true story An OLAP report writing software: GUI: Range Selection SQL Translation RDBMS Report Writer / PlottingThe GUI: The GUI User selects the region of interest via GUI: CTRL-A selects ALL, Click’n-drag selects LOTS.An Engineering Problem and Solution: An Engineering Problem and Solution The Problem: SQL statement has a length limit. User selection is LARGE, especially when the visualization is plotting. The Solution: Break a large selection to a collection of SQL queries. The Complaint: Kind of slow!A formalization: A formalization A tree-structured categorical set: (dimensions) A selected subset: (user selection) Find the shortest possible description of the selected subset: (translated SQL query)Outline: Outline Formal definition of structured sets, expressions and the Minimal Descriptive Length (MDL) problem Partition structures Hierarchical structures Multidimensional product structuresThe structured set: The structured set A finite universe: U A finite alphabet: An interpretation function: [ ] : P(U) E.g. [ a ] = { 1, 2, 3 } [ 1 ] = { 1 } U/ is the set cover naturally induced by on U. Expressions: Expressions A language L An evaluation function: [ ] : L P(U) The Propositional Language: L (U, ) L = | L + L | L – L | L ·L The L+ (U, ) :L+ = | L+ + L+ “+” is union, “ – ” is difference, “ · ” is intersection.The L -MDL Problem: The L -MDL Problem Let V U, its descriptive length: ||V||L = min { ||s|| : [ s ] = V and s L } The L -MDL decision problem: ||V||L k ? The L -MDL problem: Find a (non-unique) compact (shortest) expression for V in L.Complexity Issues: Complexity Issues The L (U, )-MDL problem is NP-complete. The L+-MDL problem is NP-complete. [reduction from set-cover] Is there anything good we can do? Back to EngineeringTractable MDL-Problems: Tractable MDL-Problems Strategy: reduce the complexity of the MDL problem by restricting to a class of structures. Two tractable cases: Partition: = 1 U, U/1 is a partition Hierarchy: = N N-1 … 1 U, and U/i U/i+1 for all i > 0.Partition: Partition Toronto Ottawa Windsor San Jose San Francisco San Diego U = { Ottawa, Toronto, Windsor, Kingston San Jose, San Francisco, San Diego } 1 = { Ontario, California } KingstonEfficient Symbols: Efficient Symbols V = { Ottawa, Toronto, Kingston, San Francisco } #(V) = { Ontario } California is not efficient. Efficient symbols for V: #(V) = { 1 : | [ ] V | | [ ] – V | + 1 }Normal Form for Partitions: Normal Form for Partitions N is a sub-language of L : t = (1 + 2 + 3 + …) + (a1 + a2 + a3 + …) – (b1 + b2 + b3 + …) t is N -compact for V if S = #(V) A = V – [ S ] B = [ S ] – V S A BL-MDL for Partitions: L-MDL for Partitions Every expression in L can be reduced to an equivalent expression in N ! To compute a compact expression for V: (V) = #(V) (V) = (V – [ (V)] , [ (V)] – V )An Example: An Example Toronto Ottawa Windsor San Jose San Francisco San Diego Kingston V = { Ottawa, Toronto, Kingston, San Francisco } (V) = { Ontario }, (V) = ({San Francisco}, {Windsor}) t = Ontario + San Francisco – WindsorA Normal Form for Hierarchy: A Normal Form for Hierarchy Levels: (0=U) 1 2 3 … N A recursive definition for N : The expression t is normal w.r.t level i if t = t’ + A – B, with A, B i and t’ is normal w.r.t. level (i+1). Example: t = (( Canada + California – Ontario ) + Toronto – San Jose) t’ A B L-MDL for Hierarchies: L-MDL for Hierarchies Every expression in L can be reduced to an equivalent expression in N ! t = t’ + A – B is a N -compact expression for V if [ t’ ]1 = 1#(V) and is N -compact (A, B) = (V) An iterative algorithm: An iterative algorithm V ( VN + AN - BN ) ( s3 + A3 - B3 ) ( s1 + A1 - B1 ) is compactBack to Engineering: Back to EngineeringMultiple dimensions: What about multi-dimensional sets: V U1 U2 U1 and U2 are hierarchically structured sets: U1 has = 0 1 2 … U2 has = 0 1 2 … Form a new (product) structured set: U = U1 U2 = Interpretation: Multiple dimensions .Example:: Example: U1 -- Time = { Jan, Feb, Mar } U2 -- Geography = { Ontario, California } [Jan Ontario] = {(Jan, Ontario)} L-MDL for product structure is NP-complete.Reduction by example: Reduction by example Reduction from 3-set coverRelated Work: Related Work “The generalized MDL approach for summarization”, Lakshmanan, Ng, Wang, Zhou, Johnson, VLDB, 1999 limited expressiveness for multidimensional data sets; optimal expression is computed in polynomial time Multidimensional MDL problem resembles rectilinear polygon packing and covering problem “Covering rectilinear polygons with axis-parallel rectangles”, Kumar, Ramesh, ACM STOC, 1999What’s next: What’s next More complex structures -- ordering More complex operators -- range operator Conjectured and known NP-hardness of the MDL problems: L (, +) is NP-complete , polynomial if is total. ?Slide26: The End.