logging in or signing up pldi04 Aric85 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 211 Category: News & Reports.. License: All Rights Reserved Like it (0) Dislike it (0) Added: June 20, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Cloning-Based Context-Sensitive Pointer Alias Analysis using BDDs: Cloning-Based Context-Sensitive Pointer Alias Analysis using BDDs John Whaley Monica Lam Stanford University June 10, 2004 Unification vs. Inclusion: Unification vs. Inclusion Earlier scalable pointer analysis was context-insensitive unification-based [Steensgaard ’96] Pointers are either unaliased or point to the same set of objects. Near-linear, but VERY imprecise. Inclusion-based pointer analysis Can point to overlapping sets of objects. Closure calculation is O(n3) Various optimizations [Fahndrich,Su,Heintze,…] BDD formulation, simple, scalable [Berndl,Zhu] Context Sensitivity: Context Sensitivity Context sensitivity is important for precision. Unrealizable paths. Object id(Object x) { return x; } a = id(b); c = id(d); Context Sensitivity: Object id(Object x) { return x; } Object id(Object x) { return x; } Context Sensitivity Context sensitivity is important for precision. Unrealizable paths. Conceptually give each caller its own copy. a = id(b); c = id(d); Summary-Based Analysis: Summary-Based Analysis Popular method for context sensitivity. Two phases: Bottom-up: Summarize effects of methods. Top-down: Propagate information down. Problems: Difficult to summarize pointer analysis. Summary-based analysis using BDD: not shown to scale [Zhu’02] Queries (e.g. which context points to x) require expanding an exponential number of contexts. Cloning-Based Analysis: Cloning-Based Analysis Simple brute force technique. Clone every path through the call graph. Run context-insensitive algorithm on expanded call graph. The catch: exponential blowup Cloning is exponential!: Cloning is exponential! Recursion: Recursion Actually, cloning is unbounded in the presence of recursive cycles. Technique: We treat all methods within a strongly-connected component as a single node. Recursion: Recursion A G B C D E F Top 20 Sourceforge Java Apps: Top 20 Sourceforge Java Apps 1016 1012 108 104 100 Cloning is infeasible (?): Cloning is infeasible (?) Typical large program has ~1014 paths If you need 1 byte to represent a clone: Would require 256 terabytes of storage Registered ECC 1GB DIMMs: $98.6 million Power: 96.4 kilowatts = Power for 128 homes 300 GB hard disks: 939 x $250 = $234,750 Time to read (sequential): 70.8 days Seems unreasonable! BDD comes to the rescue: BDD comes to the rescue There are many similarities across contexts. Many copies of nearly-identical results. BDDs can represent large sets of redundant data efficiently. Need a BDD encoding that exploits the similarities. Contribution (1): Contribution (1) Can represent context-sensitive call graph efficiently with BDDs and a clever context numbering scheme Inclusion-based pointer analysis 1014 contexts, 19 minutes Generates all answers Contribution (2) : Contribution (2) BDD hacking is complicated bddbddb (BDD-based deductive database) Pointer algorithm in 6 lines of Datalog Automatic translate into efficient BDD implementation 10x performance over hand-tuned solver (2164 lines of Java) Contribution (3): Contribution (3) bddbddb: General Datalog solver Supports simple declarative queries Easy use of context-sensitive pointer results Simple context-sensitive analyses: Escape analysis Type refinement Side effect analysis Many more presented in the paper Context-sensitive call graphs in BDD: Context-sensitive call graphs in BDD Call graph relation: Call graph relation Call graph expressed as a relation. Five edges: Calls(A,B) Calls(A,C) Calls(A,D) Calls(B,D) Calls(C,D) B D C A Call graph relation: Call graph relation Relation expressed as a binary function. A=00, B=01, C=10, D=11 B D C A 00 10 01 11 Binary Decision Diagrams: Binary Decision Diagrams Graphical encoding of a truth table. x2 x4 x3 x3 x4 x4 x4 0 0 0 1 0 0 0 0 x2 x4 x3 x3 x4 x4 x4 0 1 1 1 0 0 0 1 x1 0 edge 1 edge Binary Decision Diagrams: Binary Decision Diagrams Collapse redundant nodes. x2 x4 x3 x3 x4 x4 x4 0 0 0 0 0 0 0 x2 x4 x3 x3 x4 x4 x4 0 0 0 0 x1 1 1 1 1 1 Binary Decision Diagrams: Binary Decision Diagrams Collapse redundant nodes. x2 x4 x3 x3 x4 x4 x4 x2 x4 x3 x3 x4 x4 x4 0 x1 1 Binary Decision Diagrams: Binary Decision Diagrams Collapse redundant nodes. x2 x4 x3 x3 x2 x3 x3 x4 x4 0 x1 1 Binary Decision Diagrams: Binary Decision Diagrams Collapse redundant nodes. x2 x4 x3 x3 x2 x3 x4 x4 0 x1 1 Binary Decision Diagrams: Binary Decision Diagrams Eliminate unnecessary nodes. x2 x4 x3 x3 x2 x3 x4 x4 0 x1 1 Binary Decision Diagrams: Binary Decision Diagrams Eliminate unnecessary nodes. x2 x3 x2 x3 x4 0 x1 1 Binary Decision Diagrams: Binary Decision Diagrams Size is correlated to amount of redundancy, NOT size of relation. As the set gets larger, the number of don’t-care bits increases, leading to fewer necessary nodes. Expanded Call Graph: Expanded Call Graph A D B C E F G H 0 1 2 0 1 2 2 1 0 Numbering Clones: Numbering Clones A D B C E F G H A D B C E F G H E E F F G G H H H H H 0 0 0 0 1 2 0-2 0-2 0-2 3-5 0 0 0 0 1 2 0 1 2 2 1 0 0 1 2 3 4 5 0 Pointer Analysis: Pointer Analysis Pointer Analysis Example: Pointer Analysis Example h1: v1 = new Object(); h2: v2 = new Object(); v1.f = v2; v3 = v1.f; Input Relations vPointsTo(v1,h1) vPointsTo(v2,h2) Store(v1,f,v2) Load(v1,f,v3) Output Relations hPointsTo(h1,f,h2) vPointsTo(v3,h2) v1 h1 v2 h2 f v3 Inference Rule in Datalog: hPointsTo(h1, f, h2) :- Store(v1, f, v2), vPointsTo(v1, h1), vPointsTo(v2, h2). v1 h1 v2 h2 f Inference Rule in Datalog v1.f = v2; Stores: Context-sensitive pointer analysis: Context-sensitive pointer analysis Compute call graph with context-insensitive pointer analysis. Datalog rules for: assignments, loads, stores discover call targets, bind parameters type filtering Apply rules until fix-point reached. Compute expanded call graph relation. Apply context-insensitive algorithm to expanded call graph. bddbddb:BDD-Based Deductive DataBase: bddbddb: BDD-Based Deductive DataBase Datalog: Datalog Declarative logic programming language designed for databases Horn clauses Operates on relations Datalog is expressive Relational algebra: Explicitly specify relational join, project, rename. Relational calculus: Specify relations between variables; operations are implicit. Datalog: Allows recursively-defined relations. Datalog BDD: Datalog BDD Join, project, rename are directly mapped to built-in BDD operations Automatically optimizes: Rule application order Incrementalization Variable ordering BDD parameter tuning Many more… Experimental Results: Experimental Results Experimental Results: Experimental Results Top 20 Java projects on SourceForge Real programs with 100K+ users each Using automatic bddbddb solver Each analysis only a few lines of code Easy to try new algorithms, new queries Test system: Pentium 4 2.2GHz, 1GB RAM RedHat Fedora Core 1, JDK 1.4.2_04, javabdd library, Joeq compiler Slide38: Slide39: Multi-type variables: Multi-type variables A variable is multi-type if it can point to objects of different types. Measure of analysis precision One line in Datalog Two ways of handling context sensitivity: Projected: Merge all contexts together Full: Keep each context separate Slide41: Related Work: Related Work Context-insensitive pointer analysis Steensgaard: Unification-based (POPL’96) Andersen: Inclusion-based (’94) Optimizations: too many to list Berndl: formulate in BDD (PLDI’03) Das: one-level-flow (PLDI’00) Hybrid unification/inclusion Related Work: Related Work Scalable context-sensitive pointer analysis Fähndrich etal, instantiation constraints (PLDI’00) CFL-reachability Unification-based: Imprecise. Handles recursion well. Computes on-demand. GOLF: Das etal. (SAS’01) One level of context sensitivity. Foster, Fahndrich, Aiken (SAS’00) Throws away information. Wilson andamp; Lam: PTF (PLDI’95) Doesn't really scale (especially complexity) Related Work: Related Work Whaley andamp; Rinard: Escape analysis (OOPSLA’99) Compositional summaries: only weak updates. Achieves scalability by collapsing escaped nodes. Emami andamp; Hendren: Invocation graphs (PLDI’94) Only shown to scale to 8K lines. Zhu andamp; Calman: (PLDI’04) To be presented next in this session. More complete coverage in the paper. Conclusion: Conclusion The first scalable context-sensitive inclusion-based pointer analysis. Achieves context sensitivity by cloning. bddbddb: Datalog efficient BDD Easy to query results, develop new analyses. Very efficient! andlt;19 minutes, andlt;600mb on largest benchmark. Complete system is publically available at: http://suif.stanford.edu/bddbddb You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
pldi04 Aric85 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 211 Category: News & Reports.. License: All Rights Reserved Like it (0) Dislike it (0) Added: June 20, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Cloning-Based Context-Sensitive Pointer Alias Analysis using BDDs: Cloning-Based Context-Sensitive Pointer Alias Analysis using BDDs John Whaley Monica Lam Stanford University June 10, 2004 Unification vs. Inclusion: Unification vs. Inclusion Earlier scalable pointer analysis was context-insensitive unification-based [Steensgaard ’96] Pointers are either unaliased or point to the same set of objects. Near-linear, but VERY imprecise. Inclusion-based pointer analysis Can point to overlapping sets of objects. Closure calculation is O(n3) Various optimizations [Fahndrich,Su,Heintze,…] BDD formulation, simple, scalable [Berndl,Zhu] Context Sensitivity: Context Sensitivity Context sensitivity is important for precision. Unrealizable paths. Object id(Object x) { return x; } a = id(b); c = id(d); Context Sensitivity: Object id(Object x) { return x; } Object id(Object x) { return x; } Context Sensitivity Context sensitivity is important for precision. Unrealizable paths. Conceptually give each caller its own copy. a = id(b); c = id(d); Summary-Based Analysis: Summary-Based Analysis Popular method for context sensitivity. Two phases: Bottom-up: Summarize effects of methods. Top-down: Propagate information down. Problems: Difficult to summarize pointer analysis. Summary-based analysis using BDD: not shown to scale [Zhu’02] Queries (e.g. which context points to x) require expanding an exponential number of contexts. Cloning-Based Analysis: Cloning-Based Analysis Simple brute force technique. Clone every path through the call graph. Run context-insensitive algorithm on expanded call graph. The catch: exponential blowup Cloning is exponential!: Cloning is exponential! Recursion: Recursion Actually, cloning is unbounded in the presence of recursive cycles. Technique: We treat all methods within a strongly-connected component as a single node. Recursion: Recursion A G B C D E F Top 20 Sourceforge Java Apps: Top 20 Sourceforge Java Apps 1016 1012 108 104 100 Cloning is infeasible (?): Cloning is infeasible (?) Typical large program has ~1014 paths If you need 1 byte to represent a clone: Would require 256 terabytes of storage Registered ECC 1GB DIMMs: $98.6 million Power: 96.4 kilowatts = Power for 128 homes 300 GB hard disks: 939 x $250 = $234,750 Time to read (sequential): 70.8 days Seems unreasonable! BDD comes to the rescue: BDD comes to the rescue There are many similarities across contexts. Many copies of nearly-identical results. BDDs can represent large sets of redundant data efficiently. Need a BDD encoding that exploits the similarities. Contribution (1): Contribution (1) Can represent context-sensitive call graph efficiently with BDDs and a clever context numbering scheme Inclusion-based pointer analysis 1014 contexts, 19 minutes Generates all answers Contribution (2) : Contribution (2) BDD hacking is complicated bddbddb (BDD-based deductive database) Pointer algorithm in 6 lines of Datalog Automatic translate into efficient BDD implementation 10x performance over hand-tuned solver (2164 lines of Java) Contribution (3): Contribution (3) bddbddb: General Datalog solver Supports simple declarative queries Easy use of context-sensitive pointer results Simple context-sensitive analyses: Escape analysis Type refinement Side effect analysis Many more presented in the paper Context-sensitive call graphs in BDD: Context-sensitive call graphs in BDD Call graph relation: Call graph relation Call graph expressed as a relation. Five edges: Calls(A,B) Calls(A,C) Calls(A,D) Calls(B,D) Calls(C,D) B D C A Call graph relation: Call graph relation Relation expressed as a binary function. A=00, B=01, C=10, D=11 B D C A 00 10 01 11 Binary Decision Diagrams: Binary Decision Diagrams Graphical encoding of a truth table. x2 x4 x3 x3 x4 x4 x4 0 0 0 1 0 0 0 0 x2 x4 x3 x3 x4 x4 x4 0 1 1 1 0 0 0 1 x1 0 edge 1 edge Binary Decision Diagrams: Binary Decision Diagrams Collapse redundant nodes. x2 x4 x3 x3 x4 x4 x4 0 0 0 0 0 0 0 x2 x4 x3 x3 x4 x4 x4 0 0 0 0 x1 1 1 1 1 1 Binary Decision Diagrams: Binary Decision Diagrams Collapse redundant nodes. x2 x4 x3 x3 x4 x4 x4 x2 x4 x3 x3 x4 x4 x4 0 x1 1 Binary Decision Diagrams: Binary Decision Diagrams Collapse redundant nodes. x2 x4 x3 x3 x2 x3 x3 x4 x4 0 x1 1 Binary Decision Diagrams: Binary Decision Diagrams Collapse redundant nodes. x2 x4 x3 x3 x2 x3 x4 x4 0 x1 1 Binary Decision Diagrams: Binary Decision Diagrams Eliminate unnecessary nodes. x2 x4 x3 x3 x2 x3 x4 x4 0 x1 1 Binary Decision Diagrams: Binary Decision Diagrams Eliminate unnecessary nodes. x2 x3 x2 x3 x4 0 x1 1 Binary Decision Diagrams: Binary Decision Diagrams Size is correlated to amount of redundancy, NOT size of relation. As the set gets larger, the number of don’t-care bits increases, leading to fewer necessary nodes. Expanded Call Graph: Expanded Call Graph A D B C E F G H 0 1 2 0 1 2 2 1 0 Numbering Clones: Numbering Clones A D B C E F G H A D B C E F G H E E F F G G H H H H H 0 0 0 0 1 2 0-2 0-2 0-2 3-5 0 0 0 0 1 2 0 1 2 2 1 0 0 1 2 3 4 5 0 Pointer Analysis: Pointer Analysis Pointer Analysis Example: Pointer Analysis Example h1: v1 = new Object(); h2: v2 = new Object(); v1.f = v2; v3 = v1.f; Input Relations vPointsTo(v1,h1) vPointsTo(v2,h2) Store(v1,f,v2) Load(v1,f,v3) Output Relations hPointsTo(h1,f,h2) vPointsTo(v3,h2) v1 h1 v2 h2 f v3 Inference Rule in Datalog: hPointsTo(h1, f, h2) :- Store(v1, f, v2), vPointsTo(v1, h1), vPointsTo(v2, h2). v1 h1 v2 h2 f Inference Rule in Datalog v1.f = v2; Stores: Context-sensitive pointer analysis: Context-sensitive pointer analysis Compute call graph with context-insensitive pointer analysis. Datalog rules for: assignments, loads, stores discover call targets, bind parameters type filtering Apply rules until fix-point reached. Compute expanded call graph relation. Apply context-insensitive algorithm to expanded call graph. bddbddb:BDD-Based Deductive DataBase: bddbddb: BDD-Based Deductive DataBase Datalog: Datalog Declarative logic programming language designed for databases Horn clauses Operates on relations Datalog is expressive Relational algebra: Explicitly specify relational join, project, rename. Relational calculus: Specify relations between variables; operations are implicit. Datalog: Allows recursively-defined relations. Datalog BDD: Datalog BDD Join, project, rename are directly mapped to built-in BDD operations Automatically optimizes: Rule application order Incrementalization Variable ordering BDD parameter tuning Many more… Experimental Results: Experimental Results Experimental Results: Experimental Results Top 20 Java projects on SourceForge Real programs with 100K+ users each Using automatic bddbddb solver Each analysis only a few lines of code Easy to try new algorithms, new queries Test system: Pentium 4 2.2GHz, 1GB RAM RedHat Fedora Core 1, JDK 1.4.2_04, javabdd library, Joeq compiler Slide38: Slide39: Multi-type variables: Multi-type variables A variable is multi-type if it can point to objects of different types. Measure of analysis precision One line in Datalog Two ways of handling context sensitivity: Projected: Merge all contexts together Full: Keep each context separate Slide41: Related Work: Related Work Context-insensitive pointer analysis Steensgaard: Unification-based (POPL’96) Andersen: Inclusion-based (’94) Optimizations: too many to list Berndl: formulate in BDD (PLDI’03) Das: one-level-flow (PLDI’00) Hybrid unification/inclusion Related Work: Related Work Scalable context-sensitive pointer analysis Fähndrich etal, instantiation constraints (PLDI’00) CFL-reachability Unification-based: Imprecise. Handles recursion well. Computes on-demand. GOLF: Das etal. (SAS’01) One level of context sensitivity. Foster, Fahndrich, Aiken (SAS’00) Throws away information. Wilson andamp; Lam: PTF (PLDI’95) Doesn't really scale (especially complexity) Related Work: Related Work Whaley andamp; Rinard: Escape analysis (OOPSLA’99) Compositional summaries: only weak updates. Achieves scalability by collapsing escaped nodes. Emami andamp; Hendren: Invocation graphs (PLDI’94) Only shown to scale to 8K lines. Zhu andamp; Calman: (PLDI’04) To be presented next in this session. More complete coverage in the paper. Conclusion: Conclusion The first scalable context-sensitive inclusion-based pointer analysis. Achieves context sensitivity by cloning. bddbddb: Datalog efficient BDD Easy to query results, develop new analyses. Very efficient! andlt;19 minutes, andlt;600mb on largest benchmark. Complete system is publically available at: http://suif.stanford.edu/bddbddb