Share PowerPoint. Anywhere!

pldi04

Featured Animated Featured Animated
Uploaded from authorPOINT
Download as Download Not Available PPT
Presentation Description

No description available

Like authorSTREAM?


You can vote once a day till December
10th, Vote Now!
Views: 125
Like it  ( Likes) Dislike it  ( Dislikes)
Added: June 20, 2007 This presentation is Public
Presentation Category :News & Reports
Presentation StatisticsNew!
Views on authorSTREAM: 125
Presentation Transcript

Cloning-Based Context-Sensitive Pointer Alias Analysis using BDDs : Cloning-Based Context-Sensitive Pointer Alias Analysis using BDDs John Whaley Monica Lam Stanford University June 10, 2004


Unification vs. Inclusion : Unification vs. Inclusion Earlier scalable pointer analysis was context-insensitive unification-based [Steensgaard ’96] Pointers are either unaliased or point to the same set of objects. Near-linear, but VERY imprecise. Inclusion-based pointer analysis Can point to overlapping sets of objects. Closure calculation is O(n3) Various optimizations [Fahndrich,Su,Heintze,…] BDD formulation, simple, scalable [Berndl,Zhu]


Context Sensitivity : Context Sensitivity Context sensitivity is important for precision. Unrealizable paths. Object id(Object x) { return x; } a = id(b); c = id(d);


Context Sensitivity : Object id(Object x) { return x; } Object id(Object x) { return x; } Context Sensitivity Context sensitivity is important for precision. Unrealizable paths. Conceptually give each caller its own copy. a = id(b); c = id(d);


Summary-Based Analysis : Summary-Based Analysis Popular method for context sensitivity. Two phases: Bottom-up: Summarize effects of methods. Top-down: Propagate information down. Problems: Difficult to summarize pointer analysis. Summary-based analysis using BDD: not shown to scale [Zhu’02] Queries (e.g. which context points to x) require expanding an exponential number of contexts.


Cloning-Based Analysis : Cloning-Based Analysis Simple brute force technique. Clone every path through the call graph. Run context-insensitive algorithm on expanded call graph. The catch: exponential blowup


Cloning is exponential! : Cloning is exponential!


Recursion : Recursion Actually, cloning is unbounded in the presence of recursive cycles. Technique: We treat all methods within a strongly-connected component as a single node.


Recursion : Recursion A G B C D E F


Top 20 Sourceforge Java Apps : Top 20 Sourceforge Java Apps 1016 1012 108 104 100


Cloning is infeasible (?) : Cloning is infeasible (?) Typical large program has ~1014 paths If you need 1 byte to represent a clone: Would require 256 terabytes of storage Registered ECC 1GB DIMMs: $98.6 million Power: 96.4 kilowatts = Power for 128 homes 300 GB hard disks: 939 x $250 = $234,750 Time to read (sequential): 70.8 days Seems unreasonable!


BDD comes to the rescue : BDD comes to the rescue There are many similarities across contexts. Many copies of nearly-identical results. BDDs can represent large sets of redundant data efficiently. Need a BDD encoding that exploits the similarities.


Contribution (1) : Contribution (1) Can represent context-sensitive call graph efficiently with BDDs and a clever context numbering scheme Inclusion-based pointer analysis 1014 contexts, 19 minutes Generates all answers


Contribution (2) : Contribution (2) BDD hacking is complicated  bddbddb (BDD-based deductive database) Pointer algorithm in 6 lines of Datalog Automatic translate into efficient BDD implementation 10x performance over hand-tuned solver (2164 lines of Java)


Contribution (3) : Contribution (3) bddbddb: General Datalog solver Supports simple declarative queries Easy use of context-sensitive pointer results Simple context-sensitive analyses: Escape analysis Type refinement Side effect analysis Many more presented in the paper


Context-sensitive call graphs in BDD : Context-sensitive call graphs in BDD


Call graph relation : Call graph relation Call graph expressed as a relation. Five edges: Calls(A,B) Calls(A,C) Calls(A,D) Calls(B,D) Calls(C,D) B D C A


Call graph relation : Call graph relation Relation expressed as a binary function. A=00, B=01, C=10, D=11 B D C A 00 10 01 11


Binary Decision Diagrams : Binary Decision Diagrams Graphical encoding of a truth table. x2 x4 x3 x3 x4 x4 x4 0 0 0 1 0 0 0 0 x2 x4 x3 x3 x4 x4 x4 0 1 1 1 0 0 0 1 x1 0 edge 1 edge


Binary Decision Diagrams : Binary Decision Diagrams Collapse redundant nodes. x2 x4 x3 x3 x4 x4 x4 0 0 0 0 0 0 0 x2 x4 x3 x3 x4 x4 x4 0 0 0 0 x1 1 1 1 1 1


Binary Decision Diagrams : Binary Decision Diagrams Collapse redundant nodes. x2 x4 x3 x3 x4 x4 x4 x2 x4 x3 x3 x4 x4 x4 0 x1 1


Binary Decision Diagrams : Binary Decision Diagrams Collapse redundant nodes. x2 x4 x3 x3 x2 x3 x3 x4 x4 0 x1 1


Binary Decision Diagrams : Binary Decision Diagrams Collapse redundant nodes. x2 x4 x3 x3 x2 x3 x4 x4 0 x1 1


Binary Decision Diagrams : Binary Decision Diagrams Eliminate unnecessary nodes. x2 x4 x3 x3 x2 x3 x4 x4 0 x1 1


Binary Decision Diagrams : Binary Decision Diagrams Eliminate unnecessary nodes. x2 x3 x2 x3 x4 0 x1 1


Binary Decision Diagrams : Binary Decision Diagrams Size is correlated to amount of redundancy, NOT size of relation. As the set gets larger, the number of don’t-care bits increases, leading to fewer necessary nodes.


Expanded Call Graph : Expanded Call Graph A D B C E F G H 0 1 2 0 1 2 2 1 0


Numbering Clones : Numbering Clones A D B C E F G H A D B C E F G H E E F F G G H H H H H 0 0 0 0 1 2 0-2 0-2 0-2 3-5 0 0 0 0 1 2 0 1 2 2 1 0 0 1 2 3 4 5 0


Pointer Analysis : Pointer Analysis


Pointer Analysis Example : Pointer Analysis Example h1: v1 = new Object(); h2: v2 = new Object(); v1.f = v2; v3 = v1.f; Input Relations vPointsTo(v1,h1) vPointsTo(v2,h2) Store(v1,f,v2) Load(v1,f,v3) Output Relations hPointsTo(h1,f,h2) vPointsTo(v3,h2) v1 h1 v2 h2 f v3


Inference Rule in Datalog : hPointsTo(h1, f, h2) :- Store(v1, f, v2), vPointsTo(v1, h1), vPointsTo(v2, h2). v1 h1 v2 h2 f Inference Rule in Datalog v1.f = v2; Stores:


Context-sensitive pointer analysis : Context-sensitive pointer analysis Compute call graph with context-insensitive pointer analysis. Datalog rules for: assignments, loads, stores discover call targets, bind parameters type filtering Apply rules until fix-point reached. Compute expanded call graph relation. Apply context-insensitive algorithm to expanded call graph.


bddbddb:BDD-Based Deductive DataBase : bddbddb: BDD-Based Deductive DataBase


Datalog : Datalog Declarative logic programming language designed for databases Horn clauses Operates on relations Datalog is expressive Relational algebra: Explicitly specify relational join, project, rename. Relational calculus: Specify relations between variables; operations are implicit. Datalog: Allows recursively-defined relations.


Datalog  BDD : Datalog  BDD Join, project, rename are directly mapped to built-in BDD operations Automatically optimizes: Rule application order Incrementalization Variable ordering BDD parameter tuning Many more…


Experimental Results : Experimental Results


Experimental Results : Experimental Results Top 20 Java projects on SourceForge Real programs with 100K+ users each Using automatic bddbddb solver Each analysis only a few lines of code Easy to try new algorithms, new queries Test system: Pentium 4 2.2GHz, 1GB RAM RedHat Fedora Core 1, JDK 1.4.2_04, javabdd library, Joeq compiler


Slide38 :


Slide39 :


Multi-type variables : Multi-type variables A variable is multi-type if it can point to objects of different types. Measure of analysis precision One line in Datalog Two ways of handling context sensitivity: Projected: Merge all contexts together Full: Keep each context separate


Slide41 :


Related Work : Related Work Context-insensitive pointer analysis Steensgaard: Unification-based (POPL’96) Andersen: Inclusion-based (’94) Optimizations: too many to list Berndl: formulate in BDD (PLDI’03) Das: one-level-flow (PLDI’00) Hybrid unification/inclusion


Related Work : Related Work Scalable context-sensitive pointer analysis Fähndrich etal, instantiation constraints (PLDI’00) CFL-reachability Unification-based: Imprecise. Handles recursion well. Computes on-demand. GOLF: Das etal. (SAS’01) One level of context sensitivity. Foster, Fahndrich, Aiken (SAS’00) Throws away information. Wilson andamp; Lam: PTF (PLDI’95) Doesn't really scale (especially complexity)


Related Work : Related Work Whaley andamp; Rinard: Escape analysis (OOPSLA’99) Compositional summaries: only weak updates. Achieves scalability by collapsing escaped nodes. Emami andamp; Hendren: Invocation graphs (PLDI’94) Only shown to scale to 8K lines. Zhu andamp; Calman: (PLDI’04) To be presented next in this session. More complete coverage in the paper.


Conclusion : Conclusion The first scalable context-sensitive inclusion-based pointer analysis. Achieves context sensitivity by cloning. bddbddb: Datalog  efficient BDD Easy to query results, develop new analyses. Very efficient! andlt;19 minutes, andlt;600mb on largest benchmark. Complete system is publically available at: http://suif.stanford.edu/bddbddb