vpl05

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

The Benefits of Exposing Calls and Returns: 

The Benefits of Exposing Calls and Returns Rajeev Alur University of Pennsylvania CONCUR/SPIN, August 2005

Software Model Checking: 

Software Model Checking Code Abstractor Model Verifier Specification Yes Counter-example Predicate abstraction Control flow graph + Boolean vars (Pushdown automata) On-the-fly explicit state Symbolic fixpoint evaluation Temporal logics/Automata Regular! Observables

Regular specifications not expressive enough: 

Regular specifications not expressive enough Classical Hoare-style pre/post conditions If p holds when procedure A is invoked, q holds upon return Total correctness: every invocation of A terminates Integral part of emerging standard JML Stack inspection properties: security/access control If a setuuid bit is being set, process root must be in the call stack Inter-procedural data-flow analysis An expression e is very busy at a control point p if on all paths from p, e will be used before any of its variables (possibly local in current procedure) are modified Need matching of calls with returns, or finding pending calls, or local paths --- Context-free properties!

Checking Context-free Specifications: 

Checking Context-free Specifications Obstcales Context-free languages are not closed under intersection Checking context-free properties against context-free models is undecidable However, many such properties are verifiable Existing work in security that handles some stack inspection properties [JMT99,JKS03] Adding assert statements in the program (with additional local variables, if needed), and then checking regular properties (e.g. reachability) amounts to checking context-free properties Inter-procedural data-flow analysis algorithms [RHS95]

Exposing Calls and Returns: 

Exposing Calls and Returns What’s common to the checkable properties? Both model and property have their own stack, but the two stacks are synchronized and grow/shrink together! As a generator, program exposes its calls and returns, and as an acceptor, property pushes on calls and pops on returns Formalization of this intuition: Visibly Pushdown Languages A surprisingly robust class of languages with properties like the regular languages and potentially many applications

Talk Outline: 

Talk Outline Visibly Pushdown Languages Temporal Logic CaRet and Model Checking Ongoing Work References: Visibly pushdown languages Alur, Madhusudan; STOC’04 A temporal logic of nested calls and returns Alur, Etessami, Madhusudan; TACAS’04 Congruences for visibly pushdown languages Alur, Kumar, Madhusudan, Viswanthan; ICALP’05

Context-free Languages: Recap (1/2): 

Context-free Languages: Recap (1/2) Given an alphabet S, a language L is a set of finite words over S A pushdown automaton (PDA) has a finite control and a stack, and while reading a word, it can push/pop stack symbols while updating control state Configuration of a PDA: control state + a string of stack symbols Acceptance defined by empty stack or final state A language L is a context-free language (CFL) if there is a pushdown automaton that accepts it Sample CFLs All regular languages Set of words of the form an bn, for some n Set of words with equal number of a and b symbols Non-CFL: Set of words of the form an bn cn

Context-free Languages: Recap (2/2): 

Context-free Languages: Recap (2/2) Alternative characterization: Context-free grammars Natural and popular for defining syntax Nondeterministic PDAs are more expressive than deterministic ones Emptiness of a PDA solvable in polynomial-time Closed under union, but not closed under intersection or complementation Language inclusion, emptiness of intersection undecidable Applications: Parsing, Natural language processing, Program analysis…

Exposing Calls and Returns: 

Exposing Calls and Returns Pushdown alphabet: partitioned into 3 disjoint sets Σ = Spush  Spop  Slocal Pushdown words: finite words over pushdown Σ A visibly pushdown automaton over a pushdown alphabet Σ is a pushdown automaton that pushes a symbol onto the stack on a symbol in Spush pops the stack on a symbol in Spop cannot change the stack on a symbol in Slocal Key: Stack size at any time is determined by the input word but not control state or stack content

Visibly pushdown languages (VPL): 

A language L is a VPL over a pushdown alphabet Σ, if there is a visibly pushdown automaton that accepts it (acceptance by final state) The language {an bn | for some n} VPL if a is in Spush and b is in Spop Not a VPL for other partitions The language of words with equal number of a and b symbols is not a VPL (independent of partition) Every regular language L is a VPL independent of partitioning Dyck language (words with well-balanced parantheses) is a VPL provided left/right parantheses are in Spush/Spop resp Visibly pushdown languages (VPL)

VPLs in Program Analysis: 

VPLs in Program Analysis Program bool P(u:int) { global int x; local int y; … a: if Q { x = (x+y) }; … } bool Q { local int y; if { …. y++; return 1;} else return P(x) } To figure out whether the expression e=(x+y) is very busy at program point a, Spush = {call-p, call-q} Spop = {ret-p, ret-q} Slocal = {used-e, mod-x, mod-y, skip} Executions are pushdown words, e.g. call-q, skip, mod-y, ret-q, used-e, mod-x, skip, ret-p Set of executions starting at a location a is a VPL: La Set of executions in which e is very-busy is also a VPL: Le e is very busy at a if La is included in Le Analysis

VPLs for Document Processing: 

VPLs for Document Processing XML Document <conference> <name> CONCUR 2005 </name> <location> <city> San Francisco </city> <hotel> Stanford Court </hotel> </location> <sponsor> CISCO </sponsor> <sponsor> Microsoft </sponsor> … </conference> Pushdown alphabet Spush = {<name>, <location>, …} Spop = {</name>, </location>, …} Slocal = {San Francisco, Microsoft, …} A document d is a pushdown word Sample Query: Find documents related to conferences sponsored by Microsoft in San Francisco Specify query as a VPL: L Analysis: Membership question Does document d satisfy query L ? Use VPAs instead of tree automata! (typically, no recursion, but only hierarchy) Query Processing

Closure Properties: 

Note: can’t combine languages wrt different partitions Closed under intersection: Given two VPAs A and B, build a product C accepting intersection of L(A) and L(B) State of C: (state of A, state of B) Stack symbol of C: (stack symbol of A, stack symbol of B) C can simulate the stacks of A and B together Closed under union Closed under complementation Closed under concatenation and Kleene-* Closed under partition-preserving homomorphisms Closure Properties

Determinization: 

Given a nondeterministic VPA A, we can construct a deterministic VPA B that accepts the same language and has size exponential in A Potentially useful for building runtime monitors for checking program executions, and online algorithms for XML query processing VPLs are a subclass of DCFLs (languages defined by deterministic PDAs) DCFLs not closed under union Equivalence problem for DCFLs decidable, but complex Determinization

Determinization: Sketch of the construction: 

Determinization of nondeterministic automata uses subset construction: a state R of B is a set of states of A (the states that A can be, having read the word w so far) Subset construction does not apply to stack But we can do subsets of summaries: if w is a well-matched word, (q,q’) is a summary of A on w, if A can go from (q,$) to (q’,$), where $ is stack bottom More precisely, if w=w1c1w2c2…cnwn+1, where ci’s are calls and wi’s are well-matched words, then after reading w, determinized automaton B has Stack is (Sn,Rn,cn),….(S1,R1,c1)$ Control state is (Sn+1,Rn+1) Ri = Set of all states A can be in after w1c1…wi Si= Set of all summaries of A on the segment wi Determinization: Sketch of the construction

Decision Problems: 

Emptiness: Given a VPA A, is its language empty? Same as for PDAs: Polynomial-time complete (cubic) Language inclusion (or equivalence): Given VPAs A and B, is language of A contained in that of B? Determinize B, take its complement, take product with A, and test for emptiness Exponential-time complete Recall: Inclusion is PSPACE-complete for (nondeterministic) finite automata, and undecidable for PDAs Decision Problems

VPL Properties Summary: 

VPL Properties Summary Regular CFL DCFL VPL L Emptiness Inclusion Yes Yes Yes Yes No No No Undec PTIME No Yes PTIME Undec Yes Yes Yes PTIME Exptime NLOG Pspace

Pushdown Words as Binary Trees: 

Pushdown Words as Binary Trees Let w = i5 c1 i1 c2 i4 i3 i3 r2 c1 i1 r1 r1 i5 i3 i5 c1 r1 i1 c2 i4 i3 i3 i5 i3 r2 Stack trees r1 c1 i1

VPL: Connection to tree languages: 

VPL: Connection to tree languages Tree-language characterization: Let L be a set of pushdown words and let ST(L) be the set of stack trees that correspond to L. Theorem: L is a VPL iff ST(L) is a regular tree language Note: It is well-known that the set of parse trees corresponding to a context-free grammar is a regular tree language

Finite word automata that can jump: 

Finite word automata that can jump Let w = i5 c1 i1 c2 i4 i3 i3 r2 c1 i1 r1 r1 i5 i3 Summary Automata Finite-state automaton that reads pushdown word While reading a call, can send a copy to matching return d(q,a) is a set of pairs of states if a is in Spush Nondeterministic summary automata are expressively equivalent to VPAs Deterministic VPA (= VPL) > Deterministic summary automata > Deterministic tree automata (on stack trees)

Robustness: Alternative Characterizations: 

Robustness: Alternative Characterizations Monadic second order logic with matching predicate m(x,y) means x is a call and y is matching return Sample formula: forall x. if p(x) then exists y,z. ( q(y) and x<y<z and m(x,z) ) Thm. MSO + matching predicate interpreted over pushdown words is expressively equivalent to VPLs Thm: Every CFL is a homomorphic image of a VPL Context-free grammar based characterization Two types of non-terminals V0 (matched words) and V1 All productions are of the form X  a if X is in V0 then a must be local X  a Y b Z a is a call, b is a return, Y is in V0 if X is in V0 then Z must be in V0

“Regular-like” properties continue..: 

“Regular-like” properties continue.. Congruences and minimization (Myhill-Nerode Theorem) central to theory of regular languages Given a language L, for well-matched words u and v, define u ~L v iff for all words x and y, xuy in L iff xvy in L Theorem: A language L of well-matched words is a VPL iff the congruence ~L is of finite index Minimization No unique minimal deterministic VPA in general, but… Minimization of (single-entry) RSMs (i.e. procedural boolean programs) possible. Partitioning into k procedures/modules is adequate to get canonicity!

ω-VPL - Extension to Infinite Words: 

ω-VPL - Extension to Infinite Words A Büchi VPA: VPA over infinite pushdown words A word is accepted if along a run, the set F is seen infinitely often ω-VPL – class of languages accepted by Büchi VPAs ω-VPL is closed under all Boolean operations Characterization using regular trees and MSO characterization hold. However, ω-VPLs are not determinizable! Let L be set of all words such that the stack is repeatedly bounded i.e. for some n, the stack depth is n infinitely often. L is an ω-VPL but there is no deterministic (Muller) VPA for it Language inclusion and equivalence are still decidable

Talk Outline: 

Talk Outline Visibly Pushdown Languages Temporal Logic CaRet and Model Checking Ongoing Work

Software Model Checking: 

Software Model Checking Code Abstractor Model Verifier Specification Yes Counter-example Predicate abstraction Control flow graph + Boolean vars (Pushdown automata) CaRet/VPAs Observables

Abstracting Software: 

Abstracting Software int x, y; if x>0 { ……. y=x+1 .…… } else { …… y=x+1 …… } bx: x>0 by: y>0 Program

Abstracting Modular Programs: 

Abstracting Modular Programs main() { bool y; … x = P(y); … z = P(x); … } bool P(u: bool) { … return Q(u); } bool Q(w: bool) { if … else return P(~w) } A2 A1 A3 A2 A2 A3 A3 A1 Entry/Inputs Exit/outputs Box (function-calls) Program Recursive State Machine (RSM)/ Pushdown automaton

Linear-time Propositional Temporal Logic: 

Linear-time Propositional Temporal Logic Q ::- p | not Q | Q or Q’ | Next Q | Always Q | Eventually Q | Q Until Q’ Interpreted over (infinite) sequences. Models of an LTL formula is a w-regular language. Useful for stating sequencing properties: If req happens, then req holds until it is granted: Always ( req → (req Until grant) ) An exception is never raised: Always ( not Exception )

CARET: 

CARET CARET: A temporal logic for Calls and Returns Expresses context-free properties A B C A Global successor used by LTL ………….

CARET: 

CARET CARET: A temporal logic for Calls and Returns Expresses context-free properties A B C D Global successor used by LTL …………. Local successor: Jump from calls to returns Otherwise global successor at the same level

CARET: 

CARET CARET: A temporal logic for Calls and Returns Expresses context-free properties A B C A Global successor used by LTL …………. Local successor: Jump from calls to returns Otherwise global successor at the same level

CARET: 

CARET CARET: A temporal logic for Calls and Returns Expresses context-free properties A B C A Global successor used by LTL …………. Local successor: Jump from calls to returns Otherwise global successor at the same level Local path

CARET: 

CARET CARET: A temporal logic for Calls and Returns Expresses context-free properties A B C A Global successor used by LTL …………. Local successor: Jump from calls to returns Otherwise global successor at the same level Caller modality: Jump to the caller of the current module Defined for every position except top-level ones

CARET: 

CARET CARET: A temporal logic for Calls and Returns Expresses context-free properties A B C A Global successor used by LTL …………. Abstract successor: Jump from calls to returns Otherwise global successor at the same level Caller modality: Jump to the caller of the current module Defined for every position except top-level ones Caller path gives the stack content!

CARET Definition: 

CARET Definition Syntax: Q ::- p | not Q | Q or Q’ | Next Q | Always Q | Eventually Q | Q Until Q’ Local-Next Q | Local-Always Q Local-Eventually Q | Q Local-Until Q’ Caller Q | CallerPath-Always Q CallerPath-Eventually Q | Q CallerPath-Until Q’ Local- and Caller- versions of all temporal operators All these operators can be nested

Expressing properties in Caret: 

Expressing properties in Caret Pre-post conditions: If P holds when A is called, then Q must hold when the call returns Always ( (P and call-to-A) Local-Next Q ) A P Q Integrating Manna/Pnueli-style reasoning for reactive computations with Hoare-style reasoning for structured programs

Expressing properties in Caret: 

Expressing properties in Caret If A is called with low priority, then it cannot access the file Always ( call-to-A and low-priority Local-Always ( not access-file ) ) A low-priority A high-priority access-file

Expressing properties in Caret: 

Expressing properties in Caret Stack inspection properties If a variable x is accessed, then A must be on the call stack Always ( access-to-x CallerPath-Eventually call-to-A ) access-to-x A

Model checking CARET: 

Model checking CARET Given: A (boolean) recursive state machine/ visibly pushdown automaton M A CARET formula Q Model-checking: Do all runs of M satisfy the specification Q? CARET can be model-checked in time that is polynomial in M and exponential in Q. |M|3 . 2O(|Q|) Complexity class same as that for LTL ! Generalization of Vardi-Wolper construction

Model-checking CARET: Intuition: 

Model-checking CARET: Intuition The specification matches calls and returns of the program, so the runs of the program and models of the formula are both visibly pushdown languages Given M and formula Q, Build a Buchi pushdown automaton that accepts words exhibited by M that satisfy (not Q) Check this pushdown automaton for emptiness Construction builds on the classical tableaux for LTL s, Q1 s Push s and Q1 Local-Next Q1 Pop s and Q1 ; Check Q1

Conclusions and Ongoing Work: 

Conclusions and Ongoing Work Exposing calls and returns lets you hide the stack! VPLs seem robust and adequate to model software analysis problems VPL-triggered research Dynamic logic with VPL (Loding,Serre) Visibly pushdown games (Loding,Madhusudan,Serre) XML query processing (Pitcher) Third-order Algol with iteration (Murawski,Walukiewicz) Active area of current research DTDs, XML, and query languages Branching-time logics, Fixpoint calculus, and visibly pushdown tree automata (Alur, Chaudhuri, Madhusudan) Expressive completeness of temporal operators Implementing a model checker for VPL monitors