seminal ml2006

Uploaded from authorPOINT
Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

SEMINAL: Searching for ML Type-Error Messages: 

SEMINAL: Searching for ML Type-Error Messages Benjamin Lerner, Dan Grossman, Craig Chambers University of Washington

Example: Curried functions : 

Example: Curried functions # let map2 f aList bList = List.map (fun (a, b) -andgt; f a b) (List.combine aList bList);; val map2 : ('a -andgt; 'b -andgt; 'c) -andgt; 'a list -andgt; 'b list -andgt; 'c list = andlt;funandgt;

Example: Nested matches: 

Example: Nested matches # type a = A1 | A2 | A3;; # type b = B1 | B2;; # let x : a = ...;; # let y : b = ...;; # match x with A1 -andgt; 1 | A2 -andgt; match y with B1 -andgt; 11 | B2 -andgt; 21 | A3 -andgt; 5;; This pattern matches values of type a but is here used to match values of type b Try replacing match x with A1 -andgt; 1 | A2 -andgt; match y with B1 -andgt; 11 | B2 -andgt; 21 | A3 -andgt; 5;; with match x with A1 -andgt; 1 | A2 -andgt; (match y with B1 -andgt; 11 | B2 -andgt; 21) | A3 -andgt; 5;;

What went wrong?: 

What went wrong? Sometimes, existing type-error messages… …are not local Symptoms andlt;andgt; Problems …are not intuitive Very lengthy types are hard to read …are not descriptive Location + types andlt;andgt; Solution …have a steep learning curve

Related work: 

Related work Instrument the type-checker Change the order of unification Explanation systems Program slicing Interactive systems Reparation systems But that leads to… See paper for citations

Tight coupling with type-checker: 

Tight coupling with type-checker Implementing a Hindley-Milner TC is easy Implementing a production TC is hard Adding good error messages makes it even harder Interferes with easy revision or extension of TC Error messages in TC adds to compiler’s trusted computing base

Our Approach, in one slide: 

Our Approach, in one slide Treats type checker as oracle Makes no assumptions about the type system Note: no dependence on unification Tries many variations on program, see which ones work Must do this carefully – there are too many! Note: 'Variant works' andlt;andgt; 'Variant is right' Ranks successful suggestions, presents results to programmer

Outline: 

Outline Examples of confusing messages Related work Our approach Running example Architecture overview Preliminary results Ongoing work Conclusions

Example: Curried functions : 

Example: Curried functions # let map2 f xs ys = List.map (fun (x, y) -andgt; f x y) (List.combine xs ys);; val map2 : ('a -andgt; 'b -andgt; 'c) -andgt; 'a list -andgt; 'b list -andgt; 'c list = andlt;funandgt;

Finding the changes, part 0: 

Finding the changes, part 0 Change let map2 f aList bList = … ;; map2 (fun (x, y) -andgt; x+y) [1;2;3] [4;5;6] Into…  map2 (fun(x,y)-andgt;x+y) [1;2;3] [4;5;6] let map2 f aList bList = … ;; 

What’s that ?: 

What’s that ? Seminal examines the given AST 'Replace with ' = 'Replace in AST' Any particular  means Expressions: raise Foo ...

Finding the changes, part 1: 

Finding the changes, part 1 Change map2 (fun (x, y) -andgt; x+y) [1;2;3] [4;5;6] Into… map2 ((fun(x,y)-andgt;x+y), [1;2;3], [4;5;6]) map2 ((fun(x,y)-andgt;x+y) [1;2;3] [4;5;6]) …  (fun (x,y)-andgt;x+y) [1;2;3] [4;5;6] map2  [1;2;3] [4;5;6] map2 (fun (x,y)-andgt;x+y)  [4;5;6] map2 (fun (x,y)-andgt;x+y) [1;2;3] 

Finding the changes, part 2: 

Finding the changes, part 2 Change (fun (x, y) -andgt; x + y) Into… fun (x, y)  -andgt; x + y fun  (x, y) -andgt; x + y fun (y, x) -andgt; x + y … fun x y -andgt; x + y

Ranking the suggestions: 

Ranking the suggestions Replace map2 (fun (x,y)-andgt;x+y) [1;2;3] [4;5;6] with  Replace map2 with  Replace (fun (x,y)-andgt;x+y) with  Replace (fun (x,y)-andgt;x+y) with (fun x y -andgt; x+y) Prefer smaller changes over larger ones Prefer non-deleting changes over others

Tidying up: 

Tidying up Find type of replacement Get this for free from TC Maintain surrounding context Help user locate the replacement Suggestions: Try replacing fun (x, y) -andgt; x + y with fun x y -andgt; x + y of type int -andgt; int -andgt; int within context (map2 (fun x y -andgt; x + y) [1; 2; 3] [4; 5; 6])

Behind the scenes: 

Behind the scenes

Searcher: 

Searcher Defines the strategy for looking for fixes: Look for single AST subtree to remove that will make problem 'go away' Replace subtree with 'wildcards'  Interesting subtrees guide the search If removing a subtree worked, try its children … Often improves on existing messages’ locations Rely on Enumerator’s suggestions for more detail

Enumerator: 

Enumerator Defines the fixes to be tried: Try custom attempts for each type of AST node E.g. Function applications break differently than if-then expressions Enumeration is term-directed A function of AST nodes only More enumerated attempts  better messages Called by Searcher when needed

Ranker: 

Ranker Defines the relative merit of successful fixes: Search can produce too many results Not all results are helpful to user E.g. 'Replace whole program with '! Use heuristics to filter and sort messages 'Smallest fixes are best' 'Currying a function is better than deleting it' Simple heuristics seem sufficient

Preliminary results: 

Preliminary results Prototype built on ocaml-3.08.4 Reasonable performance Can fully examine most files in under ~2sec Compare with human speed… Bottleneck is time spent in TC Many calls + repetitive data andgt; 90% of time TC can be improved independently from SEMINAL 

Current analysis: 

Current analysis Ongoing analysis of ~2K files collected from students Methods: Group messages as 'good', 'misleading', 'bad' Check if message precisely locates problem Check if message approximates student’s actual fix Results: Very good precision on small test cases Good precision on real, large problems Most poor results stem from common cause

Ongoing work: 

Ongoing work Dealing with multiple errors: Errors may be widely separated Least common parent is too big Idea: ignore most code, focus on one error Triage: Trade 'complete' for 'small' Not in workshop paper!

Example: Multiple errors: 

Example: Multiple errors # val x : int;; # val y : 'a list;; # match (x, y) with 0, [] -andgt; [] | _, [] -andgt; x | _, 5 -andgt; 5 + 'hi' This pattern matches values of type int * int but is here used to match values of type int * 'a list Problems: The last two patterns don’t match the same types The first two cases don’t match The last case doesn’t type-check Try replacing match (x, y) with 0, [] -andgt; [] | _, [] -andgt; x | _, 5 -andgt; 5 + 'hi' with  Not an effective suggestion! 1) Try replacing _, 5 with _,  2) Try replacing x with  3) Try replacing 5 + 'hi' with 5 + 

Example: Multiple errors: 

Example: Multiple errors # val x : int;; # val y : 'a list;; # match (x, y) with 0, [] -andgt; [] | _, [] -andgt; x | _, 5 -andgt; 5 + 'hi' First try just the scrutinee match (x, y) with  -andgt;  Then try the patterns match (x, y) with 0, [] -andgt;  | _, [] -andgt;  | _, 5 -andgt;  Finally try the whole expression match (x, y) with 0, [] -andgt; [] | _, [] -andgt; x | _, 5 -andgt; 5 + 'hi'

Conclusions: 

Conclusions Searching for repairs yields intuitively helpful messages SEMINAL decouples type-checker from error-message generator Simpler TC architecture Smaller trusted computing base It’s a win-win scenario! Version available soon – happy hacking!