Artificial Intelligence Unit III - Adversarial Search

Views:
 
Category: Education
     
 

Presentation Description

Artificial Intelligence Unit III Adversarial Search

Comments

Presentation Transcript

Artificial Intelligence Adversarial search:

BE Comp Engineering Artificial Intelligence Adversarial search

Introduction:

Introduction Games as search problems Idealization and simplification: Two players Alternate moves MAX player MIN player Available information: Perfect: chess, chequers, tic-tac-toe… (no chance, same knowledge for the two players) Imperfect: poker

Game representation:

Game representation In the general case of a game with two players: General state representation Initial-state definition Winning-state representation as: Structure Properties Utility function Definition of a set of operators

Game Tree :

Game Tree Root node represents the configuration of the board at which a decision must be made Root is labeled a "MAX" node indicating it is my turn; otherwise it is labeled a "MIN" (your turn) Each level of the tree has nodes that are all MAX or all MIN

Search with an opponent:

Search with an opponent

Search with an opponent:

Search with an opponent Trivial approximation : generating the tree for all moves Terminal moves are tagged with a utility value, for example: “+1” or “-1” depending on if the winner is MAX or MIN. The goal is to find a path to a winning state. Even if a depth-first search would minimize memory space, in complex games this kind of search cannot be carried out. Even a simple game like tic-tac-toe is too complex to draw the entire game tree.

Search with an opponent:

Search with an opponent Heuristic approximation : defining an evaluation function which indicates how close a state is from a winning (or losing) move single evaluation function to describe the goodness of a board with respect to BOTH players. Example of an Evaluation Function for Tic- Tac -Toe: f(n) = [number of 3-lengths open for me] - [number of 3-lengths open for you ] where a 3-length is a complete row, column, or diagonal. Adversary method A winning move is represented by the value “+∞”. A losing move is represented by the value “-∞”. The algorithm searches with limited depth. Each new decision implies repeating part of the search.

Example: tic-tac-toe:

Example: tic-tac-toe e (evaluation function → integer) = number of available rows, columns, diagonals for MAX - number of available rows, columns, diagonals for MIN MAX plays with “X” and desires maximizing e. MIN plays with “0” and desires minimizing e.

Example: tic-tac-toe:

Example: tic-tac-toe

Example: tic-tac-toe:

Example: tic-tac-toe

MinMax – First Example:

MinMax – First Example Max’s turn Would like the “9” points (the maximum) But if choose left branch, Min will choose move to get 3  left branch has a value of 3 If choose right, Min can choose any one of 5, 6 or 7 (will choose 5, the minimum)  right branch has a value of 5 Right branch is largest (the maximum) so choose that move 5 3 4 5 3 9 4 6 7 5 Max Min Max

MinMax – Second Example:

MinMax – Second Example Max’s turn Circles represent Max, Squares represent Min Values inside represent the value the MinMax algorithm Red arrows represent the chosen move Numbers on left represent tree depth Blue arrow is the chosen move Min Min Max Max

MaxMin - Algorithm:

MaxMin - Algorithm Named MinMax because of algorithm behind data structure Assign points to the outcome of a game Ex: Tic-Tac-Toe: X wins, value of 1. O wins, value -1. Max (X) tries to maximize point value, while Min (O) tries to minimize point value Assume both players play to best of their ability Always make a move to minimize or maximize points So, in choosing, Max will choose best move to get highest points, assuming Min will choose best move to get lowest points

Searching Game Trees using the Minimax Algorithm:

Searching Game Trees using the Minimax Algorithm Create start node as a MAX node (since it's my turn to move) with current board configuration, Expand nodes down to some depth (i.e., ply) of lookahead in the game, Apply the evaluation function at each of the leaf nodes, "Back up" values for each of the non-leaf nodes until a value is computed for the root node. At MIN nodes, the backed up value is the minimum of the values associated with its children. At MAX nodes, the backed up value is the maximum of the values associated with its children, Pick the operator associated with the child node whose backed up value determined the value at the root.

The minimax algorithm:

The minimax algorithm minimax(player,board) if(game over in current board position) return winner children = all legal moves for player from this board if(max's turn) return maximal score of calling minimax on all the children else (min's turn) return minimal score of calling minimax on all the children

The minimax algorithm:

The minimax algorithm The algorithm first recurses down to the tree bottom-left nodes and uses the Utility function on them to discover that their values are 3, 12 and 8. 16

The minimax algorithm:

The minimax algorithm Then it takes the minimum of these values, 3, and returns it as the backed-up value of node B. Similar process for the other nodes. 17 B A

The minimax algorithm:

The minimax algorithm If the maximum depth of the tree is m , and there are b legal moves at each point, then the time complexity is O ( b m ). The space complexity is: O(bm) for an algorithm that generates all successors at once O(m) if it generates successors one at a time. 18

The minimax algorithm: problems:

The minimax algorithm: problems Problem with minimax search: The number of game states it has to examine is exponential in the number of moves. Unfortunately, the exponent can’t be eliminated, but it can be cut in half. 19

Alpha-beta pruning:

Alpha-beta pruning It is possible to compute the correct minimax decision without looking at every node in the game tree. Alpha-beta pruning allows to eliminate large parts of the tree from consideration, without influencing the final decision. 20

Alpha-beta pruning:

Alpha-beta pruning The leaves below B have the values 3, 12 and 8. The value of B is exactly 3. It can be inferred that the value at the root is at least 3, because MAX has a choice worth 3. B 21

Alpha-beta pruning:

Alpha-beta pruning C , which is a MIN node, has a value of at most 2. But B is worth 3, so MAX would never choose C . Therefore, there is no point in looking at the other successors of C . 22 B C

Alpha-beta pruning:

Alpha-beta pruning D , which is a MIN node, is worth at most 14. This is still higher than MAX’s best alternative (i.e., 3), so D ’s other successors are explored. 23 B C D

Alpha-beta pruning:

Alpha-beta pruning The second successor of D is worth 5, so the exploration continues. 24 B C D

Alpha-beta pruning:

Alpha-beta pruning The third successor is worth 2, so now D is worth exactly 2. MAX’s decision at the root is to move to B , giving a value of 3 25 B C D

Alpha-beta pruning:

Alpha-beta pruning Alpha-beta pruning gets its name from two parameters. They describe bounds on the values that appear anywhere along the path under consideration: α = the value of the best (i.e., highest value) choice found so far along the path for MAX β = the value of the best (i.e., lowest value) choice found so far along the path for MIN 26

Alpha-beta pruning:

Alpha-beta pruning Alpha-beta search updates the values of α and β as it goes along. It prunes the remaining branches at a node (i.e., terminates the recursive call) as soon as the value of the current node is known to be worse than the current α or β value for MAX or MIN, respectively. 27

The alpha-beta search algorithm:

The alpha-beta search algorithm evaluate (node, alpha, beta) if node is a leaf return the heuristic value of node if node is a minimizing node for each child of node beta = min (beta, evaluate (child, alpha, beta)) if beta <= alpha return beta if node is a maximizing node for each child of node alpha = max (alpha, evaluate (child, alpha, beta)) if beta <= alpha return alpha return alpha

Final comments about alpha-beta pruning:

Final comments about alpha-beta pruning Pruning does not affect final results. Entire subtrees can be pruned, not just leaves. Good move ordering improves effectiveness of pruning. With perfect ordering , time complexity is O ( b m/2 ). Effective branching factor of sqrt(b) Consequence: alpha-beta pruning can look twice as deep as minimax in the same amount of time. 29

MinMax – AlphaBeta Pruning:

MinMax – AlphaBeta Pruning MinMax searches entire tree, even if in some cases the rest can be ignored Example – Enemy lost bet. Owes you one thing from bag. You choose bag, but he chooses thing. Go through bags one item at a time. First bag: Sox tickets, sandwich, $20 He’ll choose sandwich Second bag: Dead fish, … He’ll choose fish. Doesn’t matter if rest is car, $500, Yankee’s tickets … Don’t need to look further. Can prune. In general, stop evaluating move when find worse than previously examined move  D oes not benefit the player to play that move, it need not be evaluated any further.  Save processing time without affecting final result

MinMax – AlphaBeta Pruning Example:

MinMax – AlphaBeta Pruning Example From Max point of view, 1 is already lower than 4 or 5, so no need to evaluate 2 and 3 (bottom right)  Prune

MinMax – AlphaBeta Pruning Idea:

MinMax – AlphaBeta Pruning Idea Two scores passed around in search Alpha – best score by some means Anything less than this is no use (can be pruned) since we can already get alpha Minimum score Max will get Initially, negative infinity Beta – worst-case scenario for opponent Anything higher than this won’t be used by opponent Maximum score Min will get Initially, infinity Recursion progresses, the "window" of Alpha-Beta becomes smaller Beta < Alpha  current position not result of best play and can be pruned

Games with Imperfect Information:

Games with Imperfect Information

Games with Imperfect Information:

Games with Imperfect Information Minimax and alpha-beta pruning require too much leaf-node evaluations. May be impractical within a reasonable amount of time. SHANNON (1950): Cut off search earlier (replace TERMINAL-TEST by CUTOFF-TEST) Apply heuristic evaluation function EVAL (replacing utility function of alpha-beta)

Cutting-off Search:

Cutting-off Search Change: if TERMINAL-TEST( state ) then return UTILITY( state ) into if CUTOFF-TEST( state,depth ) then return EVAL( state ) Introduces a fixed-depth limit depth Is selected so that the amount of time will not exceed what the rules of the game allow. When cuttoff occurs, the evaluation is performed.

Heuristic EVAL:

Heuristic EVAL Idea: produce an estimate of the expected utility of the game from a given position. Performance depends on quality of EVAL. Requirements: Computation may not take too long. For non-terminal states the EVAL should be strongly correlated with the actual chance of winning.

Constraint Satisfaction Problems:

Constraint Satisfaction Problems

Constraint satisfaction problems (CSPs):

Constraint satisfaction problems (CSP)s are mathematical problems defined as a set of objects whose state must satisfy a number of constraints or limitations. CSPs are the subject of intense research in both artificial intelligence and operations research. CSPs often exhibit high complexity, requiring a combination of heuristics and combinatorial search methods to be solved in a reasonable time. Constraint satisfaction problems (CSPs)

Constraint satisfaction problems (CSPs):

Constraint satisfaction problems (CSPs) CSP: A constraint satisfaction problem is defined as a three <X, D, C> Goal test is a set of constraints Standard search problem: State of problem is defined by an assignment of values to some or all variables. Assignment that does not violate any constraint is called consistent assignment.

Example: Map-Coloring:

Example: Map-Coloring Variables WA, NT, Q, NSW, V, SA, T Domains D i = {red,green,blue} Constraints : adjacent regions must have different colors e.g., WA ≠ NT, or (WA,NT) in {(red,green),(red,blue),(green,red), (green,blue),(blue,red),(blue,green)}

Example: Map-Coloring:

Example: Map-Coloring Solutions are complete and consistent assignments, e.g., WA = red, NT = green,Q = red,NSW = green,V = red,SA = blue,T = green

Constraint graph:

Constraint graph Constraint graph: nodes are variables, arcs are constraints Unary CSP: each constraint relates one variable Binary CSP: each constraint relates two variables

Varieties of constraints:

Varieties of constraints Unary constraints involve a single variable, e.g., SA ≠ green Binary constraints involve pairs of variables, e.g., SA ≠ WA Higher-order constraints involve 3 or more variables, e.g., cryptarithmetic column constraints

Example: Cryptarithmetic:

Example: Cryptarithmetic Variables : F T U W R O X 1 X 2 X 3 Domains : { 0,1,2,3,4,5,6,7,8,9 } Constraints : Alldiff (F,T,U,W,R,O) O + O = R + 10 · X 1 X 1 + W + W = U + 10 · X 2 X 2 + T + T = O + 10 · X 3 X 3 = F , T ≠ 0, F ≠ 0

Real-world CSPs:

Real-world CSPs Timetabling problems e.g., which class is offered when and where? Transportation scheduling Factory scheduling Notice that many real-world problems involve real-valued variables

Standard search formulation:

Standard search formulation Generate & Test: Let's start with the straightforward approach, then fix it States are defined by the values assigned so far Initial state : the empty assignment { } Successor function : assign a value to an unassigned variable that does not conflict with current assignment  fail if no legal assignments Goal test : the current assignment is complete Path Cost: cost for every step

Backtracking search:

Backtracking search Variable assignments are commutative }, i.e., [ WA = red then NT = green ] same as [ NT = green then WA = red ] Only need to consider assignments to a single variable at each node Depth-first search for CSPs with single-variable assignments is called backtracking search Backtracking search is the basic uninformed algorithm for CSPs

Backtracking search:

Backtracking search

Backtracking example:

Backtracking example

Backtracking example:

Backtracking example

Backtracking example:

Backtracking example

Backtracking example:

Backtracking example

Improving backtracking efficiency:

Improving backtracking efficiency General-purpose methods can give huge gains in speed: Which variable should be assigned next? In what order should its values be tried? Can we detect inevitable failure early?

Most constrained variable:

Most constrained variable Most constrained variable: choose the variable with the fewest legal values a.k.a. minimum remaining values (MRV) heuristic

Most constraining variable:

Most constraining variable Most constraining variable: choose the variable with the most constraints on remaining variables

Least constraining value:

Least constraining value Given a variable, choose the least constraining value: the one that rules out the fewest values in the remaining variables Combining these heuristics makes 1000 queens feasible

Forward checking:

Forward checking Idea : Keep track of remaining legal values for unassigned variables Terminate search when any variable has no legal values

Forward checking:

Forward checking Idea : Keep track of remaining legal values for unassigned variables Terminate search when any variable has no legal values

Forward checking:

Forward checking Idea : Keep track of remaining legal values for unassigned variables Terminate search when any variable has no legal values

Forward checking:

Forward checking Idea : Keep track of remaining legal values for unassigned variables Terminate search when any variable has no legal values

Constraint propagation:

Constraint propagation Forward checking propagates information from assigned to unassigned variables, but doesn't provide early detection for all failures: NT and SA cannot both be blue! Constraint propagation repeatedly enforces constraints locally

Arc consistency:

Arc consistency Simplest form of propagation makes each arc consistent X  Y is consistent iff for every value x of X there is some allowed y of Y

Arc consistency:

Arc consistency Simplest form of propagation makes each arc consistent X  Y is consistent iff for every value x of X there is some allowed y of Y

Arc consistency:

Arc consistency Simplest form of propagation makes each arc consistent X  Y is consistent iff for every value x of X there is some allowed y If X loses a value, neighbors of X need to be rechecked

Arc consistency:

Arc consistency Simplest form of propagation makes each arc consistent X  Y is consistent iff for every value x of X there is some allowed y If X loses a value, neighbors of X need to be rechecked Arc consistency detects failure earlier than forward checking Can be run as a preprocessor or after each assignment

Arc consistency algorithm AC-3:

Arc consistency algorithm AC-3 Time complexity: O(n 2 d 3 ), where n is the number of variables and d is the maximum variable domain size, because: At most O(n 2 ) arcs Each arc can be inserted into the agenda (TDA set) at most d times Checking consistency of each arc can be done in O(d 2 ) time

Local search for CSPs:

Local search for CSPs Hill-climbing, simulated annealing typically work with "complete" states, i.e., all variables assigned To apply to CSPs: allow states with unsatisfied constraints operators reassign variable values Variable selection: randomly select any conflicted variable Value selection by min-conflicts heuristic: choose value that violates the fewest constraints i.e., hill-climb with h(n) = total number of violated constraints

Local search for CSPs:

Local search for CSPs The local search methodology uses the following terms: State (configuration): one possible assignment of all variables; the number of states is equal to the product of domains' sizes Evaluation value: the number of constraint violations of the state Neighbor: the state which is obtained from the current state by changing one variable value Local-minimum: the state that is not a solution and the evaluation values of all of its neighbors are larger than or equal to the evaluation value of this state Strict local-minimum: the state that is not a solution and the evaluation values of all of its neighbors are larger than the evaluation value of this state Non-strict local-minimum: the state that is a local-minimum but not a strict local-minimum.

Local search for CSP:

Local search for CSP function MIN-CONFLICTS( csp, max_steps ) return solution or failure inputs : csp , a constraint satisfaction problem max_steps , the number of steps allowed before giving up current  an initial complete assignment for csp for i = 1 to max_steps do if current is a solution for csp then return current var  a randomly chosen, conflicted variable from VARIABLES[ csp ] value  the value v for var that minimize CONFLICTS( var, v, current, csp ) set var = value in current return failure

Example: 4-Queens:

Example: 4-Queens States : 4 queens in 4 columns (4 4 = 256 states) Actions : move queen in column Goal test : no attacks Evaluation : h(n) = number of attacks

Min-conflicts example 2:

Min-conflicts example 2 Use of min-conflicts heuristic in hill-climbing h=5 h=3 h=1

Advantages of local search:

Advantages of local search The runtime of min-conflicts is roughly independent of problem size. Solving the millions-queen problem in roughly 50 steps. Local search can be used in an online setting. Backtrack search requires more time

Summary:

Summary CSPs are a special kind of problem: states defined by values of a fixed set of variables goal test defined by constraints on variable values Backtracking = depth-first search with one variable assigned per node Variable ordering and value selection heuristics help significantly Forward checking prevents assignments that guarantee later failure Constraint propagation (e.g., arc consistency) does additional work to constrain values and detect inconsistencies Iterative min-conflicts is usually effective in practice

authorStream Live Help