# lecture_note_8

Views:

Category: Entertainment

## Presentation Description

No description available.

## Presentation Transcript

### Genetic Algorithms:

1 Genetic Algorithms Genetic Algorithms (GA) & Genetic Programming are two of the more popular approaches in a field called “Evolutionary Computation”. Characteristics of GA Learning Hypotheses are often described by bit strings whose interpretation depends on the application The search for an appropriate hypothesis begins with a population, or collection, of initial hypotheses. At each step, the hypotheses in the current population are evaluated relative to a given measure of fitness.

### Genetic Algorithms:

2 Genetic Algorithms Provide a learning method motivated by an analogy to biological evolution . GAs generate successor hypotheses by repeated mutating and recombining parts of the best currently known hypotheses Rather than search from general-to-specific hypotheses, or from simple-to-complex.

### A Prototypical Genetic Algorithm:

3 A Prototypical Genetic Algorithm GA( Fitness , Fitness_threshold , p , r , m ) /* Fitness : A function that assigns an evaluation score as given a hypothesis. Fitness_threshold : A threshold specifying the termination criterion. p : The number of hypotheses to be included in the population. r : The fraction of the population to be replaced by Crossover at each step. m : The mutation rate. */

### A Prototypical Genetic Algorithm (Cont’d):

4 A Prototypical Genetic Algorithm (Cont’d) Initialize : P  Generate p hypotheses at random Evaluate : for each h in P, compute Fitness(h) While [max h , Fitness(h)] < Fitness_threshold Select: Probabilistically select (1-r)p members of P to add to P S . Crossover: Probabilistically select (r  p)/2 pairs of hypotheses from P. For each pair, h 1 ,h 2  , produce two offspring by applying the Crossover operator. Add all offspring to P s . Mutate: Invert a randomly selected bit in m  p random members of P s Update: P  P s Evaluate: for each h in P, compute Fitness(h) Return the hypothesis from P that has the highest fitness

### Representing Hypotheses:

5 Representing Hypotheses Represent (Outlook = Overcast  Rain)  (Wind = Strong) by Outlook Wind 011 10 Represent IF Wind = Strong THEN PlayTennis = yes by Outlook Wind PlayTennis 111 10 10

### Operators for Genetic Algorithms:

6 Operators for Genetic Algorithms

### Fitness Function and Selection:

7 Fitness Function and Selection Fitness proportionate selection: Tournament selection: Pick h 1 , h 2 at random with uniform probability With some predefined probability p, select the more fit of these two. Rank selection: Sort all hypotheses by fitness Probability of selection is proportional to its rank in this sorted list, rather than its fitness.

### GABIL System [DeJong et al. 1993] :

8 GABIL System [DeJong et al. 1993] A GA can be reviewed as a general optimization method that searches a large space of candidate objects seeking one that performs best according to the fitness function. Although not guaranteed to find an optimal object, Gas often succeed in finding an object with high fitness. GABIL uses a GA to learn boolean concepts represented by a disjunctive set of propositional rules. Its generalization accuracy is comparable to other learning system like: C4.5 and rule learning sys. AQ14.

### GABIL [DeJong et al. 1993] :

9 GABIL [DeJong et al. 1993] Fitness: Fitness(h) = (correct(h)) 2 Representation: IF a 1 = T  a 2 = F THEN c = T; IF a 2 = T THEN c = F represented by a 1 a 2 c a 1 a 2 c 10 01 1 11 10 0 Genetic operators: ??? want variable length rule sets

### Crossover with Variable-Length Bit Strings:

10 Crossover with Variable-Length Bit Strings Start with a 1 a 2 c a 1 a 2 c h 1 : 10 01 1 11 10 0 h 2 : 01 11 0 10 01 0 choose crossover points for h 1 , e.g., after bits 1, 8 now restrict points in h 2 to those that produce bitstrings with well-defined semantics, e.g., 1,3, 1,8, 6,8. if we choose 1,3, result is a 1 a 2 c h 3 : 11 10 0 a 1 a 2 c a 1 a 2 c a 1 a 2 c h 4 : 11 10 0 11 11 0 10 01 0

### GABIL Results:

11 GABIL Results Performance of GABIL comparable to symbolic rule/tree learning methods C4.5, ID5R, AQ14 Average performance on a set of 12 synthetic problems: 92.1% accuracy

### Hypothesis Space Search:

12 Hypothesis Space Search Gas employ a randomized beam search method to seek a maximally fit hypothesis. Quite different from that of other learning methods; E.g. the gradient descent search in Back-Propagation moves smoothly from one hypothesis to a new one that is very similar. The GA search can move much more abruptly, replacing a parent hypothesis by an offspring that may be radically different from the parent. Advantage: less likely to fall into the same kind of local minima that can plague gradient descent methods.

### Hypothesis Space Search (Cont’d):

13 Hypothesis Space Search (Cont’d) One practical difficulty in some GA applications is the problem of Crowding Crowding is a phenomenon in which some individuals that are more highly fit than others in the population quickly reproduce, so that copies of these individuals and very similar individuals take over a large fraction of the population. The negative impact of crowding is that it reduces the diversity of the population, thereby slowing further progress by the GA.

### Three Solutions of Avoiding Crowding:

14 Three Solutions of Avoiding Crowding Alter the Selection Function Using criteria such as tournament selection or rank selection in place of fitness proportionate selection (also called roulette wheel selection ) Fitness Sharing The measured fitness of an individual is reduced by the presence of other similar individuals in the population Restrict the kinds of individuals allowed to recombine to form offspring

### Population Evolution and Schema Theorem:

15 Population Evolution and Schema Theorem How to characterize evolution of population in GA? Schema = string containing 0, 1, * (“don’t care”) Typical schema: 10**0* Instances of above schema: 101101, 100000, … Characterize population by number of instances representing each possible schema m(s, t) = number of instances of schema s in population at time t

### Consider Selection Step Only:

16 Consider Selection Step Only (t) = average fitness of population at time t m(s, t) = instances of schema s in population at time t (s, t) = average fitness of instances of s at time t Probability of selecting h in one selection step Probability of selecting an instance of s in one step Expected number of instances of s after n selections f(h) : fitness of the individual bit string h .

### Consider Selection Step Only (Cont’d):

17 Consider Selection Step Only (Cont’d) Conclusion The expected number of instances of schema s at generation t+1 is proportional to the average fitness of instances of this schema at time t; inversely proportional to the average fitness of all members of the population at time t.

### Schema Theorem:

18 Schema Theorem m(s, t) = instances of schema s in pop at time t (t) = average fitness of pop. at time t (s, t) = ave. fitness of instances of s at time t p c = probability of single point crossover operator p m = probability of mutation operator l = length of single bit strings o(s) number of defined (non “*”) bits in s d(s) = distance between leftmost rightmost defined bits in s

### Genetic Programming:

19 Genetic Programming Genetic Programming (GP) is a form of evolutionary computation in which the individuals in the evolving population are computer programs rather than bit strings. Population of programs represented by trees The fitness of a given individual program in the population is typically determined by executing the program on a set of training data.

20 Crossover

### Genetic Programming:

21 Genetic Programming More interesting example design electronic filter circuits Individuals are programs that transform beginning circuit to final circuit, by adding/subtracting components and connections Use population of 640,000, run on 64 node parallel processor Discovers circuits competitive with best human designs

### Biological Evolution:

22 Biological Evolution Lamark (19th century) The experiences of a single organism diretly affected the gentic makeup of their offspring. E.g., If an individual learned during its lifetime to avoid some toxic food, it could pass this trait on genetically to its offspring, which therefore would not need to learn the trait. But current evidence contradicts this view Genetic makeup of an individual is, in fact, unaffected by the lifetime experience of one’s biological parents.

### Baldwin Effect:

23 Baldwin Effect Baldwin Effect:  individual learning (indirectly) increases rate of evolution. Plausible example: New predator appears in environment; Individual species who can learn (to avoid it) will be selected; Increase in learning individuals will support more diverse gene pool; Resulting in faster evolution; Possibly resulting in new non-learned traits such as an instinctive fear of predator.

### Computer Experiments on Baldwin Effect:

24 Computer Experiments on Baldwin Effect Evolve simple neural networks [Hinton and Nowlan, 1987]: Some network weights fixed during lifetime, others trainable Genetic makeup determines which are fixed, and their weight values Results: With no individual learning, population failed to improve over time When individual learning allowed Early generations: population contained many individuals with many trainable weights Later generations: higher fitness, while number of trainable weights decreased 