logging in or signing up Genetic Algorithm Introduction and use in Phylogenetics aSGuest126742 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT lite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 31 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: February 14, 2012 This Presentation is Public Favorites: 0 Presentation Description Genetic Algorithm Introduction and use in Phylogenetics Comments Posting comment... Premium member Presentation Transcript Genetic Algorithm: Genetic AlgorithmContents: Contents Introduction Basic Outline Genetic Operators Methods of Selection Termination GA in PhylogeneticsIntroduction: Introduction Genetic algorithms are a group of search techniques used in computer science developed in the 1970s. These search techniques are based on the process of biological evolution used to provide useful solutions to optimization and search problems. based on biological evolution, they use techniques that are emulated from the concepts of inheritance , mutation and selectionBasic Principle: : Basic Principle: Genetic algorithms generate a population of solutions for a given problem. The ‘fittest’ solutions are selected and these solutions ‘reproduce Resulting in a new population of possible solutions, with a different composition than the previous population. This process ends when a best solution is found.Basic Description : Basic Description Algorithm is started with a set of solutions called population . Solutions from one population are taken and used to form a new population Solutions which are selected to form new solutions ( offspring ) are selected according to their fitness – The more suitable they are the more chances they have to reproduce.Encoding: Encoding Introduction: Encoding of chromosomes is one of the problems, when you are starting to solve problem with GA. Encoding very depends on the problem . Following are some encodings, which have been already used with some success. Binary Encoding . Permutation Encoding Value Encoding Tree EncodingBinary Encoding . : Binary Encoding . In binary encoding every chromosome is a string of bits , 0 or 1 .Permutation Encoding : Permutation Encoding In permutation encoding , every chromosome is a string of numbers, which represents number in a sequence .Value Encoding : Value Encoding In value encoding , every chromosome is a string of some values. Values can be anything connected to problem, form numbers, real numbers or chars to some complicated objects.Tree Encoding : Tree Encoding Tree encoding is used mainly for evolving programs or expressions, for genetic programming . In tree encoding every chromosome is a tree of some objects, such as functions or commands in programming language.Example of chromosomes with tree encoding : Example of chromosomes with tree encodingPowerPoint Presentation: Tree encoding is good for evolving programs. Programing language LISP is often used to this, because programs in it are represented in this form and can be easily parsed as a tree, The crossover and mutation can be done relatively easily.Operators of Genetic Algorithm: : Operators of Genetic Algorithm: Crossover : Crossover selects genes from parent chromosomes and creates a new offspring from them. Randomly some crossover point are chosen from both parents Crossover points are then exchangedCrossover can then look like this ( | is the crossover point): : Crossover can then look like this ( | is the crossover point):Crossover techniques : Crossover techniques One-point crossover A single crossover point on both parents' organism strings is selected. All data beyond that point in either organism string is swapped between the two parent organisms. The resulting organisms are the children:Two-point crossover : Two-point crossover Two-point crossover calls for two points to be selected on the parent organism strings. Everything between the two points is swapped between the parent organisms. Resulting two child organisms:"Cut and splice" : "Cut and splice" The "cut and splice" approach, results in a change in length of the children strings. The reason for this difference is that each parent string has a separate choice of crossover point.Crossover biases : Crossover biases For crossover operators which exchange contiguous sections of the chromosomes (e.g. k-point) the ordering of the variables may become important. This is particularly true when good solutions contain building blocks which might be disrupted by a non-respectful crossover operator.2.Mutation : 2.Mutation Mutation is a genetic operator that alters one ore more gene values in a chromosome from its initial statetypes of mutation: : types of mutation: Flip Bit -A mutation operator that simply inverts the value of the chosen gene (0 goes to 1 and 1 goes to 0). This mutation operator can only be used for binary genes. Boundary - A mutation operator that replaces the value of the chosen gene with either the upper or lower bound for that gene (chosen randomly). This mutation operator can only be used for integer and float genes.Outline of the Basic Genetic Algorithm : [Start] Generate random population [Fitness] Evaluate the fitness [New population] Create a new population by repeating following steps [Selection] Select two parent chromosomes from a population according to their fitness [Crossover] [Mutation] [Accepting] Place new offspring in a new population [Replace] Use new generated population for a further run of algorithm [Test] If the end condition is satisfied, stop , and return the best solution in current population [Loop] Go to step 2 Outline of the Basic Genetic AlgorithmMethods of Selection: Methods of SelectionMethods of Selection: Methods of Selection Different techniques which a genetic algorithm can use to select the individuals to be copied over into the next generationGenerational selection: Generational selection offspring of the individuals selected from each generation become the entire next generation No individuals are retained between generationsRoulette Wheel Selection: Roulette Wheel Selection Parents are selected according to their fitness. The better the chromosomes are, the more chances to be selected they have. Chromosome with bigger fitness will be selected more timesRoulette Wheel Selection: Roulette Wheel SelectionRank Selection : Rank Selection first ranks the population and then every chromosome receives fitness from this ranking worst will have fitness 1, second worst 2 and the best will have fitness NSituation before ranking (graph of fitnesses): Situation before ranking (graph of fitnesses )Situation after ranking (graph of order numbers): Situation after ranking (graph of order numbers)Rank Selection: Rank Selection all the chromosomes have a chance to be selected. but can lead to slower convergence, because the best chromosomes do not differ so much from other onesElitism: Elitism by crossover and mutation, we have a big chance, that we will loose the best chromosome Elitism - first copies atleast one best chromosome without changes to new population. can very rapidly increase performance of GA, because it prevents losing the best found solutionTournament selection: Tournament selection Subgroups of individuals are chosen from the larger population members of each subgroup compete against each other Only one individual from each subgroup is chosen to reproduceParameters of Genetic Algorithm: Parameters of Genetic AlgorithmCrossover probability: Crossover probability how often will be crossover performed If there is no crossover, offspring is exact copy of parents If there is a crossover, offspring is made from parts of parents' chromosome If crossover probability is 100% , then all offspring is made by crossover If it is 0% , whole new generation is made from exact copies of chromosomes from old population Crossover - new chromosomes will have good parts of old chromosomes and maybe the new chromosomes will be betterMutation probability: Mutation probability how often will be parts of chromosome mutated If no mutation, offspring is taken after crossover without any change If mutation probability is 100% , whole chromosome is changed if it is 0% , nothing is changed. Mutation is made to prevent falling GA into local extremePopulation size: Population size how many chromosomes are in population (in one generation) If there are too few chromosomes, GA have a few possibilities to perform crossover and only a small part of search space is explored On the other hand, if there are too many chromosomes, GA slows down after some limit it is not useful to increase population size, because it does not make solving the problem faster.Termination: Termination Generations — The algorithm stops when the Fixed number of generations is reached. Time limit — The algorithm stops after running for allocated amount of time Fitness limit — The highest ranking solution's fitness is reaching or has reached a plateau such that successive iterations no longer produce better resultsTermination: Termination Stall time limit — The algorithm stops if there is no improvement in the objective function during an interval of time solution is found that satisfies minimum criteria Combinations of the aboveGenetic Algorithm in Phylogenetics: Genetic Algorithm in PhylogeneticsGA’s in Phylogenetics: GA’s in Phylogenetics GA’s are used in field of phylogenetics in multiple sequence alignmentPrograms that perform multiple alignment using a GA: Programs that perform multiple alignment using a GA SAGA (sequence alignment by genetic algorithm)-breaks possible alignments into pieces and then introduces gaps at various positions MAGA (multiple alignment by genetic algorithm) for protein sequences implemented a genetic algorithm with basic scoring function, random introduction of gaps, and linear propagation to the next generation RAGA (RNA alignment by genetic algorithm) for RNA alignmentChoosing an Initial Population: Choosing an Initial Population To begin with an alignment produced by another program and copy it multiple times to create an initial population To copy the initial alignment multiple times and then randomly distribute the gaps within each alignment To start with sequences without gaps and find a way to introduce gaps into the alignment. Alignment #1 Alignment #4 Alignment #3 Alignment #2 Alignment #1 Alignment #2 Alignment #3 Alignment #4 Alignment #1 Alignment #3 Alignment #2 Alignment #4Evaluating Fitness: Evaluating Fitness Naïve Scoring Function One point for each match per column No penalty for gaps A gap does not match a gap A A A G G +1 +1 +1 +1 - Total Score for this column = 4 Score(column) = nc2(#A) + nc2(#C) + nc2(#G) + nc2(#T) where nc2(x) is the number of ways you can choose 2 items from a group of x FormulaMutation: Mutation C A A T G A - C - C G T A A A G C A C T A - - - A A A A Step 1: Determine whether or not to mutate a gap using a random number generator Gap selected for mutation Step 2: Determine how much that gap will be mutated using a random number generator - 2 spaces Step 3: Move gap C A A T G A - C - C G T A A A G C A C T A - - - A A A ACrossover: Crossover C A A T G A - C - C G T A A A G C A C T A - - - A A A A C A A T G A - C - C G T A A A G C A C T A - - - A A A A A A A A A A A C C - - T G A A A A A - - C C C - T T G G A crossover alignment is built by randomly selecting a first sequence from one of the two alignments, a second sequence from one of the two alignments, etc. Example: Alignment #1 Alignment #2 Crossover Alignment You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Genetic Algorithm Introduction and use in Phylogenetics aSGuest126742 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT lite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 31 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: February 14, 2012 This Presentation is Public Favorites: 0 Presentation Description Genetic Algorithm Introduction and use in Phylogenetics Comments Posting comment... Premium member Presentation Transcript Genetic Algorithm: Genetic AlgorithmContents: Contents Introduction Basic Outline Genetic Operators Methods of Selection Termination GA in PhylogeneticsIntroduction: Introduction Genetic algorithms are a group of search techniques used in computer science developed in the 1970s. These search techniques are based on the process of biological evolution used to provide useful solutions to optimization and search problems. based on biological evolution, they use techniques that are emulated from the concepts of inheritance , mutation and selectionBasic Principle: : Basic Principle: Genetic algorithms generate a population of solutions for a given problem. The ‘fittest’ solutions are selected and these solutions ‘reproduce Resulting in a new population of possible solutions, with a different composition than the previous population. This process ends when a best solution is found.Basic Description : Basic Description Algorithm is started with a set of solutions called population . Solutions from one population are taken and used to form a new population Solutions which are selected to form new solutions ( offspring ) are selected according to their fitness – The more suitable they are the more chances they have to reproduce.Encoding: Encoding Introduction: Encoding of chromosomes is one of the problems, when you are starting to solve problem with GA. Encoding very depends on the problem . Following are some encodings, which have been already used with some success. Binary Encoding . Permutation Encoding Value Encoding Tree EncodingBinary Encoding . : Binary Encoding . In binary encoding every chromosome is a string of bits , 0 or 1 .Permutation Encoding : Permutation Encoding In permutation encoding , every chromosome is a string of numbers, which represents number in a sequence .Value Encoding : Value Encoding In value encoding , every chromosome is a string of some values. Values can be anything connected to problem, form numbers, real numbers or chars to some complicated objects.Tree Encoding : Tree Encoding Tree encoding is used mainly for evolving programs or expressions, for genetic programming . In tree encoding every chromosome is a tree of some objects, such as functions or commands in programming language.Example of chromosomes with tree encoding : Example of chromosomes with tree encodingPowerPoint Presentation: Tree encoding is good for evolving programs. Programing language LISP is often used to this, because programs in it are represented in this form and can be easily parsed as a tree, The crossover and mutation can be done relatively easily.Operators of Genetic Algorithm: : Operators of Genetic Algorithm: Crossover : Crossover selects genes from parent chromosomes and creates a new offspring from them. Randomly some crossover point are chosen from both parents Crossover points are then exchangedCrossover can then look like this ( | is the crossover point): : Crossover can then look like this ( | is the crossover point):Crossover techniques : Crossover techniques One-point crossover A single crossover point on both parents' organism strings is selected. All data beyond that point in either organism string is swapped between the two parent organisms. The resulting organisms are the children:Two-point crossover : Two-point crossover Two-point crossover calls for two points to be selected on the parent organism strings. Everything between the two points is swapped between the parent organisms. Resulting two child organisms:"Cut and splice" : "Cut and splice" The "cut and splice" approach, results in a change in length of the children strings. The reason for this difference is that each parent string has a separate choice of crossover point.Crossover biases : Crossover biases For crossover operators which exchange contiguous sections of the chromosomes (e.g. k-point) the ordering of the variables may become important. This is particularly true when good solutions contain building blocks which might be disrupted by a non-respectful crossover operator.2.Mutation : 2.Mutation Mutation is a genetic operator that alters one ore more gene values in a chromosome from its initial statetypes of mutation: : types of mutation: Flip Bit -A mutation operator that simply inverts the value of the chosen gene (0 goes to 1 and 1 goes to 0). This mutation operator can only be used for binary genes. Boundary - A mutation operator that replaces the value of the chosen gene with either the upper or lower bound for that gene (chosen randomly). This mutation operator can only be used for integer and float genes.Outline of the Basic Genetic Algorithm : [Start] Generate random population [Fitness] Evaluate the fitness [New population] Create a new population by repeating following steps [Selection] Select two parent chromosomes from a population according to their fitness [Crossover] [Mutation] [Accepting] Place new offspring in a new population [Replace] Use new generated population for a further run of algorithm [Test] If the end condition is satisfied, stop , and return the best solution in current population [Loop] Go to step 2 Outline of the Basic Genetic AlgorithmMethods of Selection: Methods of SelectionMethods of Selection: Methods of Selection Different techniques which a genetic algorithm can use to select the individuals to be copied over into the next generationGenerational selection: Generational selection offspring of the individuals selected from each generation become the entire next generation No individuals are retained between generationsRoulette Wheel Selection: Roulette Wheel Selection Parents are selected according to their fitness. The better the chromosomes are, the more chances to be selected they have. Chromosome with bigger fitness will be selected more timesRoulette Wheel Selection: Roulette Wheel SelectionRank Selection : Rank Selection first ranks the population and then every chromosome receives fitness from this ranking worst will have fitness 1, second worst 2 and the best will have fitness NSituation before ranking (graph of fitnesses): Situation before ranking (graph of fitnesses )Situation after ranking (graph of order numbers): Situation after ranking (graph of order numbers)Rank Selection: Rank Selection all the chromosomes have a chance to be selected. but can lead to slower convergence, because the best chromosomes do not differ so much from other onesElitism: Elitism by crossover and mutation, we have a big chance, that we will loose the best chromosome Elitism - first copies atleast one best chromosome without changes to new population. can very rapidly increase performance of GA, because it prevents losing the best found solutionTournament selection: Tournament selection Subgroups of individuals are chosen from the larger population members of each subgroup compete against each other Only one individual from each subgroup is chosen to reproduceParameters of Genetic Algorithm: Parameters of Genetic AlgorithmCrossover probability: Crossover probability how often will be crossover performed If there is no crossover, offspring is exact copy of parents If there is a crossover, offspring is made from parts of parents' chromosome If crossover probability is 100% , then all offspring is made by crossover If it is 0% , whole new generation is made from exact copies of chromosomes from old population Crossover - new chromosomes will have good parts of old chromosomes and maybe the new chromosomes will be betterMutation probability: Mutation probability how often will be parts of chromosome mutated If no mutation, offspring is taken after crossover without any change If mutation probability is 100% , whole chromosome is changed if it is 0% , nothing is changed. Mutation is made to prevent falling GA into local extremePopulation size: Population size how many chromosomes are in population (in one generation) If there are too few chromosomes, GA have a few possibilities to perform crossover and only a small part of search space is explored On the other hand, if there are too many chromosomes, GA slows down after some limit it is not useful to increase population size, because it does not make solving the problem faster.Termination: Termination Generations — The algorithm stops when the Fixed number of generations is reached. Time limit — The algorithm stops after running for allocated amount of time Fitness limit — The highest ranking solution's fitness is reaching or has reached a plateau such that successive iterations no longer produce better resultsTermination: Termination Stall time limit — The algorithm stops if there is no improvement in the objective function during an interval of time solution is found that satisfies minimum criteria Combinations of the aboveGenetic Algorithm in Phylogenetics: Genetic Algorithm in PhylogeneticsGA’s in Phylogenetics: GA’s in Phylogenetics GA’s are used in field of phylogenetics in multiple sequence alignmentPrograms that perform multiple alignment using a GA: Programs that perform multiple alignment using a GA SAGA (sequence alignment by genetic algorithm)-breaks possible alignments into pieces and then introduces gaps at various positions MAGA (multiple alignment by genetic algorithm) for protein sequences implemented a genetic algorithm with basic scoring function, random introduction of gaps, and linear propagation to the next generation RAGA (RNA alignment by genetic algorithm) for RNA alignmentChoosing an Initial Population: Choosing an Initial Population To begin with an alignment produced by another program and copy it multiple times to create an initial population To copy the initial alignment multiple times and then randomly distribute the gaps within each alignment To start with sequences without gaps and find a way to introduce gaps into the alignment. Alignment #1 Alignment #4 Alignment #3 Alignment #2 Alignment #1 Alignment #2 Alignment #3 Alignment #4 Alignment #1 Alignment #3 Alignment #2 Alignment #4Evaluating Fitness: Evaluating Fitness Naïve Scoring Function One point for each match per column No penalty for gaps A gap does not match a gap A A A G G +1 +1 +1 +1 - Total Score for this column = 4 Score(column) = nc2(#A) + nc2(#C) + nc2(#G) + nc2(#T) where nc2(x) is the number of ways you can choose 2 items from a group of x FormulaMutation: Mutation C A A T G A - C - C G T A A A G C A C T A - - - A A A A Step 1: Determine whether or not to mutate a gap using a random number generator Gap selected for mutation Step 2: Determine how much that gap will be mutated using a random number generator - 2 spaces Step 3: Move gap C A A T G A - C - C G T A A A G C A C T A - - - A A A ACrossover: Crossover C A A T G A - C - C G T A A A G C A C T A - - - A A A A C A A T G A - C - C G T A A A G C A C T A - - - A A A A A A A A A A A C C - - T G A A A A A - - C C C - T T G G A crossover alignment is built by randomly selecting a first sequence from one of the two alignments, a second sequence from one of the two alignments, etc. Example: Alignment #1 Alignment #2 Crossover Alignment