Rensselaer Polytechnic Institute

Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Slide 1: 

1 November 19, 2009 1 Rensselaer Polytechnic Institute Department of Computer Science PhD Defense November 19, 2009 Troy, New York Travis Desell Advisors: Carlos Varela and Boleslaw Szymanski Committee: Malik Magdon-Ismail, Heidi Newberg and Dave Anderson

Overview : 

2 November 19, 2009 Motivation Asynchronous Optimization Optimization Framework Simulating Massive-Scale Environments MilkyWay@Home and Validation Conclusions Overview 2

Slide 3: 

3 November 19, 2009 Why is optimization important? 3

Slide 4: 

4 November 19, 2009 What is the structure and origin of the Milky Way galaxy? 4

Slide 5: 

5 November 19, 2009 5

Slide 6: 

6 November 19, 2009 Massive-scale Computing? 6

Slide 7: 

7 November 19, 2009 Courtesy of http://www.top500.org RPI CCNI BlueGene/L 7 Supercomputer Cores by Top 500 Rank

Slide 8: 

8 November 19, 2009 8 SETI@Home

MilkyWay@Home : 

9 November 19, 2009 MilkyWay@Home 9

Slide 10: 

10 November 19, 2009 How can we use these environments for optimization? 10

Slide 11: 

11 November 19, 2009 Asynchronous Optimization 11

Traditional Optimization Strategies : 

12 November 19, 2009 Traditional Optimization Strategies Individual-based Evolution: Differential Evolution Particle Swarm Optimization 12 Population-based Evolution: Genetic Search Traditional continuous optimization strategies are evolutionary, imitating biology. Populations of potential solutions evolve through recombination.

Issues With Traditional Optimization : 

13 November 19, 2009 13 Issues With Traditional Optimization Traditional global optimization techniques are dependent and iterative Current population (or individual) is used to generate the next population (or individual) Dependencies and iterations limit scalability and impact performance With volatile hosts, what if an individual in the next generation is lost? Redundancy is expensive Scalability limited by population size

Asynchronous Optimization Strategy : 

14 November 19, 2009 14 Asynchronous Optimization Strategy Use an asynchronous methodology No dependencies on unknown results No iterations Continuously updated population N individuals are generated randomly for the initial population Fulfill work requests by applying recombination operators to the population Update population with reported results

Generic Asynchronous Optimization : 

15 November 19, 2009 15 Generic Asynchronous Optimization Population Fitness (1) Fitness (2) Fitness (n) . . . . . . . . Individual (1) Individual (2) Individual (n) . . . . . . . . Unevaluated Individuals Unevaluated Individual (1) Unevaluated Individual (2) Unevaluated Individual (n) . . . . . . . . Workers (Fitness Evaluation) Report results and update population Request Work Send Work Generate individuals when queue is low

Genetic Search : 

16 November 19, 2009 Genetic Search Generate initial random population Iteratively generate new populations: N + M + O = population size N best individuals survive through ‘selection’ M individuals mutated O individuals generated through ‘recombination’ 16

Genetic Search Example : 

17 November 19, 2009 Genetic Search Example 17 iteration optimize sphere: f(pi) = pi[0]2 + pi[1]2 + pi[2]2

Alternate Recombination : 

18 November 19, 2009 18 Alternate Recombination Double Shot - two parents generate three children Average of the parents Outside the less fit parent, equidistant to parent and average Outside the more fit parent, equidistant to parent and average worst parent: pworst childlower = 2pworst - childaverage best parent: pbest childaverage = (pbest + pworst)/2 childlower = 2pbest - childaverage

Alternate Recombination (2) : 

19 November 19, 2009 19 Alternate Recombination (2) Randomized Simplex N parents generate one or more children Points randomly along the line created by the worst parent, and the centroid (average) of the remaining parents child = pworst + random(l1,l2) * (pworst - a) worst parent: pworst parent: p1 parent: p2 parent: p3 parent: p4 a = average(parents) pworst + l1(pworst - a) pworst + l2(pworst - a)

Steady State and Asynchronous GS : 

20 November 19, 2009 Steady State and Asynchronous GS Steady State is sequential: Generate initial random population Randomly choose mutation or recombination to generate new individual If new individual improves population, insert it and remove worst member We modify this approach for Asynchronous GS: Generate initial random population Randomly choose mutation or recombination to generate new individuals for work requests When fitness reported, insert members if they improve the population 20

Asynchronous Genetic Search : 

21 November 19, 2009 21 Asynchronous Genetic Search Population Fitness (1) Fitness (2) Fitness (n) . . . . . . . . Individual (1) Individual (2) Individual (n) . . . . . . . . Unevaluated Individuals Unevaluated Individual (1) Unevaluated Individual (2) Unevaluated Individual (n) . . . . . . . . Workers (Fitness Evaluation) Report results and update population Request Work Send Work Generate individuals when queue is low Remove worst individual and add new individual in order Select random parents from population to generate new individual

Asynchronous vs Iterative Genetic Search : 

22 November 19, 2009 22 Asynchronous vs Iterative Genetic Search

Particle Swarm Optimization : 

23 November 19, 2009 Particle Swarm Optimization Particles ‘fly’ around the search space. Move according to their previous velocity and are pulled towards the global best found position and their locally best found position. Analogies: cognitive intelligence (local best knowledge) social intelligence (global best knowledge) 23

Particle Swarm Optimization (Example) : 

24 November 19, 2009 Particle Swarm Optimization (Example) 24 previous: pi(t-1) current: pi(t) local best global best c1 * (li - pi(t)) c2 * (g - pi(t)) w * vi(t) velocity: vi(t) possible new positions

Particle Swarm Optimization (Example) : 

25 November 19, 2009 Particle Swarm Optimization (Example) 25 previous: pi(t-1) current: pi(t) local best global best c1 * (li - pi(t)) c2 * (g - pi(t)) w * vi(t) velocity: vi(t) possible new positions

Particle Swarm Optimization (Example) : 

26 November 19, 2009 Particle Swarm Optimization (Example) 26 previous: pi(t-1) current: pi(t) local best global best c2 * (g - pi(t)) w * vi(t) velocity: vi(t) possible new positions Particle finds a new local best position

Particle Swarm Optimization (Example) : 

27 November 19, 2009 Particle Swarm Optimization (Example) 27 previous: pi(t-1) current: pi(t) local best global best c2 * (g - pi(t)) w * vi(t) velocity: vi(t) possible new positions c1 * (li - pi(t))

Particle Swarm Optimization (Example) : 

28 November 19, 2009 Particle Swarm Optimization (Example) 28 previous: pi(t-1) current: pi(t) local best global best velocity: vi(t) new position Particle finds the global best position

Particle Swarm Optimization (Example) : 

29 November 19, 2009 Particle Swarm Optimization (Example) 29 c2 * (g - pi(t)) w * vi(t) possible new positions c1 * (li - pi(t)) previous: pi(t-1) current: pi(t) local best global best velocity: vi(t) Another particle finds the global best position

Particle Swarm Optimization : 

30 November 19, 2009 Particle Swarm Optimization PSO: vi(t+1) = w * vi(t) + c1 * r1 * (li - pi(t)) + c2 * r2 * (g - pi(t)) pi(t+1) = pi(t) + vi(t+1) w, c1, c2 = constants r1, r2 = random float between 0 and 1 vi(t) = velocity of particle i at iteration t pi(t) = position of particle i at iteration t li = best position found by particle i g = global best position found by all particles 30

Differential Evolution (In Brief) : 

31 November 19, 2009 Differential Evolution (In Brief) In general: Perform binary or exponential recombination between the current individual and another individual modified by a scaled difference between n pairs of other individuals 31 best/n/bin best/n/exp random/n/bin random/n/exp current/n/bin current/n/exp Parent Selection Number of Pairs Recombination / / Many Variations:

Differential Evolution (Example) : 

32 November 19, 2009 Differential Evolution (Example) 32 current: pi(t) pair1: r1 parent: r0 recombine(current, target) target: r0 + c(r1 - r2) scaled differential: c(r1 - r2) scaled differential: c(r1 - r2) pair2: r2

Differential Evolution (Details) : 

33 November 19, 2009 Differential Evolution (Details) pi,j(t) = jth parameter of ith member of population at iteration t gj = jth parameter of global best member at iteration t c = scaling factor r1, r2 = random int between 0 and population size, r1 != r2 r3 = random int between 0 and number of parameters r4 = random float between 0 and 1 cr = crossover rate 33 = gj(t) + c * (pr1,j(t) - pr2,j(t)) = pi,j(t) pi,j(t+1) DE (best/1/bin): if r3 == j or r4 < cr otherwise if f(p(t+1)) < f(p(t)) then p(t+1) = p(t)

Asynchronous DE & PSO : 

34 November 19, 2009 Asynchronous DE & PSO Note that generating new positions does not necessarily require the fitness of the previous position 1. Generate new particle or individual positions to fill work queue 2. Update local and global best on results DE: If result improves individual, update individual’s position PSO: If result improves particles local best, update local best, particle’s position and velocity of the result 34

Asynchronous DE & PSO : 

35 November 19, 2009 35 Asynchronous DE & PSO Population Fitness (1) Fitness (2) Fitness (n) . . . . . . . . Individual (1) Individual (2) Individual (n) . . . . . . . . Unevaluated Individuals Unevaluated Individual (1) Unevaluated Individual (2) Unevaluated Individual (n) . . . . . . . . Workers (Fitness Evaluation) Report results and update population Request Work Send Work Generate individuals when queue is low Individuals replaced if new individual has better fitness Select individual to generate new individual from in round-robin manner

Slide 36: 

36 November 19, 2009 A Generic Framework for Distributed Optimization 36

Approach : 

37 November 19, 2009 Separation of research concerns Scientific models Optimization methods Distributed computing 37 Approach

Synchronous Architecture : 

38 November 19, 2009 38 Synchronous Architecture Gradient Descent Newton Method Simplex . . . distribute parameters and data worker1 worker2 workern request evaluation block until result compose fitness Distributed Computing Scientific Models Synchronous Search Methods Optimization Methods

Asynchronous Architecture : 

39 November 19, 2009 39 Asynchronous Architecture Differential Evolution Genetic Search Particle Swarm . . . get parameters and data worker1 worker2 workern report results Distributed Computing Scientific Models Asynchronous Search Methods Optimization Methods

Asynchronous Architecture : 

40 November 19, 2009 40 Asynchronous Architecture … Initial Parameters Optimized Parameters

Slide 41: 

41 November 19, 2009 Simulating Massive-Scale Environments 41

Simulation Implementation : 

42 November 19, 2009 42 Simulation Implementation Population Fitness (1) Fitness (2) Fitness (n) . . . Individual (1) Individual (2) Individual (n) . . . Unevaluated Individuals Unevaluated Individual (1) Unevaluated Individual (2) Unevaluated Individual (n) . . . Workers (Fitness Evaluation) Report results and update population Request Work Send Work Generate individuals when queue is low Min Heap (Report Times) Remove result with minimum time and report Generate report time and insert into heap Request new work for each reported individual Initialize heap with results equal to number of workers 1 5 35 20 8 45 21 36 70 18 203 105 19 7 15

Slide 43: 

43 November 19, 2009 Optimization Problems 43

Sphere Function : 

44 November 19, 2009 Sphere Function Courtsey of http://www-optima.amp.i.kyoto-u.ac.jp/member/student/hedar/Hedar_files/TestGO.htm 44

Ackley Function : 

45 November 19, 2009 Ackley Function 45 Courtsey of http://www-optima.amp.i.kyoto-u.ac.jp/member/student/hedar/Hedar_files/TestGO.htm

Griewank Function : 

46 November 19, 2009 Griewank Function 46 Courtsey of http://www-optima.amp.i.kyoto-u.ac.jp/member/student/hedar/Hedar_files/TestGO.htm

Rastrigin Function : 

47 November 19, 2009 Rastrigin Function 47 Courtsey of http://www-optima.amp.i.kyoto-u.ac.jp/member/student/hedar/Hedar_files/TestGO.htm

Rosenbrock Function : 

48 November 19, 2009 Rosenbrock Function 48 Courtesy of http://en.wikipedia.org/wiki/File:Rosenbrock_function.svg

Slide 49: 

49 November 19, 2009 Exploration vs Exploitation? 49

Simulating Homogeneous Environments : 

50 November 19, 2009 Simulating Homogeneous Environments Homogeneous Report Times: 1 Synchronous vs Asynchronous Search 50

Simulated Homogeneous Environments : 

51 November 19, 2009 Simulated Homogeneous Environments 51

Simulated Homogeneous Environments : 

52 November 19, 2009 Simulated Homogeneous Environments 52

Simulated Homogeneous Environments : 

53 November 19, 2009 Simulated Homogeneous Environments 53

Simulating Homogeneous Environments Results : 

54 November 19, 2009 Simulating Homogeneous Environments Results Genetic search is computationally slow at large population sizes Asynchronous search scales better Parallel search can take more evaluations/time PSO scaled the best (when it could be used) Not all searches can solve all problems 54

Simulating Heterogeneous Environments : 

55 November 19, 2009 Simulating Heterogeneous Environments Heterogeneous Report Times: 1 + random(0 .. 1) * range Evaluations to solution Time to solution 55

Simulated Heterogeneous Environments : 

56 November 19, 2009 Simulated Heterogeneous Environments 56

Simulated Heterogeneous Environments : 

57 November 19, 2009 Simulated Heterogeneous Environments 57

Simulated Heterogeneous Environments : 

58 November 19, 2009 Simulated Heterogeneous Environments 58

Simulating Heterogeneous Environments Results : 

59 November 19, 2009 Simulating Heterogeneous Environments Results Asynchronous search requires less evaluations to solution with increased heterogeneity 59

Slide 60: 

60 November 19, 2009 Simulating MilkyWay@Home 60

Simulating MilkyWay@Home : 

61 November 19, 2009 Simulating MilkyWay@Home 61

Gamma Distribution : 

62 November 19, 2009 Gamma Distribution Probability model for waiting times random variable: x shape: k scale: sigma 62

Simulating MilkyWay@Home : 

63 November 19, 2009 Simulating MilkyWay@Home 63

Simulating MilkyWay@Home : 

64 November 19, 2009 Simulating MilkyWay@Home Heterogeneous Report Times: 40% GPU - randomly distributed as gamma(3, 1000) 60% CPU - randomly distributed as gamma(2, 12000) Evaluations to solution Time to solution 64

Simulated MilkyWay@Home : 

65 November 19, 2009 Simulated MilkyWay@Home 65

Simulated MilkyWay@Home : 

66 November 19, 2009 Simulated MilkyWay@Home 66 AGS didn’t work! ADE/best worked?

Simulated MilkyWay@Home : 

67 November 19, 2009 Simulated MilkyWay@Home 67

Simulating MilkyWay@Home Results : 

68 November 19, 2009 Simulating MilkyWay@Home Results Weird things happen! AGS stops working ADE starts working Asynchronous optimization still scales well 68

Slide 69: 

69 November 19, 2009 Optimizing Models of the Milky Way Galaxy 69

Slide 70: 

70 November 19, 2009 What if hosts return bad results? 70

Partial Verification : 

71 November 19, 2009 Partial Verification BOINC verifies every work unit Only results that will be inserted into the population need to be verified Partial Verification: Ignore false-negatives (results that won’t be inserted) Verify results which potentially improve the search 71

Slide 72: 

72 November 19, 2009 72 Unevaluated Individuals Unevaluated Individual (1) Unevaluated Individual (2) Unevaluated Individual (n) . . . . . . . . Workers (Fitness Evaluation) Remove verified results from queue and insert them into the population Request Work Send Work Generate new individuals when queue is low, otherwise Partial Verification Strategy Population Fitness (1) Fitness (2) Fitness (n) . . . Individual (1) Individual (2) Individual (n) . . . Verification Queue Fitness (1) Fitness (2) Fitness (n) . . . Individual (1) Individual (2) Individual (n) . . . Insert result if it could improve population Resend individuals at verification rate

Optimization Method Comparison : 

73 November 19, 2009 Optimization Method Comparison Tracked best fitness across 5 separate searches for each combination of search parameters Used different verification rates (v) Used Sagittarius stripe 22: 100,789 observed stars 3 streams 20 optimization parameters 73

Slide 74: 

74 November 19, 2009 74

Limiting Redundancy (Genetic Search) : 

75 November 19, 2009 Limiting Redundancy (Genetic Search) 75 60% verification found best solutions Increased verification reduces reliability Reliability and convergence by number of parents seems dependent on verification rate

Limiting Redundancy (PSO) : 

76 November 19, 2009 Limiting Redundancy (PSO) 76 30% verification found best solutions Increased verification reduces reliability Not as dramatically as AGS Lower inertia weights give better results

Limiting Redundancy (DE best/n/bin) : 

77 November 19, 2009 Limiting Redundancy (DE best/n/bin) 77 30% verification found best solutions Increased verification slightly reduces reliability 1 pair gives best results More pairs slightly improves reliability

Limiting Redundancy (DE rand/p/bin) : 

78 November 19, 2009 Limiting Redundancy (DE rand/p/bin) 78 30% verification found best solutions Increased verification reduces reliability 1 pair gives best results More pairs reduces reliability

Optimization Method Comparison : 

79 November 19, 2009 Optimization Method Comparison 79

Slide 80: 

80 November 19, 2009 Conclusions 80

Key Contributions : 

81 November 19, 2009 Scalable and resilience optimization methods Generic/Unified optimization framework: Supercomputers Grids Internet Computing Simulated environments and benchmarks 81 Key Contributions

Key Contributions : 

82 November 19, 2009 MilkyWay@Home 439 teraflops 25,000 active users 50,000 active hosts Asynchronous optimization being used to optimize models of the Milky Way galaxy Validation work improves time to solution 82 Key Contributions

What did we learn? : 

83 November 19, 2009 Asynchronous optimization scales to large environments Heterogeneity can cause unexpected results Verification can be limited for asynchronous optimization 83 What did we learn?

Slide 84: 

84 November 19, 2009 Future Work 84

Improve Asynchronous Optimization : 

85 November 19, 2009 Why does AGS fail in a simulated environment? Why does ADE succeed in a large simulated environment? Improve scalability further Improve convergence rates and reliability 85 Improve Asynchronous Optimization

Metaheuristics : 

86 November 19, 2009 Tuning parameters is time consuming and tedious Can this process be automated? 86 Metaheuristics

Improve Verification Strategies : 

87 November 19, 2009 Can we reduce resources dedicated to verification further? MilkyWay@Home has ~1% fault rate What if we assume results to be successful and remove them from the population if found faulty? 87 Improve Verification Strategies

Collaboration : 

88 November 19, 2009 DNA@Home Malaria@Home 88 Collaboration

Slide 89: 

89 November 19, 2009 Questions? 89

Slide 90: 

90 November 19, 2009 90 Thanks! http://wcl.cs.rpi.edu http://milkyway.cs.rpi.edu Work partially supported by: NSF AST No. 0607618 NSF IIS No. 0612213 NSF MRI No. 0420703 NSF CAREER CNS Award No. 0448407

Slide 91: 

91 November 19, 2009 Extra Slides 91

Related Work - Distributed Genetic Search : 

92 November 19, 2009 Enrique Alba and Jose M. Troya. Analyzing synchronous and asynchronous parallel distributed genetic algorithms. Future Generation Computer Systems, 17:451–465, January 2001. Enrique Alba and Bernabe Dorronsoro. The exploration/exploitation tradeoff in dynamic cellular genetic algorithms. IEEE Transactions on Evolutionary Computation, 9:126–142, April 2005. Johan Berntsson and Maolin Tang. A convergence model for asynchronous parallel genetic algorithms. In IEEE Congress on Evolutionary Computation (CEC2003), volume 4, pages 2627–2634, December 2003. Erick Cantu-Paz. A survey of parallel genetic algorithms. Calculateurs Paralleles, Reseaux et Systems Repartis, 10(2):141–171, 1998. Bernabe Dorronsoro and Enrique Alba. A simple cellular genetic algorithm for continuous optimization. IEEE Congress on Evolutionary Computation (CEC2006), pages 2838–2844, July 2006. Bernabe Dorronsoro, Enrique Alba, Mario Giacobini, and Marco Tomassini. The influence of grid shape and asynchronicity on cellular evolutionary algorithms. In IEEE Congress on Evolutionary Computation (CEC2004), volume 2, pages 2152–2158, June 2004. Gianluigi Folino, Agostino Forestiero, and Giandomenico Spezzano. A JXTA based asynchronous peer-to-peer implementation of genetic programming. Journal of Software, 1:12–23, August 2006. Hiroaki Imade, Ryohei Morishita, Isao Ono, Norihiko Ono, and Masahiro Okamoto. A grid-oriented genetic algorithm framework for bioinformatics. New Generation Computing: Grid Systems for Life Sciences, 22:177–186, January 2004. Dudy Lim, Yew-Soon Ong, Yaochu Jin, Bernhard Sendhoff, and Bu-Sung Lee. Efficient hierarchical parallel genetic algorithms using grid computing. Future Generation Computer Systems, 23:658–670, May 2007. Z. G. Wang, M. Rahman, Y. S. Wong, and J. Sun. Optimization of multi-pass milling using parallel genetic algorithm and parallel genetic simulated annealing. International Journal of Machine Tools and Manufacture, 45(15):1726–1734, December 2005. 92 Related Work - Distributed Genetic Search

Related Work - Distributed Particle Swarm : 

93 November 19, 2009 S. Baskar, A. Alphones, and P. N. Suganthan. Concurrent pso and fdr-pso based reconfigurable phase-differentiated antenna array design. In Congress on Evolutionary Computation, volume 2, pages 2173–2179, June 2004. S. Baskar and P. N. Suganthan. A novel concurrent particle swarm optimization. In Congress on Evolutionary Computation, volume 1, pages 792–796, June 2004. Xiaohui Cui and Thomas E. Potok. Distributed adaptive particle swarm optimizer in dynamic environment. In International Parallel and Distributed Processing Symposium, pages 1–7, March 2007. Byung-Il Koh, Alan D. George, and Raphael T. Haftka. Parallel asynchronous particle swarm optimization. International Journal of Numerical Methods in Engineering, 67(4):578–595, July 2006. J. F. Schutte, J. A. Reinbolt, B. J. Fregly, R. T. Haftka, and A. D. George. Parallel global optimization with the particle swarm algorithm. International Journal for Numerical Methods in Engineering, 61(13):2296–2315, December 2004. Gerhard Venter and Jaroslaw Sobieszczanski-Sobieski. A parallel particle swarm optimization algorithm accelerated by asynchronous evaluations. In Sixth World Congresses of Structural and Multidisciplinary Optimization, pages 1–10, May 2005. Xue Wang, Jun-Jie Ma, Sheng Wang, and Dao-Wei Bi. Distributed particle swarm optimization and simulated annealing for energy-efficient coverage in wireless sensor networks. Sensors Special Issue on Energy Efficiency and Intelligent Signal Processing for Wireless Sensing, 7(5):628–648, May 2007. 93 Related Work - Distributed Particle Swarm

Related Work - Distributed Differential Evolution : 

94 November 19, 2009 D. K. Tasoulis, N. G. Pavlidis, V. P. Plagianakos, and M. N. Vrahatis. Parallel differential evolution. In Congress on Evolutionary Computation 2004 (CEC2004), volume 2, pages 2023–2029, June 2004. 94 Related Work - Distributed Differential Evolution

Optimization Parameters for Simulation : 

95 November 19, 2009 Optimization Parameters for Simulation Population Size: 100 Genetic Search: mutation rate: 0.3 recombination: simplex 2 parents line search min: -1.5 line search max: 1.5 95 Particle Swarm Optimization: inertia weight (w) = 0.5 cognitive scale (c1) = 2 social scale (c2) = 2 Differential Evolution: binomial recombination number pairs = 1 parent = best or random pair scale = 0.5 crossover rate = 0.5

Latency Effects : 

96 November 19, 2009 Latency Effects Is BOINC a good platform for optimization? Fast turnaround required to keep populations evolving Many slow clients -- are these resources wasted? 96

Operator Examination (1) - BlueGene : 

97 November 19, 2009 97 Operator Examination (1) - BlueGene

Operator Examination (2) - BOINC : 

98 November 19, 2009 98 Operator Examination (2) - BOINC

Operator Examination (3) - BOINC : 

99 November 19, 2009 99 Operator Examination (3) - BOINC

Operator Examination (4) - BOINC : 

100 November 19, 2009 100 Operator Examination (4) - BOINC

Operator Examination (5) - BOINC : 

101 November 19, 2009 101 Operator Examination (5) - BOINC

Operator Examination (6) - BOINC : 

102 November 19, 2009 102 Operator Examination (6) - BOINC

Operator Examination (7) - BOINC : 

103 November 19, 2009 103 Operator Examination (7) - BOINC