Slide1 : Electronic Bayanihan
Parallel Processing and Volunteer Computing
Luis F. G. Sarmenta
The Ever-Growing Need for Speed : The Ever-Growing Need for Speed Machines are getting faster …
Moore’s Law (going strong since 1960’s):
for the same cost, you get a 2x faster proc. every 18 mos.
Want more speed? Just wait 18 mos.
… but not fast enough
Many new computationally-intensive apps, which still take days/weeks, even on latest processors
Examples of compute-intensive apps
Physics: CFD (auto/aero industry), weather, data analysis
Bio and Chem: modelling of DNA, proteins, etc.
Business: data mining, simulation and prediction
Multimedia: CGI rendering
Math: finding large primes, cryptography, etc.
We can do a lot if we have very fast computers!
Parallel Processing : Parallel Processing Want 2x speedup now? Use 2 processors!
Simple Example: adding 8 numbers 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8
3 + 7 + 5 + 6 + 7 + 8
10 + 11 + 7 + 8
21 + 15
36 Parallel (2 Procs.) 4 steps = 1.75x faster
(approaches 2x
with more numbers) Idea: give diff. parts of problem to diff. procs.
In general: up to N times faster with N procs.
Parallel Processing : Parallel Processing The Idea
partition problem into many parts
have processors work on different parts at the same time
“Embarassingly Parallel” apps
simplest form of parallelism
each part is independent
procs. can work without communicating with each other
very efficient because they don’t need to wait for each other
Master-Worker model
master assigns work
workers get and do work One BIG
Problem
Example: Code Cracking : Example: Code Cracking RSA RC5-56 challenge: crack the code and win $10,000
secret 56-bit key
first few letters are known, so we can know if we find right key
Strategy: search the whole key space by brute force
It works!
In 1997, distributed.net cracked the code
using thousands of computers around the world
Solved problem 26,000 times faster than a single Pentium
Workers search in parallel Eureka!
The key is 3124120797
Parallel Image Rendering : Parallel Image Rendering Another embarassingly parallel application
Ray tracing
trace light ray for each pixel
looong computation per pixel
but, pixels are independent
Idea: divide pixels and give to different computers
(or, even simpler, give differentframes)
Probably used in Toy Story, and Star Wars! One BIG
Picture divide into
independent blocks
Monte-Carlo Simulations : Monte-Carlo Simulations Simulation is based on randomized variables
Examples
Business
stock market and economy
simulating processing queues
Traffic Simulation
random traffic conditions, and driver behavior (!)
Input: random number seed
gives diff. sets of random #’s
Output: values of certain variables
e.g., stock price, travel time Traffic
Simulator
Parallel Monte-Carlo Simulations : Parallel Monte-Carlo Simulations Run many independent Monte-Carlo simulations with different random inputs in parallel
Collect statistical info, e.g., average travel times
Embarassingly parallel
each simulation is completely independent of the others Ave. Travel Time:
to Makati = 1.5 hrs.
to Manila = 70 mins. Do many runs
using actual
probability
distributions
What-If Simulations : What-If Simulations If we can compute averages fast enough, then we can do “what-if” experiments
First, choose (non-random) configuration (e.g., w/ or w/o color coding, or w/ or w/o flyover)
Then, run parallel Monte Carlo to determine averages
Repeat with other configurations you want to try
Select best
Ave. Travel Time:
to Makati = 1.5 hrs.
to Manila = 70 mins. without
flyover
What-If Simulations : If we can compute averages fast enough, then we can do “what-if” experiments
First, choose (non-random) configuration (e.g., w/ or w/o color coding, or w/ or w/o flyover)
Then, run parallel Monte Carlo to determine averages
Repeat with other configurations you want to try
Select best
Ave. Travel Time:
to Makati = 45 mins.
to Manila = 70 mins. What-If Simulations with
flyover
Master-Worker Computations : Master-Worker Computations Generalization of embarassingly parallel comp.
several embarassingly parallel computations in sequence
each “step” is composed of independent parts
but, each step can depend on previous step
In each step:
divide work into parts (aka “work packets”)
distribute to workers
collect results
create new work for next step (possibly based on results) Perform
Embarassingly Parallel
Computation (Next Parallel Step)
Other Master-Worker Examples : Other Master-Worker Examples Data Mining: finding patterns in data
collect lots of data (e.g., sales)
find associations
e.g., people who buy product A will also buy product B with probability 0.7
each work packet can be a subset of data
Optimizing multi-variable funcs.
start with m random points
choose k best ones and then generate new candidates from them
repeat until convergence
“genetic algorithms”
etc., etc.,
Perform
Embarassingly Parallel
Computation (Next Parallel Step)
Grid-Based Computations : Simulated
Space Grid-Based Computations The Idea
Divide space into 2D or 3D grid
Assign each cell to a different processor/computer
Procs. perform computation on their respective data in parallel
Exchange data between neighboring cells
Repeat step
Not “embarassingly parallel”
cells are not totally independent
requires communication b/w cells
make sure that communication time is less than computation time
Grid-Based Computations : Useful for modeling and simulation of physical systems
Computational Fluid Dynamics (CFD)
Heat Transfer/Diffusion
Weather forecasting
Computationally-Intensive
More accuracy -> more computation
finer grids
shorter simulated time intervals
Thus, we need parallel processing!
Grid-Based Computations
Other Forms of Parallel Processing : Other Forms of Parallel Processing N-Body Problem
compute movement of N bodies based on gravity
each proc gets a different subset of the N bodies
Matrix Multiplication
Fast-Fourier Transform
for image and sound processing
Pipelined Computations
etc., etc.,
The Main Idea
divide into parts
maximize computation-to-communication ratio (granularity) 1 2 3 4 8 12 16 15 14 9 5 6 7 11 10 13 One BIG
Problem divide into parts
How do we do Parallel Processing? : How do we do Parallel Processing? Supercomputers: VERY fast computers with MANY procs
e.g., Cray, Connection Machine, IBM Deep Thought, etc.
Different types
SIMD: single-instruction, multiple data
e.g., vector machines, array machines, CM2, etc.
MIMD: multiple-instruction, multiple data
More common today
Centralized Shared Memory
sometimes aka SMPs, e.g.. dual-Pentium PC’s
simpler memory model, but less scalable (only up to ~32)
Distributed Memory
each proc. has own memory; communicate via msg passing
much more scalable (up to thousands of procs)
Drawback: They’re EXPENSIVE!
Millions of $ each, plus not easily upgradeable
NOWs and Metacomputing : NOWs and Metacomputing Use existing networks of workstations (NOWs) as one big virtual metacomputer
“poor man’s supercomputer”
make use of idle time
Successful Projects
ad hoc systems: e.g., RSA-129 factoring (1994), GIMPS
general purpose systems: e.g., Condor, PVM, MPI
recent/ongoing projects:e.g. Legion, Globus, Beowulf, etc.
Problems
difficult to set up
requires giving out accounts
can take days, weeks, months to install
cannot be used by normal user
Volunteer Computing : Volunteer Computing The Idea: Make it very easy for even non-expert users to join a NOW by themselves
Minimal setup required Maximum participation
Can form very large NOWs very quickly!
just invite people
The Dream: Electronic Bayanihan
achieving the impossible through cooperation “Bayanihan” Mural by Carlos “Botong” Francisco, commissioned by Unilab, Philippines. Used with permission.
Examples of Volunteer Computing : Examples of Volunteer Computing distributed.net
cracked RC5 on October 19, 1997
average aggregate power of 26,000 Pentium-class PCs
involved 500,000 unique IP addresses over 250 days
SETI@home
data mining of radio telescope data to find ET
Ease of volunteering makes it possible to attract huge numbers of volunteers
volunteers download pre-compiled binaries from the web
Some Problems:
still need different versions for different platforms
still need a little technical knowledge from volunteers
download, install, and run
security: software can read/write user’s data
Project Bayanihan:Volunteer Computing with Java : Project Bayanihan: Volunteer Computing with Java Very easy to use
“one-click computing”: go to web site; applet runs automatically
minimal setup & distribution effort
even NC and WebTV users can join
Platform-independent
write once, run anywhere
easier for both programmers and volunteers
Secure
Java sandbox prevents applets from touching users’ data
minimal security risk Server
Potentials : Potentials True (Altruistic) Volunteer Computing
potentially tap millions of computers around the world
cool problems (e.g., RC5, Chess, computer “olympics”)
worthy causes
local (e.g., traffic), or global (e.g., medical, environmental)
Forced (Institutional) Volunteer Computing
easy & cheap supercomputing for companies and universities
collaboratories: enable institutions to share computing power
Paid (Commercial) Volunteer Computing
market systems: buy, sell, trade processor cycles as commodity
contract systems: use cycles to pay for online services
NOIA (Network of Information Appliances)
use computational power of set-top boxes (e.g., WebTV)
can be commercialized and/or contract-based
tens of millions of processors (note: Greek noia means “mind”)
Current Results : Current Results Bayanihan software framework
my ongoing Ph.D. thesis at MIT
Currently good for Master-Worker style apps
RSA RC5 challenge decryption
Distributed web-crawling (parallel communications)
Mandelbrot set (rendering)
Genetic algorithms (e.g., function optimization)
Research Issues
User Interface Design
Adaptive Parallelism
Fault and Sabotage Tolerance
Performance and Scalability
Implementing other types of Apps
Experimental Results : Experimental Results 200 Mhz Pentiums w/ Netscape 4.0, NT4.0, 10Mbit Ethernet
Mandelbrot
shows good performance of embarassingly parallel apps
Matrix Multiply
Broadcast B
distribute rows of A
Performance limited by
sequential broadcast of Matrix B
memory limits (can’t do 1000x1000 matrix)
Jacobi Iteration (Grid Computation)
shows that we can now write general scientific code (e.g., CFD)
BUT, performance is still BAD because we are just simulating a grid computation using master-worker
ongoing research
Experimental Results : 200 Mhz Pentiums w/ Netscape 4.0, NT4.0, 10Mbit Ethernet
Mandelbrot
shows good performance of embarassingly parallel apps
Matrix Multiply
Broadcast B
distribute rows of A
Performance limited by
sequential broadcast of Matrix B
memory limits (can’t do 1000x1000 matrix)
Jacobi Iteration (Grid Computation)
shows that we can now write general scientific code (e.g., CFD)
BUT, performance is still BAD because we are just simulating a grid computation using master-worker
ongoing research
Experimental Results Conclusion
At present, Bayanihan is best for master-worker and embarassingly parallel computations
BUT, still a lot of useful apps
especially Monte Carlo-like apps
Feel free to suggest some!
Related Work : Related Work Using Java applications
ATLAS (Cilk), ParaWeb, JPVM, WWWinda, Jada
IceT, Ninflet, Java//
DOGMA
Using Java applets
Univ. of Wash. Factoring, DAMPP (JavaWorld)
Charlotte: eager scheduling, shared memory
Javelin: brokers, servlets, Linda, Cilk-like scheduling
Bayanihan
ease-of-use and accessibility
minimal setup requirements maximum participation
flexibility
use OOP to make general application programming and volunteer computing research easy
For More Information : For More Information Papers & Presentations
Sarmenta. “Bayanihan: Web-Based Volunteer Computing Using Java.” WWCA’98. (Springer LNCS 1368)
Sarmenta, Hirano, Ward. “Towards Bayanihan: Building an Extensible Framework for Volunteer Computing Using Java.” ACM Java ‘98. (Concurrency, Oct? ‘98)
Sarmenta, Hirano. “Bayanihan: Building and Studying Volunteer Computing Systems Using Java.” FGCS ’99
Sarmenta, “An Adaptive, Fault-Tolerant Implementation of BSP for Java-based Volunteer Computing Systems.” IPPS Workshop Java for Par. and Dist. Comp. Apr. ‘99
URL
http://bayanihan.ateneo.net/
http://www.cag.lcs.mit.edu/bayanihan
Email: lfgs@pusit.admu.edu.ph, lfgs@mit.edu