Simon Does Arch Matter CyberPanel

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

By: ripon_k20 (49 month(s) ago)

please send me your presentation at ripon_k20@yahoo.com

Presentation Transcript

Slide1: 

Does Architecture Matter? Horst D. Simon Associate Laboratory Director and Director of NERSC Lawrence Berkeley National Laboratory Presentation to NSF Cyberinfrastructure Council July 27, 2005 (with contributions by L. Oliker, W. Saphir, E. Strohmaier, D. Skinner, J. Shalf, and K. Yelick at LBNL and UC Berkeley)

Does Architecture Matter?: 

Does Architecture Matter? YES! High End Computing has settled into a relatively stable state in 2005 (architecture/software/systems) However there will be significant changes in the next five years ahead that could completely change the HEC ecosystem by 2015 Supercomputing Centers must be ready to adapt to this change by fostering diversity in architecture, software, and systems NSF can foster this change, by maintaining a diverse center environment re-investing in basic research in architecture, software, and tools for HEC

Future Challenge: Developing a New Ecosystem for HPC: 

Future Challenge: Developing a New Ecosystem for HPC From the NRC Report on “The Future of Supercomputing”: Platforms, software, institutions, applications, and people who solve supercomputing applications can be thought of collectively as an ecosystem Research investment in HPC should be informed by the ecosystem point of view - progress must come on a broad front of interrelated technologies, rather than in the form of individual breakthroughs. Pond ecosystem image from http://www.tpwd.state.tx.us/expltx/eft/txwild/pond.htm

Supercomputing Ecosystem (1988): 

Supercomputing Ecosystem (1988) Cold War and Big Oil spending in the 1980s Powerful Vector Supercomputers 20 years of Fortran applications base in physics codes and third party apps

Supercomputing Ecosystem (until about 1988): 

Supercomputing Ecosystem (until about 1988) Cold War and Big Oil spending in the 1980s Powerful Vector Supercomputers 20 years of Fortran applications base in physics codes and third party apps

Supercomputing Ecosystem (2005): 

Supercomputing Ecosystem (2005) Commercial Off The Shelf technology (COTS) “Clusters” 12 years of legacy MPI applications base

Supercomputing Ecosystem (2005): 

Supercomputing Ecosystem (2005) Commercial Off The Shelf technology (COTS) “Clusters” 12 years of legacy MPI applications base

How Did We Make the Change?: 

How Did We Make the Change? Massive R&D Investment HPCC in the US Vigorous computer science experimentation in languages, tools, system software Development of Grand Challenge applications External Driver Industry transition to CMOS micros All changes happened virtually at once Ecosystem change

Observations on the 2005 Ecosystem: 

Observations on the 2005 Ecosystem It is very stable attempts of re-introducing old species failed (X1) attempts of introducing new species failed (mutation of Blue Gene 1999 to BG/L 2005) It works well just look around the room So why isn’t everybody happy and content?

HPTC 2004 Market Highlights: 

HPTC 2004 Market Highlights Strong market performance Revenue = $7.25B, Growth = 30.2% Units = 155,596, Growth = 70.2% ASP = $46.6K, Decline = -23.5% Drivers: Market rebounds for the second year with growing budgets and new customers Potential development of new low-end market based on tipping point in price/performance Slides courtesy of Earl Joseph Jr. , IDC

HPTC 2004 Market Highlights: 

HPTC 2004 Market Highlights Revenue growth favors the low-end Capability – declined –4.0% Enterprise – grew 44.5% Divisional – grew 24.6% Departmental – grew 22.7% Workgroup – grew 64.6% Everyone suffered in 2001 and 2002 2003 and 2004 were strong growth years Slides courtesy of Earl Joseph Jr. , IDC

Developing a New Ecosystem for HPC: 

Developing a New Ecosystem for HPC The supercomputing community is aware that the current situation is suboptimal for HPC “divergence problem” and Blue Planet at NERSC concern about the “right” benchmarks The current ecosystem will become untenable after about 2010 in the face of the architectural and software challenges How are we going to change the ecosystem? What are we going to change it into? DARPA HPCS is on the right track combining requirements for new architecture, new languages, and insistence on commercialization by vendors But, will a $150M program be enough to change a $6 - 8B industry?

Does Architecture Matter?: 

Does Architecture Matter? YES! High End Computing has settled into a relatively stable state in 2005 (architecture/software/systems) There will be significant changes in the next five years ahead that could completely change the HEC ecosystem by 2015 Supercomputing Centers must be ready to adapt to this change by fostering diversity in architecture, software, and systems NSF must foster this change, by maintaining a diverse center environment re-investing in basic research in architecture, software, and tools for HEC

HPC Technology Will Evolve and Supercomputer Centers Must Adapt: 

HPC Technology Will Evolve and Supercomputer Centers Must Adapt 1995-2005 Stabilization/standardization Gradual increase in parallelization 2005-2010 Potential for disruptive technology changes Performance limits for applications that cannot adapt to changes HPC Community and Supercomputer Centers must: Analyze and drive technology changes Explore all technology alternatives Investigate how applications can take advantage of the changes Work with users to adapt and optimize applications

Custom vs. Commodity : 

Custom vs. Commodity Commodity  Custom Linux Clusters X1 Commodity: inexpensive, performance not tuned for HPC Custom: expensive, excellent performance Most systems are a hybrid: T3E: commodity processors; custom interconnect SP: commodity processors; custom interconnect XT3: commodity processors; custom interconnect Altix: commodity processors; custom interconenct See a pattern? 2005-2010 theme: “accelerators” for commodity processors and interconnects will increase hybridization. The systems likely to be best suited to the general purpose workload are such hybrid systems. Challenge: Centers needs to determine the best mix of custom and commodity components for effective supercomputer architectures. - is a false dichotomy

Historical Processor Performance: 

Historical Processor Performance Source: “Getting up to speed: The Future of Supercomputing”, NRC, 2004

Processor Trends 2005-2010: 

Processor Trends 2005-2010 Peak HPC performance has increased by 1.8x/year for 10 years Peak processor performance has increased 1.4x/year Number of processors has increased 1.3x/year 2005-2010 Moore’s Law (exponential feature size decrease) will continue to hold, but: Chips will have multiple processors (cores) Will this drive parallel computing from the low end? Performance increase of processors will be smaller (clock rate, exploiting ILP) Challenge: Greatly increased parallelism visible to user (scaling to 10K processors and more)

Application Status in 2005: 

Application Status in 2005 A few Teraflop/s sustained performance Scaled to 512 - 1024 processors Parallel job size at NERSC

Parallelism has Stagnated for a Decade: 

Parallelism has Stagnated for a Decade Number of processors in the most highly parallel system in the TOP500 ASCI RED Intel Paragon XP IBM BG/L

Integrated Performance Monitoring (IPM): 

Integrated Performance Monitoring (IPM) brings together multiple sources of performance metrics into a single profile that characterizes the overall performance and resource usage of the application maintains low overhead by using a unique hashing approach which allows a fixed memory footprint and minimal CPU usage open source, relies on portable software technologies and is scalable to thousands of tasks developed by David Skinner at NERSC (see http://www.nersc.gov/projects/ipm/ )

Scaling Portability: Profoundly Interesting: 

Scaling Portability: Profoundly Interesting A high level description of the performance of cosmology code MADCAP on four well known architectures. Source: David Skinner, NERSC, IPM project http://www.nersc.gov/projects/ipm/

16 Way for 4 seconds: 

16 Way for 4 seconds (About 20 timestamps per second per task) *( 1…4 contextual variables)

64 way for 12 seconds: 

64 way for 12 seconds

256 way for 36 seconds: 

256 way for 36 seconds

Processor Accelerators: 

Processor Accelerators Most HPC vendors looking at ways to improve single processor performance Memory accelerators (e.g. software controlled prefetch, cache management) FPGAs Vector processors Game processors/graphics processors Other novel processor architectures Challenges: Programming model/techniques for accelerators may be different Not all applications/algorithms may benefit Machines will contain both conventional processors and processor accelerators, requiring a hybrid approach to achieve maximum performance

The Memory Wall: 

The Memory Wall Source: “Getting up to speed: The Future of Supercomputing”, NRC, 2004

Memory Trends: 

Memory Trends Memory latency Memory latency will continue to improve more slowly than processor speed Memory bandwidth Memory bandwidth limited by concurrency/latency Bandwidth problems magnified by multi-core Challenges: Latency-hiding techniques will become even more important Bandwidth-minimization techniques will become even more important Evolution of codes, languages, compilers and algorithms will be required just to stay even

Characterizing Memory Access: 

Characterizing Memory Access Memory Access Patterns/Locality Source: David Koester, MITRE

Apex-MAP characterizes architectures through a synthetic benchmark: 

Apex-MAP characterizes architectures through a synthetic benchmark Source: Erich Strohmaier, NERSC, LBNL

Apex-Map Sequential: 

Apex-Map Sequential Source: Erich Strohmaier, NERSC, LBNL

Apex-Map Sequential: 

Apex-Map Sequential Source: Erich Strohmaier, NERSC, LBNL

Apex-Map Sequential: 

Apex-Map Sequential Source: Erich Strohmaier, NERSC, LBNL

Apex-Map Sequential: 

Apex-Map Sequential Source: Erich Strohmaier, NERSC, LBNL

Interconnect Trends: 

Interconnect Trends 1995-2005 marked by: Proliferation of interconnects, most based on fat trees or similar Bandwidth: improvement slightly worse than processor improvement Latency: improvement significantly worse than processor performance improvement 2005-2010 trends: Point-to-point bandwidth will keep up with processor performance; latency will be relatively worse Full bisection networks may become much more expensive. Increasing interest in mesh-based interconnects Optical transceivers remain expensive, length limits on copper cables will constrain network choices Challenges Non-fat-tree networks present many challenges. Unproven for a general purpose workload. Which ones will work?

Future Interconnects: 

Future Interconnects Currently most applications work with a flat MPI model, this is already a simplification More processors means more complex interconnects and topology sensitivity Example: BG/L five different interconnection networks latency dependent on distance

Slide36: 

Even today’s machines are interconnect topology sensitive Four (16 processor) IBM Power 3 nodes with Colony switch

Application Topology: 

Application Topology 1024 way MILC 1024 way MADCAP 336 way FVCAM If the interconnect is topology sensitive, mapping will become an issue (again) “Characterizing Ultra-Scale Applications Communincations Requirements”, by John Shalf et al., to be presented at SC05

Slide38: 

Interconnect Topology BG/L

Interconnect Accelerators: 

Interconnect Accelerators Most current interconnects are used for MPI send/recv. Manufacturers are investigating interconnect accelerators RDMA operations (already becoming common) Collective operations (reductions) in hardware Protocol offload Support for global shared memory Challenges/Opportunities Understand what accelerators will most benefit the general purpose workload and work with vendors to implement those features. Optimize codes to make use of features that are delivered.

Operating System Software: 

Operating System Software Linux role has been increasing. By 2010 most HPC systems will be based on Linux. Linux is fragmenting RedHat, SuSE, Fedora, kernel.org, lab-based distributions, plus microkernels for compute nodes Kernel compatibility, test matrix becoming big issues Vendors offering horizontally integrated solutions Challenges/opportunities Leading-edge computing environment requires integration of software from diverse sources, which may not be thoroughly tested/supported be vendors Assess tradeoffs between local integration vs. different degrees of vendor integration. GPL enables de-fragmentation

Custom vs. Commodity is a false dichotomy: 

Custom vs. Commodity is a false dichotomy Commodity clusters are only hardware commodity Each “commodity” cluster is unique in its software configuration Custom software cluster have been a major drain on resources in supercomputing centers (1FTE per 256 proc.) Market opportunity for big league software players watch the entry of Microsoft, Apple in 2005/2006 prediction: cluster market in 2010 will be dominated by large players that leverage desktop knowledge, i.e. Dell, Lenovo, HP, etc.

Languages: 

Languages Use of languages with special HPC support has declined since 1995 MPI and C/C++/Fortran has become ubiquitous but have limited ability to adapt possible architectural changes: ability hide latency ability to exploit global shared memory ability to adapt to hybrid machines New languages provide ability to exploit architectural accelerators Longstanding programmability and maintainability problems are becoming critical path issues Challenges/Opportunities: To what extent can new languages allow the general purpose workload to take advantage of architectural improvements? To what extent will new languages improve ability of scientists to innovate? Creating an ecosystem: new languages will require strong institutional support and commitment to succeed

Power/Cooling: 

Power/Cooling Most systems in 2000-2005 have been based on commodity packaging/racking/cooling. Power density increasing rapidly 75 KW/rack by 2010 Variety of cooling options will be available (air, liquid, liquid assist), potentially with performance impact Processors will have power-saving features (e.g. turning off part of the chip) Blue Gene approach: larger number of smaller processors gives similar peak performance for less power Challenges Understand the implications of power-saving designs for the general purpose workload

Data: 

Data 2000-2005: Data management issues becoming increasingly important. 2005-2010: Technology (primarily software) will not be able to meet needs unless it evolves. Challenges Lack of standardization, performance portability Locating and accessing data becoming more and more difficult Hierarchical storage management (many different types) will be important to reduce complexity Scientific workflows copy data multiple times – how can technology help? Increasing latency/bandwidth gap between memory and disk storage

Networking: 

Networking 1995-2005: WAN: TCP limitations have become a major bottleneck Specialized UDP-based applications instead of TCP LAN: Ethernet everywhere. No more FDDI, HIPPI, etc. NERSC LAN is now 10 Gb/s jumbo frame Ethernet. 2005-2010: WAN: Long-haul high performance networks moving to dynamic provisioning of high-speed circuits to avoid packet loss. LAN: Infiniband- or Ethernet-based accelerators may be effective for some LAN functions and provide opportunities to unify system/storage interconnects with LAN. Challenges/opportunities: WAN: Close integration between ISPs (ESnet/Abilene) and end sites required to take advantages of WAN/MAN infrastructure. LAN: What is the wisest path forward? 40 Gb/s Ethernet? IB? Accelerated Ethernet?

Computer Security: 

Computer Security 1995-2005: Large increase in number and sophistication of attacks Password sniffing a major problem Many computer centers compromised with significant downtime NERSC has had account compromises but no major system root compromises or significant downtime 2005-2010: Intrusion detection systems that work at 10Gb/s+ speeds One-time passwords (OTP) and Personal Identity Verification (PIV) Challenges OTP is not a panacea Will improve security but session hijacking can still occur Will decrease user productivity if not properly implemented As hackers grow more sophisticated, centers must grow more sophisticated as well

NERSC Response to Technology Challenges: 

NERSC Response to Technology Challenges Science-Driven Systems Analyze the effectiveness of new technologies several years before the technology becomes available for high-end systems Work with computer vendors to develop new systems and software that address the needs of scientific applications in general and the NERSC workload in particular Science-Driven Services Provide robust and effective support to users to enable them to tackle challenges of new technology Develop quantitative models of the NERSC workload Science-Driven Analytics Deploy new data management and networking technologies in support of growing scientific needs to collect, manage, and analyze data. Collaboration and Partnerships Establish multi-lab, multi-agency government-vendor-university partnerships to facilitate and enhance the approaches above Help to establish ecosystems that facilitate and sustain adoption of new technologies

NERSC: Science-Driven Computing Strategy 2006 -2010: 

NERSC: Science-Driven Computing Strategy 2006 -2010

Does Architecture Matter?: 

Does Architecture Matter? YES! High End Computing has settled into a relatively stable state in 2005 (architecture/software/systems) However there will be significant changes in the next five years ahead that could completely change the HEC ecosystem by 2015 Supercomputing Centers must be ready to adapt to this change by fostering diversity in architecture, software, and systems NSF can foster this change, by maintaining a diverse center environment re-investing in basic research in architecture, software, and tools for HEC

HPC Technology Will Evolve and Supercomputer Centers Must Adapt: 

HPC Technology Will Evolve and Supercomputer Centers Must Adapt HPC Community and Supercomputer Centers must: Analyze and drive technology changes Explore all technology alternatives Investigate how applications can take advantage of the changes Work with users to adapt and optimize applications There are too many architectural and systems innovations in development that one solution (one center) can capture it all Architecture diversity Systems diversity Vendor diversity

Science Driven Architecture Process: 

Science Driven Architecture Process 50 Tflop/s - 100 Tflop/s sustained performance on applications of national importance Process: identify applications identify computational methods used in these applications identify architectural features most important for performance of these computational methods work with vendors to introduce these features in future architectures Reference: Creating Science-Driven Computer Architecture: A New Path to Scientific Leadership, (Horst D. Simon, C. William McCurdy, William T.C. Kramer, Rick Stevens, Mike McCoy, Mark Seager, Thomas Zacharia, Jeff Nichols, Ray Bair, Scott Studham, William Camp, Robert Leland, John Morrison, Bill Feiereisen), Report LBNL-52713, May 2003. (see www.nersc.gov/news/reports/HECRTF-V4-2003.pdf)

Slide52: 

Capability Computing Applications in DOE/SC Accelerator modeling Astrophysics Biology Chemistry Climate and Earth Science Combustion Materials and Nanoscience Plasma Science/Fusion QCD Subsurface Transport

Slide53: 

Capability Computing Applications in DOE/SC (cont.) These applications and their computing needs have been well-studied in the past years: “A Science-Based Case for Large-scale Simulation”, David Keyes, Sept. 2004 (http://www.pnl.gov/scales). “Validating DOE’s Office of Science “Capability” Computing Needs”, E. Barsis, P. Mattern, W. Camp, R. Leland, SAND2004-3244, July 2004. NERSC Greenbook, 2005 (to appear).

Science Breakthroughs Enabled by Leadership Computing Capability : 

Science Breakthroughs Enabled by Leadership Computing Capability

How Science Drives Architecture: 

How Science Drives Architecture State-of-the-art computational science requires increasingly diverse and complex algorithms Only balanced systems that can perform well on a variety of problems will meet future scientists’ needs! Data-parallel and scalar performance are both important

Phil Colella’s “Seven Dwarfs”: 

Phil Colella’s “Seven Dwarfs” Algorithms that consume the bulk of the cycles of current high-end systems: Structured Grids Unstructured Grids Fast Fourier Transform Dense Linear Algebra Sparse Linear Algebra Particles Monte Carlo

Does Architecture Matter?: 

Does Architecture Matter? YES! High End Computing has settled into a relatively stable state in 2005 (architecture/software/systems) However there will be significant changes in the next five years ahead that could completely change the HEC ecosystem by 2015 Supercomputing Centers must be ready to adapt to this change by fostering diversity in architecture, software, and systems NSF can foster this change, by maintaining a diverse center environment re-investing in basic research in architecture, software, and tools for HEC

Opinion Slide: 

Opinion Slide One reason why we have failed so far to make a good case for increased funding in supercomputing is that we have not yet made a compelling science case. A better example: “The Quantum Universe” “It describes a revolution in particle physics and a quantum leap in our understanding of the mystery and beauty of the universe.” http://interactions.org/quantumuniverse/