logging in or signing up ACAT 2002 LCG Status AscotEdu Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 161 Category: Product Traini.. License: All Rights Reserved Like it (0) Dislike it (0) Added: June 20, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript LHC Computing Grid Project: LHC Computing Grid Project Creating a Global Virtual Computing Centre for Particle Physics ACAT’2002 27 June 2002 Les Robertson IT Division, CERN les.robertson@cern.ch Summary: Summary LCG – The LHC Computing Grid Project requirements, funding, creating a Grid areas of work grid technology computing fabrics deployment operating a grid Plan for the LCG Global Grid Service A few remarks Slide3: Funding dictates – Worldwide distributed computing system Small fraction of the analysis at CERN Batch analysis – using 12-20 large regional centres how to use the resources efficiently establishing and maintaining a uniform physics environment Data exchange and interactive analysis involving tens of smaller regional centres, universities, labs Summary - Project Goals: Summary - Project Goals applications - tools, frameworks, environment, persistency computing system global grid service cluster automated fabric collaborating computer centres grid CERN-centric analysis global analysis environment Goal – Prepare and deploy the LHC computing environment This is not another grid technology project – it is a grid deployment project Two Phases: Two Phases The first phase of the project – 2002-2005 preparing the prototype computing environment, including support for applications – libraries, tools, frameworks, common developments, ….. global grid computing service funded by Regional Centres, CERN, special contributions to CERN by member and observer states, middleware developments by national and regional Grid projects manpower OK hardware at CERN - ~40% funded Phase 2 – construction and operation of the initial LHC Computing Service – 2005-2007 at CERN – missing funding of ~80M CHF Funding: Funding Funding agencies have little enthusiasm for investing more in particle physics HEP seen as a ground-breaker in computing initiator of the Web track record of exploiting leading edge computing effective global collaborations real need – for data as well as computation one of the few application areas with real cross-border data needs LHC in sync with -- emergence of Grid technology -- explosion of network bandwidth We must deliver on Phase 1 for LHC - and show the relevance for other sciences Building a Grid: Building a Grid Computing Centre Cluster Cluster Fabric: automated management installation, configuration, maintenance, monitoring, error recovery, … reliability cost containment Cluster Fabric autonomic computing The MONARC Multi-Tier Model (1999): The MONARC Multi-Tier Model (1999) les.robertson@cern.ch Tier 0 - recording, reconstruction Building a Grid: Building a Grid Collaborating Computer Centres Building a Grid: Building a Grid Collaborating Computer Centres The virtual LHC Computing Centre Grid Alice VO CMS VO Virtual Computing Centre: Virtual Computing Centre The user --- sees the image of a single cluster does not need to know - where the data is - where the processing capacity is - how things are interconnected - the details of the different hardware and is not concerned by the conflicting policies of the equipment owners and managers Project Implementation Organisation: Project Implementation Organisation Four areas Applications (see Matthias Kasemann’s presentation) Grid Technology Fabrics Grid deployment Grid Technology AreaLeveraging Grid R&D Projects: Grid Technology Area Leveraging Grid Randamp;D Projects US projects European projects Many national, regional Grid projects -- GridPP(UK), INFN-grid(I), NorduGrid, Dutch Grid, … significant Randamp;D funding for Grid middleware risk of divergence and is that good or bad? global grids need standards useful grids need stability hard to do this in the current state of maturity will we recognise and be willing to migrate to the winning solutions? Grid Technology Area: Grid Technology Area Ensuring that the appropriate middleware is available Supplied and maintained by the 'Grid projects' It is proving hard to get the first 'production' data intensive grids going as user services Can the grid projects provide long-term support and maintenance? Trade-off between new functionality and stability The Trans-Atlantic Issue: The Trans-Atlantic Issue Bridging the ATLANTIC is essential for the project HICB – High Energy and Nuclear Physics Intergid Collaboration Board GLUE – Grid Laboratory Universal Environment compatible middleware and infrastructure Funded by DataTAG and iVDGL Certificates - OK Schemas – under way, working with the wider Globus world, getting complicated – probably OK Middleware components – not yet clear – but close collaboration on File replication Job scheduling Collaboration with Grid Projects: Collaboration with Grid Projects LCG must deploy a GLOBAL GRID essential to have compatible middleware andamp; grid infrastructure better – have identical middleware We are banking on GLUE But we have to make some choices towards the end of the year Services are about stability, support, maintenance Can the Randamp;D grid projects take commitments for long term maintenance of their middleware? Scope of Fabric Area: Scope of Fabric Area Tier 1,2 centre collaboration Grid-Fabric integration middleware (DataGrid WP4) Automated systems management package Technology assessment (PASTA III) started CERN Tier 0+1 centre Grid Deployment Area: Grid Deployment Area The aim is to build a general computing service for a very large user population of independently-minded scientists using a large number of independently managed sites This is NOT a collection of sites providing pre-defined services it is the user’s job that defines the service it is current research interests that define the workload it is the workload that defines the data distribution DEMAND - Unpredictable andamp; Chaotic But the SERVICE had better be Available andamp; Reliable Grid Deployment – current status: Grid Deployment – current status Experiments can do (and are doing) their event production using distributed resources with a variety of solutions classic distributed production – send jobs to specific sites, simple bookkeeping some use of Globus, and some of the HEP Grid tools other integrated solutions (ALIEN) The hard problem for distributed computing is data analysis – ESD and AOD chaotic workload unpredictable data access patterns this is where new Grid technology is needed resource broker, replica management, .. this is the problem that the LCG has to solve Grid Operation: Grid Operation User Network Operations Centre Local operation Local user support Grid Operations Centre Call Centre Local site Grid operations Grid information service Virtual Organisation Grid logging andamp; bookkeeping queries monitoring andamp; alarms corrective actions Grid Operation: Grid Operation We do not know how to do this Probably nobody knows – looks like network operation, but there are many more variables to be watched and adjusted; looks like multi-national commercial systems, but we have no central ownership, control A 24 hour service is needed – round the clock and round the world Setting up the LHC Global Grid Service: Setting up the LHC Global Grid Service First data is in 2007 LCG must learn from current solutions, leverage the tools coming from the grid projects, show that grids are useful but set realistic targets short term (this year): use current solutions for physics data challenges (event productions) consolidate (stabilise, maintain) middleware learn what a 'production grid' really means by working with DataGrid and VDT medium term (next year): Set up a reliable global grid service – initially only a few larger centres, but on three continents Stabilise it Several times the capacity of the CERN facility and as easy to use Slide24: Having stabilised this base service – showing that we can run a solid service for the experiments then – progressive evolution – integrate all of the Regional Centre resources provided for LHC improve quality, reliability, predictability integrate new middleware functionality – possibly once per year migrate to de facto standards as soon as they emerge Final comments: Final comments It is not just about distributing computation, it is also about managing distributed data (lots of it!) and maintaining a single view of the environment All these parallel developments, rapidly changing technology .. may be good in the long term, but we must deploy a global grid service next year A dependable, reliable 24 X 7 service is essential and not so easy to do with all these sites and all that data Service Quality is the Key to Acceptance of Grids Reliable OPERATION will be the factor that limits the size of practical Grids We are getting funding because of the relevance for other sciences, engineering, business -- keeping things general, main-line must remain a high priority You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
ACAT 2002 LCG Status AscotEdu Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 161 Category: Product Traini.. License: All Rights Reserved Like it (0) Dislike it (0) Added: June 20, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript LHC Computing Grid Project: LHC Computing Grid Project Creating a Global Virtual Computing Centre for Particle Physics ACAT’2002 27 June 2002 Les Robertson IT Division, CERN les.robertson@cern.ch Summary: Summary LCG – The LHC Computing Grid Project requirements, funding, creating a Grid areas of work grid technology computing fabrics deployment operating a grid Plan for the LCG Global Grid Service A few remarks Slide3: Funding dictates – Worldwide distributed computing system Small fraction of the analysis at CERN Batch analysis – using 12-20 large regional centres how to use the resources efficiently establishing and maintaining a uniform physics environment Data exchange and interactive analysis involving tens of smaller regional centres, universities, labs Summary - Project Goals: Summary - Project Goals applications - tools, frameworks, environment, persistency computing system global grid service cluster automated fabric collaborating computer centres grid CERN-centric analysis global analysis environment Goal – Prepare and deploy the LHC computing environment This is not another grid technology project – it is a grid deployment project Two Phases: Two Phases The first phase of the project – 2002-2005 preparing the prototype computing environment, including support for applications – libraries, tools, frameworks, common developments, ….. global grid computing service funded by Regional Centres, CERN, special contributions to CERN by member and observer states, middleware developments by national and regional Grid projects manpower OK hardware at CERN - ~40% funded Phase 2 – construction and operation of the initial LHC Computing Service – 2005-2007 at CERN – missing funding of ~80M CHF Funding: Funding Funding agencies have little enthusiasm for investing more in particle physics HEP seen as a ground-breaker in computing initiator of the Web track record of exploiting leading edge computing effective global collaborations real need – for data as well as computation one of the few application areas with real cross-border data needs LHC in sync with -- emergence of Grid technology -- explosion of network bandwidth We must deliver on Phase 1 for LHC - and show the relevance for other sciences Building a Grid: Building a Grid Computing Centre Cluster Cluster Fabric: automated management installation, configuration, maintenance, monitoring, error recovery, … reliability cost containment Cluster Fabric autonomic computing The MONARC Multi-Tier Model (1999): The MONARC Multi-Tier Model (1999) les.robertson@cern.ch Tier 0 - recording, reconstruction Building a Grid: Building a Grid Collaborating Computer Centres Building a Grid: Building a Grid Collaborating Computer Centres The virtual LHC Computing Centre Grid Alice VO CMS VO Virtual Computing Centre: Virtual Computing Centre The user --- sees the image of a single cluster does not need to know - where the data is - where the processing capacity is - how things are interconnected - the details of the different hardware and is not concerned by the conflicting policies of the equipment owners and managers Project Implementation Organisation: Project Implementation Organisation Four areas Applications (see Matthias Kasemann’s presentation) Grid Technology Fabrics Grid deployment Grid Technology AreaLeveraging Grid R&D Projects: Grid Technology Area Leveraging Grid Randamp;D Projects US projects European projects Many national, regional Grid projects -- GridPP(UK), INFN-grid(I), NorduGrid, Dutch Grid, … significant Randamp;D funding for Grid middleware risk of divergence and is that good or bad? global grids need standards useful grids need stability hard to do this in the current state of maturity will we recognise and be willing to migrate to the winning solutions? Grid Technology Area: Grid Technology Area Ensuring that the appropriate middleware is available Supplied and maintained by the 'Grid projects' It is proving hard to get the first 'production' data intensive grids going as user services Can the grid projects provide long-term support and maintenance? Trade-off between new functionality and stability The Trans-Atlantic Issue: The Trans-Atlantic Issue Bridging the ATLANTIC is essential for the project HICB – High Energy and Nuclear Physics Intergid Collaboration Board GLUE – Grid Laboratory Universal Environment compatible middleware and infrastructure Funded by DataTAG and iVDGL Certificates - OK Schemas – under way, working with the wider Globus world, getting complicated – probably OK Middleware components – not yet clear – but close collaboration on File replication Job scheduling Collaboration with Grid Projects: Collaboration with Grid Projects LCG must deploy a GLOBAL GRID essential to have compatible middleware andamp; grid infrastructure better – have identical middleware We are banking on GLUE But we have to make some choices towards the end of the year Services are about stability, support, maintenance Can the Randamp;D grid projects take commitments for long term maintenance of their middleware? Scope of Fabric Area: Scope of Fabric Area Tier 1,2 centre collaboration Grid-Fabric integration middleware (DataGrid WP4) Automated systems management package Technology assessment (PASTA III) started CERN Tier 0+1 centre Grid Deployment Area: Grid Deployment Area The aim is to build a general computing service for a very large user population of independently-minded scientists using a large number of independently managed sites This is NOT a collection of sites providing pre-defined services it is the user’s job that defines the service it is current research interests that define the workload it is the workload that defines the data distribution DEMAND - Unpredictable andamp; Chaotic But the SERVICE had better be Available andamp; Reliable Grid Deployment – current status: Grid Deployment – current status Experiments can do (and are doing) their event production using distributed resources with a variety of solutions classic distributed production – send jobs to specific sites, simple bookkeeping some use of Globus, and some of the HEP Grid tools other integrated solutions (ALIEN) The hard problem for distributed computing is data analysis – ESD and AOD chaotic workload unpredictable data access patterns this is where new Grid technology is needed resource broker, replica management, .. this is the problem that the LCG has to solve Grid Operation: Grid Operation User Network Operations Centre Local operation Local user support Grid Operations Centre Call Centre Local site Grid operations Grid information service Virtual Organisation Grid logging andamp; bookkeeping queries monitoring andamp; alarms corrective actions Grid Operation: Grid Operation We do not know how to do this Probably nobody knows – looks like network operation, but there are many more variables to be watched and adjusted; looks like multi-national commercial systems, but we have no central ownership, control A 24 hour service is needed – round the clock and round the world Setting up the LHC Global Grid Service: Setting up the LHC Global Grid Service First data is in 2007 LCG must learn from current solutions, leverage the tools coming from the grid projects, show that grids are useful but set realistic targets short term (this year): use current solutions for physics data challenges (event productions) consolidate (stabilise, maintain) middleware learn what a 'production grid' really means by working with DataGrid and VDT medium term (next year): Set up a reliable global grid service – initially only a few larger centres, but on three continents Stabilise it Several times the capacity of the CERN facility and as easy to use Slide24: Having stabilised this base service – showing that we can run a solid service for the experiments then – progressive evolution – integrate all of the Regional Centre resources provided for LHC improve quality, reliability, predictability integrate new middleware functionality – possibly once per year migrate to de facto standards as soon as they emerge Final comments: Final comments It is not just about distributing computation, it is also about managing distributed data (lots of it!) and maintaining a single view of the environment All these parallel developments, rapidly changing technology .. may be good in the long term, but we must deploy a global grid service next year A dependable, reliable 24 X 7 service is essential and not so easy to do with all these sites and all that data Service Quality is the Key to Acceptance of Grids Reliable OPERATION will be the factor that limits the size of practical Grids We are getting funding because of the relevance for other sciences, engineering, business -- keeping things general, main-line must remain a high priority