servicechallenges

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

LHC Computing Grid Project - LCG: 

LHC Computing Grid Project - LCG Service Challenges Why, What, How 27 May 04 Les Robertson – LCG Project Leader CERN – European Organization for Nuclear Research Geneva, Switzerland les.robertson@cern.ch Adapted from presentation by Ian Bird

Service Challenges: 

Service Challenges Purpose Understand what it takes to operate a real grid service – run for days/weeks at a time (outside of experiment Data Challenges) Trigger/encourage the Tier1 & large Tier-2 planning – move towards real resource planning – based on realistic usage patterns Get the essential grid services ramped up to target levels of reliability, availability, scalability, end-to-end performance Set out milestones needed to achieve goals during the service challenges NB: This is focussed on Tier 0 – Tier 1/large Tier 2 Data management, batch production and analysis Short term goal – by end 2004 – have in place a robust and reliable data management service and support infrastructure and robust batch job submission Ian Bird – ian.bird@cern.ch

Service challenges – examples : 

Service challenges – examples Data Management Networking, file transfer, data management Storage management and interoperability Fully functional storage element (SE) Continuous job probes Understand limits Operations centres Accounting, assume levels of service responsibility, etc Hand-off of responsibility (RAL-Taipei-US/Canada) "Security incident" Detection, incident response, dissemination and resolution User support Assumption of responsibility, demonstrate staff in place, etc VO management Robust and flexible registration, management interfaces, etc Etc. Ian Bird – ian.bird@cern.ch

Data Management – example : 

Data Management – example Data management builds on a stack of underlying services: Network Robust file transfer Storage interfaces and functionality Replica location service Data management tools Ian Bird – ian.bird@cern.ch

Data management – 2 : 

Data management – 2 Network layer: Proposed set of network milestones already in draft Peter Clark, David Foster, Harvey Newman File transfer service layer: Move a file from A to B, with high reliability and target performance This service would normally only be visible via the data movement service Only app that can access/schedule/control this network E.g. of this is gridftp, bbftp, etc. Reliability – the service must detect failure, retry, etc. Interfaces to storage systems (SRM) The US-CMS/CERN “Edge Computing” project might be an instance of this layer (network + file transfer) Ian Bird – ian.bird@cern.ch

Data management – 3 : 

Data management – 3 Data movement service layer: Builds on top of file transfer and network layers To provide an absolutely reliable and dependable service with good performance Implement queuing, priorities, etc. Initiates file transfers using file transfer service Acts on application’s behalf – a file handed to the service will be guaranteed to arrive Replica Management Service: Makes use of data movement Maintains distributed grid catalogue Ian Bird – ian.bird@cern.ch

Job probes – example : 

Job probes – example Continuous flood of jobs Fill all resources Use as probes – test if they can use the resources Data access, cpu, etc Understand limitations, bottlenecks of the system Baseline measurement, find limits, build and improve This might be a function of the GOC Overseen by RAL-Taipei-+ collaboration ? A challenge might run for a week Outside of experiment data challenges In parallel (or part of) data management or other challenges Ian Bird – ian.bird@cern.ch

Summary: 

Summary Service challenges: Understand what it really takes to operate reliable and performant services Put in place underpinnings of a reliable infrastructure by the end of the year Requires: Agreed milestones Commitment of resources and people Ian Bird – ian.bird@cern.ch