logging in or signing up ATLAS ppdg short george Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 34 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 03, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript GRID Tools In ATLAS Production: GRID Tools In ATLAS Production Production Features First steps PlansATLAS Data Challenge: ATLAS Data Challenge ATLAS DC1 phase 1 starts this May 108 Generator events (all produced at CERN) 107 Geant3 detector response events (atlsim framework) 107 reconstructed events (Athena framework) Data production is for physics purposes (Trigger TDR) 2/3 of data produced outside of CERN production on a global scale: Asia, Australia, Europe and North America 19 countriesATLAS jobs: ATLAS jobs Four distinct classes: Event generation : CPU 100 SI95, output 20kB no Database needed Detector simulation: 16,000 SI95, output 2MB, Geometry parameter Database needed (MySQL) Reconstruction: CPU 2000 SI95, output 500 kB Geometry, Alignment, Calibration Databases Analysis: CPU 100 SI95, Output 50 kB(Ntuple) many different event components (HES…), DBGRID Tool Evaluation: GRID Tool Evaluation Gradual implementation plan: Use whatever is available now: Magda, Pacman, Condor-G Suppose the tools are there (imitate them when necessary) Collect as much components as possible and try to classify there by their functionality Start with event generation and detector simulation (Atlsim+Dice)Atlsim-Dice Production Status: Atlsim-Dice Production Status Objectivity, ROOT, MySQL interfaces implemented Multi-processor run with common input Typical input may contain many thousands of similar physics events Atlsim jobs are able to maintain a local DB to process common input coherently Optimal processing time per job is about 24 hours Typical output file size for 170 – 320 events (with hits and digits) is 200 – 300 Mbytes Pre-VDC Experience: Pre-VDC Experience Recipes for producing the data (jobOptions, kumacs) has to be fully tested. Preparation production recipes takes time and efforts, encapsulating considerable knowledge and infrastructure dependencies inside. When you got recipes, data production is straightforward After the data have been produced, what do we have to do with the developed recipes? Data are primary, recipes are secondaryVirtual Data Perspective: Virtual Data Perspective Recipes are as valuable as the data Production recipes are Virtual data Recipes are primary, data are secondary - if you have recipes you can reproduce data Do not throw away the recipes, save them (in VDC) Recipes should be encapsulated in VD ObjectsVDC Status in DC1: VDC Status in DC1 Approved as an R&D activity (parallel to the production scripts not using VDC) Templated jobOptions approach was used for Generator events production USA site (BNL) will use VDC in simulation production transformation Participants from Canada and UK expressed interest in using VDC-based scriptsProduction Policies in VDC: Production Policies in VDC Allocation of unique event ID implementing the event ID allocation policy Allocating random number seeds providing unified random number seed allocation policy Support for automatic generation of jobs Unique partition numbering Encapsulation of environment variables VDC database backend: VDC database backend VDC guarantee uniqueness of event ID output PFN random number seeds This was difficult with a non-VDC “perl script” approach in a massive parallel production environmentVDC Integration in Production: VDC Integration in Production Production System is extended in DC1 with features provided by few “ortogonal” VDC component: Data reproducibility SIGNATURE (application software version) Grid dimension: LOCATION (site) Application complexity Application CONFIGURATIONVDC Integration: VDC Integration VDC-based automatic “garbage collection”: Agents (jobs) get the next derivation from VDC After the data has been materialized agents register “success” in VDC if some previous invocation has not been completed within the specified timeout period, it is invoked again You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
ATLAS ppdg short george Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 34 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: December 03, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript GRID Tools In ATLAS Production: GRID Tools In ATLAS Production Production Features First steps PlansATLAS Data Challenge: ATLAS Data Challenge ATLAS DC1 phase 1 starts this May 108 Generator events (all produced at CERN) 107 Geant3 detector response events (atlsim framework) 107 reconstructed events (Athena framework) Data production is for physics purposes (Trigger TDR) 2/3 of data produced outside of CERN production on a global scale: Asia, Australia, Europe and North America 19 countriesATLAS jobs: ATLAS jobs Four distinct classes: Event generation : CPU 100 SI95, output 20kB no Database needed Detector simulation: 16,000 SI95, output 2MB, Geometry parameter Database needed (MySQL) Reconstruction: CPU 2000 SI95, output 500 kB Geometry, Alignment, Calibration Databases Analysis: CPU 100 SI95, Output 50 kB(Ntuple) many different event components (HES…), DBGRID Tool Evaluation: GRID Tool Evaluation Gradual implementation plan: Use whatever is available now: Magda, Pacman, Condor-G Suppose the tools are there (imitate them when necessary) Collect as much components as possible and try to classify there by their functionality Start with event generation and detector simulation (Atlsim+Dice)Atlsim-Dice Production Status: Atlsim-Dice Production Status Objectivity, ROOT, MySQL interfaces implemented Multi-processor run with common input Typical input may contain many thousands of similar physics events Atlsim jobs are able to maintain a local DB to process common input coherently Optimal processing time per job is about 24 hours Typical output file size for 170 – 320 events (with hits and digits) is 200 – 300 Mbytes Pre-VDC Experience: Pre-VDC Experience Recipes for producing the data (jobOptions, kumacs) has to be fully tested. Preparation production recipes takes time and efforts, encapsulating considerable knowledge and infrastructure dependencies inside. When you got recipes, data production is straightforward After the data have been produced, what do we have to do with the developed recipes? Data are primary, recipes are secondaryVirtual Data Perspective: Virtual Data Perspective Recipes are as valuable as the data Production recipes are Virtual data Recipes are primary, data are secondary - if you have recipes you can reproduce data Do not throw away the recipes, save them (in VDC) Recipes should be encapsulated in VD ObjectsVDC Status in DC1: VDC Status in DC1 Approved as an R&D activity (parallel to the production scripts not using VDC) Templated jobOptions approach was used for Generator events production USA site (BNL) will use VDC in simulation production transformation Participants from Canada and UK expressed interest in using VDC-based scriptsProduction Policies in VDC: Production Policies in VDC Allocation of unique event ID implementing the event ID allocation policy Allocating random number seeds providing unified random number seed allocation policy Support for automatic generation of jobs Unique partition numbering Encapsulation of environment variables VDC database backend: VDC database backend VDC guarantee uniqueness of event ID output PFN random number seeds This was difficult with a non-VDC “perl script” approach in a massive parallel production environmentVDC Integration in Production: VDC Integration in Production Production System is extended in DC1 with features provided by few “ortogonal” VDC component: Data reproducibility SIGNATURE (application software version) Grid dimension: LOCATION (site) Application complexity Application CONFIGURATIONVDC Integration: VDC Integration VDC-based automatic “garbage collection”: Agents (jobs) get the next derivation from VDC After the data has been materialized agents register “success” in VDC if some previous invocation has not been completed within the specified timeout period, it is invoked again