Experiences of Submitting UKDMC and LISA GEANT4 Jobs: Experiences of Submitting UKDMC and LISA GEANT4 Jobs
FIRST I should say thanks for the opportunity of using the growing resource of the GRID particularly for Particle Astrophysics (LISA and UKDMC) which has no direct funding to (or from) the GRIDPP
It is increasingly clear we NEED use of the GRID in order to carry out accurate simulations for both LISA and Dark Matter – either to investigate signals or possible backgrounds on micro and macroscopic levels. Specifically this is processor intensive work
I am a novice – only 5 weeks usage of the GRID
Experiences of Submitting UKDMC and LISA GEANT4 Jobs: Experiences of Submitting UKDMC and LISA GEANT4 Jobs Experimental configuration
Dark Matter
LISA
Output of >100 Jobs
Benefits
Comments on Functionality and UI
Slide3: UKDMC Experiment
Two-Phase Liquid Xenon, ZEPLIN III: Two-Phase Liquid Xenon, ZEPLIN III ZEPLIN III, our near future detector will offer extreme levels of signal-to-background discrimination. Interpretation will only be possible with full Monte Carlo studies (a novelty for DM)
Prototype Simulation: Full Lab Geometry: Prototype Simulation: Full Lab Geometry
GEANT 4: GEANT 4 Due to the need to develop Monte Carlo simulations for Dark Matter experiments, I have become involved in the development of Geant4 – particularly exploiting the low energy, radioactivity decay and neutron extensions of the toolkit (see advanced example DMX within the release package)
From basic simulations of our prototype system it is clear that greater computing power is required in order to produce high statistics and accurately model our detectors.
In addition it is envisaged to develop a simulation of the underground environment, UNEX, to give accurate spectra and particle types within the experimental area.
Furthermore, Imperial is also involved in LISA a gravitational wave experiment, where charging of proof masses becomes critical. To simulate the charging rate requires large number of cosmic events with rare hadronic showers resulting in residual charge
One High Energy event:: One High Energy event: LXe GXe PMT mirror source
Neutrons: room elastic inelastic outside Neutrons
LISA/STEP: LISA/STEP Gravitational wave experiment
STEP = test of the equivalence principle
Both rely on floating proof masses with no electrical connections
prone to charging effects from cosmic rays
However, charging rate is relatively low
(~1 in 5000 particles)
LISA Geometry and Geant4 Images: LISA Geometry and Geant4 Images
LISA Geometry and Geant4 Images: LISA Geometry and Geant4 Images
OUTPUT from Grid Running: OUTPUT from Grid Running Over 100 jobs have been run on the grid to try and estimate the charging rate in the proof mass and test different cuts and processes within Geant4.
1.9 million events have been run in ~300 hours of CPU time
The preliminary outcome is as follows:
6secs and Convergence of Charging Rate: 6secs and Convergence of Charging Rate
6secs and Convergence of Charging Rate: 6secs and Convergence of Charging Rate Initial indications of charging rate
GRID JOB SUBMISSION – My Experience: GRID JOB SUBMISSION – My Experience With the auspicious title of DataGrid User…..
After running >100 jobs I have had some experience of running of jobs on the DataGrid – mostly good
However, there are a few things that if implemented would be useful, although their unavailability may just be the youthful nature of the GRID and therefore already present, at least in the design… (some of which are due to my ignorance)
Things Missing, apparently (1): Things Missing, apparently (1) Status of job events run, near completion?
Run-time Partial grab of output files check on job: RB release 1.4 (Dave’s talk)
Length of identifier – cumbersome
Saving identifiers to file ease of management of many jobs
Request output to be saved to file automatically when job completed
Proxy expiration and file loss – can protect against it, but can occur
File back up – prevent losses when things crash, and therefore reduce number of repeat jobs
Job clearing and file clearing – particularly if job crashes/disappears
Things Missing, apparently (2): Things Missing, apparently (2) Diagnostics
memory usage (single event leaks) – max/average
CPU Time – can access at runtime with Globus
Disc access = efficiency of local staging, etc…
Forced killing of jobs? Clearing of files or keeping of partial files – cancelling jobs loses everything
Run time limit/Disk Usage/Memory Usage
in case of problems/diagnostic?
Node limit Batch script to run jobs sequentially without clogging up the farm – from proxy-request?
Shared Disc for data? – Input files are ~500 Mbytes and copied 32 times…
Things Missing, apparently (3): Things Missing, apparently (3) What decides resource management? Queued at IC or at RAL – speed of processor? Disc transfer time?
Jobs cleared before get output (RB dies)
Housekeeping/cleaning of tmp files
Script to save output to your account without 3rd party access?
Prone to abuse?
Tidy up – clear up dangling jobs and tmp files for a given user
Things Missing, apparently (4): Things Missing, apparently (4) Reliability?
Compilers and inter-site homogeneity
IC = egcs 1.1.2 whereas RAL = gcc 2.95.2
Level of resource available and average usage/users/CPU power – stage requests, think about optimising problem, look elsewhere
More nodes with my VO would naturally be helpful
IC = 16 nodes (user limit 14)
RAL = 8 nodes
elsewhere?
Apologies if some of these comments are due to me…
Conclusions: Conclusions The GRID is clearly a very powerful resource that has enabled me to run a lot of jobs in a very short space of time
It is clear that Dark Matter and Spacecraft charging studies at this time NEED the GRID - particularly for accurate Monte Carlo simulations of future detectors (ZEPLIN III) and Spacecraft charging rates (LISA/STEP)
In running jobs some things could perhaps be more elegant/convenient to use, but on the whole it is not too difficult