ALICE Computing Model : F.Carminati
BNL Seminar
March 21, 2005 ALICE Computing Model
Offline framework : Offline framework AliRoot in development since 1998
Entirely based on ROOT
Used since the detector TDR’s for all ALICE studies
Two packages to install (ROOT and AliRoot)
Plus MC’s
Ported on most common architectures
Linux IA32, IA64 and AMD, Mac OS X, Digital True64, SunOS…
Distributed development
Over 50 developers and a single CVS repository
2/3 of the code developed outside CERN
Tight integration with DAQ (data recorder) and HLT (same code-base)
Wide use of abstract interfaces for modularity
“Restricted” subset of c++ used for maximum portability
AliRoot layout : AliRoot layout ROOT AliRoot STEER Virtual MC G3 G4 FLUKA HIJING MEVSIM PYTHIA6 PDF EVGEN HBTP HBTAN ISAJET AliEn/gLite EMCAL ZDC ITS PHOS TRD TOF RICH ESD AliAnalysis AliReconstruction PMD CRT FMD MUON TPC START RALICE STRUCT AliSimulation
Software management : Software management Regular release schedule
Major release every six months, minor release (tag) every month
Emphasis on delivering production code
Corrections, protections, code cleaning, geometry
Nightly produced UML diagrams, code listing, coding rule violations, build and tests , single repository with all the code
No version management software (we have only two packages!)
Advanced code tools under development (collaboration with IRST/Trento)
Smell detection (already under testing)
Aspect oriented programming tools
Automated genetic testing
ALICE Detector Construction Database (DCDB) : ALICE Detector Construction Database (DCDB) Specifically designed to aid detector construction in distributed environment:
Sub-detector groups around the world work independently
All data collected in central repository and used to move components from one sub-detector group to another and during integration and operation phase at CERN Multitude of user interfaces:
WEB-based for humans
LabView, XML for laboratory equipment and other sources
ROOT for visualisation
In production since 2002
A very ambitious project with important spin-offs
Cable Database
Calibration Database
The Virtual MC : The Virtual MC
TGeo modeller : TGeo modeller
Results : Results Geant3 FLUKA HMPID
5 GeV Pions
ITS – SPD: Cluster SizePRELIMINARY! : ITS – SPD: Cluster Size PRELIMINARY!
Reconstruction strategy : Reconstruction strategy Main challenge - Reconstruction in the high flux environment (occupancy in the TPC up to 40%) requires a new approach to tracking
Basic principle – Maximum information approach
Use everything you can, you will get the best
Algorithms and data structures optimized for fast access and usage of all relevant information
Localize relevant information
Keep this information until it is needed
Tracking strategy – Primary tracks : Tracking strategy – Primary tracks Incremental process
Forward propagation towards to the vertex TPCITS
Back propagation ITSTPCTRDTOF
Refit inward TOFTRDTPCITS
Continuous seeding
Track segment finding in all detectors Combinatorial tracking in ITS
Weighted two-tracks 2 calculated
Effective probability of cluster sharing
Probability not to cross given layer for secondary particles
Tracking & PID : Tracking & PID PIV 3GHz – (dN/dy – 6000)
TPC tracking - ~ 40s
TPC kink finder ~ 10 s
ITS tracking ~ 40 s
TRD tracking ~ 200 s
TPC ITS+TPC+TOF+TRD
Condition and alignment : Condition and alignment Heterogeneous information sources are periodically polled
ROOT files with condition information are created
These files are published on the Grid and distributed as needed by the Grid DMS
Files contain validity information and are identified via DMS metadata
No need for a distributed DBMS
Reuse of the existing Grid services
External relations and DB connectivity : External relations and DB connectivity DAQ Trigger DCS ECS Physics data DCDB AliEngLite:
metadata
file store calibration procedures calibration files AliRoot Calibration classes API API API API API files From URs:
Source, volume, granularity, update frequency, access pattern, runtime environment and dependencies
API – Application Program Interface Relations between DBs not final not all shown API API HLT Call for UR sent to subdetectors
Metadata : Metadata MetaData are essential for the selection of events
We hope to be able to use the Grid file catalogue for one part of the MetaData
During the Data Challenge we used the AliEn file catalogue for storing part of the MetaData
However these are file-level MetaData
We will need an additional event-level MetaData
This can be simply the TAG catalogue with externalisable references
We are discussing with STAR on this subject
We will take a decision soon
We would prefer that the Grid scenario be clearer
ALICE CDC’s : ALICE CDC’s
Use of HLT for monitoring in CDC’s : Use of HLT for monitoring in CDC’s Aliroot Simulation Digits Raw Data LDC LDC LDC LDC GDC Event builder alimdc Root file CASTOR AliEn Monitoring HLT Algorithms ESD Histograms
ALICE Physics Data Challenges : ALICE Physics Data Challenges
PDC04 schema : CERN Tier2 Tier1 Tier2 Tier1 Production of RAW Shipment of RAW to CERN Reconstruction of RAW in all T1’s Analysis AliEn job control Data transfer PDC04 schema
Phase 2 principle : Mixed signal Phase 2 principle
Simplified view of the ALICE Grid with AliEn : Simplified view of the ALICE Grid with AliEn Local scheduler ALICE VO – central services Central Task Queue Job submission File Catalogue Configuration Accounting User authentication Computing Element Workload management Job Monitoring Storage volume manager Data Transfer Storage Element Cluster Monitor AliEn Site services Disk and MSS Existing site components ALICE VO – Site services integration
Site services : Site services Inobtrusive – entirely in user space:
Singe user account
All authentication already assured by central services
Tuned to the existing site configuration – supports various schedulers and storage solutions
Running on many Linux flavours and platforms (IA32, IA64, Opteron)
Automatic software installation and updates (both service and application)
Scalable and modular – different services can be run on different nodes (in front/behind firewalls) to preserve site security and integrity:
Load balanced file transfer nodes (on HTAR) CERN firewall solution for large volume file transfers Fire wall ONLY High ports (50K-55K) for parallel file transport CERN Intranet AliEn Data Transfer AliEn Other services
Slide23 :
Log files, application
software storage
1TB SATA Disk server
Phase 2 job structure : Master job submission, Job Optimizer (N sub-jobs), RB, File catalogue, processes monitoring and control, SE… Central servers CEs Sub-jobs Job processing AliEn-LCG interface Sub-jobs RB Job processing CEs Storage CERN CASTOR: underlying events Local SEs CERN CASTOR: backup copy Storage Primary copy Primary copy Local SEs Output files Output files Underlying event input files zip archive of output files Register in AliEn FC: LCG SE: LCG LFN = AliEn PFN edg(lcg) copy®ister File catalogue Phase 2 job structure Task - simulate the event reconstruction and remote event storage
Completed Sep. 2004
Production history : Production history ALICE repository – history of the entire DC
~ 1 000 monitored parameters:
Running, completed processes
Job status and error conditions
Network traffic
Site status, central services monitoring
….
7 GB data
24 million records with 1 minute granularity – analysed to improve GRID performance
Statistics
400 000 jobs, 6 hours/job, 750 MSi2K hours
9M entries in the AliEn file catalogue
4M physical files at 20 AliEn SEs in centres world-wide
30 TB stored at CERN CASTOR
10 TB stored at remote AliEn SEs + 10 TB backup at CERN
200 TB network transfer CERN –> remote computing centres
AliEn efficiency observed >90%
LCG observed efficiency 60% (see GAG document)
Job repartition : Job repartition Jobs (AliEn/LCG): Phase 1 - 75/25%, Phase 2 – 89/11%
More operation sites added to the ALICE GRID as PDC progressed Phase 2 Phase 1 17 permanent sites (33 total) under AliEn direct control and additional resources through GRID federation (LCG)
Summary of PDC’04 : Summary of PDC’04 Computing resources
It took some effort to ‘tune’ the resources at the remote computing centres
The centres’ response was very positive – more CPU and storage capacity was made available during the PDC
Middleware
AliEn proved to be fully capable of executing high-complexity jobs and controlling large amounts of resources
Functionality for Phase 3 has been demonstrated, but cannot be used
LCG MW proved adequate for Phase 1, but not for Phase 2 and in a competitive environment
It cannot provide the additional functionality needed for Phase 3
ALICE computing model validation:
AliRoot – all parts of the code successfully tested
Computing elements configuration
Need for a high-functionality MSS shown
Phase 2 distributed data storage schema proved robust and fast
Data Analysis could not be tested
Development of Analysis : Development of Analysis Analysis Object Data designed for efficiency
Contain only data needed for a particular analysis
Analysis à la PAW
ROOT + at most a small library
Work on the distributed infrastructure has been done by the ARDA project
Batch analysis infrastructure
Prototype published at the end of 2004 with AliEn
Interactive analysis infrastructure
Demonstration performed at the end 2004 with AliEngLite
Physics working groups are just starting now, so timing is right to receive requirements and feedback
Slide29 : Forward Proxy Forward Proxy Rootd Proofd Grid/Root Authentication Grid Access Control Service TGrid UI/Queue UI Proofd Startup Slave
Registration/
Booking- DB Site PROOF SLAVE SERVERS Site A PROOF SLAVE SERVERS Site B LCG Master Setup New Elements Grid Service Interfaces Grid File/Metadata Catalogue Client retrieves list
of logical file (LFN + MSN) Booking Request
with logical file names “Standard” Proof Session Slave ports
mirrored on
Master host Optional Site Gateway Master Client Grid-Middleware independend PROOF Setup Only outgoing connectivity
Grid situation : Grid situation History
Jan ‘04: AliEn developers are hired by EGEE and start working on new MW
May ‘04: A prototype derived from AliEn is offered to pilot users (ARDA, Biomed..) under the gLite name
Dec ‘04: The four experiments ask for this prototype to be deployed on larger preproduction service and be part of the EGEE release
Jan ‘05: This is vetoed at management level -- AliEn will not be common software
Current situation
EGEE has vaguely promised to provide the same functionality of AliEn-derived MW
But with a 2-4 months delay at least on top of the one already accumulated
But even this will be just the beginning of the story: the different components will have to be field tested in a real environment, it took four years for AliEn
All experiments have their own middleware
Our is not maintained because our developers have been hired by EGEE
EGEE has formally vetoed any further work on AliEn or AliEn-derived software
LCG has allowed some support for ALICE but the situation is far from being clear
ALICE computing model : ALICE computing model For pp similar to the other experiments
Quasi-online data distribution and first reconstruction at T0
Further reconstruction passes at T1’s
For AA different model
Calibration, alignment and pilot reconstructions during data taking
Data distribution and first reconstruction at T0 during the four months after AA run (shutdown)
Second and third pass distributed at T1’s
For safety one copy of RAW at T0 and a second one distributed among all T1’s
T0: First pass reconstruction, storage of one copy of RAW, calibration data and first-pass ESD’s
T1: Subsequent reconstructions and scheduled analysis, storage of the second collective copy of RAW and one copy of all data to be safely kept (including simulation), disk replicas of ESD’s and AOD’s
T2: Simulation and end-user analysis, disk replicas of ESD’s and AOD’s
Very difficult to estimate network load
ALICE requirements on MiddleWare : ALICE requirements on MiddleWare One of the main uncertainties of the ALICE computing model comes from the Grid component
ALICE was developing its computing model assuming that a MW with the same quality and functionality that AliEn would have had in two years from now will be deployable on the LCG computing infrastructure
If not, we will still analyse the data (!), but
Less efficiency more computers more time and money
More people for production more money
To elaborate an alternative model we should know what will be
The functionality of the MW developed by EGEE
The support we can count on from LCG
Our “political” “margin of manoeuvre”
Possible strategy : Possible strategy If
Basic services from LCG/EGEE MW can be trusted at some level
We can get some support to port the “higher functionality” MW onto these services
We have a solution
If a) above is not true but if
We have support for deploying the ARDA-tested AliEn-derived gLite
We do not have a political “veto”
We still have a solution
Otherwise we are in trouble
ALICE Offline Timeline : ALICE Offline Timeline
Main parameters : Main parameters
Processing pattern : Processing pattern
Conclusions : Conclusions ALICE has made a number of technical choices for the Computing framework since 1998 that have been validated by experience
The Offline development is on schedule, although contingency is scarce
Collaboration between physicists and computer scientists is excellent
Tight integration with ROOT allows fast prototyping and development cycle
AliEn goes a long way in providing a GRID solution adapted to HEP needs
However its evolution into a common project has been “stopped”
This is probably the largest single “risk factor” for ALICE computing
Some ALICE-developed solutions have a high potential to be adopted by other experiments and indeed are becoming “common solutions”