ATLAS Italia Calcolo Overview ATLAS sw e computing Calcolo fatto e previsto - share INFN Resoconto Milestones
Schema del talk ATLAS Computing org: aree in rifacimento/nuove Stato e sviluppi sw e tools –Simulazione, Ricostruzione, Production Environment Data Challenges –Calcolo fatto (DC1 etc.) in Italia e previsto (DC2 etc.) –Inquadramento in ATLAS Globale Milestones 2003 resoconto Milestones 2004 proposta
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Computing Organization The ATLAS Computing Organization was revised at the beginning of 2003 in order to adapt it to current needs Basic principles: Management Team consisting of: Computing Coordinator (Dario Barberis) Software Project Leader (David Quarrie) Small(er) executive bodies Shorter, but more frequent, meetings Good information flow, both horizontal and vertical Interactions at all levels with the LCG project The new structure is now in place and working well A couple of areas still need some thought (this month)
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep New computing organization Internal organization being defined this month
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Main positions in computing organization Computing Coordinator Leads and coordinates the developments of ATLAS computing in all its aspects: software, infrastructure, planning, resources. Coordinates development activities with the TDAQ Project Leader(s), the Physics Coordinator and the Technical Coordinator through the Executive Board and its subcommittees (COB and TTCC). Represents ATLAS computing in the LCG management structure (SC2 and other committees) and at LHC level (LHCC and LHC-4). Chairs the Computing Management Board. Software Project Leader Leads the developments of ATLAS software, as the Chief Architect of the Software Project. Is member of the ATLAS Executive Board, COB and TTCC. Participates in the LCG Architects Forum and other LCG activities. Chairs the Software Project Management Board and the Architecture Team.
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Main boards in computing organization Computing Management Board (CMB): Computing Coordinator (chair) Software Project Leader TDAQ Liaison Physics Coordinator International Computing Board Chair GRID, Data Challenge & Operations Coordinator Planning & Resources Coordinator Data Management Coordinator – Responsibilities: coordinate and manage computing activities. Set priorities and take executive decisions. – Meetings: bi-weekly.
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Main boards in computing organization Software Project Management Board (SPMB): Software Project Leader (chair) Computing Coordinator (ex officio) Simulation Coordinator Event Selection, Reconstruction & Analysis Tools Coordinator Core Services Coordinator Software Infrastructure Team Coordinator LCG Applications Liaison Calibration/Alignment Coordinator Sub-detector Software Coordinators Physics Liaison TDAQ Software Liaison – Responsibilities: coordinate the coherent development of software (both infrastructure and applications). – Meetings: bi-weekly.
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Main boards in computing organization ATLAS-LCG Team: – Includes all ATLAS representatives in the many LCG committees. Presently 9 people: SC2: Dario Barberis (Computing Coordinator), Daniel Froidevaux (from Physics Coordination) PEB: Gilbert Poulard (DC Coordinator) GDB: Dario Barberis (Computing Coordinator), Gilbert Poulard (DC Coordinator), Laura Perini (Grid Coordinator) GAG: Laura Perini (Grid Coordinator), Craig Tull (Framework-Grid integr.) AF: David Quarrie (Chief Architect & SPL) POB: Peter Jenni (Spokesperson), Torsten Åkesson (Deputy Spokesperson) LHC4: Peter Jenni (Spokesperson), Torsten Åkesson (Deputy Spokesperson), Dario Barberis (Computing Coordinator), Roger Jones (ICB Chair) – Responsibilities: coordinate the ATLAS-LCG interactions, improve information flow between “software development”, “computing organization” and “management”. Meetings: weekly.
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Main boards in computing organization Architecture Team (A-Team): – Composition: experts appointed by the Software Project Leader. – Responsibilities: design and set guidelines for the implementation of the software architecture. – Meetings: weekly. Software Infrastructure Team (SIT): – Composition: experts appointed by the Software Project Leader. – Responsibilities: provide the infrastructure for software development and distribution. – Meetings: bi-weekly. International Computing Board (ICB): – Composition: representatives of ATLAS funding agencies. – Responsibilities: discuss the allocation of resources to the Software & Computing project. – Meetings: 4 times/year (in software weeks).
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Organization: work in progress (1) Data Challenge, Grid and Operations terms of office of key people coming to an end ~now DC1 operation finished, we need to put in place an effective organization for DC2 Grid projects moving from R&D phase to implementation and eventually production systems we are discussing how to coordinate at high level all activities: Data Challenge organization and executions "Continuous" productions for physics and detector performance studies Contacts with Grid middleware providers Grid Application Interfaces Grid Distributed Analysis we plan to put a new organization in place by September 2003, before the start of DC2 operations
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Organization: work in progress (2) Event Selection, Reconstruction and Analysis Tools here we aim to achieve a closer integration of people working on high-level trigger algorithms detector reconstruction combined reconstruction event data model software tools for analysis “effective” integration in this area already achieved with HLT TDR work, now we have to set up a structure to maintain constant contacts and information flow organization of this area will have to be agreed with the TDAQ and Physics Coordinators (discussions on-going) most of the people involved will have dual reporting lines (same as for detector software people) we plan to put the new organization in place by the September 2003 ATLAS Week
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Computing Model Working Group (1) Work on the Computing Model was done in several different contexts: online to offline data flow world-wide distributed reconstruction and analysis computing resource estimations Time has come to bring all these inputs together coherently A small group of people has been put together to start collecting all existing information and defining further work in view of the Computing TDR, with the following backgrounds: Resources Networks Data Management Grid applications Computing farms Distributed physics analysis Distributed productions Alignment and Calibration procedures Data Challenges and tests of computing model
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Computing Model Working Group (2) This group will: first assemble existing information and digest it act as contact point for input into the Computing Model from all ATLAS members prepare a “running” Computing Model document with up-to-date information to be used for resource bids etc. prepare the Computing Model Report for the LHCC/LCG by end 2004 contribute the Computing Model section of the Computing TDR (mid-2005) The goal is to come up with a coherent model for: physical hardware configuration e.g. how much disk should be located at experiment hall between the Event Filter & Prompt Reconstruction Farm data flows processing stages latencies resources needed at CERN and in Tier-1 and Tier-2 facilities
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Computing Model Working Group (3) Group composition: Roger Jones (ICB chair, Resources), chairman Bob Dobinson (Networks) David Malon (Data Management) Torre Wenaus (Grid applications) Sverre Jarp (Computing farms) Paula Eerola (Distributed physics analysis) XXX (Distributed productions) Richard Hawkings (Alignment and Calibration procedures) Gilbert Poulard (Data Challenges and Computing Model tests) Dario Barberis & David Quarrie (Computing management, ex officio) First report expected in October 2003 Tests of the Computing Model will be the main part of DC2 operation (2Q 2004)
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep ATLAS Database Coordination Group recently set up coordination of: Production & Installation DBs (TCn) Configuration DB (online) Conditions DB (online and offline) with respect to: data transfer synchronization data transformation algorithms i.e. from survey measurements of reference marks on muon chambers to wire positions in space (usable online and offline) members: Richard Hawkings (Alignment & Calibration Coordinator), chair David Malon (Offline Data Management Coordinator) Igor Soloviev (Online DB contact person) Kathy Pommes (TCn Production & Installation DB) Data Management Issues
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Conditions Data Working Group The Database Workshop in early February brought the (so far) separate communities together Convergence of interest was obvious from on-line and off-line sides Terminology used was rather different, as well as data flow assumptions Small group of people started addressing these items: definition of configuration and conditions data data flow and rates for configuration and conditions data relation between online and offline calibrations input from Detector Control Systems input from “static” production database, pre-run calibration and survey data Members of the group so far: Richard Hawkings (chair) David Adams Steve Armstrong Mihai Caprini David Malon David Quarrie RD Schaffer Igor Soloviev
Simulation in ATLAS (A.Rimoldi) Demanding environment People vs things The biggest collaboration ever gathered in HEP The most complete and challenging physics ever handled The present simulation in pills: Fast Simulation: Atlfast Detailed simulation in Geant3 In production since 10 years, but frozen since 1995 and used for DC productions until now Detailed simulation in Geant4 Growing up (and evolving fast) from the subdetectors side Detailed testbeam studies (tb as an ‘old times’ experiment) For all the technologies represented Physics studies extensively addressed since 2001 For validation purposes Under development: Fast/semi-fast simulation, shower parameterizations Staged detector environment for early studies Optimizations, FLUKA integration…
DC2 Different concepts from the different domains about DC2… For the Geant4 Simulation people the DC2 target means a way to state that: Geant4 is the main simulation engine for Atlas from now on We have concluded a first physics validation cycle and found that Geant4 is now better or at least comparable with Geant3 We have written enough C++ code to say that the geometry description of Atlas is at the same level of detail as the one in Geant3 The application must still be optimized from the point of view of Memory time Performance (CPU) Application robustness
DC2 is close We have a functional simulation program based on Geant4 available now for the complete detector detector components already collected Shifting emphasis from subdetector physics simulations to ATLAS physics simulations after three years of physics validations Studies under way: Memory usage minimization Performance optimization Initialization time monitoring/minimization Calorimeters parameterization A new approach to the detector description through the GeoModel We are fully integrated within the Athena framework
Complete Simulation Chain Events can be generated online or read in Geometry layout can be chosen Hits are defined for all detectors Hits can now be written out (and read back in) together with the HepMC information Digitization being worked out right now Pileup strategy to be developed in the near future
The plan (for term geometry of all subdetectors shieldings in place 2 weeks oct- nov cables & services4 weeks oct- dec performance tests at different conditions 1 week jul- feb robustness tests for selected event samples 2 weeks aug- feb robustness tests for selected regions barrel 2 weeks sep- dec endcap 2 weeks sep- dec transition region 2 weeks sep- dec hits for all subdetectors (check and test)2 weeks sep- dec persistency 2 weeks sep- nov performance tests for all the detectors components in place1 week sep- nov performance tests vs. different conditions 2 weeks sep- nov robustness tests for all the det. components 2 weeks sep- dec packages restructuring for inconsistency with old structures 3 weeks oct- dec cleaning packages area (to attic) 1 week nov revising writing rights (obsolete, new) 1 week nov documentation4 weeks sep- dec 03 Emphasis on: Refinement of geometry missing pieces combined testbeam setup performance and robustness tests hits & digits persistency pileup In view of DC2 Early tests starting from September with single particle beams in order to evaluate the global performances well before DC2 startup
Atlas week, Sep 2003, PragueAlexander Solodkov22 Reconstruction: algorithms in Athena Two pattern recognition algorithms are available for the Inner Detector iPatRec and xKalman Two different packages are used to reconstruct tracks in the Muon Spectrometer MuonBox and MOORE The initial reconstruction of cell energy is done separately in LAr and TileCal. After that all reconstruction algorithms do not see any difference between LArCell and TileCell and are using generic CaloCells as input Jet reconstruction, E T miss Several algorithms combine information from tracking detectors and calorimeters in order to achieve good rejection factor or identification efficiency e/ identification, e/ rejection, identification, back tracking to Inner Detector through calorimeters, …
Atlas week, Sep 2003, PragueAlexander Solodkov23 High Level Trigger algorithm strategy Offline model : Event Loop Manager directs an Algorithm: Here is an event, see what you can do with it High Level Trigger model: Steering directs an Algorithm: Here is a seed. Access only relevant event data. Only validate a given hypothesis You may be called multiple times per this one event! Do it all within LVL2 [EF] latency of O(10ms) [O(1 s)] Possible event-to- event No event-to-event access Calibration & Alignment Database Access Slow and refined approaches Fast and rough treatment Performance Full access to event if necessary Restricted to Regions-of-Interest Data Access EVENT FILTERLEVEL 2ISSUES
Atlas week, Sep 2003, PragueAlexander Solodkov24 New test beam reconstruction in Athena Inner Detector (Pixel, SCT), calorimeter (TileCal) and whole Muon System are using latest TDAQ software at the test beam ByteStream files are produced by DataFlow libraries Format of the ROD fragment in the output ByteStream file is very close to the one used for HLT performance studies ByteStream with test beam data is available in Athena now ByteStreamCnvSvc is able to read test beam ByteStream since July 2003 ROD data decoding is implemented in the same way as in HLT converters for MDT and RPC (July 2003) and TileCal (September 2003) Converters are filling new Muon/TileCal EDM RDO => RIO conversion available in Athena before is used at no cost Reconstruction of Muon TB data is possible in Athena Muon reconstruction is done by MOORE package Ntuples are produced for the analysis Combined test beam (8 – 13 Sep 2003) Both MDT and TileCal data are reconstructed in Athena
Atlas week, Sep 2003, PragueAlexander Solodkov25 Chambers misalignments 180 GeV beam Barrel sagitta Comparisons with Muonbox are possible MOORE MDT segments reconstruction (test beam data) For the full 3-D reconstruction the standard MOORE ntuple can be used
Atlas week, Sep 2003, PragueAlexander Solodkov26 Reconstruction Task Force Who Véronique Boisvert, Paolo Calafiura, Simon George (chair), Giacomo Polesello, Srini Rajagopalan, David Rousseau Mandate Formed in Feb03 to perform high level re-design and decomposition of the reconstruction and event data model Cover everything between raw data and analysis Look for common solutions to HLT and offline Deliverables Interim reports published in April and May. Significant constructive feedback Final report any day now Interaction Several well attended open meetings to kick off and present reports Meetings focused on specific design issues to get input and feedback Feedback incorporated into second interim report
Atlas week, Sep 2003, PragueAlexander Solodkov27 RTF recommendations Very brief overview… please read the report Modularity, granularity, baseline reconstruction Reconstruction top down design (dataflow) Domains: sub-systems, combined reconstruction and analysis preparation Analysis of algorithmic components, identified common tools Integration of fast simulation Steering EDM Common interfaces between algorithms e.g. common classes for tracking subsystems Design patterns to give uniformity to data classes in combined reconstruction domain Approach to units and transformations Separation of event and non-event data Navigation
Atlas week, Sep 2003, PragueAlexander Solodkov28 Implementation of RTF recommendations RTF ends with the final report Goals incorporate first feedback into Release substantial implementation by Ambitious! How Planned in subsystems, coordinated in SPMB Requires cross-subsystem cooperation Because of the nature of the recommendations: common EDM classes, shared tools and patterns This is already happening Joint meetings, e.g. recent muon + InDet tracking, jet reconstruction meeting Subdetectors’ WBS already include implementations of RTF Some already implemented, e.g. Calo cluster event/non-event data separation, common indet RIOs are in 7.0.0
Atlas week, Sep 2003, PragueAlexander Solodkov29 Reconstruction Summary A complete spectrum of reconstruction algorithms is available in the Athena framework They are used both for HLT and offline reconstruction The same algorithms are being tried for test beam analysis Ongoing developments: Cleaner modularization (toolbox) Robustness (noisy/dead channels, misalignments) Extend algorithms reach (e.g low p t, very high p t ) New algorithms Implementation of RTF recommendations in next releases will improve greatly the quality of the reconstruction software Next challenge: summer 2004, a complete ATLAS barrel wedge in the test beam. Reconstruction and analysis using (almost) only ATLAS offline reconstruction.
Sviluppo nuovo ATLAS Production environment Finora sviluppati diversi tools –Specialmente in contesto Grid US Produzioni svolte con tool diversi in posti diversi –Usata molta manpower, scarsa automazione, controlli e correzioni a posteriori Decisione di sviluppare sistema nuovo e coerente, seguono slides di Alessandro De Salvo –Meetings luglio-agosto, finale ristretto 12-8 con De Salvo per INFN: architettura sistema (con riusi), sharing fra CERN (+nordici), INFN, US –Per INFN partecipazione da Milano-CNAF (2p EDT, Guido), da Napoli (2p), Roma1 (Alessandro)
Atlas Production System Design of an automatic production system to be deployed ATLAS-wide on the time scale of DC2 (spring 2004) Design of an automatic production system to be deployed ATLAS-wide on the time scale of DC2 (spring 2004) Automatic Robust Support for several flavours of GRID and legacy resources LCG US-GRID NG Local Batch queues Components Components Production DB Supervisor/Executors (master-slave system) Data Management System (to be finalized) Production Tools To be defined To be defined Security & Authorization Continuous Parallel QA System Monitoring Tools Exact schemata of the Production DB
Atlas Production System details (I) Components Production DB Single (logical) DB Dataset Task Task Transf. Dataset Logical File Job Definition Job Transf. Job Execution Logical File Supervisor All the initiatives come from it Communicates with the DB Uses several Executors to perform GRID or resource-specific tasks Supervisors and executors are logically and physically separated, thus allowing maximum flexibility and crash-safe actions Data Management Single (logical) DMS for all Atlas data Registration of all files in all facilities Ability to move data between any of the facilities Replica Management Tools Production request Production definition
Atlas Production System details (II) RB Chimera RB Task (Dataset) Partition Transf. Definition Task Transf. Definition + physics signature Executable name Release version signature Supervisor 1Supervisor 2Supervisor 4 US GridLCGNG Local Batch Task = [job]* Dataset = [partition]* JOB DESCRIPTION Human intervention Data Management System US Grid Executer LCG Executer NG Executer LSF Executer Supervisor 3 Job Run Info Rob Gardner Alessandro De Salvo OxanaSmirnovaLucGoossens LucGoossens KaushikDe Luc Goossens Location Hint (Task) Location Hint (Job) Job (Partition)
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep ATLAS Computing Timeline POOL/SEAL release ATLAS release 7 (with POOL persistency) LCG-1 deployment ATLAS complete Geant4 validation ATLAS release 8 DC2 Phase 1: simulation production DC2 Phase 2: intensive reconstruction (the real challenge!) Combined test beams (barrel wedge) Computing Model paper ATLAS Computing TDR and LCG TDR DC3: produce data for PRR and test LCG-n Computing Memorandum of Understanding Physics Readiness Report Start commissioning run GO! NOW
Dario Barberis: ATLAS Organization LHCC Review of Computing Manpower - 2 Sep Sept. 2003Software Release 7 (POOL integration) 31 Dec. 2003Geant4 validation for DC2 complete 27 Feb. 2004Software Release 8 (ready for DC2/1) 1 April 2004DC2 Phase 1 starts 1 May 2004 Ready for combined test beam 1 June 2004DC2 Phase 2 starts 31 Jul DC2 ends 30 Nov. 2004Computing Model paper 30 June 2005Computing TDR 30 Nov. 2005Computing MOU 30 June 2006Physics Readiness Report 2 October 2006Ready for Cosmic Ray Run High-Level Milestones
DC1 e parte INFN DC1-1 fatto in 1.5 mesi : terminato settembre 02 –10 7 eventi + 3* 10 7 particelle singole 39 siti 30 TB 500 KSi2K * mese Circa 3000 CPU usate (max) –CPUs INFN 132 = Roma1 46, CNAF 40, Milano 20, Napoli 16, LNF 10 (SI95=2* =5400) INFN circa 5% risorse e 5% share (ma INFN=10% ATLAS) DC1-2 pileup fatto in 1 mese: terminato fine 02 –1.2 M eventi di DC TB e 40 KSi2K * mese, stessi siti “proporzionalmente” –Risorse e share INFN come DC1-1 (per costruzione)
Ricostruzione per HLT TDR Fatta su 1.3 M eventi in 15 giorni terminata a maggio 2003 –10 siti (Tier1 o simili) –30 KSi2K * mese Forse + CPU nei vari tests che in prod. finale… Frazione del CNAF vicina 10% Ripetuto in luglio e primi agosto 20 CPU CNAF – continuata poi ricostruzione per fisica (A0) (vedi monitor agosto CNAF-ATLAS).. monitor agosto CNAF-ATLAS)..
DC2 in Italia Inizio in Aprile 2004 fine in Novembre –Si userà nuovo ATLAS “production environment” Ricercatori INFN impegnati nello sviluppo Impegno ATLAS globale per simul+rec in SI2k*mese circa doppio di DC1, supponendo CPU Geant4=Geant3 –CPU INFN richiesta da DC1*4 a DC1*6 (incertezza Geant4) Oltre a DC2 calcolo per fisica e rivelatori (come DC1) –Vedi agosto a Mi, Na, RmMiNaRm In DC2 prima volta analisi massiccia e distribuita (Tier3) Necessità 2004 prevedono (Tabella richieste da Referees) –18 kSI95 (5k esistenti + 13k new) in Tier2 (Disco 10.5TB ora + 11 new) New da anticipare a 2003 A Mi-LCG 120 CPU (70new=6k) a Rm 100 (45new=4k) a Na (45new=4k) LNF inizia con 0.2 K k new e 0.9 TB disco –Da 7k a 15k in Tier1 (buffer per prestazioni Geant4) –Aggiunta di 1.5k SI95 e disco a sistema Tier3 (ora solo 700 SI95! E circa 1 TB in 8 sezioni)
DC2 in Italia Importante non accada più che share INFN<10% Importante partecipare con tutte competenze locali –Setting up e decisioni ora per il modello di calcolo e di analisi Per il 2005 il piano è aumento contenuto rispetto a richieste 2004 in Tier2 e raddoppiare CPU in Tier3 –Per Tier2 3 kSI95 e 2 TB disco (niente a Mi e Rm) –Per Tier3 2 kSI95 e 3 TB disco Seguono slides (G.Poulard) su situazione DC e planning ATLAS globale per illustrare i vari punti
DC1 in numbers ProcessNo. of events CPU TimeCPU-days (400 SI2k) Volume of data kSI2k.monthsTB Simulation Physics evt Simulation Single part. 3x Lumi02 Pile-up4x Lumi10 Pile-up2.8x Reconstruction4x Reconstruction + Lvl1/2 2.5x10 6 (84)(6300) Total690 (+84)51000 (+6300) 60
ATLAS DC1 Phase 1 : July-August CPU‘s 110 kSI CPU days 5*10* 7 events generated 1*10* 7 events simulated 3*10* 7 single particles 30 Tbytes files 39 Institutes in 18 Countries 1.Australia 2.Austria 3.Canada 4.CERN 5.Czech Republic 6.France 7.Germany 8.Israel 9.Italy 10.Japan 11.Nordic 12.Russia 13.Spain 14.Taiwan 15.UK 16.USA grid tools used at 11 sites
Primary data (in 8 sites) Data (TB) Simulation: 23.7 (40%) Pile-up: 35.4 (60%) Lumi02: (14.5) Lumi10: (20.9) Pile-up: Low luminosity ~ 4 x 10 6 events (~ 4 x 10 3 NCU days) High luminosity ~ 3 x 10 6 events ( ~ 12 x 10 3 NCU days) Data replication using Grid tools (Magda)
DC2 resources (based on Geant3 numbers) ProcessNo. of events Time span CPU power CPU TIMEVolume of data At CER N Off site months kSI2k kSI2k.months TB Simulation Pile-up (*) Digitization (75)(25)(50) Byte-stream Total (+57) 26 (+57) 28 (+38) Reconst * To be kept if no “0” suppression
DC2: July 2003 – July 2004 At this stage the goal includes: Full use of Geant4; POOL; LCG applications Pile-up and digitization in Athena Deployment of the complete Event Data Model and the Detector Description Simulation of full ATLAS and 2004 combined test beam Test the calibration and alignment procedures Use widely the GRID middleware and tools Large scale physics analysis Computing model studies (document end 2004) Run as much as possible the production on LCG-1
Athena Geant4 Hits MCTruth Athena Geant4 Pythia 6 H 4 mu (Athena-ROOT) Athena-POOL HepMC Event generation Task Flow for DC2 data Detector Simulation Hits MCTruth Hits MCTruth Athena-POOL Digitization (Pile-up) Reconstruction Athena-POOL Athena Pile-up +Digits Athena Pile-up +Digits Athena Pile-up +Digits Digits Athena ESD AOD Athena ESD AOD Athena ESD AOD Byte-stream Athena Geant4
DC2:Scenario & Time scale End-July 03: Release 7 Mid-November 03: pre- production release February 1 st 04 : Release 8 (production) April 1 st 04: June 1 st 04: “DC2” July 15th Put in place, understand & validate: Geant4; POOL; LCG applications Event Data Model Digitization; pile-up; byte-stream Conversion of DC1 data to POOL; large scale persistency tests and reconstruction Testing and validation Run test-production Start final validation Start simulation; Pile-up & digitization Event mixing Transfer data to CERN Intensive Reconstruction on “Tier0” Distribution of ESD & AOD Calibration; alignment Start Physics analysis Reprocessing
ATLAS Data Challenges: DC2 We are building an ATLAS Grid production & Analysis system We intend to put in place a “continuous” production system o If we continue to produce simulated data during summer 2004 we want to keep open the possibility to run another “DC” later (November 2004?) with more statistics We plan to use LCG-1 but we will have to live with other Grid flavors and with “conventional” batch systems Combined test-beam operation foreseen as part of DC2
Milestones Completamento del 10% INFN della simulazione Geant3 per HLT TDR nel quadro del DC1 Aprile 2003 –Completato al come da slides presentate, gia’ entro febbraio, ma 5% (coerente con CPU disponibile) 2 - Completamento della ricostruzione e analisi dei dati simulati di cui al punto precedente Giugno 2003 –I dati sono stati ricostruiti entro maggio senza trigger code e questi dati sono stati trasferiti al CERN e usati per un’analisi rapida prima della pubblicazione del HLT TDR. In luglio e fino ad inizio agosto sono stati ri-ricostruiti con l’aggiunta del trigger-code –Completata al 90% perchè è stato rimandato a data da destinarsi il primo test realistico di analisi distribuita che pensavamo di realizzare per il HLT TDR.
Milestones 2003 (2) 3 - Simulazione di 10**6 eventi mu con GEANT4 e lo stesso layout usato per HLT TDR Giugno 2003 –Dal gruppo di simulazione di Pavia sono stati processati 4.5M di eventi di muoni singoli a 20 GeV (con subsample extra a 200 GeV) in testbeam 2002 mode con tempo stimato per evento di.1 s/ev su una macchina PentiumIII 1.26GHz. I dati simulati sono stati poi processati dai programmi di ricostruzione del muon system (Calib e Moore) e sono stati confrontati con i dati reali del testbeam In relazione a questa produzione di eventi e' stata effettuata una analisi ed e' stata posta in pubblicazione una nota interna di Atlas (ATLAS-COM- MUON ) (quattro nomi 2 da Pavia, 1 da Cosenza e uno CERN).Sono inoltre stati a prodotti, sempre a Pavia, 1M di eventi di muoni singoli e circa 2X10**4 eventi di Z-> mu mu ed altrettanti di W->mu nu con muon system in versione aggiornata (versione P03 del database dei muoni Amdb_SimRec) nella regione centrale dello spettrometro a muoni per tests di robustness –Completata al 100% (o più se possibile)
Milestones 2003 (3) 4 - Ripetizione di una delle analisi HLT TDR sui dati mu generati con GEANT4 Dicembre 2003 –Come riportato al punto precedente I dati di mu generati con GEANT4 sono già stati validati con analisi e confronto con dati reali. Il confronto con i risultati di GEANT3 è tuttora previsto. 5 - Inserimento dei TierX di ATLAS nel sistema di produzione di LCG, e test dell'inserimento con le prime produzioni del DC2 di Atlas. Dicembre 2003 –Il DC2 di ATLAS risulta spostato in avanti di 7 mesi rispetto alla data prevista in luglio 2002 e l’accesso degli esperimenti a LCG-1 sta per avvenire ora ( inizio settembre 2003) a fronte di una previsione per aprile- maggio (4 mesi circa). –I Tier2 già attivati in ATLAS Italia (Milano, Roma1, Napoli) intendono comunque installare LCG-1 e sperimentarne l’utilizzo entro il Milano come Tier2 già committed a LCG installerà LCG-1 entro settembre e parteciperà alle attività concordate fra ATLAS e LCG; Roma1 e Napoli parteciperanno ai test di LCG-1 in un quadro puramente ATLAS. Dopo che questa prima fase di tests sarà stata completata con successo proporremo l’inserimento ufficiale di Roma1 e Napoli in LCG (primavera 2004?).
Milestones entro maggio 2004: pronto s/w production quality per start DC2 (Geant4, Athena release 8, production environment LCG) – GEANT4: ottimizzazione prestazioni rispetto ad attuale fattore 2 rispetto a GEANT3 (ma non c'e' un target fissato) raffinamento geometria (cavi, servizi, etc.) finalizzazione digitizzazione e persistenza – Data Management (contributo INFN trascurabile) integrazione con POOL e con SEAL dictionary (rappresentazione ATLAS event Model) Persistenza per ESD, AOD, Tag Data in Pool Common geometry model for Reconstruction and Simulation Support for event collections and filtering ( ma questo puo' andare a luglio) –Production Environment Nuovo sistema di produzione per ATLAS, automatizzato e che acceda in modo coerente alle DB di metadati di produzione (ora AMI), catalogo files, virtual data. Interfaccia utente uniforme per tutto ATLAS, interfacciato a LCG (responsabilita' INFN),US-GRID(Chimera), NorduGrid e Plain Batch.
Milestones 2004 (2) 2- entro ottobre 2004 completato DC2 simulazione, ricostruzione ed eventuale reprocessing –Partecipazione di Tier1,2 (CNAF, Milano, Napoli, Roma1) a fasi simulazione, pileup e ricostruzione eseguendo il 10% di ATLAS globale in Italia. –Da aprile i siti Tier2 sono tutti LCG-capable, cioe' in tutti il s/w e' installato e testato in mini-produzione italiana. (Milano inseritoin LCG da prima del 2004). –Analisi in Tier3 in collaborazione con Tier1,2: report per fine 2004 –Contributo INFN a TDR computing
Tabella richieste (anticipo!) SezioneRISORSE HW RICHIESTE (anno ) Richieste FINANZ. (kEuro) Totale (kEuro) Milano (rivisto ) a) Nodi di calcolo per 6kSI95 CPU b) controller + dischi per un totale di 5 TB c) switch a) 73,5 b) 24 c) ,5 Roma1a) Nodi di calcolo per 4 kSI95 CPU b) controller + dischi per un totale di 3 TB a) 60 b) Napolia) 1 Rack 42U b) 15 Nodi di calcolo con biprocessori PIV 2.5 GHz per 3kSI95 in totale c) Switch Ethernet 10/ porte d) Server + controller RAID 5 + dischi per un totale di 2 TB a) 1,2 b) 45 c) 1,5 d) 12 59,7 LNFa) 600 SI95 CPU b) Dischi per un totale di 0,8 TB a) 9 b) 4,8 13,8
Tabella richieste (anticipo?) Cosenzaa) 5 PC da 100 SI95 cad. + monitors b) Dischi per un totale di 0,3 TB a) 10 b) 2 12 GenovaNAS File severs + dischi per un totale di 1.5 TB 9 9 Leccea) 1 Nodo di calcolo (1 biprocessore PIV 2.5 GHz) b) Dischi per un totale di 1 TB a) 3 b) 6 9 Paviaa) 300 SI95 CPU b) Dischi per un totale di 0,6 TB 2003 c) Dischi per un totale di 0,6 TB 2004 a) 4,5 b) 4,2 c) 3,6 12,3 Pisaa) 500 SI95 CPU b) Dischi per un totale di 1 TB a) 7,5 b) 12 19,5 Roma2Dischi per un totale di 1 TB 6 6 Roma3a) 100 SI95 CPU b) Dischi per un totale di 1 TB a) 1,5 b) 6 7,5 Udine TOTALE 334,3
Main Atlas Roma1 Farm Activities (Jun-Aug 2003) DC1 “Conventional” Reconstruction pileup events (QCD di-jets, Et >17 GeV) DC1 EDG Reconstruction (CNAF, Cambridge, Lyon, MI, RM) Reconstruction of low luminosity pileupped events (QCD di-jets, Et < 560 GeV) Muon Trigger studies H8 Testbeam data analysis EDG/DC1 Atlas Software packages production (RPMs)
20 available CPUs CPU Load
44 available CPUs CPU Load
CPU load
Roma1 Atlas Farm Usage Statistics Farm info/description: Farm info/description:
PBS Server Status ATLAS Farm in Milan (Updated every 120 minutes) Total Jobs Queued Jobs Running Jobs Exiting Jobs Waiting Jobs
% User % Free % System % Nice CPUs status on atlcluster-mi (Updated every 120 minutes)
Total Jobs Queued Jobs Running Jobs Exiting Jobs Waiting Jobs PBS Server CNAF (Updated every 120 minutes)