Scaricare la presentazione
La presentazione è in caricamento. Aspetta per favore
PubblicatoFranca Magnani Modificato 8 anni fa
1
Domenico Elia1 Calcolo ALICE: stato e richieste finanziarie Domenico Elia Riunione Referee Calcolo LHC / Padova, 25.5.2016 Riunione con Referee Calcolo LHC Padova, 25 Maggio 2016
2
Domenico Elia2Riunione Referee Calcolo LHC / Padova, 25.5.2016 Outline ALICE Computing status: impiego delle risorse 2015, attività calcolo Run2 performance siti italiani, attività di R&D Richieste finanziarie: situazione CPU e storage nei Tier-2, dismissioni richieste suppletive 2016 (Tier-1) richieste ordinarie 2017 (Tier-1 e Tier-2)
3
Domenico Elia3Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status First year Run2 data taking pp @ 13 TeV PbPb @ 5.02 TeV
4
Domenico Elia4Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status First year Run2 data taking 2010-2013 – 7.3 PB (one replica) All data processed in final reconstruction pass 2015 – 7.2 PB (one replica)
5
Domenico Elia5Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Resource usage in 2015 Overall CPU/DISK/TAPE usage: CPU @ T1, T2 over pledge (opportunistic, extra-WLCG) DISK usage below request (delay in 2015 data reconstruction) high TAPE usage (unexpected high pile-up in pp 13 TeV bs 25 ns) CERN-RRB-2016-049
6
Domenico Elia6Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Resource usage in 2015 ALICE Grid: new entries in 2015-2016:
7
Domenico Elia7Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Resource usage in 2015 ALICE Grid: new entries in 2015-2016 HLT farm used for offline activities (when not in run): included in the Grid as a fully virtual site
8
Domenico Elia8Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Resource usage in 2015 ALICE Grid: new entries in 2015-2016 HLT farm used for offline activities (when not in run) usual share of the activities: ~150 MC cycles (papers + first physics analysis of 2015 data) Run1 raw data re-processing, Run2 data processing (bulk of raw and MC production for Run2, both pp and PbPb, still to be done) organized and user (chaotic) analysis
9
Domenico Elia9Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Resource usage in 2015 ALICE Grid: new entries in 2015-2016 HLT farm used for offline activities usual share of the activities: 61K parallel jobs on average MC productions: 71% RAW data processing: 9% User analysis: 6% Organized analysis: 14%
10
Domenico Elia10Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Resource usage in 2015 Current activities: MC productions (papers, first physics 2015, upgrade) Raw data processing: code improved to reduce memory consumption (now 2 GB/job) 2015 data reconstructed partially: -distortions in the TPC occur in runs with high interaction rate -specific corrections need to be developed, currently being validated -plan to complete fully calibrated reconstruction by next ~1-1.5 months
11
Domenico Elia11Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Resource usage in 2015 Current activities: MC productions (papers, first physics 2015, upgrade) Raw data processing Changing replication policy: needed to cope with the available storage single ESD replica global disk space needed for 2015 processing: -5-6 PB (RAW + MC) -barely feasible with the expected resources
12
Domenico Elia12Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Resource usage in 2015 Current activities: MC productions (papers, first physics 2015, upgrade) Raw data processing Changing replication policy Popularity and cleanup: -removed very old MC productions -removed second ESD replica for low acces productions Volume of data vs Nr of accesses in X=3,6,12 months First bin: data created before period X began and not accessed during that period
13
Domenico Elia13Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Performance of the Italian sites TO LNL CNAF BA CT ~14% INFN Problems with the LUSTRE FS in the old Bari site (BC2S) fully migrated to the new ReCaS datacenter
14
Domenico Elia14Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Performance of the Italian sites Resource usage @ T2: following the usual internal coordination plan monthly meetings (performance recording) + annual workshop overall ~50% increase in total WCT from 2014 to 2015
15
Domenico Elia15Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Performance of the Italian sites Resource usage @ T2: following the usual internal coordination plan monthly meetings (performance recording) + annual workshop overall ~50% increase in total WCT from 2014 to 2015 large upgrade in 2 sites (ReCaS) within 2015: CATANIA (in production since April, ~1500 core, 1 PB: Catania-VF) BARI (in production for ALICE since mid-August): ~300 server, 105 kHS06 (~10000 core) - 25 kHS06 CMS pledge + 10 kHS06 ALICE pledge ~4 PB disk storage + 2.75 PB tape library - 900 TB CMS pledge + 900 TB ALICE pledge 20 Gbit/s network connection (ready for 40 Gbit/s)
16
Domenico Elia16Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status New ReCaS BA infrastructure Official opening July 9, 2015: https://agenda.infn.it/conferenceDisplay.py?confId=9856 BARI Tier-2 from BC2S to ReCaS: -migration from LUSTRE to pure XRootD -large opportunistic use of CPU (up to ~6000 slots) BC2S ReCaS Pledge 2015
17
Domenico Elia17Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Performance of the Italian sites Bari Torino PD-LNL Catania Pledge: New ReCaS center in Bari New ReCaS center in Catania: Catania-VF
18
Domenico Elia18Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Performance of the Italian sites Monitoring T2 data from APEL: https://faust01.to.infn.it/#/dashboard/script/pledge_mc_sum.js BALNLT1 CTTO
19
Domenico Elia19Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status R&D activity and s/w for Run3 Virtual Analysis Facility (STOA-LHC PRIN): Cloud-based VAF deployed in BA, CA, LNL, TO and TS XRootD-based Data Federation (DF) set-up and populated: local redirectors in each site + national redirector in BA system fully tested, final PRIN report completed by end of April ’16 Software development for Run3: ITS standalone tracking based on cellular automaton (TO) ITS geometry (AL) response simulation for the pixel (pAlpide) chip (TS, BS-PV) First experience with EOS at TS
20
Domenico Elia20Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status R&D activity on the Dashboard The project: a Dashboard concentrate in a single graphical interface all the information concerning the ALICE activity in each site (MonALISA, local Batch system, local Monitoring system metrics). Currently running in the Bari T2 site (since ~2 years) Recently exported also to the Torino site Next steps: –export in all ALICE T2 and others WLCG sites –global dashboard for the Italian computing in ALICE Abstract submitted to CHEP’16
21
Domenico Elia21Riunione Referee Calcolo LHC / Padova, 25.5.2016 Sito web calcolo ALICE Italia https://web2.infn.it/ALICE-Italia-computing/index.php/it/
22
Domenico Elia22Riunione Referee Calcolo LHC / Padova, 25.5.2016 Sito web calcolo ALICE Italia https://web2.infn.it/ALICE-Italia-computing/index.php/it/ Attività Contatti Documenti Link Eventi
23
Domenico Elia23Riunione Referee Calcolo LHC / Padova, 25.5.2016 Situazione risorse e richieste finanziarie
24
Domenico Elia24Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie Situazione CPU/storage Italia In produzione al Tier-1: CPU:29000 HS06 (pledge 2016) DISK:3900 TB (pledge 2016) TAPE:5500 TB (pledge 2016) In produzione ai Tier-2 (+ Cagliari): BariCatania Padova- LNL TorinoCagliariTotale HS06 12080131471688110373112053601 TB 984 12041152 1123704533 Disponibili (incluso obsoleti non ancora dismessi) Maggio 2016 Quota pledge 2015-2016 da Febbraio 2016 (x2) Pledge 2016: 43845 HS06 + 4829 TB
25
Domenico Elia25Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie Situazione CPU/storage Tier-2 Acquisti seconda metà 2015: CPU: 1720 HS06 a LNL (bonus ~450 HS06) + 1400 HS06 a TO storage: espansioni 4x180 TB a BA e LNL (bonus ~50 TB) esito ottimizzato con combinazione gare (BA) e acquisti di sito
26
Domenico Elia26Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie Situazione CPU/storage Tier-2 Finanziamento 2016 da CSN3: richieste: 435 k€ (387 crescita e rimpiazzi + 48 overhead) assegnazioni: 332 k€ (308 crescita e rimpiazzi + 24 overhead) rinvio dismissioni storage CT/CA, per metà dismissioni PD-LNL e TO assegnata al 50% la richiesta overhead pledge 2016 garantite in accordo all’esito CRSG/RRB
27
Domenico Elia27Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie Situazione CPU/storage Tier-2 Finanziamento 2016 da CSN3: richieste: 435 k€ (387 crescita e rimpiazzi + 48 overhead) assegnazioni: 332 k€ (308 crescita e rimpiazzi + 24 overhead) Schema suddivisione tra i siti: CPU: ~14400 HS06 BA: 10200 HS06 (1950 crescita + 1568 rimpiazzi = 3518 HS06) LNL: 10200 HS06 (2500 crescita + 5496 rimpiazzi = 7996 HS06) TO: 10300 HS06 (1300 crescita + 1584 rimpiazzi = 2884 HS06) DISK: ~620 TB BA: 1184 TB (260 crescita = 260 TB) LNL: 1202 TB (50 crescita + 130 rimpiazzi = 180 TB) TO: 1223 TB (100 crescita + 80 rimpiazzi = 180 TB)
28
Domenico Elia28Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie Situazione CPU/storage Tier-2 Finanziamento 2016 da CSN3: richieste: 435 k€ (387 crescita e rimpiazzi + 48 overhead) assegnazioni: 332 k€ (308 crescita e rimpiazzi + 24 overhead) Situazione acquisti 2016: completati: BA: 3840 HS06 (BA) + 180 TB (espansione per LNL) LNL:8600 HS06 (LNL) + licenza per espansione storage da finalizzare: BA:260 TB (gara comune con CMS, totale ~200 k€) TO:2880 HS06 + 180 TB (sinergie con acquisti altre sigle e C3S) overhead (ricognizione esigenze completata e storni effettuati)
29
Domenico Elia29Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie Situazione CPU/storage Tier-2 Situazione aggiornata con risorse 2016: CPU:45333 HS06 in eccesso al pledge: 1488 HS06 DISK: 4876 TB in eccesso al pledge: 47 TB BariCatania Padova- LNL TorinoCagliariTotale HS06 10512131471138510289045333 TB 124412041202122604876 Disponibili a fine 2016 (fatte dismissioni + completati acquisti 2016*) Pledge 2016: 43845 HS06 + 4829 TB * Ipotesi di buon esito acquisti residui a BA e TO
30
Domenico Elia30Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie Dismissioni 2016-17 Anno di dismissione BariCatania LNL- Padova TorinoCagliariTotale HS06 2016156805496158411209768 TB 20160 130260 15720567 HS06 201700038400 TB 20170 11401170231 Rinvio dismissioni storage dalla seconda metà del 2016 al 2017: 130 TB (CT) + 130 TB (LNL) + 80 TB (TO) + 20 TB (CA) = 360 TB
31
Domenico Elia31Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie Dismissioni 2016-18 Anno di dismissione BariCatania LNL- Padova TorinoCagliariTotale HS06 2016156805496158411209768 TB 20160 0130 770207 HS06 201700038400 TB 20170 24413019720591 HS06 201866721314702149021968 TB 20180 002050 Rinvio dismissioni storage dalla seconda metà del 2016 al 2017: 130 TB (CT) + 130 TB (LNL) + 80 TB (TO) + 20 TB (CA) = 360 TB Dismissioni ReCaS (BA e CT) previste nel 2018 = 20000 HS06
32
Domenico Elia32Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie Dismissioni 2016-18 Anno di dismissione BariCatania LNL- Padova TorinoCagliariTotale HS06 2016156805496158411209768 TB 20160 0130 770207 HS06 201700038400 TB 20170 24413019720591 HS06 201866721314702149021968 TB 20180 002050 Situazione complessiva Tier-2 a inizio 2017: CPU:45333 – 3840 = 41493 HS06 DISK:4876 – 591 = 4285 TB
33
Domenico Elia33Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie RRB Aprile 2016 Share INFN per 2017: CPU, DISK per Tier-1 e Tier-2: 18.9% (18.5% per 2016) TAPE per Tier-1: 34.8% (35.2 per 2016, 41.1% per 2015) RRB October 2015
34
Domenico Elia34Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie RRB Aprile 2016 +7% (4%) CPU al Tier-1 (0) +30% (22%) TAPE al Tier-1 (0) increased processing time for high pile-up pp events (x2) + TPC calibration issues increased raw data volume for pp events (x3.5) as observed in 2015 sample
35
Domenico Elia35Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie RRB Aprile 2016 +7% (4%) CPU al Tier-1 (0) +30% (22%) TAPE al Tier-1 (0) Richiesta suppletiva 2016 per Tier-1: CPU: 2700 HS06 35 k€ (pledge 2016 rev: 31752 HS06) TAPE:1.6 PB 40 k€ (pledge 2016 rev: 7.1 PB) increased processing time for high pile-up pp events (x2) + TPC calibration issues increased raw data volume for pp events (x3.5) as observed in 2015 sample
36
Domenico Elia36Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie RRB Aprile 2016 Incrementi 2016 rev. 2017: CPU:13.8% (T0) 31.5% (T1) 17% (T2) DISK:27.4%16.8%19.9% TAPE:30.8%39.9% RRB October 2015
37
Domenico Elia37Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie Richieste 2017: Tier-1 e Tier-2 CPU Tier-1 (HS06) DISK Tier-1 (TB) TAPE Tier-1 (TB) CPU Tier-2 (HS06) DISK Tier-2 (TB) Pledged T1 Disp. – dismiss. T2 3175238857064414934285 Scrutinati ALICE 2017 4176947259883 519755916 Delta 100178402819104821631 Stima costo (k€) 130.2176.470.5115.3326.2 Totale (k€) 377.1441.5 Overhead T2 (k€) 54.1 Stima costi T2 (T1): 11 (13) € / HS06 e 200 (210) € / TB Dismissioni Tier-1: non incluse Overhead Tier-2: 6% CPU + 5% DISCO (rete) + 7% totale (server aggiuntivi)
38
Domenico Elia38Riunione Referee Calcolo LHC / Padova, 25.5.2016 Richieste finanziarie Richieste 2017: per sito Tier-2 Dismissioni HS06 / TBk€ Bari00,0 0 Catania00,0 24448,8 LNL-Padova00,0 13026,0 Torino384042,2 19739,4 81,6 Cagliari00,0 204,0 Dismissioni totale HS06 / TBk€ 384042,2 591118,2 160,4 Crescita netta HS06 / TBk€ 664273,1 1040207,9 281,0 Dismissioni + crescita HS06 / TBk€ 10482115,3 1631326,1 441,4
39
Domenico Elia39Riunione Referee Calcolo LHC / Padova, 25.5.2016 Backup
40
Domenico Elia40Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Resource usage in 2015 CPU resource evolution: steady grouth of the number of active jobs system scaled from 500 to 100,000 concurrently running jobs scheduled analysis now prevaling on chaotic analysis organized analysis +60% in 2015 wrt 2014 better efficiency
41
Domenico Elia41Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Run2 overview
42
Domenico Elia42Riunione Referee Calcolo LHC / Padova, 25.5.2016 ALICE Computing status Status of 2015 data processing Substantial IR-induced distortions in the TPC Affect both p-p and Pb-Pb data Sophisticated correction algorithms development in the past 6 months Data reconstructed partially (first physics, Lower IR runs) Bulk of reconstruction still pending 42
Presentazioni simili
© 2024 SlidePlayer.it Inc.
All rights reserved.