Scaricare la presentazione
La presentazione è in caricamento. Aspetta per favore
PubblicatoCipriano Antonucci Modificato 8 anni fa
1
Domenico Elia1 Calcolo ALICE: stato e richieste Domenico Elia Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Riunione con Referee Calcolo LHC Bologna, 25 Maggio 2015
2
Domenico Elia2Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Outline ALICE Computing status: impiego delle risorse 2014 performance siti italiani, attività di R&D su VAF evoluzione CM per Run2 Richieste finanziarie: situazione CPU e storage nei Tier-2, dismissioni richieste 2016 per Tier-1 e Tier-2
3
Domenico Elia3Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Resource usage in 2014 Overall CPU/DISK/TAPE usage: CPU @ T1, T2 over pledge (opportunistic, extra-WLCG) DISK usage ~70% (87% full wherever good network connection) TAPE usage ~90% (but 50% @ T1, will improve with Run2) CERN-RRB-2015-014
4
Domenico Elia4Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Resource usage in 2014 Main activity 2014 / May 2015: Run1 data reprocessing and associated MC: pp 2010 (pass4), pp and pPb 2012, pPb 2013 (pass3) full detector recalibration + improved software, all with the same code pp 2011 reprocessing being evaluated (overlap with Run2) further MC productions (~120 cycles): requests from PWGs (68% pp, 18% pPb, 14% PbPb) first large-scale production for Run2 (new detector setup) ~4% generations dedicated to upgrade studies (Run3) analysis (user and organized trains) ALICE recommissioning for Run2: test of upgraded detector readout, trigger, DAQ, recording chain cosmics trigger data taking with Offline processing
5
Domenico Elia5Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Resource usage in 2014 Main activity 2014 / May 2015: Run1 data reprocessing and associated MC further MC productions (~120 cycles) analysis (user and organized trains) Average: ~45K concurrent jobs ~99.5% availability 85% CPU eff @ T0, T1 79% CPU eff @ T2
6
Domenico Elia6Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Resource usage in 2014 Main activity 2014 / May 2015: Run1 data reprocessing and associated MC further MC productions (~120 cycles) analysis (user and organized trains) MC productions: 71% @ all centres RAW data processing: 6% @ T0/T1 only User analysis: 11% @ all centres Organized analysis: 12% @ all centres individual analysis decreased by 50% in the period 2012-2014 still ample room to increase the share of organized analysis reducing individual analysis by factor 2 could still give ~2-5% gain in efficiency
7
Domenico Elia7Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Resource usage in 2014 Main activity 2014 / May 2015: Run1 data reprocessing and associated MC further MC productions (~120 cycles) analysis (user and organized trains) ALICE recommissioning for Run2 New and upcoming WLCG sites: KR-KISTI (Korea), T1: in production in 2014 NRC-KI (Russia), T1: in production in 2014 UNAM (Mexico), T2: rumping up, MoU signed in Nov 2014 COMSATS (Pakistan), T2: MoU signed in March 2015 CHPC (South Africa), T2: MoU signed in April 2015
8
Domenico Elia8Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites ~15%
9
Domenico Elia9Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites Storage availability:
10
Domenico Elia10Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites Resource usage @ T1: CPU ~150% pledge (2015) DISK ~85% (still @ pledge 2014) TAPE largely underused (~700 TB)
11
Domenico Elia11Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites Resource usage @ T2: largely benefits from strict internal coordination monthly meetings (performance recording) + annual workshop overall ~35% increase in total WCT from 2013 to 2014 large upgrade in 2 sites (ReCaS): BARI (almost ready, in production beginning of June) CATANIA (in production since April, ~1500 core, 1 PB: Catania-VF)
12
Domenico Elia12Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites Bari Torino PD-LNL Catania Pledge: Catania-VF
13
Domenico Elia13Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites Resource usage @ T2: largely benefits from strict internal coordination monthly meetings (performance recording) + annual workshop overall ~35% increase in total WCT from 2013 to 2014 large upgrade in 2 sites (ReCaS): BARI (almost ready, in production beginning of June) CATANIA (in production since April, ~1500 core, 1 PB: Catania-VF) monitoring T2 data from APEL (Andrea Guarise): 2014 e 2015 pledge values in place HS06 SI2K conversion factor from sites (BDII) checked/updated WCT cross-checked vs exp (MonALISA) and local monitorings
14
Domenico Elia14Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites (*) In Aprile Catania-VF monitorata solo attraverso MonAlisa, su EGI da Maggio
15
Domenico Elia15Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites (*) In Aprile Catania-VF monitorata solo attraverso MonAlisa, su EGI da Maggio Coordinamento Tier-2 Conversione ore in Wall_h_kSi2k dei dati EGI
16
Domenico Elia16Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites (*) In Aprile Catania-VF monitorata solo attraverso MonAlisa, su EGI da Maggio Bari (-16%) Torino (-15%) PD-LNL (-7‰) Catania (+2%) Coordinamento Tier-2
17
Domenico Elia17Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status R&D activity on the VAF In the framework of the STOA-LHC PRIN: Torino VAF going to be accounted for ALICE similar (test) cloud infrastructures deployed in 2014: Bari, Cagliari, Legnaro, Trieste (OpenStack) Catania could join (new T2 infrastructure) many activities ongoing, reported to CHEP’15 (April): Interoperating Cloud-based Virtual Farms for the ALICE experiment at the LHC (Trieste) Monitoring of IaaS and scientific applications on the Cloud using Elasticsearch ecosystem (Torino) Managing competing elastic grid and cloud computing applications using OpenNebula (Torino) Local storage federation through XRootD architecture for interactive distributed analysis (Bari) white paper to be finalized by end of 2015 futher (connected) activities: parallel computing (TS, BA), dashboard for the Italian sites (BA)
18
Domenico Elia18Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Targeting integrated luminosity 1 nb -1 for PbPb: by combination of Run1 and Run2 statistics consistent with the ALICE approved programme 4-fold increase in instant luminosity for PbPb Detector upgrades: complete TRD/PHOS, new DCAL Double event rate of TPC/TRD: consolidation of TPC and TRD readout electronics Increased capacity of HLT and DAQ: rate up to 8 GB/sec to T0 (for Heavy-Ion data taking) ALICE Computing status Evolution of CM for Run2
19
Domenico Elia19Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Grid model largely unchanged in Run2: integration of every new computing centre average 2 replicas of analysis objects: dependency on resource stability, 1 copy for least popular data low differentiation of tasks: T0/T1 still raw data keepers/producers all other tasks (MC + analysis) performed everywhere tasks generally sent to data, but data can go to tasks if needed: jobs go to data, in case of failure read from closest replica (<5%) ALICE global data distribution by exclusive use of xrootd protocol analysis input mostly on AODs (limited use of ESDs) push analyzers to organized trains (LEGO framework) ALICE Computing status Evolution of CM for Run2
20
Domenico Elia20Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Main software and process improvements: new version of the software framework (AliRoot 5.x): effort to improve performance of ALICE reconstruction software use TRD points in the fit (improve high-momentum resolution) reduce memory requirements during calibration and reconstruction use of HLT for online Raw data compression (factor 4): already tested in Run1, implies reduction of tape storage @ Tier-0/1 use of HLT for calibration: move first calibration iteration to online use of HLT track seeds for offline reconstruction improve performance of GEANT4 simulation for ALICE further development of fast and parametrized simulation ALICE Computing status Evolution of CM for Run2
21
Domenico Elia21Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Main software and process improvements Additional improvements: start adapting ALICE distributed computing to Cloud, using of HLT farm for offline processing corresponds to additional 3% CPU resources improving performance of the organized analysis trains speeding up and improving the efficiency of the analysis activity by active data management explore contributed resources: ie spare CPU cycles on supercomputers collaborating with other experiment on this issue ALICE Computing status Evolution of CM for Run2
22
Domenico Elia22Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Basic assumptions for Run2 resource estimate: same CPU power needed for reconstruction 25% larger raw event size: additional detectors, detector coverage higher track multiplicity with increased beam energy and pileup MC productions: 100% pp, pPb + 30-40% PbPb events ALICE Computing status Evolution of CM for Run2 T1/2 2016: +25% T1/2 2016: +17%
23
Domenico Elia23Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Summary Run2: data volume in the period 2015-2018 expected ~3x Run1 focus of the Grid development will be on improving the analysis efficiency and decreasing the turnaround time of organized trains several other software and process improvements site performance and stability will continue to be a key factor for success of the ALICE offline computing planned resource increase expected to meet the demands, working on data popularity monitoring and replica limitation Run3: TDR submitted to LHCC, final discussion first week of June
24
Domenico Elia24Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Situazione CPU/Storage Italia In produzione al Tier-1: CPU:22800 HS06 (pledge 2015) DISK:1920 TB (pledge 2014*) In produzione ai Tier-2 (+ Cagliari): BariCatania Padova- LNL TorinoCagliariTotale HS06 82401314789239954112041384 TB 360 1204677 844703155 Disponibili (incluso obsoleti non ancora dismessi) Maggio 2015 A pledge 2015 (3380 TB) in Ottobre Risorse 2015: in produzione ReCaS CT, il resto tra Giugno e Settembre (prox slide) Pledge 2015: 38600 HS06 + 4381 TB
25
Domenico Elia25Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Situazione CPU/Storage Tier-2 Assegnazione 2015 parte ReCaS: Bari (in produzione a Giugno): rimpiazzi BA: 1568 HS06 + 360 TB Catania (in produzione da inizio Aprile): rimpiazzi CT: 1075 HS06, rimpiazzi CA: 840 HS06 parte crescita netta totale ALICE: 1550 HS06 + 500 TB
26
Domenico Elia26Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Situazione CPU/Storage Tier-2 Assegnazione 2015 parte ReCaS: Bari (in produzione a Giugno):1568 HS06 + 360 TB Catania (già in produzione): 3465 HS06 + 500 TB Assegnazione 2015 parte CSN3 (224.5 k€): quota CPU (32.5 k€), stornata su PD-LNL e TO: rimpiazzi PD-LNL: 1280 HS06, rimpiazzi TO: 1400 HS06 quota Storage (192 k€): completamento crescita netta totale + rimpiazzi PD-LNL e TO gara gestita a Bari (capitolato quasi pronto, in GE a Giugno) attesa: 6 x 180 TB (1 BA, 2 TO, 3 PD-LNL) = 1.08 PB consistente risparmio con sole espansioni (BA e PD-LNL) potrebbero servire ~15 kEuro aggiuntivi (2 server TO)
27
Domenico Elia27Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Situazione CPU/Storage Tier-2 Aggiornata con risorse 2015: CPU:39216 HS06 in eccesso al pledge: 616 HS06 DISK: 4423 TB in eccesso al pledge: 42 TB BariCatania Padova- LNL TorinoCagliariTotale HS06 82401314777368973112039216 TB 924120411521123204423 Disponibili a fine 2015 (escluso tutte le dismissioni 2015 + risorse 2015) Pledge 2015: 38600 HS06 + 4381 TB
28
Domenico Elia28Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Dismissioni 2015-17 Anno di dismissione BariCatania LNL- Padova TorinoCagliariTotale HS06 201515681075128014008406163 TB 2015360 065 8150556 HS06 2016156805496158411209768 TB 20160 130260 15720567 HS06 2017000000 TB 20170 11401170231
29
Domenico Elia29Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Dismissioni 2015-17 Anno di dismissione BariCatania LNL- Padova TorinoCagliariTotale HS06 201515681075128014008406163 TB 2015360 065 8150556 HS06 2016156805496158411209768 TB 20160 130260 15720567 HS06 2017000000 TB 20170 11401170231 II semestre 2016 2017
30
Domenico Elia30Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Dismissioni 2015-17 Anno di dismissione BariCatania LNL- Padova TorinoCagliariTotale HS06 201515681075128014008406163 TB 2015360 065 8150556 HS06 2016156805496158411209768 TB 20160 130260 15720567 HS06 2017000000 TB 20170 11401170231 Situazione complessiva Tier-2 nel 2016: CPU:39216 – 9768 = 29448 HS06 DISK:4423 – 567 = 3856 TB
31
Domenico Elia31Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Esito RRB Aprile 2015 Share INFN per 2016: CPU, DISK per Tier-1 e Tier-2: 18.5% (19.3% nel 2015) TAPE per Tier-1: 35.2% (41.1% nel 2015)
32
Domenico Elia32Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Richieste 2016: Tier-1 CPU Tier-1 (HS06) DISK Tier-1 (TBn) Pledged T1 Disp. – dismiss. T2 228003382 Scrutinati ALICE 2016 290453885 Delta 6245503 Stima costo (k€) 87.4120.7 Totale (k€) 208.1 Stima costi T2 (T1): 12 (14) € / HS06 e 220 (240) € / TBn Tape @ T1: quota pledge da 4192 (2015) a 5491 TB (2016) Dismissioni Tier-1: non incluse
33
Domenico Elia33Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Richieste 2016: Tier-1 e Tier-2 CPU Tier-1 (HS06) DISK Tier-1 (TBn) CPU Tier-2 (HS06) DISK Tier-2 (TBn) Pledged T1 Disp. – dismiss. T2 228003382294483856 Scrutinati ALICE 2016 290453885 438454829 Delta 624550314397973 Stima costo (k€) 87.4120.7172.7214.0 Totale (k€) 208.1386.7 Overhead T2 (k€) 48.1 Stima costi T2 (T1): 12 (14) € / HS06 e 220 (240) € / TBn Tape @ T1: quota pledge da 4192 (2015) a 5491 TB (2016) Dismissioni Tier-1: non incluse Overhead Tier-2: 6% CPU + 5% DISCO (rete) + 7% totale (server aggiuntivi)
34
Domenico Elia34Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Richieste 2016: per sito Tier-2 Dismissioni HS06 / TBk€ Bari156818,8 00,0 18,8 Catania00,0 13028,6 LNL-Padova549666,0 26057,2 123,2 Torino158419,0 15734,5 53,5 Cagliari112013,4 204,4 17,8 Dismissioni totale HS06 / TBk€ 9768117,2 567124,7 242,0 Crescita netta HS06 / TBk€ 462955,5 40689,2 144,8 Dismissioni + crescita HS06 / TBk€ 14397172,8 973214,0 386,7 Richiesta completa
35
Domenico Elia35Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Richieste 2016: per sito Tier-2 Dismissioni HS06 / TBk€ Bari156818,8 00,0 18,8 Catania00,0 13028,6 LNL-Padova549666,0 26057,2 123,2 Torino158419,0 15734,5 53,5 Cagliari112013,4 204,4 17,8 Dismissioni totale HS06 / TBk€ 9768117,2 567124,7 242,0 Crescita netta HS06 / TBk€ 462955,5 40689,2 144,8 Dismissioni + crescita HS06 / TBk€ 14397172,8 973214,0 386,7 HS06 / TBk€ 156818,8 00,0 18,8 00,0 0 549666,0 00,0 66,0 158419,0 00,0 19,0 112013,4 204,4 17,8 HS06 / TBk€ 9768117,2 204,4 121,6 HS06 / TBk€ 462955,5 40689,2 144,8 HS06 / TBk€ 14397172,8 42693,6 266,4 Richiesta completa Con rinvio dismissioni storage
36
Domenico Elia36Riunione Referee Calcolo LHC / Bologna, 25.5.2015 BACKUP
37
Domenico Elia37Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE CM: Focus on Run2 Physics programme, upgrades Targeting integrated luminosity 1 nb -1 for PbPb: by combination of Run1 and Run2 statistics consistent with the ALICE approved programme 4-fold increase in instant luminosity for PbPb
38
Domenico Elia38Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE CM: Focus on Run2 Infrastructure improvements Focus on SE stability: major factor for successful analysis and high CPU efficiency goal for all SEs: > 98% availability
39
Domenico Elia39Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE CM: Focus on Run2 Infrastructure improvements Focus on SE stability LHCone programme Network use will increase IPv6 adoption Refurbishment of SAM/SUM tests: WLCG monitoring consolidation projet, advanced status Site tests will reflect more and more the VO tests: in the ALICE case provided by MonALISA
40
Domenico Elia40Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE CM: Focus on Run2 Infrastructure improvements Focus on SE stability LHCone programme: brings substantial improvement in inter-site connectivity allows for further diluition of boundaries between sites and tasks Europe largely covered, focus on South America and Asia Network use will increase: large data volumes, more to transfer between sites remote access to storage
41
Domenico Elia41Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE CM: Focus on Run2 Infrastructure improvements Focus on SE stability LHCone programme Network use will increase IPv6 adoption: IPv4 address depletion is already a fact for new sites ALICE services are IPv6 ready xrootd v.4 should be IPv6 ready (release end of May) other sevices are being brought into compliance
42
Domenico Elia42Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Nuova infrastruttura virtuale CT
Presentazioni simili
© 2024 SlidePlayer.it Inc.
All rights reserved.