La presentazione è in caricamento. Aspetta per favore

La presentazione è in caricamento. Aspetta per favore

Domenico Elia1 Calcolo ALICE: stato e richieste Domenico Elia Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Riunione con Referee Calcolo LHC Bologna,

Presentazioni simili


Presentazione sul tema: "Domenico Elia1 Calcolo ALICE: stato e richieste Domenico Elia Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Riunione con Referee Calcolo LHC Bologna,"— Transcript della presentazione:

1 Domenico Elia1 Calcolo ALICE: stato e richieste Domenico Elia Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Riunione con Referee Calcolo LHC Bologna, 25 Maggio 2015

2 Domenico Elia2Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Outline  ALICE Computing status:  impiego delle risorse 2014  performance siti italiani, attività di R&D su VAF  evoluzione CM per Run2  Richieste finanziarie:  situazione CPU e storage nei Tier-2, dismissioni  richieste 2016 per Tier-1 e Tier-2

3 Domenico Elia3Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Resource usage in 2014  Overall CPU/DISK/TAPE usage:  CPU @ T1, T2 over pledge (opportunistic, extra-WLCG)  DISK usage ~70% (87% full wherever good network connection)  TAPE usage ~90% (but 50% @ T1, will improve with Run2) CERN-RRB-2015-014

4 Domenico Elia4Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Resource usage in 2014  Main activity 2014 / May 2015:  Run1 data reprocessing and associated MC:  pp 2010 (pass4), pp and pPb 2012, pPb 2013 (pass3)  full detector recalibration + improved software, all with the same code  pp 2011 reprocessing being evaluated (overlap with Run2)  further MC productions (~120 cycles):  requests from PWGs (68% pp, 18% pPb, 14% PbPb)  first large-scale production for Run2 (new detector setup)  ~4% generations dedicated to upgrade studies (Run3)  analysis (user and organized trains)  ALICE recommissioning for Run2:  test of upgraded detector readout, trigger, DAQ, recording chain  cosmics trigger data taking with Offline processing

5 Domenico Elia5Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Resource usage in 2014  Main activity 2014 / May 2015:  Run1 data reprocessing and associated MC  further MC productions (~120 cycles)  analysis (user and organized trains) Average: ~45K concurrent jobs ~99.5% availability 85% CPU eff @ T0, T1 79% CPU eff @ T2

6 Domenico Elia6Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Resource usage in 2014  Main activity 2014 / May 2015:  Run1 data reprocessing and associated MC  further MC productions (~120 cycles)  analysis (user and organized trains) MC productions: 71% @ all centres RAW data processing: 6% @ T0/T1 only User analysis: 11% @ all centres Organized analysis: 12% @ all centres  individual analysis decreased by 50% in the period 2012-2014  still ample room to increase the share of organized analysis  reducing individual analysis by factor 2 could still give ~2-5% gain in efficiency

7 Domenico Elia7Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Resource usage in 2014  Main activity 2014 / May 2015:  Run1 data reprocessing and associated MC  further MC productions (~120 cycles)  analysis (user and organized trains)  ALICE recommissioning for Run2  New and upcoming WLCG sites:  KR-KISTI (Korea), T1: in production in 2014  NRC-KI (Russia), T1: in production in 2014  UNAM (Mexico), T2: rumping up, MoU signed in Nov 2014  COMSATS (Pakistan), T2: MoU signed in March 2015  CHPC (South Africa), T2: MoU signed in April 2015

8 Domenico Elia8Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites ~15%

9 Domenico Elia9Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites  Storage availability:

10 Domenico Elia10Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites  Resource usage @ T1:  CPU ~150% pledge (2015)  DISK ~85% (still @ pledge 2014)  TAPE largely underused (~700 TB)

11 Domenico Elia11Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites  Resource usage @ T2:  largely benefits from strict internal coordination  monthly meetings (performance recording) + annual workshop  overall ~35% increase in total WCT from 2013 to 2014  large upgrade in 2 sites (ReCaS):  BARI (almost ready, in production beginning of June)  CATANIA (in production since April, ~1500 core, 1 PB: Catania-VF)

12 Domenico Elia12Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites Bari Torino PD-LNL Catania Pledge: Catania-VF

13 Domenico Elia13Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites  Resource usage @ T2:  largely benefits from strict internal coordination  monthly meetings (performance recording) + annual workshop  overall ~35% increase in total WCT from 2013 to 2014  large upgrade in 2 sites (ReCaS):  BARI (almost ready, in production beginning of June)  CATANIA (in production since April, ~1500 core, 1 PB: Catania-VF)  monitoring T2 data from APEL (Andrea Guarise):  2014 e 2015 pledge values in place  HS06  SI2K conversion factor from sites (BDII) checked/updated  WCT cross-checked vs exp (MonALISA) and local monitorings

14 Domenico Elia14Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites (*) In Aprile Catania-VF monitorata solo attraverso MonAlisa, su EGI da Maggio

15 Domenico Elia15Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites (*) In Aprile Catania-VF monitorata solo attraverso MonAlisa, su EGI da Maggio Coordinamento Tier-2 Conversione ore in Wall_h_kSi2k dei dati EGI

16 Domenico Elia16Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Performance of the Italian sites (*) In Aprile Catania-VF monitorata solo attraverso MonAlisa, su EGI da Maggio Bari (-16%) Torino (-15%) PD-LNL (-7‰) Catania (+2%) Coordinamento Tier-2

17 Domenico Elia17Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status R&D activity on the VAF  In the framework of the STOA-LHC PRIN:  Torino VAF going to be accounted for ALICE  similar (test) cloud infrastructures deployed in 2014:  Bari, Cagliari, Legnaro, Trieste (OpenStack)  Catania could join (new T2 infrastructure)  many activities ongoing, reported to CHEP’15 (April):  Interoperating Cloud-based Virtual Farms for the ALICE experiment at the LHC (Trieste)  Monitoring of IaaS and scientific applications on the Cloud using Elasticsearch ecosystem (Torino)  Managing competing elastic grid and cloud computing applications using OpenNebula (Torino)  Local storage federation through XRootD architecture for interactive distributed analysis (Bari)  white paper to be finalized by end of 2015  futher (connected) activities:  parallel computing (TS, BA), dashboard for the Italian sites (BA)

18 Domenico Elia18Riunione Referee Calcolo LHC / Bologna, 25.5.2015  Targeting integrated luminosity 1 nb -1 for PbPb:  by combination of Run1 and Run2 statistics  consistent with the ALICE approved programme  4-fold increase in instant luminosity for PbPb  Detector upgrades:  complete TRD/PHOS, new DCAL  Double event rate of TPC/TRD:  consolidation of TPC and TRD readout electronics  Increased capacity of HLT and DAQ:  rate up to 8 GB/sec to T0 (for Heavy-Ion data taking) ALICE Computing status Evolution of CM for Run2

19 Domenico Elia19Riunione Referee Calcolo LHC / Bologna, 25.5.2015  ALICE Grid model largely unchanged in Run2:  integration of every new computing centre  average 2 replicas of analysis objects:  dependency on resource stability, 1 copy for least popular data  low differentiation of tasks:  T0/T1 still raw data keepers/producers  all other tasks (MC + analysis) performed everywhere  tasks generally sent to data, but data can go to tasks if needed:  jobs go to data, in case of failure read from closest replica (<5%)  ALICE global data distribution by exclusive use of xrootd protocol  analysis input mostly on AODs (limited use of ESDs)  push analyzers to organized trains (LEGO framework) ALICE Computing status Evolution of CM for Run2

20 Domenico Elia20Riunione Referee Calcolo LHC / Bologna, 25.5.2015  Main software and process improvements:  new version of the software framework (AliRoot 5.x):  effort to improve performance of ALICE reconstruction software  use TRD points in the fit (improve high-momentum resolution)  reduce memory requirements during calibration and reconstruction  use of HLT for online Raw data compression (factor 4):  already tested in Run1, implies reduction of tape storage @ Tier-0/1  use of HLT for calibration:  move first calibration iteration to online  use of HLT track seeds for offline reconstruction  improve performance of GEANT4 simulation for ALICE  further development of fast and parametrized simulation ALICE Computing status Evolution of CM for Run2

21 Domenico Elia21Riunione Referee Calcolo LHC / Bologna, 25.5.2015  Main software and process improvements  Additional improvements:  start adapting ALICE distributed computing to Cloud, using of HLT farm for offline processing  corresponds to additional 3% CPU resources  improving performance of the organized analysis trains  speeding up and improving the efficiency of the analysis activity by active data management  explore contributed resources:  ie spare CPU cycles on supercomputers  collaborating with other experiment on this issue ALICE Computing status Evolution of CM for Run2

22 Domenico Elia22Riunione Referee Calcolo LHC / Bologna, 25.5.2015  Basic assumptions for Run2 resource estimate:  same CPU power needed for reconstruction  25% larger raw event size:  additional detectors, detector coverage  higher track multiplicity with increased beam energy and pileup  MC productions: 100% pp, pPb + 30-40% PbPb events ALICE Computing status Evolution of CM for Run2 T1/2 2016: +25% T1/2 2016: +17%

23 Domenico Elia23Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Summary  Run2:  data volume in the period 2015-2018 expected ~3x Run1  focus of the Grid development will be on improving the analysis efficiency and decreasing the turnaround time of organized trains  several other software and process improvements  site performance and stability will continue to be a key factor for success of the ALICE offline computing  planned resource increase expected to meet the demands, working on data popularity monitoring and replica limitation  Run3:  TDR submitted to LHCC, final discussion first week of June

24 Domenico Elia24Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Situazione CPU/Storage Italia  In produzione al Tier-1:  CPU:22800 HS06 (pledge 2015)  DISK:1920 TB (pledge 2014*)  In produzione ai Tier-2 (+ Cagliari): BariCatania Padova- LNL TorinoCagliariTotale HS06 82401314789239954112041384 TB 360 1204677 844703155 Disponibili (incluso obsoleti non ancora dismessi) Maggio 2015 A pledge 2015 (3380 TB) in Ottobre Risorse 2015: in produzione ReCaS CT, il resto tra Giugno e Settembre (prox slide) Pledge 2015: 38600 HS06 + 4381 TB

25 Domenico Elia25Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Situazione CPU/Storage Tier-2  Assegnazione 2015 parte ReCaS:  Bari (in produzione a Giugno):  rimpiazzi BA: 1568 HS06 + 360 TB  Catania (in produzione da inizio Aprile):  rimpiazzi CT: 1075 HS06, rimpiazzi CA: 840 HS06  parte crescita netta totale ALICE: 1550 HS06 + 500 TB

26 Domenico Elia26Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Situazione CPU/Storage Tier-2  Assegnazione 2015 parte ReCaS:  Bari (in produzione a Giugno):1568 HS06 + 360 TB  Catania (già in produzione): 3465 HS06 + 500 TB  Assegnazione 2015 parte CSN3 (224.5 k€):  quota CPU (32.5 k€), stornata su PD-LNL e TO:  rimpiazzi PD-LNL: 1280 HS06, rimpiazzi TO: 1400 HS06  quota Storage (192 k€):  completamento crescita netta totale + rimpiazzi PD-LNL e TO  gara gestita a Bari (capitolato quasi pronto, in GE a Giugno)  attesa: 6 x 180 TB (1 BA, 2 TO, 3 PD-LNL) = 1.08 PB  consistente risparmio con sole espansioni (BA e PD-LNL)  potrebbero servire ~15 kEuro aggiuntivi (2 server TO)

27 Domenico Elia27Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Situazione CPU/Storage Tier-2  Aggiornata con risorse 2015:  CPU:39216 HS06  in eccesso al pledge: 616 HS06  DISK: 4423 TB  in eccesso al pledge: 42 TB BariCatania Padova- LNL TorinoCagliariTotale HS06 82401314777368973112039216 TB 924120411521123204423 Disponibili a fine 2015 (escluso tutte le dismissioni 2015 + risorse 2015) Pledge 2015: 38600 HS06 + 4381 TB

28 Domenico Elia28Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Dismissioni 2015-17 Anno di dismissione BariCatania LNL- Padova TorinoCagliariTotale HS06 201515681075128014008406163 TB 2015360 065 8150556 HS06 2016156805496158411209768 TB 20160 130260 15720567 HS06 2017000000 TB 20170 11401170231

29 Domenico Elia29Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Dismissioni 2015-17 Anno di dismissione BariCatania LNL- Padova TorinoCagliariTotale HS06 201515681075128014008406163 TB 2015360 065 8150556 HS06 2016156805496158411209768 TB 20160 130260 15720567 HS06 2017000000 TB 20170 11401170231 II semestre 2016  2017

30 Domenico Elia30Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Dismissioni 2015-17 Anno di dismissione BariCatania LNL- Padova TorinoCagliariTotale HS06 201515681075128014008406163 TB 2015360 065 8150556 HS06 2016156805496158411209768 TB 20160 130260 15720567 HS06 2017000000 TB 20170 11401170231  Situazione complessiva Tier-2 nel 2016:  CPU:39216 – 9768 = 29448 HS06  DISK:4423 – 567 = 3856 TB

31 Domenico Elia31Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Esito RRB Aprile 2015  Share INFN per 2016:  CPU, DISK per Tier-1 e Tier-2: 18.5% (19.3% nel 2015)  TAPE per Tier-1: 35.2% (41.1% nel 2015)

32 Domenico Elia32Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Richieste 2016: Tier-1 CPU Tier-1 (HS06) DISK Tier-1 (TBn) Pledged T1 Disp. – dismiss. T2 228003382 Scrutinati ALICE 2016 290453885 Delta 6245503 Stima costo (k€) 87.4120.7 Totale (k€) 208.1 Stima costi T2 (T1): 12 (14) € / HS06 e 220 (240) € / TBn Tape @ T1: quota pledge da 4192 (2015) a 5491 TB (2016) Dismissioni Tier-1: non incluse

33 Domenico Elia33Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Richieste 2016: Tier-1 e Tier-2 CPU Tier-1 (HS06) DISK Tier-1 (TBn) CPU Tier-2 (HS06) DISK Tier-2 (TBn) Pledged T1 Disp. – dismiss. T2 228003382294483856 Scrutinati ALICE 2016 290453885 438454829 Delta 624550314397973 Stima costo (k€) 87.4120.7172.7214.0 Totale (k€) 208.1386.7 Overhead T2 (k€) 48.1 Stima costi T2 (T1): 12 (14) € / HS06 e 220 (240) € / TBn Tape @ T1: quota pledge da 4192 (2015) a 5491 TB (2016) Dismissioni Tier-1: non incluse Overhead Tier-2: 6% CPU + 5% DISCO (rete) + 7% totale (server aggiuntivi)

34 Domenico Elia34Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Richieste 2016: per sito Tier-2 Dismissioni HS06 / TBk€ Bari156818,8 00,0 18,8 Catania00,0 13028,6 LNL-Padova549666,0 26057,2 123,2 Torino158419,0 15734,5 53,5 Cagliari112013,4 204,4 17,8 Dismissioni totale HS06 / TBk€ 9768117,2 567124,7 242,0 Crescita netta HS06 / TBk€ 462955,5 40689,2 144,8 Dismissioni + crescita HS06 / TBk€ 14397172,8 973214,0 386,7 Richiesta completa

35 Domenico Elia35Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Richieste finanziarie Richieste 2016: per sito Tier-2 Dismissioni HS06 / TBk€ Bari156818,8 00,0 18,8 Catania00,0 13028,6 LNL-Padova549666,0 26057,2 123,2 Torino158419,0 15734,5 53,5 Cagliari112013,4 204,4 17,8 Dismissioni totale HS06 / TBk€ 9768117,2 567124,7 242,0 Crescita netta HS06 / TBk€ 462955,5 40689,2 144,8 Dismissioni + crescita HS06 / TBk€ 14397172,8 973214,0 386,7 HS06 / TBk€ 156818,8 00,0 18,8 00,0 0 549666,0 00,0 66,0 158419,0 00,0 19,0 112013,4 204,4 17,8 HS06 / TBk€ 9768117,2 204,4 121,6 HS06 / TBk€ 462955,5 40689,2 144,8 HS06 / TBk€ 14397172,8 42693,6 266,4 Richiesta completa Con rinvio dismissioni storage

36 Domenico Elia36Riunione Referee Calcolo LHC / Bologna, 25.5.2015 BACKUP

37 Domenico Elia37Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE CM: Focus on Run2 Physics programme, upgrades  Targeting integrated luminosity 1 nb -1 for PbPb:  by combination of Run1 and Run2 statistics  consistent with the ALICE approved programme  4-fold increase in instant luminosity for PbPb

38 Domenico Elia38Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE CM: Focus on Run2 Infrastructure improvements  Focus on SE stability:  major factor for successful analysis and high CPU efficiency  goal for all SEs: > 98% availability

39 Domenico Elia39Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE CM: Focus on Run2 Infrastructure improvements  Focus on SE stability  LHCone programme  Network use will increase  IPv6 adoption  Refurbishment of SAM/SUM tests:  WLCG monitoring consolidation projet, advanced status  Site tests will reflect more and more the VO tests:  in the ALICE case provided by MonALISA

40 Domenico Elia40Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE CM: Focus on Run2 Infrastructure improvements  Focus on SE stability  LHCone programme:  brings substantial improvement in inter-site connectivity  allows for further diluition of boundaries between sites and tasks  Europe largely covered, focus on South America and Asia  Network use will increase:  large data volumes, more to transfer between sites  remote access to storage

41 Domenico Elia41Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE CM: Focus on Run2 Infrastructure improvements  Focus on SE stability  LHCone programme  Network use will increase  IPv6 adoption:  IPv4 address depletion is already a fact for new sites  ALICE services are IPv6 ready  xrootd v.4 should be IPv6 ready (release end of May)  other sevices are being brought into compliance

42 Domenico Elia42Riunione Referee Calcolo LHC / Bologna, 25.5.2015 ALICE Computing status Nuova infrastruttura virtuale CT


Scaricare ppt "Domenico Elia1 Calcolo ALICE: stato e richieste Domenico Elia Riunione Referee Calcolo LHC / Bologna, 25.5.2015 Riunione con Referee Calcolo LHC Bologna,"

Presentazioni simili


Annunci Google