La presentazione è in caricamento. Aspetta per favore

La presentazione è in caricamento. Aspetta per favore

Tiziana Ferrari (INFN CNAF), Luciano Gaido (INFN TO)

Presentazioni simili


Presentazione sul tema: "Tiziana Ferrari (INFN CNAF), Luciano Gaido (INFN TO)"— Transcript della presentazione:

1 Tiziana Ferrari (INFN CNAF), Luciano Gaido (INFN TO)
INFN Grid Operations Tiziana Ferrari (INFN CNAF), Luciano Gaido (INFN TO)

2 Outline Statistics: EGEE III SA1 ongoing activities
Availability and reliability of Italian region LHC job submission via WMS EGEE III SA1 ongoing activities Grid core services: plan of upgrade Testbeds (status) Richieste 2009: Inventariabile Missioni e consumo

3 EGEE Availability/Reliability: June 08 (1/3)
36 certified sites Improvement of Italian region availability/reliability: still a lot of progress to be made 20-23 June: top-level BDII down (electrical power outage at T1 computing room affecting the DNS server)  top-level BDII failover mechanism prevented to work Entire IT region affected Actions: Configuration of secondary DNS servers for the cnaf.infn.it domain and related sub-domains (GARR) - DONE IT ROC: direct weekly monitoring of SAM statistics of every site (from July 08)

4 EGEE Availability/Reliability: June 08 (2/3)
ITALY

5 EGEE Availability/Reliability: June 08 (3/3)
ITALY

6 Availability/Reliability: Jan-May 08
SAM tests affected by a gLite mw bug (Classic SE) An automatic mechanism for statistics amendment in case of mw and SAM tests problems STILL MISSING Alarms automatically raised in case of SAM test failures need to be put in production Re-engineering of SAM test in progress (Nagios, regionalization of probes)

7 LHC Job submission via WMS
WMS Monitor: 01.cnaf.infn.it:8443/wmsmon/main/main.php Stats shown here are collected for the WMS production servers at CNAF (RB statistics not included here, a few additional WMS outside CNAF non included either) Submitted jobs (not including test activities): ALICE: 0.38 Mjob (migration to WMS since mid June) ATLAS: 0.2 Mjob CMS: 3.3 Mjob LHCb: 0.33 Mjob

8 ALICE ATLAS CMS LHCb

9 Main EGEE III SA1 ongoing activities (1/2)
Improvement of failover solutions: procurement of new full redundant hardware and migration of most critical core (servers, network switches) – at CNAF and other sites hosting core services Improvement of monitoring and alarms (regionalization of SAM via nagios, SMS alarms, …) WMS load balancing testing DNS Improvement of site availability/reliability CREAM pilot services: functional and scalability tests (PD, CNAF, Bari, Catania)

10 Main EGEE III SA1 ongoing activities (2/2)
Restructuring of Grid oversight activities (turni di monitoraggio) Grid security Replacement of classicSE instances with StoRM, StoRM support (currently installed in 9 Italian sites: ESA-ESRIN,  INFN-BOLOGNA, INFN-CNAF-LHCB, INFN-FERRARA, INFN-GENOVA, INFN-PARMA, INFN-PISA, INFN-ROMA3, INFN-T1 ) Integration of new resources (PON) Accounting DGAS: planning of development activities to adopt the RUS standards across EGEE domains Storage accounting (SAGE, INFN CT): preliminary testing, porting, integration with

11 Grid core services: plan of upgrade
Major hw upgrades in the coming month at CNAF: VOMS (two servers) + new VOMS replica of CERN instance LFC  upgrade and oracle backend 10 WMS/LB servers (dedicated to LHC VOs)  10 blades (T1 tender), installation expected by the end of the month Virtualization: UI, site BDII (one instance to be added for failover), myproxy server, …  utilizzo di fondi assegnati per il 2008 e integrati ad Apr 08 (20 keuro in totale) a fronte di una spesa di 30 keuro

12 Testbeds Development testbed: no major changes since last April for:
EGEE certification testbed (SA3) INFN Grid certification testbed Pre-production testbed (mostly virtual machines, expected to reduce in size in the coming months) Experimental services: WMS (a few instances, mainly for testing of SL4 WMS features, CMS) CREAM: Functional tests on existing hw Scalabilty tests: PD: existing hw funded in Sep 2007 CNAF: about 10 servers currenlty hosting WMS/LB production server and waiting to be migrated to new fully redundant hw (16 blades, installation next week, funding: 6 KEuro (CNAF funding for 2008) Keuro (referee meeting Apr 08) + CNAF structural funds (servizio Grid Operations e UF T1) Bari/Catania: hw available on site

13 Richieste materiale INV  94 Ke + 100 Ke (tasca) 1/2
OBIETTIVO 1: SERVIZI CORE INFN Grid Bari: richiesta di 1 WMS e 1 LB (sostituzione di hw obsoleto, ATLAS/CMS backup)  INV: 8 Keuro Catania: richiesta 1 WMS , 1 LB (backup ALICE/ATLAS/LHCb), accounting 1 HLR multi-sito (per siti della Grid del sud), sostituzione di hw obsoleto  INV: 12 Keuro Ferrara: top-level BDII (backup servizio centrale per tutta la Grid)  3 Keuro Padova: 3 server per vitualizzazione servizi: VoMS (backup server centrale), HLR, WMS, LB (backup CMS e ALICE), top-level BDII (backup)  12 Keuro

14 Richieste materiale INV  94 Ke + 100 Ke (tasca) 2/2
OBIETTIVO 2: POTENZIAMENTO DI SITO/SOSTITUZIONE HW OBSOLETO: Genova: 3 calcolatori per SRM (con scheda FC), CE e BDII  9 Keuro Lecce: 3 server per CE, BDII e UI, SE con scheda FC  9 keuro Pisa: UI general purpose  2.5 Keuro Roma2: CE+BDII, SE (con 2 TB di disco) + switch (hw obsoleto)  7 Keuro Trieste:  16.5 Keuro I PRIORITA: server STORM + 3 TB disco (5.5 Keuro); switch per collegamento nuovi nodi nella farm (3 Keuro); II PRIORITA': 2 box twin per WN (8 Keuro) CNAF  INV: 115 Keuro Potenziamento SE sito INFN CNAF e prove WMS con GPFS (elevato consumo di spazio disco)  9 keuro fondi assegnati per il 2008 e integrati ad Apr 08 (20 keuro in totale) insufficienti per gli upgrade necessari ad oggi (30 Keuro)  6 Keuro di integrazione Tasca per sostituzione servizi di sito Grid obsoleti (CE, SE, WN, …) in ulteriori siti INFN  100 Keuro (assegnati al CNAF)

15 Missioni e consumo (1/2) Missioni italia (workshop INFN Grid, riunioni di coordimento a livello italiano): 1.5 Keuro per siti piccoli, 3 Keuro per i T2, di piu' nelle sedi con personale SA1 Missioni estero: solo siti con attivita' SA1, o T2 (partecipazione di una persona a conferenza egee) Consumo: 4 Keuro T2, 2 Keuro altri siti Totale richieste: Missioni IT: 66.5 Keuro Missioni estero: 97.5 Keuro Consumo: 61 Keuro

16 Missioni e consumo (2/2) BA BO CA CT FE FI GE LE LNL MI NA PD PG PR PI
Sito BA BO CA CT Cnaf FE FI GE LE LNL Persone/FTE 3/1.1 - 1/0.5 15/9.45 1/0.2 1/0.25 Missioni IT 5.0 1.5 2.0 3.0 14 Missioni Estero 46.5 6.0 Consumo 4 2 Sito MI NA PD PG PR PI RM1 RM2 RM3 TO TS Persone/FTE 1/0.25 4/2.7 - 6/4.4 Missioni IT 1.5 6.0 3.0 9.0 Missioni Es. 8.0 22 Consumo 4 2 3


Scaricare ppt "Tiziana Ferrari (INFN CNAF), Luciano Gaido (INFN TO)"

Presentazioni simili


Annunci Google