Tier2 di Milano Componenti e Monitoring Luca Vaccarossa Milano 14 dicembre 2007
User Interface (UI) E’ la macchina con i comandi per la sottomissione a Grid voms-proxy-init / grid-proxy-init edg-job-sumit edg-job-status edg-job-get-output
User Interface (UI) atlfarm008.mi.infn.it atlfarm010.mi.infn.it grid008.mi.infn.it
Computing Element t2-ce-01.mi.infn.it Grid gateway PBS server (TORQUE) MAUI scheduler
Computing Element Il sistema batch della farm e' Torque + Maui. le code abilitate per gli utenti locali sono: local (max cpu time 48h, max walltime 72h) short (coda corta con cpu riservate, max cpu time 40m, max walltime 2h)
Worker Nodes (WN) grid009.mi.infn.it grid012.mi.infn.it grid016.mi.infn.it grid017.mi.infn.it grid018.mi.infn.it grid019.mi.infn.it grid021.mi.infn.it grid022.mi.infn.it grid023.mi.infn.it grid024.mi.infn.it grid025.mi.infn.it grid026.mi.infn.it t2-wn-02.mi.infn.it t2-wn-03.mi.infn.it t2-wn-04.mi.infn.it t2-wn-05.mi.infn.it
Worker Nodes (WN) t2-wn-06.mi.infn.it t2-wn-07.mi.infn.it t2-wn-08.mi.infn.it t2-wn-09.mi.infn.it t2-wn-13.mi.infn.it t2-wn-14.mi.infn.it t2-wn-15.mi.infn.it t2-wn-16.mi.infn.it t2-wn-17.mi.infn.it t2-wn-18.mi.infn.it t2-wn-19.mi.infn.it t2-wn-21.mi.infn.it t2-wn-22.mi.infn.it t2-wn-23.mi.infn.it t2-wn-24.mi.infn.it
Comandi PBS showq Show job status and some job info showbf [-v] Check for immediately available CPUs and nodes checkjob [-v] | qstat -f Check job status canceljob Cancel a job, sending essentially a qdel to the pbs_server showstart [-h] Show when job is scheduled to start
Comandi PBS PBSNODES –a | less Si vedono i WN che non hanno job Segnalare a
Priorita’ e FairShare Priorita’: diagnose –p FS: diagnose –f
Chi sono io ? "/C=IT/O=INFN/OU=Personal Certificate/L=Milano/CN=Silvia resconi "/C=IT/O=INFN/OU=Personal Certificate/L=Milano/CN=Tommaso Lari" lari
Chi sono io ? "/C=IT/O=INFN/OU=Personal Certificate/L=Milano/CN=Attilio Andreazza" andreazz "/C=IT/O=INFN/OU=Personal Certificate/L=Milano/CN=Clara Troncon" troncon "/C=IT/O=INFN/OU=Personal Certificate/L=Milano/CN=Leonardo Carminati" lcarmina
Chi sono io ? "/C=IT/O=INFN/OU=Personal Certificate/L=Milano/CN=Donatella Cavalli" cavalli "/C=IT/O=INFN/OU=Personal Certificate/L=Milano/CN=Caterina Pizio" pizio
Chi sono io ? "/C=IT/O=INFN/OU=Personal Certificate/L=Milano/CN=Umberto De Sanctis" atlas012 "/C=IT/O=INFN/OU=Personal Certificate/L=Milano/CN=Simone Montesano" atlas020
Chi sono io ? "/C=IT/O=INFN/OU=Personal Certificate/L=Milano/CN=Chiara Tamarindi" atlas033 "/C=IT/O=INFN/OU=Personal Certificate/L=Genova/CN=Fabrizio Parodi" parodi "/C=IT/O=INFN/OU=Personal Certificate/L=Genova/CN=Bianca Osculati" osculati
GridView Monitoring and Visualization Tool for LCG Data Transfer Job Status Service Availability
SAM Tests Certificato nel browser Test automatici SAM on demand? c&page=samadminhttps://cic.gridops.org/index.php?section=r c&page=samadmin
Ganglia “Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids.”
Ganglia It relies on a multicast-based listen/announce protocol to monitor state within clusters and uses a tree of point-to-point connections amongst representative cluster nodes to federate clusters and aggregate their state.multicast It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. XMLXDRRRDtool