Scaricare la presentazione
La presentazione è in caricamento. Aspetta per favore
PubblicatoFausta Poli Modificato 7 anni fa
1
21/02/2008Tutorial analisi distribuita1 Analisi distribuita in ATLAS Carminati Leonardo Universita’ degli Studi e sezione INFN di Milano
2
21/02/2008Tutorial analisi distribuita2 Distributed analysis: the user dream DPD Distributed analysis black box AOD Send jobs to data optimizing the resources
3
21/02/2008Tutorial analisi distribuita3 What are we sopposed to do with distributed analysis? Qui si entra in un terreno scivoloso e la questione ha contorni ancora non ben definiti ( ed io in particolare non sono certo di aver capito ): questo e’ il mio guess alla data odierna Selezione e riduzione dei dati interessanti: selezione basata sull tag + thinning e slimming Analisi o direttamente sugli AOD o su qualche D n PD intermedio creato da working group per produrre il DPD finale per esempio usando un qualche AnalysisSkeleton o qualche codice della famiglia EventView like il DPD finale puo’ ancora essere AOD like o una semplice ntupla di ROOT Analisi sul DPD finale con ROOT o AthenROOTAccess sui T3 AODD 1 PDD n PDPLOT ….
4
21/02/2008Tutorial analisi distribuita4 Distributed analysis in ATLAS Attualmente ci sono due ‘software’ che permettono all’utente in modo trasparente (??) di accedere alle risorse di calcolo per l’analisi distribuita: GANGA : Gaudi Athena aNd Grid Alliance, progetto comune ATLAS / LHCb. Sviluppato in Python Applicazione "user friendly" in grado di Configurare/Sottomettere/Monitorare Inizialmente sviluppato per permettere l’analisi su siti ‘europei’ permette di fare veramente ‘Analisi distribuita’ Pathena: Package integrato in Athena si usa in modo del tutto simile ad athena: athena pathena Sottomissione di job in GRID sostituendo il comando `pathena` ad `athena` E' stato usato pienamente solo a BNL. Alcuni siti europei (LYON) lo stanno usando complementare a GANGA Per il momento indirizza jobs in un sito alla volta. La situazione e’ in realta’ molto piu’ fluida perche’ sono in corso contaminazioni tra i due sistemi di sottomissione
5
21/02/2008Tutorial analisi distribuita5 Distributed analysis with GANGA : some comments A fair report on the ganga performance is complicated because it strongly depends : Data distribution: AODs are not completely duplicated to all T1 as they should be (incomplete datasets) Sites configuration: jobs fail in some sites for several (local) reasons If a dataset is COMPLETE somewhere then ganga works perfectly: the jobs are sent correctly to sites which have a complete replica of the dataset. The problem of ‘bad sites’ (= sites on which my jobs fail) is clearly an issue for the users: Often the jobs fail because the site on which they are executed is not properly configured: this doesn’t depend (directly) on ganga Restrict the access to a minimal list of GOOD sites on which I’m sure my jobs will run (automatic procedure?) reduces a lot the failure Define a ‘black list’ to exclude non optimal sites If a dataset is incomplete then ganga takes care of maximizing the output
6
21/02/2008Tutorial analisi distribuita6 Distributed analysis with ganga: incomplete datasets Send to the site a subjob running on files which are really present at the site In this configuration you will get 25 output files: minimal user intervention and maximal result Subjob 1: files from 5-15 Subjob 2: files from 21-30 Subjob 3: files from 16-20 Site A: incomplete dataset files 5-15 Site B: incomplete dataset files 15-30 Site C: incomplete dataset files 15-20
7
21/02/2008Tutorial analisi distribuita7 Distributed analysis with ganga: current situation For the moment it’s not safe to let ganga decide for you: better to select the site with the maximum amount of files and send jobs there. So here things start to be complicated (at least from a common user point of view) and users tend to become nervous.. A user has to do some operations which are not always simple: Find out where the files are and select the site with the largest number of files Be sure that the selected site is a good one. send jobs there
8
21/02/2008Tutorial analisi distribuita8 Distributed analysis with Pathena Not much to comment here. Pathena is the closest thing to the user dream: you just choose the input dataset name and the output dataset and your jobs on average will succeed with a very high efficiency (no additional worries!). pathena --inDS trig1_misal1_csc11.005310.PythiaH120gamgam.recon.AOD.v12000601_tid005860 --outDS user.LeonardoCarminati.trig1_misal1_csc11.005310.PythiaH120gamgam.recon.AOD --split 10 HggAnalysis_jobOptions.py Many users love Pathena for this ! Ok so where’s the problem? Still I would have some questions on Pathena: Pathena benefits from the fact that (almost) all AODs are copied in BNL: what happens if data are not collected at BNL? When I run Pathena it seems to me that the jobs go to BNL only: is it correct? Is this model scalable with the increase of distributed analysis clients ?
9
21/02/2008Tutorial analisi distribuita9 Distributed analysis: a closer look Cosa faremo Girare un codice di analisi athena–based (HiggsAnalysisUtils) su eventi full simulati di H->gam gam nei siti in cui questi dati sono replicati producendo ntuple per il plot finale. Obiettivo: produrre un plot di massa invariante dell’Higgs che decade in due fotoni Disclaimer: non sono un software developer. Nessuno spazio all’eleganza, solo quello che serve veramente ( e che funziona..) Perche’ questo esempio: perche’ e’ un esempio che conosco bene e che e’ stato ampiamente utilizzato. Facile adattare l’esempio proposto a diversi usecases Spero di riuscire a convicervi che l’analisi distribuita funziona e che e’ l’unico mezzo che abbiamo in ATLAS per fare analisi
10
21/02/2008Tutorial analisi distribuita10 Step I : setting up the Athena environment Create a cmthome directory and enter in the directory: mkdir cmthome cd cmthome Create a requirements file named simply 'requirements'. A basic requirements file at CERN for the following examples would be: #--------------------------------------------------------------- set CMTSITE CERN set SITEROOT /afs/cern.ch macro ATLAS_DIST_AREA ${SITEROOT}/atlas/software/dist macro ATLAS_TEST_AREA "${HOME}/testarea/13.0.40" apply_tag oneTest use AtlasLogin AtlasLogin-* $(ATLAS_DIST_AREA) set PATHENA_GRID_SETUP_SH /afs/usatlas.bnl.gov/lcg/current/etc/profile.d/grid_env.sh #---------------------------------------------------------------
11
21/02/2008Tutorial analisi distribuita11 Step I : setting up the Athena environment Set up your CMT environment source /afs/cern.ch/sw/contrib/CMT/v1r20p20070720/mgr/setup.sh cmt config Create an athena working area and setup the 13.0.40 Athena release: mkdir -p $HOME/testarea/13.0.40 cd $HOME/testarea/13.0.40 source $HOME/cmthome/setup.sh -tag=13.0.40,gcc323 Now your athena setup is ok for release 13.0.40
12
21/02/2008Tutorial analisi distribuita12 An analysis package example: HggAnalysis Common effort from HG1 csc working group: share the analysis code in a common cvs area : as soon as a new piece of code has found to be mature enough it is included in the common analysis /offline/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils : several tools from several contributors HggSelector: Hgg1JetSelector, Hgg2JetSelector, HggEtmissSelector … ConversionsFlagTool: look for different types of conversions HggFitter: photons direction reconstruction PrimaryVertexFinder: find the best primary vertex TrackIsolationTool: apply track isolation cut LRPhotonIdentification: likelihood ratio based photon identification The output is basically a custom ntuple: in particular the HggUserData tree contains the ‘best’ outcome of the analysis.
13
21/02/2008Tutorial analisi distribuita13 Step II : setting up the analysis package Check out the package from CVS and compile it After compilation go in your run directory and copy some useful files here: Now your analysis package is ready: you can run it locally simply… We are going to use the same analysis package using both Pathena and ganga cmt co -r HiggsAnalysisUtils-00-03-21 PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils cd PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/cmt cmt config source setup.sh gmake cd../run cp../share/*.*.
14
21/02/2008Tutorial analisi distribuita14 Before starting: where’s my dataset? Gli strumenti per capire come trovare il dataset discussi ieri Ma se ci interessano solo gli AOD : una copia completa al T1 e una copia divisa (+ o -) equamente tra i tier2 che afferiscono ad una cloud. Allo stato attuale il problema non si pone con pathena: I jobs runnano a BNL e in genere BNL e’ sempre ben fornito di dati. Anche con ganga non dovrebbe importare ma come gia’ ricordato: E’ piu’ sicuro restringere l’analisi a siti affidabili Noi cercheremo di girare sulla cloud italiana: il feedback degli utenti e’ fondamentale per far si che la cloud italiana diventi affidabile. Richiedere che una copia dei dati (‘sottoscrizione’) sia effettuata in un certo sito e’ un’opera semplice che dovremo abituarci a fare http://gridui02.usatlas.bnl.gov:25880/server/pandamon/query?mode=reqsubs0
15
21/02/2008Tutorial analisi distribuita15 Setting up ganga: Login to your lxplus account: Setting up ganga is very easy: latest version is 4.4.7 Ganga saves the outputs in a directory named gangadir. As the outputs might be big, one could just create a gangadir under scratch0 and simply create a link Best way of running Ganga is though a CLIP (Command Line Interface in Python) Session (a Graphical User Interface exist, not covered here) To quit Ganga simply type source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh 4.4.7 ssh lxplus.cern.ch ln -sf /afs/cern.ch/user/l/lcarmina/scratch0/gangadir gangadir ganga ^D
16
21/02/2008Tutorial analisi distribuita16 Distributed analysis with ganga: locate a dataset then the following command will return the list of sites which hold the dataset while the following command will return the number of files for each site In [3]:d.list_locations() Dataset trig1_misal1_mc12.006384.PythiaH120gamgam.recon.AOD.v13003002_tid016421 Complete: BU_DDM AGLT2_SRM ASGCDISK_V2 BNLDISK PICDISK ROMA1 NDGFT1DISK NAPOLI CERNCAF WUP LYONDISK FZKDISK Incomplete: LPNHE TOKYO NIKHEF RALDISK MWT2_UC UTA_SWT2 DESY-HH CNAFDISK MILANO WISC In [4]:d.list_locations_num_files() Out[4]: {'DESY-HH': 29, 'UTA_SWT2': 0, 'CNAFDISK': 40, 'AGLT2_SRM': 0, 'NAPOLI': 40, 'TOKYO': 0, 'NIKHEF': 2, 'ASGCDISK_V2': 40, 'RALDISK': 1, 'NDGFT1DISK': 0, 'BNLDISK': 0, 'PICDISK': 40, 'WISC': 0, 'WUP': 40, 'MWT2_UC': 0, 'MILANO': 40, 'ROMA1': 40, 'LYONDISK': 40, 'FZKDISK': 40, 'BU_DDM': 0, 'LPNHE': 0, 'CERNCAF': 40} d=DQ2Dataset() d.dataset='trig1_misal1_mc12.006384.PythiaH120gamgam.recon.AOD.v13003002_tid016421‘ ATTENZIONE
17
21/02/2008Tutorial analisi distribuita17 Prepare a script to submit your job Il modo piu’ semplice e veloce di usare ganga e’ utilizzando degli scripts di python per definire tutte le caratteristiche del job: j = Job() j.name='Hgg120-7905' 1) Define your job j.application=Athena() j.application.prepare(athena_compile=False) j.application.option_file='$HOME/testarea/13.0.40/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/HggAnal ysis_jobOption.py' 2) Define the application and its attributes In this case we are using Athena. The attribute can be for example j.Application=Root() or j.Application=Executable() The attribute ‘prepare’ can be used to state if we want the code to be compiled on the wn. In this case our compiled code is sent to wn In the ‘option’ attribute specify the jobOptions we need
18
21/02/2008Tutorial analisi distribuita18 Prepare a script to submit your job j.splitter=AthenaSplitterJob() j.splitter.numsubjobs=10 3) Define a splitter and the number of subjobs j.inputdata=DQ2Dataset() j.inputdata.dataset="trig1_misal1_mc12.006384.PythiaH120gamgam.recon.AOD.v13003002_tid016421" j.inputdata.match_ce_all=True j.inputdata.min_num_files=10 4) Define input data and its attributes Inputdata can be a dq2 dataset (DQ2Dataset) or local files (ATLASLocalDataset) match_ce_all=true allows the job to go to sites with an incomplete replica of the interesting datase min_num_files is usually set for safety: sometimes a dataset is registered in a site as incomplete although it effectively holds no files at all
19
21/02/2008Tutorial analisi distribuita19 Prepare a script to submit your job (II) j.outputdata.outputdata=['AnalysisSkeleton.aan.root'] j.outputdata=DQ2OutputDataset() j.outputdata.datasetname='Hgg120cloudtest-1-MILANO-rel13040' 5) Define output data and its attributes outputdata is the name of the output ntuple The output can be set to DQ2Dataset or ATLASOutputDataset. In the first case the output is registered in dq2 (better for close-to final results) while in the second is saved locally (better for tests). In case of a DQ2Dataset, datasetname specifies the name of the output dataset. In this case it will be users.LeonardoCarminati.ganga.Hgg120cloudtest-1-MILANO-rel13040 In case of ATLASOutputDataset the output ntuples has (have) to be retrieved by jobs(jobid).outputdata.retrieve() jobs(jobid).subjobs(n).outputdata.retrieve() Outputs will be saved in your gangadir/workspace/Local/’jobid’/’n’/output
20
21/02/2008Tutorial analisi distribuita20 Prepare a script to submit your job (II) j.merger=AthenaOutputMerger() 6) Define a merger of output ntuples Provided that you have a proper ROOT environment set, in this way output ntuples can be meged in a single one just typing (once the job is finished) Jobs(jobid).merger.merge() The merged ntuples will be stored in gangadir/workspace/Local/’jobid’/output 7) Add required auxiliary files in the input sandbox j.inputsandbox=['$HOME/scratch0/13.0.30.2/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/dzps.dat', '$HOME/scratch0/13.0.30.2/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/dzstr.dat', '$HOME/scratch0/13.0.30.2/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/dzmid.dat', '$HOME/scratch0/13.0.30.2/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/PT2dist.root', '$HOME/scratch0/13.0.30.2/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/photonLikelihoodPdf.root', '$HOME/scratch0/13.0.30.2/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/LVL1Config.xml', ]
21
21/02/2008Tutorial analisi distribuita21 Prepare a script to submit your job (II) j.backend=LCG() j.backend.requirements=AtlasLCGRequirements() j.backend.requirements.sites= [‘CNAF’,’LNF’,'MILANO’,’NAPOLI’,’ROMA1’] j.submit() 8) Define where you want to send your jobs j.backend.requirements.sites=[…] allow you to restrict the submission to a user defined list of sites (safer procedure) J.backend.requirements.excluded_sites=[…] allow you to ban sites The list of available sites can be obtained from the ganga prompt d.AtlasLCGRequirements() d.list_sites() In principle you can also force the jobs to a site directly j.backend.CE ='ce05-lcg.cr.cnaf.infn.it:2119/jobmanager-lcglsf-atlas‘ The list of CE can be obtained from shell prompt lcg-infosites –vo atlas ce 9) Submit the job
22
21/02/2008Tutorial analisi distribuita22 Prepare a script to submit your job (II) j = Job() j.name='Hgg-120GeV' j.application=Athena() j.application.prepare(athena_compile=False) j.application.option_file='$HOME/testarea/13.0.40/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/HggAnalysis_jobOption.py' j.splitter=AthenaSplitterJob() j.splitter.numsubjobs=10 j.inputdata=DQ2Dataset() j.inputdata.dataset="trig1_misal1_mc12.006384.PythiaH120gamgam.recon.AOD.v13003002_tid016421" j.inputdata.match_ce_all=True j.outputdata=DQ2OutputDataset() j.outputdata.outputdata=['AnalysisSkeleton.aan.root'] j.outputdata.datasetname='Hgg120-tutorialRoma3-rel13040‘ j.inputsandbox=['$HOME/testarea/13.0.40/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/dzps.dat', '$HOME/testarea/13.0.40/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/dzstr.dat', … ] j.backend=LCG() j.backend.requirements=AtlasLCGRequirements() j.backend.requirements.sites= ['ROMA1','NAPOLI',’LNF’,'MILANO','CNAF'] j.submit() Analysis jobs on the italian cloud and register the output in dq2
23
21/02/2008Tutorial analisi distribuita23 Submit and monitor a job with ganga: To submit your job Ganga fornisce dei tools per seguire la vita del nostro job: Se tutto va bene il job passa da ‘submitted’ a ‘running’ e poi ‘completed’ Si possono avere anche informazioni piu’ dettagliate In [5]:jobs Out[5]: Statistics: 1 jobs -------------- # id status name subjobs application backend backend.actualCE # 1 running Hgg120 10 Athena LCG In [5]:jobs(“1.0”) ganga execfile('$HOME/testarea/13.0.40/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/gangaTest-tutorialRoma3.py')
24
21/02/2008Tutorial analisi distribuita24 Getting the output: Nel caso abbiamo selezionato ‘DQ2OutputDataset’, quando il job raggiunge lo stato ‘completed’ gli output sono registrati da ganga in dq2 e quindi facilmente accessibili con I tools di dq2 Nel caso di `ATLASOutputDataset’ le ntuples devono essere recuperate con il comando retrieve() ed eventualmente mergiate. I logs del job (stderr e stdout) sono salvati nella directory gangadir/workspace/Local/’jobid’/’subjob’/output Si puo’ dare un’occhiata ai logs anche dentro ganga semplicemente jobs(jobid). subjobs[n]. peek('stdout', 'cat')
25
21/02/2008Tutorial analisi distribuita25 Documentation and users support Sito web ufficiale di ganga http://ganga.web.cern.ch/ganga Wiki page di ATLAS: tutorials e informazioni utili https://twiki.cern.ch/twiki/bin/view/Atlas/DistributedAnalysisUsingGanga Hypernews molto ricca di spunti e molto efficiente (please subscribe!) hn-atlas-GANGAUserDeveloper@cern.ch Il feedback degli utenti e’ essenziale: gli sviluppatori sono in genere molto attenti alle esigenze degli utenti e lo sviluppo di ganga e’ realmente user-driven
26
21/02/2008Tutorial analisi distribuita26 Setting up pathena pathena e’ un package di Athena: occorre fare checkout come per qualsiasi altro pacchetto Spostarsi nella testarea Fare checkout del pacchetto e compilare cmt co PhysicsAnalysis/DistributedAnalysis/PandaTools cd PhysicsAnalysis/DistributedAnalysis/PandaTools/*/cmt source setup.sh make cd ${HOME}/testarea/13.0.40
27
21/02/2008Tutorial analisi distribuita27 Submitting a job with pathena Sottomettere un job con pathena e’ molto semplice: Spostarsi nell’area run del pacchetto HiggsAnalysisUtils Sottometti il job dal prompt di shell NOTA: l’output dataset deve essere del tipo user.LeonardoCarminati.something Nota: attualmente tutti I jobs vanno a BNL non c’e’ bisogno di specificare siti pathena --inDS --outDS --split 10 --extFile HggAnalysis_jobOption.py ${HOME}/testarea/13.0.40/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run
28
21/02/2008Tutorial analisi distribuita28 Monitoring your jobs pathena_util status(jobID) ====================================== JobID : 1 time : 2008-02-16 15:12:05 inDS : trig1_misal1_mc12.006384.PythiaH120gamgam.recon.AOD.v13003002_tid016421 outDS : user.LeonardoCarminati.Hgg120-tutorialRoma3-rel13040 libDS : user.LeonardoCarminati.lxplus225_76.lib._000001 build : 7659827 run : 7659828-7659837 jobO : HggAnalysis_jobOption.py site : ANALY_BNL_ATLAS_1 ---------------------- buildJob : succeeded ---------------------- runAthena : total : 10 succeeded : 0 failed : 0 running : 10 unknown : 0 ---------------------- Chiamare pathena_utils Chiedi il summary del tuo job (ricorda il jobID)
29
21/02/2008Tutorial analisi distribuita29 Getting your output Di default l’output dataset viene registrato in dq2 Lo stato del job e I relativi output possono essere controllati sulla pagina di panda
30
21/02/2008Tutorial analisi distribuita30 Conclusioni L’analisi distribuita e’ uno strumento indispensabile per fare analisi in ATLAS (non credo ci siano alternative) Gli strumenti di analisi distribuita disponibili sul mercato sono essenzialmente Ganga e pathena. Non e’ chiaro cosa succedera’: probabilmente alla fine ci sara’ un’unica interfaccia per l’utente finale Vorremmo convircevi che I tools, per quanto fragili, funzionano e permettono di fare analisi. Piu’ ampia e’ la comunita’ di utenti e maggiore e’ la velocita’ con cui I tools evolvono. In particolare supportare lo sviluppo di T1 e cloud italiana
Presentazioni simili
© 2024 SlidePlayer.it Inc.
All rights reserved.