La presentazione è in caricamento. Aspetta per favore

La presentazione è in caricamento. Aspetta per favore

21/02/2008Tutorial analisi distribuita1 Analisi distribuita in ATLAS Carminati Leonardo Universita’ degli Studi e sezione INFN di Milano.

Presentazioni simili


Presentazione sul tema: "21/02/2008Tutorial analisi distribuita1 Analisi distribuita in ATLAS Carminati Leonardo Universita’ degli Studi e sezione INFN di Milano."— Transcript della presentazione:

1 21/02/2008Tutorial analisi distribuita1 Analisi distribuita in ATLAS Carminati Leonardo Universita’ degli Studi e sezione INFN di Milano

2 21/02/2008Tutorial analisi distribuita2 Distributed analysis: the user dream DPD Distributed analysis black box AOD Send jobs to data optimizing the resources

3 21/02/2008Tutorial analisi distribuita3 What are we sopposed to do with distributed analysis? Qui si entra in un terreno scivoloso e la questione ha contorni ancora non ben definiti ( ed io in particolare non sono certo di aver capito ): questo e’ il mio guess alla data odierna  Selezione e riduzione dei dati interessanti:  selezione basata sull tag + thinning e slimming  Analisi o direttamente sugli AOD o su qualche D n PD intermedio creato da working group per produrre il DPD finale  per esempio usando un qualche AnalysisSkeleton o qualche codice della famiglia EventView like  il DPD finale puo’ ancora essere AOD like o una semplice ntupla di ROOT  Analisi sul DPD finale con ROOT o AthenROOTAccess sui T3 AODD 1 PDD n PDPLOT ….

4 21/02/2008Tutorial analisi distribuita4 Distributed analysis in ATLAS Attualmente ci sono due ‘software’ che permettono all’utente in modo trasparente (??) di accedere alle risorse di calcolo per l’analisi distribuita:  GANGA : Gaudi Athena aNd Grid Alliance, progetto comune ATLAS / LHCb.  Sviluppato in Python  Applicazione "user friendly" in grado di Configurare/Sottomettere/Monitorare  Inizialmente sviluppato per permettere l’analisi su siti ‘europei’  permette di fare veramente ‘Analisi distribuita’  Pathena: Package integrato in Athena  si usa in modo del tutto simile ad athena: athena  pathena  Sottomissione di job in GRID sostituendo il comando `pathena` ad `athena`  E' stato usato pienamente solo a BNL. Alcuni siti europei (LYON) lo stanno usando  complementare a GANGA  Per il momento indirizza jobs in un sito alla volta.  La situazione e’ in realta’ molto piu’ fluida perche’ sono in corso contaminazioni tra i due sistemi di sottomissione

5 21/02/2008Tutorial analisi distribuita5 Distributed analysis with GANGA : some comments  A fair report on the ganga performance is complicated because it strongly depends :  Data distribution: AODs are not completely duplicated to all T1 as they should be (incomplete datasets)  Sites configuration: jobs fail in some sites for several (local) reasons  If a dataset is COMPLETE somewhere then ganga works perfectly: the jobs are sent correctly to sites which have a complete replica of the dataset.  The problem of ‘bad sites’ (= sites on which my jobs fail) is clearly an issue for the users: Often the jobs fail because the site on which they are executed is not properly configured: this doesn’t depend (directly) on ganga  Restrict the access to a minimal list of GOOD sites on which I’m sure my jobs will run (automatic procedure?) reduces a lot the failure  Define a ‘black list’ to exclude non optimal sites  If a dataset is incomplete then ganga takes care of maximizing the output

6 21/02/2008Tutorial analisi distribuita6 Distributed analysis with ganga: incomplete datasets Send to the site a subjob running on files which are really present at the site In this configuration you will get 25 output files: minimal user intervention and maximal result Subjob 1: files from 5-15 Subjob 2: files from 21-30 Subjob 3: files from 16-20 Site A: incomplete dataset files 5-15 Site B: incomplete dataset files 15-30 Site C: incomplete dataset files 15-20

7 21/02/2008Tutorial analisi distribuita7 Distributed analysis with ganga: current situation  For the moment it’s not safe to let ganga decide for you: better to select the site with the maximum amount of files and send jobs there.  So here things start to be complicated (at least from a common user point of view) and users tend to become nervous..  A user has to do some operations which are not always simple:  Find out where the files are and select the site with the largest number of files  Be sure that the selected site is a good one.  send jobs there

8 21/02/2008Tutorial analisi distribuita8 Distributed analysis with Pathena  Not much to comment here. Pathena is the closest thing to the user dream: you just choose the input dataset name and the output dataset and your jobs on average will succeed with a very high efficiency (no additional worries!).  pathena --inDS trig1_misal1_csc11.005310.PythiaH120gamgam.recon.AOD.v12000601_tid005860 --outDS user.LeonardoCarminati.trig1_misal1_csc11.005310.PythiaH120gamgam.recon.AOD --split 10 HggAnalysis_jobOptions.py  Many users love Pathena for this !  Ok so where’s the problem? Still I would have some questions on Pathena:  Pathena benefits from the fact that (almost) all AODs are copied in BNL: what happens if data are not collected at BNL?  When I run Pathena it seems to me that the jobs go to BNL only: is it correct?  Is this model scalable with the increase of distributed analysis clients ?

9 21/02/2008Tutorial analisi distribuita9 Distributed analysis: a closer look Cosa faremo Girare un codice di analisi athena–based (HiggsAnalysisUtils) su eventi full simulati di H->gam gam nei siti in cui questi dati sono replicati producendo ntuple per il plot finale. Obiettivo:  produrre un plot di massa invariante dell’Higgs che decade in due fotoni Disclaimer: non sono un software developer. Nessuno spazio all’eleganza, solo quello che serve veramente ( e che funziona..) Perche’ questo esempio: perche’ e’ un esempio che conosco bene e che e’ stato ampiamente utilizzato. Facile adattare l’esempio proposto a diversi usecases Spero di riuscire a convicervi che l’analisi distribuita funziona e che e’ l’unico mezzo che abbiamo in ATLAS per fare analisi

10 21/02/2008Tutorial analisi distribuita10 Step I : setting up the Athena environment  Create a cmthome directory and enter in the directory: mkdir cmthome cd cmthome  Create a requirements file named simply 'requirements'. A basic requirements file at CERN for the following examples would be: #--------------------------------------------------------------- set CMTSITE CERN set SITEROOT /afs/cern.ch macro ATLAS_DIST_AREA ${SITEROOT}/atlas/software/dist macro ATLAS_TEST_AREA "${HOME}/testarea/13.0.40" apply_tag oneTest use AtlasLogin AtlasLogin-* $(ATLAS_DIST_AREA) set PATHENA_GRID_SETUP_SH /afs/usatlas.bnl.gov/lcg/current/etc/profile.d/grid_env.sh #---------------------------------------------------------------

11 21/02/2008Tutorial analisi distribuita11 Step I : setting up the Athena environment  Set up your CMT environment source /afs/cern.ch/sw/contrib/CMT/v1r20p20070720/mgr/setup.sh cmt config  Create an athena working area and setup the 13.0.40 Athena release: mkdir -p $HOME/testarea/13.0.40 cd $HOME/testarea/13.0.40 source $HOME/cmthome/setup.sh -tag=13.0.40,gcc323  Now your athena setup is ok for release 13.0.40

12 21/02/2008Tutorial analisi distribuita12 An analysis package example: HggAnalysis  Common effort from HG1 csc working group: share the analysis code in a common cvs area : as soon as a new piece of code has found to be mature enough it is included in the common analysis  /offline/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils : several tools from several contributors  HggSelector: Hgg1JetSelector, Hgg2JetSelector, HggEtmissSelector …  ConversionsFlagTool: look for different types of conversions  HggFitter: photons direction reconstruction  PrimaryVertexFinder: find the best primary vertex  TrackIsolationTool: apply track isolation cut  LRPhotonIdentification: likelihood ratio based photon identification  The output is basically a custom ntuple: in particular the HggUserData tree contains the ‘best’ outcome of the analysis.

13 21/02/2008Tutorial analisi distribuita13 Step II : setting up the analysis package  Check out the package from CVS and compile it  After compilation go in your run directory and copy some useful files here:  Now your analysis package is ready: you can run it locally simply…  We are going to use the same analysis package using both Pathena and ganga cmt co -r HiggsAnalysisUtils-00-03-21 PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils cd PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/cmt cmt config source setup.sh gmake cd../run cp../share/*.*.

14 21/02/2008Tutorial analisi distribuita14 Before starting: where’s my dataset?  Gli strumenti per capire come trovare il dataset discussi ieri  Ma se ci interessano solo gli AOD : una copia completa al T1 e una copia divisa (+ o -) equamente tra i tier2 che afferiscono ad una cloud.  Allo stato attuale il problema non si pone con pathena: I jobs runnano a BNL e in genere BNL e’ sempre ben fornito di dati.  Anche con ganga non dovrebbe importare ma come gia’ ricordato:  E’ piu’ sicuro restringere l’analisi a siti affidabili  Noi cercheremo di girare sulla cloud italiana: il feedback degli utenti e’ fondamentale per far si che la cloud italiana diventi affidabile.  Richiedere che una copia dei dati (‘sottoscrizione’) sia effettuata in un certo sito e’ un’opera semplice che dovremo abituarci a fare http://gridui02.usatlas.bnl.gov:25880/server/pandamon/query?mode=reqsubs0

15 21/02/2008Tutorial analisi distribuita15 Setting up ganga:  Login to your lxplus account:  Setting up ganga is very easy: latest version is 4.4.7  Ganga saves the outputs in a directory named gangadir. As the outputs might be big, one could just create a gangadir under scratch0 and simply create a link  Best way of running Ganga is though a CLIP (Command Line Interface in Python) Session (a Graphical User Interface exist, not covered here)  To quit Ganga simply type source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh 4.4.7 ssh lxplus.cern.ch ln -sf /afs/cern.ch/user/l/lcarmina/scratch0/gangadir gangadir ganga ^D

16 21/02/2008Tutorial analisi distribuita16 Distributed analysis with ganga: locate a dataset then the following command will return the list of sites which hold the dataset while the following command will return the number of files for each site In [3]:d.list_locations() Dataset trig1_misal1_mc12.006384.PythiaH120gamgam.recon.AOD.v13003002_tid016421 Complete: BU_DDM AGLT2_SRM ASGCDISK_V2 BNLDISK PICDISK ROMA1 NDGFT1DISK NAPOLI CERNCAF WUP LYONDISK FZKDISK Incomplete: LPNHE TOKYO NIKHEF RALDISK MWT2_UC UTA_SWT2 DESY-HH CNAFDISK MILANO WISC In [4]:d.list_locations_num_files() Out[4]: {'DESY-HH': 29, 'UTA_SWT2': 0, 'CNAFDISK': 40, 'AGLT2_SRM': 0, 'NAPOLI': 40, 'TOKYO': 0, 'NIKHEF': 2, 'ASGCDISK_V2': 40, 'RALDISK': 1, 'NDGFT1DISK': 0, 'BNLDISK': 0, 'PICDISK': 40, 'WISC': 0, 'WUP': 40, 'MWT2_UC': 0, 'MILANO': 40, 'ROMA1': 40, 'LYONDISK': 40, 'FZKDISK': 40, 'BU_DDM': 0, 'LPNHE': 0, 'CERNCAF': 40} d=DQ2Dataset() d.dataset='trig1_misal1_mc12.006384.PythiaH120gamgam.recon.AOD.v13003002_tid016421‘ ATTENZIONE

17 21/02/2008Tutorial analisi distribuita17 Prepare a script to submit your job Il modo piu’ semplice e veloce di usare ganga e’ utilizzando degli scripts di python per definire tutte le caratteristiche del job: j = Job() j.name='Hgg120-7905' 1) Define your job j.application=Athena() j.application.prepare(athena_compile=False) j.application.option_file='$HOME/testarea/13.0.40/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/HggAnal ysis_jobOption.py' 2) Define the application and its attributes  In this case we are using Athena. The attribute can be for example j.Application=Root() or j.Application=Executable()  The attribute ‘prepare’ can be used to state if we want the code to be compiled on the wn. In this case our compiled code is sent to wn  In the ‘option’ attribute specify the jobOptions we need

18 21/02/2008Tutorial analisi distribuita18 Prepare a script to submit your job j.splitter=AthenaSplitterJob() j.splitter.numsubjobs=10 3) Define a splitter and the number of subjobs j.inputdata=DQ2Dataset() j.inputdata.dataset="trig1_misal1_mc12.006384.PythiaH120gamgam.recon.AOD.v13003002_tid016421" j.inputdata.match_ce_all=True j.inputdata.min_num_files=10 4) Define input data and its attributes  Inputdata can be a dq2 dataset (DQ2Dataset) or local files (ATLASLocalDataset)  match_ce_all=true allows the job to go to sites with an incomplete replica of the interesting datase  min_num_files is usually set for safety: sometimes a dataset is registered in a site as incomplete although it effectively holds no files at all

19 21/02/2008Tutorial analisi distribuita19 Prepare a script to submit your job (II) j.outputdata.outputdata=['AnalysisSkeleton.aan.root'] j.outputdata=DQ2OutputDataset() j.outputdata.datasetname='Hgg120cloudtest-1-MILANO-rel13040' 5) Define output data and its attributes  outputdata is the name of the output ntuple  The output can be set to DQ2Dataset or ATLASOutputDataset. In the first case the output is registered in dq2 (better for close-to final results) while in the second is saved locally (better for tests).  In case of a DQ2Dataset, datasetname specifies the name of the output dataset. In this case it will be users.LeonardoCarminati.ganga.Hgg120cloudtest-1-MILANO-rel13040  In case of ATLASOutputDataset the output ntuples has (have) to be retrieved by  jobs(jobid).outputdata.retrieve()  jobs(jobid).subjobs(n).outputdata.retrieve() Outputs will be saved in your gangadir/workspace/Local/’jobid’/’n’/output

20 21/02/2008Tutorial analisi distribuita20 Prepare a script to submit your job (II) j.merger=AthenaOutputMerger() 6) Define a merger of output ntuples  Provided that you have a proper ROOT environment set, in this way output ntuples can be meged in a single one just typing (once the job is finished)  Jobs(jobid).merger.merge() The merged ntuples will be stored in gangadir/workspace/Local/’jobid’/output 7) Add required auxiliary files in the input sandbox j.inputsandbox=['$HOME/scratch0/13.0.30.2/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/dzps.dat', '$HOME/scratch0/13.0.30.2/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/dzstr.dat', '$HOME/scratch0/13.0.30.2/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/dzmid.dat', '$HOME/scratch0/13.0.30.2/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/PT2dist.root', '$HOME/scratch0/13.0.30.2/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/photonLikelihoodPdf.root', '$HOME/scratch0/13.0.30.2/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/LVL1Config.xml', ]

21 21/02/2008Tutorial analisi distribuita21 Prepare a script to submit your job (II) j.backend=LCG() j.backend.requirements=AtlasLCGRequirements() j.backend.requirements.sites= [‘CNAF’,’LNF’,'MILANO’,’NAPOLI’,’ROMA1’] j.submit() 8) Define where you want to send your jobs  j.backend.requirements.sites=[…] allow you to restrict the submission to a user defined list of sites (safer procedure)  J.backend.requirements.excluded_sites=[…] allow you to ban sites  The list of available sites can be obtained from the ganga prompt  d.AtlasLCGRequirements()  d.list_sites()  In principle you can also force the jobs to a site directly  j.backend.CE ='ce05-lcg.cr.cnaf.infn.it:2119/jobmanager-lcglsf-atlas‘  The list of CE can be obtained from shell prompt lcg-infosites –vo atlas ce 9) Submit the job

22 21/02/2008Tutorial analisi distribuita22 Prepare a script to submit your job (II) j = Job() j.name='Hgg-120GeV' j.application=Athena() j.application.prepare(athena_compile=False) j.application.option_file='$HOME/testarea/13.0.40/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/HggAnalysis_jobOption.py' j.splitter=AthenaSplitterJob() j.splitter.numsubjobs=10 j.inputdata=DQ2Dataset() j.inputdata.dataset="trig1_misal1_mc12.006384.PythiaH120gamgam.recon.AOD.v13003002_tid016421" j.inputdata.match_ce_all=True j.outputdata=DQ2OutputDataset() j.outputdata.outputdata=['AnalysisSkeleton.aan.root'] j.outputdata.datasetname='Hgg120-tutorialRoma3-rel13040‘ j.inputsandbox=['$HOME/testarea/13.0.40/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/dzps.dat', '$HOME/testarea/13.0.40/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/dzstr.dat', … ] j.backend=LCG() j.backend.requirements=AtlasLCGRequirements() j.backend.requirements.sites= ['ROMA1','NAPOLI',’LNF’,'MILANO','CNAF'] j.submit() Analysis jobs on the italian cloud and register the output in dq2

23 21/02/2008Tutorial analisi distribuita23 Submit and monitor a job with ganga:  To submit your job  Ganga fornisce dei tools per seguire la vita del nostro job:  Se tutto va bene il job passa da ‘submitted’ a ‘running’ e poi ‘completed’  Si possono avere anche informazioni piu’ dettagliate In [5]:jobs Out[5]: Statistics: 1 jobs -------------- # id status name subjobs application backend backend.actualCE # 1 running Hgg120 10 Athena LCG In [5]:jobs(“1.0”) ganga execfile('$HOME/testarea/13.0.40/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run/gangaTest-tutorialRoma3.py')

24 21/02/2008Tutorial analisi distribuita24 Getting the output:  Nel caso abbiamo selezionato ‘DQ2OutputDataset’, quando il job raggiunge lo stato ‘completed’ gli output sono registrati da ganga in dq2 e quindi facilmente accessibili con I tools di dq2  Nel caso di `ATLASOutputDataset’ le ntuples devono essere recuperate con il comando retrieve() ed eventualmente mergiate.  I logs del job (stderr e stdout) sono salvati nella directory gangadir/workspace/Local/’jobid’/’subjob’/output  Si puo’ dare un’occhiata ai logs anche dentro ganga semplicemente jobs(jobid). subjobs[n]. peek('stdout', 'cat')

25 21/02/2008Tutorial analisi distribuita25 Documentation and users support  Sito web ufficiale di ganga http://ganga.web.cern.ch/ganga  Wiki page di ATLAS: tutorials e informazioni utili https://twiki.cern.ch/twiki/bin/view/Atlas/DistributedAnalysisUsingGanga  Hypernews molto ricca di spunti e molto efficiente (please subscribe!) hn-atlas-GANGAUserDeveloper@cern.ch  Il feedback degli utenti e’ essenziale: gli sviluppatori sono in genere molto attenti alle esigenze degli utenti e lo sviluppo di ganga e’ realmente user-driven

26 21/02/2008Tutorial analisi distribuita26 Setting up pathena  pathena e’ un package di Athena: occorre fare checkout come per qualsiasi altro pacchetto  Spostarsi nella testarea  Fare checkout del pacchetto e compilare cmt co PhysicsAnalysis/DistributedAnalysis/PandaTools cd PhysicsAnalysis/DistributedAnalysis/PandaTools/*/cmt source setup.sh make cd ${HOME}/testarea/13.0.40

27 21/02/2008Tutorial analisi distribuita27 Submitting a job with pathena Sottomettere un job con pathena e’ molto semplice:  Spostarsi nell’area run del pacchetto HiggsAnalysisUtils  Sottometti il job dal prompt di shell  NOTA: l’output dataset deve essere del tipo user.LeonardoCarminati.something  Nota: attualmente tutti I jobs vanno a BNL non c’e’ bisogno di specificare siti pathena --inDS --outDS --split 10 --extFile HggAnalysis_jobOption.py ${HOME}/testarea/13.0.40/PhysicsAnalysis/HiggsPhys/HiggsAnalysisUtils/run

28 21/02/2008Tutorial analisi distribuita28 Monitoring your jobs pathena_util status(jobID) ====================================== JobID : 1 time : 2008-02-16 15:12:05 inDS : trig1_misal1_mc12.006384.PythiaH120gamgam.recon.AOD.v13003002_tid016421 outDS : user.LeonardoCarminati.Hgg120-tutorialRoma3-rel13040 libDS : user.LeonardoCarminati.lxplus225_76.lib._000001 build : 7659827 run : 7659828-7659837 jobO : HggAnalysis_jobOption.py site : ANALY_BNL_ATLAS_1 ---------------------- buildJob : succeeded ---------------------- runAthena : total : 10 succeeded : 0 failed : 0 running : 10 unknown : 0 ----------------------  Chiamare pathena_utils  Chiedi il summary del tuo job (ricorda il jobID)

29 21/02/2008Tutorial analisi distribuita29 Getting your output  Di default l’output dataset viene registrato in dq2  Lo stato del job e I relativi output possono essere controllati sulla pagina di panda

30 21/02/2008Tutorial analisi distribuita30 Conclusioni  L’analisi distribuita e’ uno strumento indispensabile per fare analisi in ATLAS (non credo ci siano alternative)  Gli strumenti di analisi distribuita disponibili sul mercato sono essenzialmente Ganga e pathena. Non e’ chiaro cosa succedera’: probabilmente alla fine ci sara’ un’unica interfaccia per l’utente finale  Vorremmo convircevi che I tools, per quanto fragili, funzionano e permettono di fare analisi. Piu’ ampia e’ la comunita’ di utenti e maggiore e’ la velocita’ con cui I tools evolvono.  In particolare supportare lo sviluppo di T1 e cloud italiana


Scaricare ppt "21/02/2008Tutorial analisi distribuita1 Analisi distribuita in ATLAS Carminati Leonardo Universita’ degli Studi e sezione INFN di Milano."

Presentazioni simili


Annunci Google