La presentazione è in caricamento. Aspetta per favore

La presentazione è in caricamento. Aspetta per favore

BioVeL is funded by the European Commission 7th Framework Programme (FP7). It is part of its e-Infrastructures activity. Biodiversity Virtual e-Laboratory.

Presentazioni simili


Presentazione sul tema: "BioVeL is funded by the European Commission 7th Framework Programme (FP7). It is part of its e-Infrastructures activity. Biodiversity Virtual e-Laboratory."— Transcript della presentazione:

1 BioVeL is funded by the European Commission 7th Framework Programme (FP7). It is part of its e-Infrastructures activity. Biodiversity Virtual e-Laboratory Under FP7, the e-Infrastructures activity is part of the Research Infrastructures programme, funded under the FP7 'Capacities' Specific Programme. It focuses on the further development and evolution of the high-capacity and high-performance communication network (GÉANT), distributed computing infrastructures (grids and clouds), supercomputer infrastructures, simulation software, scientific data infrastructures, e-Science services as well as on the adoption of e-Infrastructures by user communities. BioVeL is free and available via internet. www.biovel.euwww.biovel.eu, contact Alex Hardisty: HardistyAR@cardiff.ac.uk 1

2 BioVeL is one of a number of projects across Europe contributing to LifeWatch, and it will make a key contribution towards implementation of these initiatives by enabling better sharing of skills and data, and faster production of outputs in biodiversity science. LifeWatch is the research infrastructure programme, initiated in the European Strategy Forum on Research Infrastructures (ESFRI), to construct and operate facilities, including new virtual laboratories for biodiversity research. The ultimate objective of the LifeWatch project is to explore patterns of biodiversity and processes of biodiversity across time and space-scales. LifeWatch anticipates and encourages projects like BioVeL, acting in a local, bottom-up manner to contribute towards its construction. BioVeL will eventually hand over the results of its work – an e-Infrastructure of robust discoverable tools for use in workflows and sharing of workflows more widely in the community – to LifeWatch for continued operation. Biodiversity Virtual e-Laboratory 2

3 BioVeL è un consorzio di 15 partners appartenenti a 9 nazioni 1.Cardiff University, UK – Coordinator 2.Centro de Referência em Informação Ambiental, Brazil 3.Foundation for Research on Biodiversity, France 4.Fraunhofer-Gesellschaft, Institute IAIS, Germany 5.Free University of Berlin – Botanical Gardens and Botanical Museum, Germany 6.Hungarian Academy of Sciences Institute of Ecology and Botany, Hungary 7.Max Planck Society, MPI for Marine Microbiology, Germany 8.National Institute of Nuclear Physics, Italy 9.National Research Council: Institute for Biomedical Technologies and Institute of Biomembrane and Bioenergetics, Italy 10.Netherlands Centre for Biodiversity (NCB Naturalis), The Netherlands 11.Stichting European Grid Initiative, The Netherlands 12.University of Amsterdam, Institute of Biodiversity and Ecosystem Dynamics, The Netherlands 13. University of Eastern Finland, Finland 14. University of Gothenburg, Sweden 15.University of Manchester, UK 3

4 Biodiversity Virtual e-Laboratory 4 Progetto di 3 anni, iniziato il 1 settembre 2011

5 Semplificare la possibilità di importare dati dalle proprie librerie o da quelle di altri ricercatori “Workflow” (serie di step di data-analisi) che consentono di processare una grande quantità di dati Costruire il proprio workflow con la possibilità di selezionare e applicare successivi “servizi” di data processing Accedere a librerie di workflow e ri- usare workflow esistenti Ridurre i tempi di ricerca e l’overhead per imparare ad usare i tool Contribuire al progetto LifeWatch e GEO BON. Biodiversity Virtual e-Laboratory BioVeL is a powerful data processing tool Part of a workflow to study the ecological niche of the horseshoe crab 5

6 118k€ di finanziamento per l’INFN Impegno su due WP: WP4 (Outreach, dissemination and support) 1.5PM Outreach activities to the communities, international initiatives and other projects; Dissemination of tools and results available with participation to conferences and biodiversity related events WP7 (Services access, operation and management) 14PM Deploy, commission and operate services on behalf of the biodiversity science community; Gain access (where necessary) to underlying e-infrastructures (e.g., NGIs/EGI) for the execution of computationally intensive services. The Italian National Institute of Nuclear Physics (INFN) brings Grid and other distributed computing expertise to the provision of services, notably but not exclusively for the phylogenetics service set. INFN brings software engineering effort and expertise and capability for hosting services Biodiversity Virtual e-Laboratory e INFN 6

7 Servizi utilizzati piattaforma di Cloud di Amazon per i servizi Core nei casi in cui è richiesta una cospicua quantità di calcolo o di storage si preferiscono partner specializzati nel calcolo scientifico INFN-Bari fornisce questo tipo di servizio almeno per il periodo coperto del progetto. C’è interesse da parte della comunità di BioVeL/Lifewatch (e in particolare della componente italiana) per continuare questo rapporto anche dopo la fine di BioVel in Lifewatch. L’interesse di Bari è anche legato alla possibilità di sfruttare lle risorse computazionali del PON ReCaS e le expertise legate al progetto PRISMA Altri esempi di resource providers Il Centro de Referência em Informação Ambiental, Brazil ha fatto domanda alla EGI Cloud Task Force di poter provare quell’infrastruttura per ospitare il loro software SARA, Olanda, è uno degli “scientific cloud providers” che si stanno avvicinando al progetto per offrire supporto anche se non sono partner di Biovel Biodiversity Virtual e-Laboratory 7

8 Web service Logical Design 8 >1000 jobs && >1 month of CPU Response time: few days Few hundreds- thousand of jobs Response time: from few minutes to few hours Single fast execution per real time analysis ~10 concurrent execution Response time: ~ 5-10 seconds

9 Web service Logical Design 9 >1000 jobs && >1 month of CPU Response time: few days Few hundreds- thousand of jobs Response time: from few minutes to few hours Single fast execution per real time analysis ~10 concurrent execution Response time: ~ 5-10 seconds 2 workflow engine supportate Taverna Loni

10 Upload the user’s inputs Run MrBayes: a MPI application that could run for several hours Pass the output to the next services 10 Check the convergence of the model Retrieving the output and parsing the XML calculate the consensus tree of the posterior distribution of MrBayes output Graphical view of the tree

11 Web service Logical Design 11 >1000 jobs && >1 month of CPU Response time: few days Few hundreds- thousand of jobs Response time: from few minutes to few hours Single fast execution per real time analysis ~10 concurrent execution Response time: ~ 5-10 seconds I web services popolano la task queue

12 Rest Web service example Insert Jobs: http://localhost:8080/RestService/services/QueryJob/InsertJobs?NAME= {blast}&arguments={http://webtest.ba.infn.it/vicario/FinalFusariumDB_ 2.nexhttp://localhost:8080/RestService/services/QueryJob/InsertJobs?NAME= {blast}&arguments={http://webtest.ba.infn.it/vicario/FinalFusariumDB_ 2.nex ArgOne; http://webtest.ba.infn.it/vicario/FinalFusariumDB_1.nex ArgTwo;} ArgTwo;} Select Jobs: http://localhost:8080/RestService/servi ces/QueryJob/SelectJobs?FLAG={http://localhost:8080/RestService/servi ces/QueryJob/SelectJobs?FLAG={20b3c bf8-6805-47b4-ad7c-7b40bc706741} 12

13 Stress test di performance e affidabilità già passati: 100’000 inserimenti di fila … nessun memory-leak o altri problemi Fino a 100 client concorrenti senza problemi 1000 tasks inserite in una singola chiamata REST ~1M of tasks gestiti dal DB+backend A lot of experience in porting Bioinformatics application over EGI distributed computing infrastructure: Hmmer, MrBayes, Blast, PAML, MUSCLE, EMBOSS, Biopython, AmpliconNoise, ABCtool, Bowtie, BayeSSC, GeoKS, hyphy, raxmlHPC, phylocom, consensus_xml, Matlab, etc… 25 different services already provided to users communities Test & Results 13

14 Soap Web service example wsdlpull 'http://localhost:8080/INFN.Grid.SoapFrontEnd/Soa pServiceMethodsPort?wsdl' InsertJobs admin admin test_loni ’MatLabRUN1 input_test 12; MatLabRUN2 input_test2 24' pasq.notra@ba.infn.it wsdlpull 'http://localhost:8080/INFN.Grid.SoapFrontEnd/Soa pServiceMethodsPort?wsdl' SelectJobs admin admin 20b3cbf8-6805-47b4-ad7c-7b40bc706741 14

15 Web service Logical Design 15 >1000 jobs && >1 month of CPU Response time: few days Few hundreds- thousand of jobs Response time: from few minutes to few hours Single fast execution per real time analysis ~10 concurrent execution Response time: ~ 5-10 seconds JST Evoluzione di un Tool utilizzato a Bari fin dal progetto FIRB LIBI Utilizza la tecnica dei pilot job e una coda centrale di task Usato sia per la sottomissione a grid che alle risorse locali

16 Web service Logical Design 16 >1000 jobs && >1 month of CPU Response time: few days Few hundreds- thousand of jobs Response time: from few minutes to few hours Single fast execution per real time analysis ~10 concurrent execution Response time: ~ 5-10 seconds Gestione dei file di INPUT: WebDav

17 Spesso la dimensione degli input files è dell’ordine del Gbyte quindi può essere difficile fare l’upload con un web service standard Gli utenti di Bioinformatica di solito non hanno esperienza di uso degli storage element di grid É necessario rendere disponibile una interfaccia user-friendly per trasferire grandi quantità di dati e di files dal PC dell’utente agli storage element di grid. Questo servizio: Deve avere un client in tutte le piattaforme software (Windows/MacOS/Linux) Fornire vari sistemi di autenticazione fra cui almeno “username/password” Fornire ottime performance anche su reti ad alta latenza Fornire la possibilità di ridurre i trasferimenti fra i servizi di calcolo e i pc degli utenti (i files temporanei devono essere già diponibili all’infrastruttura di calcolo senza) Input files: problems 17

18 Screenshots: WebDav DataManagement Service 18

19  Accesso ai file mediante un browser.  Semplicità di condivisione dei file con altri colleghi Screenshots: WebDav DataManagement Service 19

20 Giacinto Donvito (INFN-ReCaS) Pasquale Notarangelo (INFN) Domenico Diacono (INFN) Saverio Vicario (CNR) Bachir Balech (CNR) People involved in the development 20

21 Workshops on e-Science Workflows in Budapest -- 9-10th of February 2012 http://indico.egi.eu/indico/conferenceDisplay.py?ovw=True&confId= 656 EGI Community Forum 2012 https://indico.egi.eu/indico/conferenceDisplay.py?confId=6 79 Proceedings: http://pos.sissa.it/cgi-bin/reader/conf.cgi?confid=162 BITS 2012 http://bits2012.dmi.unict.it/program.html NETTAB 2012 http://www.nettab.org/2012/progr.html Conference & Proceeding 21

22 We have a high scalable and solid service that could be used to supports execution of applications over different computing infrastructure This is a classical example of SaaS We have also a high performance data transfer and sharing service It is quite easy to add new application as the users requires it The technical solution has been already presented with success at few national and international conferences We already used the same framework to support different application coming from different communities This highlight the generality of the solution and the possibility to exploit synergies with other project We are already providing CPU and Storage facilities to the project via user friendly “cloud” interfaces Conclusions & To-do 22


Scaricare ppt "BioVeL is funded by the European Commission 7th Framework Programme (FP7). It is part of its e-Infrastructures activity. Biodiversity Virtual e-Laboratory."

Presentazioni simili


Annunci Google