Bioinformatica BioPerl Dr. Giuseppe Pigola –

Bioinformatica BioPerl Dr. Giuseppe Pigola – pigola@dmi.unict.it

Link Utili http://www.bioperl.org Utilizzare il tool Perl Package Manager: http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows Altri package: http://biojava.org http://biopython.org http://www.biophp.org 2Bioinformatica

BioPerl BioPerl è una collezione di moduli Perl che favoriscono lo sviluppo di script relativi ad applicazioni bioinformatiche; Dato che Perl è un ottimo linguaggio per la manipolazione di testorisulta molto efficace nelle applicazioni bioinformatiche; BioPerl è orientato agli oggetti; 3Bioinformatica

Namespace di BioPerl Bio:: Seq: Oggetto sequenza (DNA,RNA, Proteina); Bio::SeqIO: Recupero e conservazione delle sequenze (in tanti formati); Bio::SeqFeature: Caratteristiche (Gene, Esone,Promotore, etc); Bio::Annotation: Usato per memorizzare link a DB, letteratura e commenti; Bio::AlignIO; Bio::SimpleAlign; Bio::DB; Bio::SearchIO; ………. …. 4Bioinformatica

Manipolare Sequenze Crea un oggetto sequenza con determinati attributi: 5Bioinformatica Use Bio::Seq; $seq = Bio::Seq->new(-seq=>actgtggcgtcaact,-desc=>Sample Bio::Seq object, -display_id => something,-accession_number => accnum,-moltype => dna); $seq->display_id(); # Common Name $seq->seq(); $seq->length(); $seq->subseq(5,10);#Restituisce una stringa $seq->accession_number(); $seq->moltype(); $seq->primary_id(); # Indipendente dagli ID nei vari DB $seq->trunc(5,10) # Sottostringa (nuovo oggetto) $seq->revcom # Sequenza complementare (nuovo oggetto) $seq->translate # Traduzione of the sequence (nuovo oggetto) $seq->translate(p1,p2,p3) # p1=simbolo codone di stop, p2=aa X, p3= frame;

Semplici Statistiche Statistiche sulla sequenza: 6Bioinformatica Use Bio::Seq; use Bio:: Tools::SeqStats; $seq = Bio::Seq->new(-seq=>actgtggcgtcaact,-desc=>Sample Bio::Seq object, -display_id => something,-accession_number => accnum,-moltype => dna); $seq_stats = Bio::Tools::SeqStats->new($seq); $weight = $seq_stats->get_mol_wt(); #inf e sup (array) $monomer_ref = $seq_stats->count_monomers(); # frequenze (hash) $codon_ref = $seq_stats->count_codons(); # for nucleic acid sequence (array)

BLAST in Locale Ricercare sequenze simili sul DB ecoeli.nt: 7Bioinformatica Use Bio::Seq; Bio::Tools::StandAloneBlast; @params = (program => blastn,database => ecoli.nt); $factory = Bio::Tools::StandAloneBlast->new(@params); $input = Bio::Seq->new(-id=>"test query,-seq=>"ACTAAGTGGGGG"); $blast_report = $factory->blastall($input);

Smith-Waterman o Blast2Seq Deve essere installato (bioperl-ext): 8Bioinformatica Use Bio::Seq; use Bio::Tools::pSW; Bio::Tools::StandAloneBlast; $seq1 = Bio::Seq->new(-seq=>actgtggcgtcaact,-desc=>Sample Bio::Seq object, -display_id => something,-accession_number => accnum,-moltype => dna ); $seq2 = Bio::Seq->new(-seq=>actgtggcgtcaact,-desc=>Sample Bio::Seq object, -display_id => something,-accession_number => accnum,-moltype => dna ); $factory1 = new Bio::Tools::pSW( -matrix => blosum62.bla,-gap => 12,-ext => 2, ); $factory1->align_and_show($seq1, $seq2, STDOUT); #Allinea e mostra $aln = $factory1->pairwise_alignment($seq1, $seq2); # Allinea e restituisce un oggetto; $factory2 = Bio::Tools::StandAloneBlast->new(outfile => bl2seq.out); $bl2seq_report = $factory2->bl2seq($seq1, $seq2); # Usiamo AlignIO.pm per creare un oggetto SimpleAlign dal report di blast2seq $str = Bio::AlignIO->new(-file => bl2seq.out,-format => bl2seq);

ClustalW – TCoffee Deve essere installato (bioperl-ext): 9Bioinformatica Use Bio::Seq; use Bio::Tools::Run::Alignment::Clustalw; @params = (ktuple => 2, matrix => BLOSUM); $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); $ktuple = 3; $factory->ktuple($ktuple); # Cambia il parametro prima dellesezuzione $seq_array_ref = \@seq_array; # @seq_array è un array di sequenze $aln = $factory->align($seq_array_ref);

GenScan Deve essere installato (bioperl-ext): 10Bioinformatica use Bio::Seq; use Bio::Tools::Genscan; $genscan = Bio::Tools::Genscan->new(-file => result.genscan); # $gene è una istanza di Bio::Tools::Prediction::Gene # $gene->exons() ritorna un array di oggetti Bio::Tools::Prediction::Exon while($gene = $genscan->next_prediction()){ @exon_arr = $gene->exons(); } $genscan->close();

Esempio: Formattare una sequenza Legge da File una sequenza in formato FASTA e la riscrive in un altro file in formato EMBL: Formati: Fasta, EMBL, GenBank, Swissprot, PIR, GCG, SCF, phd/phred, Ace, oppure raw (plain sequence); 11Bioinformatica use Bio::SeqIO; $in = Bio::SeqIO->new('-file' => "inputfilename", '-format' => 'Fasta'); $out = Bio::SeqIO->new('-file' => ">outputfilename", '-format' => 'EMBL'); while ( my $seq = $in->next_seq() ) { $out->write_seq($seq); }

Esempio: Formattare un allineamento Legge da File un allineamento in formato FASTA e lo riscrive su un altro file in formato PFAM: 12Bioinformatica use Bio::SeqIO; $in = Bio::AlignIO->new(-file => "inputfilename",-format => fasta); $out = Bio::AlignIO->new(-file => ">outputfilename,-format => pfam); while ( my $aln = $in->next_aln() ) { $out->write_aln($aln); }

Esempio: Accedere ad un DB (1) Ricerca la sequenza ROA1_HUMAN sul DB di genbank, stampa Accession number, descrizione e sequenza (in formto FASTA): Formati: Fasta, EMBL, GenBank, Swissprot, PIR, GCG, SCF, phd/phred, Ace, oppure raw (plain sequence); 13Bioinformatica #!/usr/bin/perl use strict; use Bio::DB::GenBank; use Bio::Seq; use Bio::SeqIO; my $database = new Bio::DB::GenBank; my $seq = $database->get_Seq_by_id('ROA1_HUMAN'); print "Seq: ", $seq->accession_number(), " -- ", $seq->desc(), "\n\n"; my $out = Bio::SeqIO->newFh ( -fh => \*STDOUT, -format => 'fasta'); print $out $seq;

Esempio: Accedere ad un DB (2) Ricerca la sequenza ROA1_HUMAN sul DB di genbank, stampa Accession number, descrizione e sequenza (in formto FASTA): 14Bioinformatica #!/usr/bin/perl use Bio::Perl; $seq_object = get_sequence("genbank","ROA1_HUMAN"); write_sequence(">roa1.fasta.txt",'fasta',$seq_object);

Esempio: Accedere ad un DB (3) Ricerca la sequenza AB077698 sul DB di genPept, e la stampa sul STDOUT: 15Bioinformatica #!/usr/bin/perl -w use strict; use Bio::DB::GenPept; use Bio::DB::GenBank; use Bio::SeqIO; my $db = new Bio::DB::GenPept(); my $out = new Bio::SeqIO(-format => 'fasta'); my $acc = 'AB077698'; my $seq = $db->get_Seq_by_acc($acc); if( $seq ) { $out->write_seq($seq); } else { print STDERR "cannot find seq for acc $acc\n"; } $out->close();

Esempio: Accedere ad un DB (4) Ricerca sul DB Taxonomy di NCBI (deve essere installato XML::Twig): 16Bioinformatica #!/usr/bin/perl -w use Bio::DB::Taxonomy; my $db = new Bio::DB::Taxonomy(-source => 'entrez'); $node1 = $db->get_Taxonomy_Node(-taxonid => '9606'); $node2 = $db->get_Taxonomy_Node(-name => 'Homo sapiens'); $pnode = $node->get_Parent_Node(); $parentid = $node->parent_id; my @class = $node->classification; $node->name; $node->scientific_name;

Bioinformatica BioPerl Dr. Giuseppe Pigola –

Presentazioni simili

Presentazione sul tema: "Bioinformatica BioPerl Dr. Giuseppe Pigola –"— Transcript della presentazione:

Presentazioni simili

Sul progetto

Feed-back

Entrare

Autorizzarsi attraverso i social network:

Bioinformatica BioPerl Dr. Giuseppe Pigola –

Presentazioni simili

Presentazione sul tema: "Bioinformatica BioPerl Dr. Giuseppe Pigola –"— Transcript della presentazione:

Presentazioni simili

Sul progetto

Feed-back