Scaricare la presentazione
La presentazione è in caricamento. Aspetta per favore
PubblicatoGuerrino Macori Modificato 9 anni fa
1
Guerrino Macori National Reference Laboratory for Coagulase Positive Staphylococci including S.aureus – Torino VI WORKSHOP DEL LABORATORIO NAZIONALE DI RIFERIMENTO (NRL) PER GLI STAFILOCOCCHI COAGULASI POSITIVI COMPRESO S.AUREUS 12 / 13 Dicembre 2013 Analisi in silico e relazione tra enterotossine stafilococciche e tossine ipotetiche in silico analysis and relation between SEs and HPs IZSTO Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d’Aosta
2
VI WORKSHOP DEL LABORATORIO NAZIONALE DI RIFERIMENTO (NRL) PER GLI STAFILOCOCCHI COAGULASI POSITIVI COMPRESO S.AUREUS 12 / 13 Dicembre 2013 in silico analysis and relation between SEs and HPs Summary - Definition of bioinformatic - What is done, units information, scale overview - Databases - Some practices Reverse vaccinology Hypotetical proteins and SEs - Conclusion IZSTO Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d’Aosta
3
What is Bioinformatics/computational biology? A marriage between biology and informatic VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013
4
What is done in bioinformatics? R&D - Nucleotide and aminoacid sequences, protein domains and protein structures - models VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 Development of new algorithms for large data sets Development and implementation of tools that enable efficient access and management of different types of information
5
“A prerequisite to understanding the complete biology of an organism is the determination of its entire genome sequence” Fleischmann et al. 1995 Whole Genome sequencing (linear sequence of DNA base units – A T C G-) Human genome: 3.12 10*9 bp 2000-2001 Whole genome → exponential data → bioinformatic to organize and collect VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013
6
Post-genomic era Bioinformatic to analyze in rational manner the genomic data Is the Sequence sufficient to understand biological function of the organisms? VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013
7
DNA RNA PROTEIN What “units of information” do we deal with bioinformatics? Pathways Interactions Mutations Sequence Structure Evolution Biological data used: DNA - Genome RNA - Transcriptome PROTEIN - Proteome
8
VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 DNA RNA PROTEIN What “units of information” do we deal with bioinformatics? Simple Sequence Analysis Database searching Pairwise analysis Regulatory Regions Gene finding Whole Genome Annotations Comparative genomics (Species and strains e.g. oldest methods as PFGE) DNA sequences >gi|8886401|gb|AF162269.1| CCCACTCCTCCATCTCACAAACACTTCTCTATACCCAACAATCCCTTTTACAATCCCTGCTCATTTAGTCAAA ATGGTCAAGATTGCTGCTATCATCCTCCTCATGGGCATTCTCGCCAATGCTGCCGCCATCCCTGTCATTTCA ACACCCAAATTACAGAGCCAACCGGCGAGGGCGACCGTGGGGACGTGGCCGAC
9
VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 DNA RNA PROTEIN What “units of information” do we deal with bioinformatics? Splice Variants Tissue specific expression Structure Single gene analysis Experimental data/thousands genes simultaneously (DNA chips, microarray, expressione arrays)
10
VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 DNA RNA PROTEIN What “units of information” do we deal with bioinformatics? Proteome of an organism 2D gels Mass spectromy Structure: 2D/3D/4D
11
VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 Protein analysis: scale overview OrganismoGenome (Mb) Genes E. coli 464300 (4300) S. cerevisiae 13,5 (6000) Drosophila melanogaster 165 (13600) Arabidopsis thaliana 119 (25500) Homo sapiens 3300 (30000/40000) S.aureus 2,84 (2700
12
VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 Protein analysis: scale overview OrganismoGenome (Mb) Genes E. coli 464300 (4300) S. cerevisiae 13,5 (6000) Drosophila melanogaster 165 (13600) Arabidopsis thaliana 119 (25500) Homo sapiens 3300 (30000/40000) S.aureus 2,84 (2700
13
VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 Protein analysis: scale overview and databases Transcription and translation folding
14
ORF-Finder Nucleotide sequences → translation (any frame) ORF (Open Reading Frame) discover ORF: proteic sequence with right lenght for an average protein (> 70-100 aa). Genome scanned by software for Hypotetical proteins (Hps): possible but not verified VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013
15
Protein analysis: scale overview and databases Highlight the similarities and differences of functionally important sites Derive a structural alignment Detect evolutionary relationships can not be perceived by the sequence HPs and Functional SEs domain
16
VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 Protein analysis: databases GenBank www.ncbi.nlm.nih.gov nucleotide sequences Ensembl www.ensembl.org human/mouse genome (and others) PubMed www.ncbi.nlm.nih.gov literature references NRwww.ncbi.nlm.nih.gov protein sequences SWISS-PROTwww.expasy.ch protein sequences InterProwww.ebi.ac.uk protein domains OMIMwww.ncbi.nlm.nih.govgenetic diseases Enzymeswww.chem.qmul.ac.ukenzymes PDBwww.rcsb.org/pdbprotein structures KEGGwww.genome.ad.jpmetabolic pathways
17
NCBI databases VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013
18
Proteic sequences databases Less data than nucleotidic sequences; Rarely protein seq come from sequencing; Obtained for nucleotidic seq tradution; VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 www.expasy.org
19
Practice Reverse vaccinology in-silico analysis and relation between staphylococcal enterotoxins and hypothetical toxins: a prediction study for Staphylococcus aureus VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013
20
Practice Reverse vaccinology First genomic approach for the development of a vaccine: The Reverse Vaccinology applied to Neisseria meningitidis Start From the Whole Genome Sequence Immunogenicity testing in animal models In silico vaccine candidates Express recombinant proteins VACCINE DEVELOPMENT Vaccine 1-2 years Computer Prediction VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013
21
Practice Reverse vaccinology First genomic approach for the development of a vaccine: The Reverse Vaccinology applied to Neisseria meningitidis VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 Homology searches for all the predicted ORFs (PSI-BLAST, FASTA) Hits found No hits found (function assigned) (hypothetical proteins) Enzyme, cytoplasmic localization -Secreted -Outer membrane -Inner membrane -Periplasmic -Lipoproteins cytoplasmic SELECTEDDISCARDED Already known Neisseria antigen Localization prediction (PSORT, SignalP, TMPRED ) ORF prediction on the partial genomic sequence (ORF Finder) Homology to bacterial surface- associated proteins
22
Background Staphylococcus aureus carries a large repertoire of virulence factors, including over 40 secreted proteins and enzymes that it uses to establish and maintain infections. toxic shock syndrome toxin (TSST) Panton-Valentine leukocidin (PVL) the exfoliative toxins A and B (ETA and ETB) the family of staphylococcal enterotoxins A and B (SEA and SEB) and food poisoning VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 S.aureus may produce 21 different SEs - excluding variants species Practice in-silico analysis and relation between staphylococcal enterotoxins and hypothetical toxins: a prediction study for Staphylococcus aureus
23
Background VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 S.aureus may produce 21 different SEs
24
‘‘hypothetical proteins’’: protein that is predicted to be expressed from an Open Reading Frame, but for which there is no experimental evidence of translation Substantial fraction of proteomes There is so far no classification, proteins predicted from nucleic acid sequences and that have not been shown to exist by experimental protein chemical evidence. VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 1.Expasy's Protparam: computation of various physical and chemical parameters for a given entered sequence protein - http://web.expasy.org/protparam/ 2.NCBI Conserved Domains: search for Conserved Domains within a coding nucleotide sequence- http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi 3.PROTEIN DATA BANK - PDB The PDB archive contains information about experimentally-determined structures of proteins, and allows to visualize and align the most similar known structures - http://www.rcsb.org/pdb/home/home.do Similarity between S.aureus 13 well known deposited SEs and 50 HPs through following databases: SEA - SEB – SEC – SED – SEG – SEI – SEH – SEK – SEL – SEM – SEN – SEO - SEQ Background:
25
VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 47/50 HPS have at least one conserved domain The instability index (I.I.) (provides an estimate of the stability of HPs in a test tube) classified 32 protein as stable Within stable HPs: 6 HPs show conserved domain homologies with SEs Staphylococcal/Streptococcal toxin, Oligonucleotide Binding (OB)-fold domain Staphylococcal/Streptococcal toxin and β-grasp domain 6 HPs result unknown function and belonging family of S.aureus uncharacterized proteins: 4 sequences match with an high E-value to well-known proteins (E-value connects the score of an alignment between a user-supplied sequence and a database sequence)
26
Background VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 gi446958339 (1Q1L) gi446958341 (1TS2) gi501167136 (1I4G) gi446958340 (1TS5) Experimentally-determined structures of the 4 Sequences with the high E-value matched (NCBI Access and Protein code are shown) “in-silico” analysis of the important functionally domains and protein families demonstrate that 6 of the 50 HPs reveals relation as the same family of SEs. This would provide useful solution for the identification of many hypothetical proteins in databases and prediction of their possible involvement in the mechanisms of foodborne illness.
27
Two example: biosequences alignment and algorithmic solutions But we must always remember that: The methods utilized (algorithm for example and modeling) allow you to find the "best" alignment efficiently but do not guarantee that the result is biologically true If the biological sense matchs with function VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 The gene seq of a protein is less conserved than secondary structure, tertiary and quaternary in the course of evolution. two effects: Homologous proteins can have very different sequence and then produce alignments with a low similarity score. If the similarity between two protein sequences is high (statistically significant) is quite reasonable to assume that among them there is a relationship of functional homology.
28
VI Workshop NRL – CPS including S.aureus - Torino, 12-13 dicembre 2013 Grazie per l’attenzione
Presentazioni simili
© 2024 SlidePlayer.it Inc.
All rights reserved.