Guerrino Macori Angelo Romano National Reference Laboratory for Coagulase Positive Staphylococci, including S.aureus S.C. Controllo Alimenti e Igiene delle Produzioni Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d’Aosta Data and bioinformatic challenges: a case study NRL for Coagulase Positive Staphylococci including S.aureus
IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta Main characterization methods biological characterization serological characterization molecular Biological characterization It is to identify and compare the symptoms of an infectious agent by using it as an indicator of biological systems indicators Serological characterization It makes use of antibodies immunodiffusion ELISA and other methods that use this principle Molecular Molecular identification: PCR, PFGE
VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
Staphylococcus aureus Protein A - Typing Spa-tipizzazione Sequenza di una porzione VNTR polimorfica nella regione codificante specifica di S.aureus della proteina A stafilococcica. Presenza di molte sequenze ripetute e variabili da ceppo a ceppo ma molto conservate all’interno della specie. Le sequenze ripetute vengono chiamate “sequenze repeat” (circa bp di solito 24) VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
Staphylococcus aureus Protein A - Typing Spa-tipizzazione E’ una tecnica di tipizzazione a “singolo locus”, offre una risoluzione di subtyping paragonabile a tecniche più costose e/o laboriose, come MLST e PFGE. Online based – inserimento 4 spatipi nel 2014 Non è un protocollo inserito nel sistema qualità! Redazione e distribuzione (2016?) VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
A differenza di spa-typing MLST è una tecnica di tipizzazione a “locus multiplo”, offre una risoluzione di subtyping più profonda: Geni ulteriori sequenziati: arc (Carbamate kinase) aro (Shikimate dehydrogenase) glp (Glycerol kinase) gmk (Guanylate kinase) pta (Phosphate acetyltransferase) tpi (Triosephosphate isomerase) yqi (Acetyle coenzyme A acetyltransferase) Attività 2015 pianificata ampliamento Multi Locus Sequence Typing (MLST) More level of typing VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
RS-PCR “Genotypes” (Graber et al., Typing of the Pvl-phage Complex clonal analysis Other polifomisms Molecular combination/bioty ping and virulence factors Ribotyping MALDI-TOF VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO
Whole genome sequencing “A prerequisite to understanding the complete biology of an organism is the determination of its entire genome sequence” Fleischmann et al Sequenza lineare dell’intero genoma – A T C G- Is the Sequence sufficient to understand biological function of the organisms? VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
Progetto ProNaCC Production of “Natural Contaminated” Cheeses NRL for Coagulase Positive Staphylococci including S.aureus VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
Progetto ProNaCC Production of “Natural Contaminated” Cheeses Batch 1 (Starin SED) -> VIDAS + Batch 2 (Strain SEE) -> VIDAS + Batch 3 (Strain SEC) -> VIDAS + Batch 4 (Strain SEA) -> VIDAS + Batch 5 (Strain SEG e SEI) -> VIDAS - Batch 6 (Strain SEH) -> VIDAS - Batch 7 (Strain SEA/SED/SEJ/SER) -> VIDAS + Batch 8 (mix batches) -> VIDAS + VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta Panel of analysis for typing these strains
Vs VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
S.aureus Whole Genome Sequencing VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
3 S. aureus strains Whole Genome Sequencing 6,16 Gigabite data - 3 file FASTQ per read VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
Commercial Software Data Analysis VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
DATA ANALYSIS online resources“Web Based software” Analisi secondaria dati sequenziamento “Illumina BaseSpace” Analisi secondaria dati sequenziamento “Galaxy” VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
Risorse online “Web Based software” Pipeline analisi genomiche dedicate Comunità di utenti molto attiva e sviluppatori VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta DATA ANALYSIS
Web-based Centri di ricerca Sviluppo tool di analisi genomica dedicati Risorsa disponibile precompilata Possibilità di implementare e scaricare i codici VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
the SpeciesFinder method, is based solely upon the 16S rRNA gene SpeciesFinder predicts the prokaryotic species based on the 16S rRNA gene. The concept of using the 16S rRNA gene for taxonomic purposes goes back to 1977 SpeciesFinder 1.2 VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
WGS-based prokaryotic species identification on data set of complete genome, examines all regions of the genomes, not only core genes Alternative and faster approach would be to look at k-mers (substrings of k nucleotides in DNA sequence data) and use the number of cooccurring k-mers in two bacterial genomes as a measure of evolutionary relatedness. KmerFinder 2.0 VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
Web-server for the prediction of bacterial pathogenicity by analysing the input proteome, genome, or raw reads provided by the user. The method relies on groups of proteins, created without regard to their annotated function or known involvement in pathogenicity. PathogenFinder 1.1 VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
Short sequence reads can be assembled to draft genomes by the server. It is also possible to input a complete or partial, preassembled genome. ResFinder gives the option to run the input against one or several antimicrobial classes simultaneously, and it uses BLAST to identify the acquired resistance genes. It is possible to search for genes with specified similarity from 80%–100% identity, and the best-matching genes are given as output ResFinder 2.1 VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
Web-server for fully automated workflow, from import of raw sequencer trace files to assignment of repeat codes and spa types. The spa typing technique uses the sequence of a polymorphic VNTR in the 3' coding region of the S. aureus-specific staphylococcal protein A (spa). Each new base composition of the polymorphic repeat found in a strain is assigned a unique repeat code. The repeat succession for a given strain determines its spa type. The individual repeat length for the spa VNTR is usually 24 bp, but exceptions of 21 to 30 exist. spaTyper 1.0 VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
MLST predictor based on WGS data. The user can upload either a preassembled complete or partial bacterial genome or short sequence reads from one of four sequencing platforms- The MLST Web server was specifically designed for ease of use first step is to upload the preassembled genome or short sequence reads. In the case of short sequence reads, the sequence platform also needs to be specified. After one selects the MLST scheme to be used, the job can be submitted. MLST 1.8 (MultiLocus Sequence Typing) VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
Common practice in SNP calling is to use a closely related reference genome, often a reference genome that has been sequenced and finished with respect to the study in question. CSI Phylogeny 1.1 (Call SNPs & Infer Phylogeny) Reads were mapped to reference sequences using BWA v [20]. The depth at each mapped position was calculated using genomeCoverageBed, which is part of BEDTools v [21]. Single nucleotide polymorphisms (SNPs) were called using mpileup part of SAMTools v [22]. SNPs were filtered out if the depth at the SNP position was not at least 10x or at least 10% of the average depth for the particular genome mapping. The reason for applying a relative depth filter is to set different thresholds for sequencing runs that yield very different amounts of output data (total bases sequenced). SNPs were filtered out if the mapping quality was below 25 or the SNP quality was below 30. The quality scores were calculated by BWA and SAMTools, respectively. The scores are phred-based but can be converted to probabilistic scores, with the formula 10‘(2Q/10), where Q is the respective quality score. The probabilistic scores will represent the probability of a wrong alignment or an incorrect SNP call, respectively. In each mapping, SNPs were filtered out if they were called within the vicinity of 10 bp of another SNP (pruning). A Z- score was calculated for each SNP as described above for NDtree. The depth requirements ensure that all positions considered are covered by a minimum amount of reads. The SNP quality and the Z-score requirements ensures that all positions considered are also called with significant confidence with respect to the bases called at each position. All genome mappings were then compared and all positions where SNPs was called in at least one mapping were validated in all mappings. The validation includes both the depth check and the Z-score check as for the SNP filtering. Any position that fails validation is ignored in all mappings. Maximum Likelihood trees were created using FastTree VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
OUR Resulst WGS VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
SpeciesFinder 1.2 VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
KmerFinder 2.0 VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
KmerFinder 2.0 VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
PathogenFinder 1.1 VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
ResFinder 2.1 VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
spaTyper 1.0 VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
MLST 1.8 (MultiLocus Sequence Typing) VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
CSI Phylogeny 1.1 (Call SNPs & Infer Phylogeny) VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta
VIII Workshop NRL – CPS including S.aureus - Torino, novembre 2015 IZSTO Istituto Zooprofilattic o Sperimentale del Piemonte, Liguria e Valle d ’ Aosta