Acinetobacter baumannii GC1 recombination analysis
datasetposted on 12.01.2016 by KATHRYN HOLT
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Analysis of recombination within the genomes of Acinetobacter baumannii Global Clone 1 (GC1).
These data accompany the paper:
Holt KE, Kenyon JJ, Hamidian M, Schultz MB, Pickard DJ, Dougan G, Hall RM. "Five decades of genome evolution in the globally distributed, extensively antibiotic resistant pathogen Acinetobacter baumannii GC1", Microbial Genomics, 2016.
The files provided here are in the formats required for viewing in JScandy, available at http://jameshadfield.github.io/JScandy/. Simply drag and drop the GFF, CSV and tree files onto the browser window to view the data in interactive format.
The reference genome for Acinetobacter baumannii GC1 strain A1 was sequenced using PacBio and Illumina Hiseq platforms, and is available in NCBI under accession NZ_CP010781.1 (provided here in GFF format, "A1.gff").
Additional genomes were sequenced using Illumina HiSeq (accessions for individual genome data are given in Supp Table 1) or sourced from public data sets (accessions for individual genome data are given in Supp Table 2).
Recombination analysis was performed using Gubbins (available https://github.com/sanger-pathogens/gubbins), see Holt et al, MGen 2016 (in reference list) for details. Recombination blocks detected by Gubbins are in the file GubbinsOutput.gff.
A maximum likelihood phylogeny for the GC1 genomes, which excludes these recombinant blocks and therefore captures the vertical inheritance relationships between the genomes, output by Gubbins is given in the newick-format tree file GC1.tree. This is the tree shown in Figure 1 of the paper.
For strains with known isolation dates, BEAST (http://beast.bio.ed.ac.uk) was used to generate a dated phylogeny from the recombination-free alignment (file GC1_BEAST.tree); this is the tree shown in Figure 2 of the paper.
Genotypes for the capsular synthesis (K) locus and the outer core polysaccharide (OC) locus were determined from the genome assemblies and are listed in K_OC.csv
Screenshots from the visualisation of these files (ML tree only) using JScandy are also included. The plot at the bottom of the screenshots show the number of recombination events per base; from the whole-genome view, 4 peaks (recombination hotspots) can be identified. These correspond to the K locus (cordinates 88 kbp - 110kbp, near the start of the genome); the gene encoding outer membrane protein CarO (coordinate 2.934 Mbp); the gene encoding the AmpC beta-lactamase, which is associated with antibiotic resistance (coordinate 2.733 Mbp); and the OC locus (coordinates 3.366 Mbp - 3.375 Mbp, near the end of the genome). Zoomed in views of these regions are also given in the static screenshots.