The University of Melbourne
Browse

Supporting data and files for, "Distinct evolutionary dynamics of horizontal gene transfer in drug resistant and virulent clones of Klebsiella pneumoniae."

Version 2 2018-09-03, 04:38
Version 1 2018-09-03, 04:33
Posted on 2018-09-03 - 04:38 authored by KELLY WYRES
This collection contains a number of supporting data files and code for the comparative analysis of 28 distinct K. pneumoniae clones described in, "Distinct evolutionary dynamics of horizontal gene transfer in drug resistant and virulent clones of Klebsiella pneumoniae."

CONTENTS:
1. Assemblies directory: 1092 genome assemblies in fasta format for K. pneumoniae isolates included in the comparative analyses
2. Annotations directory: gff formatted annotation files for each of the 1092 genomes (can be used as input for Roary pan-genome analysis)
2. Reference_chromosomes directory: 28 completed chromosomal reference sequences, one for each of the clones
3. Gubbins recombination analysis files directory:
i) 28 pseudo-chromosomal alignments, one for each of the clones (can be used as input for recombination detection and/or phylogenetic analyses).
ii) 28 Gubbins output recombination predictions (.gff, one per clone).
iii) 28 Gubbins per branch statistics files (one per clone).
iv) parseGubbins2counts.py - python script for parsing the recombination prediction files and calculating mean numbers of recombination events per window of the chromosome.
4. Pan-genome directory:
i) pan_genome_gene_content_matrix.tsv - tab delimited matrix with genes in rows and genomes in columns. 1 = gene present. 0 = gene absent.
ii) pan_genome_roary_presence_absence_output.csv - direct output from Roary pan-genome analysis
iii) pan_genome_PCA_coords.tsv - coordinates for first 40 principal components in tab delimited format, genomes in rows, coordinates in columns marked Axis1-40.
iv) pan_genome_PCA_coords_clone_centroids.tsv - coordinates for clone centroids in tab delimited format as above
v) summarisePanGenomeDistancesFromCentroids.py - python script for calculating individual Euclidean distances to clone centroids. Takes the genome coordinates and clone centroid coordinates files as inputs.
vi) accessory_gene_ancestry_by_clone.tsv - tab delimited table with clones in rows and genera in columns. Values indicate the proportion of accessory genes from each clone assigned to each genus by Kraken.
5. Phage directory:
i) phage_gene_content_matrix.tsv - phage gene content matrix in tab delimited format with genomes in rows and phage genes in columns. 1 = gene present. 0 = gene absent.
ii) phage_gene_reference_sequences.fasta - reference fasta sequences for the phage genes reported in the gene content matrix above.
iii) phagePCA_coords.tsv - coordinates for first 25 principal components in tab delimited format, genomes in rows, coordinates in columns marked Axis1-25.
iv) phagePCA_coords_clone_centroids.tsv - coordinates for clone centroids in tab delimited format as above
6. Defence mechanisms directory:
i) NTUH-K2044_cas_genes.fasta - strain NTUH-K2044 cas gene nucleotide sequences in fasta format.
ii) INF256_cas_genes.fasta - strain INF256 cas gene nucleotide sequences in fasta format.
iii) REase_reference_sequences.fasta - references for types I, II, III and IV REase sequences in fasta format.
7. CG15 subclades analyses directory:
i) CG15_KL2_subclade_pseudo_whole_genome_aln.fasta - pseudo-chromosomal alignment for CG15-KL2 subclade (input for Gubbins).
ii) CG15_other_subclade_pseudo_whole_genome_aln.fasta - pseudo-chromosomal alignment for CG15-other subclade (input for Gubbins).
iii) CG15_KL2_subclade.recombination_predictions.gff - Gubbins recombinations predictions for CG15-KL2 subclade.
iv) CG15_other_subclade.recombination_predictions.gff - Gubbins recombinations predictions for CG15-other subclade.
v) CG15_KL2_subclade.per_branch_statistics.csv - Gubbins branch statistics for CG15-KL2 subclade.
vi) CG15_other_subclade.per_branch_statistics.csv - Gubbins branch statistics for CG15-other subclade.
vii) CG15_subclades_PCA_coords.tsv - coordinates for first 40 principal components in tab delimited format, genomes in rows, coordinates in columns marked Axis1-40. CG15 split into CG15-KL2 and CG15-other.
viii) CG15_subclades_PCA_coords_clone_centroids.tsv - coordinates for clone centroids in tab delimited format as above.
ix) CG15_KL2_dated_genomes_beast.xml - XML input file for BEAST2 analysis for CG15-KL2 subclade (note only genomes for which isolate collection dates are known are included).


CITATION:
Kelly L Wyres, Ryan R Wick, Louise M Judd, Roni Froumine, Alex Tokoloyi, Claire Gorrie, Margaret M C Lam, Sebastián Duchêne, Adam Jenney and Kathryn E Holt. 2018. Distinct evolutionary dynamics of horizontal gene transfer in drug resistant and virulent clones of Klebsiella pneumoniae.

CITE THIS COLLECTION

DataCite
3 Biotech
3D Printing in Medicine
3D Research
3D-Printed Materials and Systems
4OR
AAPG Bulletin
AAPS Open
AAPS PharmSciTech
Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg
ABI Technik (German)
Academic Medicine
Academic Pediatrics
Academic Psychiatry
Academic Questions
Academy of Management Discoveries
Academy of Management Journal
Academy of Management Learning and Education
Academy of Management Perspectives
Academy of Management Proceedings
Academy of Management Review
or
Select your citation style and then place your mouse over the citation text to select it.

SHARE

email
need help?