Draft Genome Sequence of Mycobacterium arupense Strain GUC1

2015 
Mycobacterium arupense is a rapidly growing nonchromogenic mycobacteria that is closely related to the Mycobacterium terrae complex and has been isolated from clinical samples, most commonly sputum samples, as well as environmental water sources (1–3). Multiple reports of tenosynovitis and osteoarticular infections with M. arupense have also been presented, including infections caused by the type strain AR30097 (4–8). Although the unique identification of M. arupense has generally been related to sequence analysis, the phenotypic properties of M. arupense that resulted in it being classified as a species include its inability to grow at 42°C, rapid growth at 30°C, variable pyrazinamidase activity, and mycolic acid patterns that distinguish it from M. terrae (1). Rapidly growing mycobacteria constitute a commonly isolated population of acid-fast bacillus in the clinical microbiology lab of varying clinical importance (9, 10). We sequenced the first draft genome of M. arupense from a sputum sample of a patient diagnosed with bronchiectasis. The isolate was originally typed as M. terrae complex by high-performance liquid chromatography; however, genome sequencing and analysis of the 16S and rpoB sequences revealed its identity as M. arupense. DNA from M. arupense strain GUC1 was extracted using the Qiagen EZ1 kit, and paired-end libraries were prepared using the Nextera XT DNA library kit followed by sequencing on the Illumina MiSeq. Sequences were adapter and quality (Q20) trimmed using cutadapt, de novo assembled using SPAdes v3.5, metagenomically screened for contaminating sequence with SURPI, and annotated via prokka v1.1 (11–14). A total of 6,386,174 pairedend reads of average length 117 nucleotides were recovered after trimming. De novo assembly yielded 173 contigs for a total assembly size of 4,441,412 bp with an N50 of 56,189 bp, an average coverage of 115 , and a total of 4,182 coding sequences. Contiguity was most likely disrupted by the high G C content (67%) along with several high-copy-number integrases, transposases, and recombinases that were longer than sequence read length. Other high-copy number contigs included those containing genes to ESX/type VII secretion system, a distantly related 3-methyladenine glycosylase, and a copper-transporting ATPase. The assembly also includes 44 kb across two contigs that aligns with 99 to 100% nucleotide identity to the pMK12478 plasmid from Mycobacterium kansasii strain ATCC 12478 (15). Otherwise, the closest aligning sequenced genomes were Mycobacterium sp. JDM601 or Mycobacterium avium strains E1/E93 at approximately 80% nucleotide identity. By Comprehensive Antibiotic Resistance Database analysis, the GUC1 strain includes an ampC betalactamase and two metallo-beta-lactamases which demonstrate 80%, 90%, and 77% amino acid identity to that of M. avium strain Env 77, respectively (16, 17). Nucleotide sequence accession numbers. This whole-genome shotgun project has been deposited at DDBJ/ENA/GenBank under the accession no. LASW00000000. The assembly described in this paper is the second version, LASW02000000.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    0
    Citations
    NaN
    KQI
    []