Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens
Date
2020-05-12
Journal Title
Journal ISSN
Volume Title
Publisher
BMC (part of Springer Nature)
Abstract
Background: Careful consideration of experimental artefacts is required in order to successfully apply highthroughput
16S ribosomal ribonucleic acid (rRNA) gene sequencing technology. Here we introduce experimental
design, quality control and “denoising” approaches for sequencing low biomass specimens.
Results: We found that bacterial biomass is a key driver of 16S rRNA gene sequencing profiles generated from
bacterial mock communities and that the use of different deoxyribonucleic acid (DNA) extraction methods [DSP
Virus/Pathogen Mini Kit® (Kit-QS) and ZymoBIOMICS DNA Miniprep Kit (Kit-ZB)] and storage buffers [PrimeStore®
Molecular Transport medium (Primestore) and Skim-milk, Tryptone, Glucose and Glycerol (STGG)] further influence
these profiles. Kit-QS better represented hard-to-lyse bacteria from bacterial mock communities compared to Kit-ZB.
Primestore storage buffer yielded lower levels of background operational taxonomic units (OTUs) from low
biomass bacterial mock community controls compared to STGG. In addition to bacterial mock community controls,
we used technical repeats (nasopharyngeal and induced sputum processed in duplicate, triplicate or quadruplicate)
to further evaluate the effect of specimen biomass and participant age at specimen collection on resultant
sequencing profiles. We observed a positive correlation (r = 0.16) between specimen biomass and participant age at
specimen collection: low biomass technical repeats (represented by < 500 16S rRNA gene copies/μl) were primarily
collected at < 14 days of age. We found that low biomass technical repeats also produced higher alpha diversities
(r = − 0.28); 16S rRNA gene profiles similar to no template controls (Primestore); and reduced sequencing
reproducibility. Finally, we show that the use of statistical tools for in silico contaminant identification, as
implemented through the decontam package in R, provides better representations of indigenous bacteria following
decontamination.
Conclusions: We provide insight into experimental design, quality control steps and “denoising” approaches for
16S rRNA gene high-throughput sequencing of low biomass specimens. We highlight the need for careful
assessment of DNA extraction methods and storage buffers; sequence quality and reproducibility; and in silico
identification of contaminant profiles in order to avoid spurious results.
Description
CITATION: Claassen-Weitz, S., et al. 2020. Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens. BMC Microbiology, 20:113, doi:10.1186/s12866-020-01795-7.
The original publication is available at https://bmcmicrobiol.biomedcentral.com
The original publication is available at https://bmcmicrobiol.biomedcentral.com
Keywords
Low biomass nasopharyngeal -- Analysis, Gene sequincing, Bacteria -- Identification, Biomass -- Bacteriology, Ribonucleic acid
Citation
Claassen-Weitz, S., et al. 2020. Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens. BMC Microbiology, 20:113, doi:10.1186/s12866-020-01795-7