16S Sequencing vs Shotgun Metagenomic Sequencing
Which to Choose For Microbiome Studies
16S sequencing or shotgun sequencing? Almost all microbiome researchers ask themselves this question when planning a new study because the vast majority of microbiome publications utilize either 16S rRNA gene sequencing or shotgun metagenomic sequencing to generate raw data for subsequent microbial profiling or metagenomics analyses. Each method has its pros and cons so, which method should you choose?
What is 16S rRNA Gene Sequencing?
16S rRNA gene sequencing, or simply 16S sequencing, utilizes PCR to target and amplify portions of the hypervariable regions (V1-V9) of the bacterial 16S rRNA gene1. Amplicons from separate samples are then given molecular barcodes, pooled together, and sequenced. After sequencing, raw data is analyzed with a bioinformatics pipeline which includes trimming, error correction, and comparison to a 16S reference database. After the reads are assigned to a phylogenetic rank, a taxonomy profile can be generated. Similarly, ITS sequencing follows the same strategy but targeting the ITS (Internal transcribed spacer) region found in fungal genomes.
What is Shotgun Metagenomic Sequencing?
Unlike 16S sequencing, which only targets 16S rRNA genes, shotgun metagenomic sequencing sequences all given genomic DNA from a sample. The library preparation workflow is similar to regular whole genome sequencing, including random fragmentation and adapter ligation. A typical workflow for taxonomy analysis of shotgun metagenomic data includes quality trimming and comparison to a reference database comprising whole genomes (e.g. Kraken2 and Centrifuge3) or selected marker genes (MetaPhlAn4 and mOTU5) to generate a taxonomy profile. Because shotgun metagenomic sequencing covers all genetic information in a sample, the data can be used for additional analyses, e.g. metagenomic assembly and binning, metabolic function profiling, and antibiotic resistance gene profiling.
16S/ITS sequencing vs. shotgun metagenomic sequencing
If your study requires genomic analyses beyond taxonomy profiling, such as metabolic pathway analysis, you should consider shotgun metagenomic sequencing due to its greater genomic coverage and data output. If composition profiling is the main purpose of the study, both techniques have pros and cons to be considered (Table 1).
The taxonomy resolution of 16S/ITS sequencing depends on the variable regions targeted, the organism itself, and the sequence analysis algorithm. In recent years, some error-correction methods, e.g. DADA26, have dramatically improved the accuracy and taxonomy resolution of this technique. With DADA2, species-level resolution for many organisms using regular 16S sequencing is now a reality. But in theory, shotgun metagenomic sequencing can achieve strain-level resolution because it can cover all genetic variations. Although in practice, the accuracy of strain-level resolution still faces technical challenges. Even so, shotgun metagenomic sequencing achieves higher resolution compared to 16S/ITS sequencing.
If metabolic function analysis is a goal, most researchers will quickly overlook 16S and ITS sequencing. But, there are some tools to can infer metabolic function from taxonomy data, e.g. PICRUSt7. But, in general shotgun metagenomic sequencing is often utilized when functional profiling is required because of the additional gene coverage.
Microbial coverage and recommended sample type
Shotgun sequencing examines all metagenomic DNA while 16S sequencing only 16S rRNA genes, which also suffers from incomplete primer coverage. Consequently, the former has greater cross-domain coverage. Then, why does Table 1 denote 16S/ITS sequencing as better in bacterial and fungi coverage? This stems from the species coverage of available reference databases because the taxonomy prediction of these sequencing approaches heavily depends on the reference database used. Currently, the coverage of 16S/ITS databases is much better than whole-genome databases. This is because the whole genomes of microbes associated with the human microbiome are much better studied than genomes from microbes associated with other environments. This is why it is recommended to use shotgun metagenomic sequencing for human-microbiome-related samples, such as feces and saliva, if taxonomy profiling is the main purpose.
Moreover, metagenomic sequencing has a higher dependence on the reference database. For example, if a bacterium has no closely related representative in the 16S reference database, you might be able to identify it at a higher phylogenetic rank or as an unknown bacteria. But, in the case of shotgun metagenomic sequencing, if a bacterium does not have a close relative (a genome from the same genus) in the reference genome database, you are likely to miss it completely. For example, the ZymoBIOMICS Spike-in Control I contains two microbes alien to the human microbiome (Imtechella halotolerans and Allobacillus halotolerans), whose genomes were previously not available. If you spike it into a fecal sample and sequence with shotgun sequencing, most bioinformatic pipelines will miss them completely unless you manually add these two genomes into the reference database. On the other hand, if analyzed with 16S sequencing, they will be identified due to the presence of their 16S sequence in reference databases.
Error-correction tools, such as DADA2, not only improve the taxonomy resolution of 16S/ITS sequencing, but they also improve accuracy. This is demonstrated when sequencing DNA from the mock microbial community (e.g. ZymoBIOMICS Microbial Community Standard). All 16S sequences are recovered with no error in the sequence, i.e. no false positives. But, with shotgun metagenomic sequencing, unless there is a perfect representative genome in the reference database for a microbe sequenced, the bioinformatics analysis is likely to predict the existence of multiple “closely-related” genomes. These closely related genomes can be from different species of the same genus or even different genus. For example, assume there are three closely related microbes, A, B, and C, and they share some sequences in common. Species A shares some sequences only with B and some other sequences only with C. If the reference database only contains genomes from B and C, when A was sequenced, the bioinformatics will predict that both B and C are present. For instance, both A and B could be strains of Escherichia coli and C is Salmonella enterica; the sequences uniquely shared by B and C may stem from a horizontal gene transfer, which is common between closely related microbes. Because of this, 16S/ITS sequencing is better in regard to false positives.
Host DNA interference
The presence of too much host DNA can cause non-specific amplification in the library preparation process of 16S and ITS sequencing, but the impact is controllable by adjusting PCR cycles and changing primers. On the other hand, the interference of host DNA is a much more difficult problem for shotgun metagenomic sequencing even though the cost of sequencing has decreased dramatically. Depending on the sample type, some samples can contain >99% human host DNA, which not only increases sequence cost but also introduces uncertainty to the measurement. This is why many researchers look into host DNA depletion, e.g. HostZERO Microbial DNA Kit, before the library preparation of shotgun sequencing. However, there may not be enough microbial genomic DNA left for shotgun sequencing after host DNA depletion, which typically requires a minimum input of 1ng. The interference of host DNA is why shallow shotgun sequencing is only recommended for human fecal samples.
While Shotgun metagenomic sequencing requires 1 ng DNA input in minimum, 16S/ITS sequencing is much more sensitive with input minima being femtograms or even as low as 10 copies of 16S rRNA genes.
Here to Help
The choice between 16S sequencing and shotgun metagenomic sequencing is a critical step for all microbiome studies. In addition to budget, many aspects of the project need to be considered including sample type, desired analyses, taxonomy resolution, and target organisms. Taking these considerations into account will help you choose the right sequencing method for your next microbiome project. The ZymoBIOMICS microbiome sequencing services offer 16S, ITS, and shotgun sequencing as complete services from DNA extraction through sequencing and bioinformatics. The sequencing and bioinformatics experts at Zymo Research are happy to help you choose the right sequencing method for your study.
Find out which sequencing method is best for you. Talk to the experts providing the ZymoBIOMICS Sequencing Services.Learn More
1. Laudadio I, Fulci V, Stronati L, Carissimi C. Next-Generation Metagenomics: Methodological Challenges and Opportunities. Omics a Journal of Integrative Biology 2019 23(7): 327-333.
2. Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biology 2014 15(3): R46.
3. Kim D, Song L, Breitwieser FP, Salzberg SL. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Research 2016 26(12): 1721-1729.
4. Segata N, Waldron L, Ballarini A, Narasimhan V, Jousson O, Huttenhower C. Metagenomic microbial community profiling using unique clade-specific marker genes. Nature Methods 2012 9(8): 811-814.
5. Sunagawa S, et al. Metagenomic species profiling using universal phylogenetic marker genes. Nature Methods 2013 10(12): 1196-1199.
6. Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. DADA2: High resolution sample inference from Illumina amplicon data. Nature Methods 2016 13(7): 581-583.
7. Langille MGI, Zaneveld J, Caporaso JG, McDonald D, Knights D, Reyes JA, Clemente JC, Burkepile DE, Vega Thurber RL, Knight R, Beiko RG, Huttenhower C. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nature Biotechnology 2013 31(9): 814-821.