The microbiomics challenge organized as part of the sbv IMPROVER project aims at addressing questions related to microbiome composition and function. Initially, the “Microbiota composition prediction” challenge aims at evaluating computational pipelines for their ability to predict the microbial composition of samples based on sequencing data. We expect future challenges to address changes in the microbiome in diseases such as Inflammatory Bowel Disease, and/or under the influence of specific lifestyle factors. The diagnostic power of metagenomics data alone and complemented with other assay data such as metabolomics or host transcriptomics may also be investigated in order to understand the added value of such data in addition to metagenomics for disease diagnosis or toxicological assessment.

Introduction to the microbiota composition prediction challenge

The biological interpretation of changes to the microbiome relies on an accurate qualitative and quantitative measurement and inference of the microbiome community composition and function, using advanced sequencing technologies and computational analysis approaches. Choosing the most suitable tool is challenging, as there is a large and ever-increasing variety of computational methods, and the issue of how to objectively benchmark them is still being explored. A few crowdsourced initiatives have been conducted for evaluating performance of metagenomics data analysis methods and providing guidance to the scientific community. The two Assemblathon efforts, ran in 2010 and 2012 (Earl et al. 2011, Bradnam et al. 2013) focused on evaluating the performance of genome assembly methods. The CAMI (Critical Assessment of Metagenome Interpretation) team in collaboration with the metagenomics community organized a challenge in 2015 ( which aimed at evaluating methods in metagenomics for assembly, binning, and taxonomy profiling. CAMI provided an extensive benchmarking dataset to participants. Among the many results they collected, CAMI observed that (i) a good assembling step is crucial for successive binning; (ii) taxonomic profiling tools accurately predict higher level taxa (e,g., family level), while giving poor predictions on lower level taxa (e.g., species level). In a spirit of continuity with CAMI and Assemblathon, the “microbiota composition prediction” challenge, which is the first phase of a series in the field of microbiomics, aims at assessing objectively the performance of microbiomics computational analysis pipeline(s) as a whole, i.e. from quality control to taxonomy profiling, for the recovery of relative abundance and taxonomy assignment of bacterial communities, rather than assessing the individual steps of the process as CAMI already did. The participants are provided with shotgun DNA sequencing data for several microbiome samples and are asked to predict, at the phylum, genus, and species level, the composition and relative abundance of bacterial communities present in each sample. The participants will have the freedom to use any private/public datasets to set up and test their approach.


Background on Microbiomics


Microbiology is the study of microscopic organisms called microbes or microorganisms. Diverse types of microorganisms exists including bacteria, archaea, and viruses. Microorganisms populate most of the earth and can be found in every part of the biosphere including soil, oceans. They are present on the epithelial surface and digestive tract of higher organisms such as humans (Figure below) and the analysis of the composition of these populations is rapidly expanding field of research.

The microbiome

In higher organisms, the microbiome comprises a complex collection of microorganisms, colonizing different body niches, such as the gut, mouth, genitals, skin, or airways. The composition of this microorganism population varies depending on the body part and the health status of the individuals (Figure below). The human microbiome is known to have a beneficial role for homeostasis, assisting for example in the bioconversion of nutrients and detoxification, supporting immunity, protecting against pathogenic microbes, and maintaining host development, metabolism and physiology (Lloyd-Price et al. 2016, Koppel et al. 2017). It is now understood that a good and sensitive balanced interaction of microbes with the host is essential to health. Moreover, growing evidence suggests that the function of the indigenous microbiota can be influenced by many factors, including genetics, diet, age, and toxins. The disruption of this balance, called dysbiosis, is associated with a plethora of diseases, including cancers, immune-related diseases, metabolic diseases, inflammatory bowel disease, pulmonary pathologies, oral diseases, skin problems, and neurological disorders (Turnbaugh et al. 2007, Schuppan et al. 2009, Benson et al. 2010, Koren et al. 2012, Sommer and Backhed 2013, Galipeau et al. 2015, Riiser 2015, Caminero et al. 2016, Scher et al. 2016, Vatanen et al. 2016, Vogtmann and Goedert 2016, Blázquez and Berin 2017, Roy and Trinchieri 2017, Shukla et al. 2017). The common feature found among these unhealthy conditions is the loss of microbiota diversity, defined as the decrease in number and abundance of distinct types of microorganisms (Huttenhower et al. 2012, Mosca et al. 2016). Lower microbiome richness has been associated with metabolic dysfunctions, skin disorders, gastrointestinal disorders, and low-grade inflammation (Alekseyenko et al. 2013, Cotillard et al. 2013, Le Chatelier et al. 2013). Therefore, interrogating the composition of the microbiome can shed light on the etiology of diseases and, in the future, microbial abundances could potentially be used as markers for disease diagnostic.

Technologies and tools for microbiome analysis

Advances in genome sequencing technologies have enabled progress in the characterization of the microbial diversity, leading to a rapid expansion of the field known as microbiomics: the study of DNA of a microbial community. An accurate analysis of microbiome sequencing data (e.g. accurate taxonomic assignment and relative abundance estimates) relies on computational methods. A plethora of analysis tools has been developed and published. However, limited information on the performance of computational methods and their context of applicability make scientists’ selection of the most appropriate software difficult. Initially, the evaluation of computational methods in microbiome analysis has been limited to authors’ benchmarking of their own method against other existing methods, when authors publish novel or improved methods. However, this evaluation remains restricted and difficult due to the limited number of methods that are generally compared in a publication, with the risk to fall into “self-assessment trap” leading to biased results (Norel et al. 2011), as well as low consensus about benchmarking datasets and evaluation metrics in microbiomics. For this reason new initiatives (see Assemblathon (Earl et al. 2011, Bradnam et al. 2013) and the CAMI initiatives ( such as the one presented here are undertaken to evaluate computational methods in microbiomics independently, comprehensively, and objectively.

Introduction to microbiome.
Microorganisms are found in many environments on earth including soil, sea floor, and the human body that are among the most studied environments. In the figure, the relative abundances of four dominant bacterial phyla in different body sites: mouth (Bik et al. 2010), distal esophagus (Pei et al. 2004), lung (Beck et al. 2012), gut (Costello et al. 2009) is shown.

Microbiomics analysis pipeline in a nutshell

The figure below reports a typical pipeline for the analysis of shotgun data.

Analysis pipeline steps. Sample pipeline for the analysis of shotgun data

Quality control of reads: QC tools applied at this step check that the raw data are of good quality and provide insights for filtering/trimming.
Trimming/Filtering of low quality reads: Trimming refers to the action of shortening sequencing reads by removing based with poor quality base calls and bases from sequencing adapters. Filtering refers to the action of removing sequencing reads completely, for instance when the average quality of the read is below a certain threshold, or when the trimmed read becomes too short.
Host genome contamination removal: filter all unwanted reads that belong to the host genome.
Taxonomic assignment: Microbiome profile identification. Identification of represented genomes abundances.


  • Alekseyenko, A. V., et al. (2013). "Community differentiation of the cutaneous microbiota in psoriasis." Microbiome 1(1): 31.
  • Beck, J. M., et al. (2012). "The microbiome of the lung." Transl Res 160(4): 258-266.
  • Benson, A. K., et al. (2010). "Individuality in gut microbiota composition is a complex polygenic trait shaped by multiple environmental and host genetic factors." Proc Natl Acad Sci U S A 107(44): 18933-18938.
  • Bik, E. M., et al. (2010). "Bacterial diversity in the oral cavity of 10 healthy individuals." ISME J 4(8): 962-974.
  • Blázquez, A. B. and M. C. Berin (2017). "Microbiome and food allergy." Translational Research 179: 199-203.
  • Bradnam, K. R., et al. (2013). "Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species." Gigascience 2(1): 10.
  • Caminero, A., et al. (2016). "Duodenal Bacteria From Patients With Celiac Disease and Healthy Subjects Distinctly Affect Gluten Breakdown and Immunogenicity." Gastroenterology 151(4): 670-683.
  • Costello, E. K., et al. (2009). "Bacterial community variation in human body habitats across space and time." Science 326(5960): 1694-1697.
  • Cotillard, A., et al. (2013). "Dietary intervention impact on gut microbial gene richness." Nature 500(7464): 585-588.
  • Earl, D., et al. (2011). "Assemblathon 1: a competitive assessment of de novo short read assembly methods." Genome Res 21(12): 2224-2241.
  • Galipeau, H. J., et al. (2015). "Intestinal Microbiota Modulates Gluten-Induced Immunopathology in Humanized Mice." The American Journal of Pathology 185(11): 2969-2982.
  • Huttenhower, C., et al. (2012). "Structure, function and diversity of the healthy human microbiome." Nature 486.
  • Koppel, N., et al. (2017). "Chemical transformation of xenobiotics by the human gut microbiota." Science 356(6344).
  • Koren, O., et al. (2012). "Host remodeling of the gut microbiome and metabolic changes during pregnancy." Cell 150(3): 470-480.
  • Le Chatelier, E., et al. (2013). "Richness of human gut microbiome correlates with metabolic markers." Nature 500(7464): 541-546.
  • Lloyd-Price, J., et al. (2016). "The healthy human microbiome." Genome Med 8(1): 51.
  • Mosca, A., et al. (2016). "Gut Microbiota Diversity and Human Diseases: Should We Reintroduce Key Predators in Our Ecosystem?" Front Microbiol 7: 455.
  • Norel, R., et al. (2011). "The self-assessment trap: can we all be better than average?" Mol Syst Biol 7: 537.
  • Pei, Z., et al. (2004). "Bacterial biota in the human distal esophagus." Proc Natl Acad Sci U S A 101(12): 4250-4255.
  • Riiser, A. (2015). "The human microbiome, asthma, and allergy." Allergy, Asthma, and Clinical Immunology : Official Journal of the Canadian Society of Allergy and Clinical Immunology 11: 35.
  • Roy, S. and G. Trinchieri (2017). "Microbiota: a key orchestrator of cancer therapy." Nat Rev Cancer 17(5): 271-285.
  • Scher, J. U., et al. (2016). "Microbiome in Inflammatory Arthritis and Human Rheumatic Diseases." Arthritis & rheumatology (Hoboken, N.J.) 68(1): 35-45.
  • Schuppan, D., et al. (2009). "Celiac Disease: From Pathogenesis to Novel Therapies." Gastroenterology 137(6): 1912-1933.
  • Shukla, S. D., et al. (2017). "Microbiome effects on immunity, health and disease in the lung." Clin Trans Immunol 6: e133.
  • Sommer, F. and F. Backhed (2013). "The gut microbiota--masters of host development and physiology." Nat Rev Microbiol 11(4): 227-238.
  • Turnbaugh, P. J., et al. (2007). "The human microbiome project." Nature 449(7164): 804-810.
  • Vatanen, T., et al. (2016). "Variation in Microbiome LPS Immunogenicity Contributes to Autoimmunity in Humans." Cell 165(6): 1551.
  • Vogtmann, E. and J. J. Goedert (2016). "Epidemiologic studies of the human microbiome and cancer." Br J Cancer 114(3): 237-242.

Share this page