Gold standard

Each submission will be scored by comparing the predictions to the “gold standard”, which corresponds to the true relative abundances (percentages of reads per species, genus, and phylum) of bacterial communities in the 19 sample datasets.

Scoring methodology

Participants’ predictions will be automatically blinded during the process of submission – to ensure anonymity during the scoring, please ensure you do not have names/affiliations in the write up document. Multiple metrics will be applied to score anonymized participants’ predictions to establish a fair and meaningful score that is not biased by any particular performance measure. These complementary metrics will evaluate different aspects of the submitted predictions. Scores will be aggregated to provide final ranking of participants. To avoid the optimization toward the maximization of specific scoring metrics, the scoring methods and metrics will be disclosed once the scoring is completed in accordance with the Challenge Rules. Once all submissions are scored and the ranking will be established, teams will be de-anonymized and the winners announced.

Tie resolution

If several teams obtain the same aggregated score incentives will be allocated according to the Challenge Rules.

Scorers and Scoring Review Panel

A team of researchers from Philip Morris R&D in Neuchâtel (Switzerland) will establish a scoring methodology and perform the scoring on the blinded submissions under the review of an independent Scoring Review Panel. The sbv IMPROVER Scoring Review Panel:

  • will consist of experts in the field of metagenomics and systems biology, and their names will be disclosed during the open phase of the challenge;
  • will review the scoring strategy and procedure for the Microbiota composition prediction challenge to ensure fairness and transparency.


Blinded scoring. Submissions will be anonymized before scoring, so that the scoring team do not have access to the identity of the participating teams or the members of the teams. To help us maintain this, submissions (e.g. prediction files and write up) should not include any information regarding the identity or affiliations of the team or the members of the team.

Submissions and significance. One file (zip archive) containing all prediction files, which comply with the requested format, is required (with predicted taxonomic profiles at species level, genus level, and phylum level). One of the submissions must be statistically significant in at least one metric, at a level of significance given by a false discovery rate of 0.05. If these requirements are not met the challenge organizers retain the right not to declare a best performer in accordance with the Challenge Rules.

Timelines. The scoring process will start as soon as the challenge has been closed (timings are given in the Challenge Rules). If all conditions are met and the open period of the challenge is not extended, the anonymized ranking of the participating teams will be disclosed and the best performers informed by email.

Share this page