How to submit prediction files ?

How to submit prediction files ?

Prediction files

Challenge participants should upload their prediction using the following naming convention:

MERI_<team_name>_sc2_<svsncs | fsvsns>_pred_subset<1|2>.txt (ex. MERI_myTeam_sc2_svsncs_pred_subset1.txt)

  • team_name is the name of the team used for the registration to the challenge
  • sc is the acronym for sub-challenge
  • svsncs and fsvsns are the acronyms for smokers versus non-current smokers (smoking), and former smokers versus never smokers (cessation), respectively.
  • subset 1 or 2 corresponds to the test dataset split in 2 subsets released at different dates (see details).

Participants are asked to provide a tab separated (\t) text file, including the header line, as indicated in the following example:

Sample ID

Smoker

Non-current smoker

Sample1

P1

P2

Sample2

0.95

0.05

Sample3

0.94

0.06

SampleM

0.85

0.15

For each subject or sample ID, participants should provide confidence values that a sample belongs to the smokers or non-current smokers class. The confidence value Px (x=1 or 2 for class 1 and class 2, respectively) should have a value between 0 and 1, with 1 being the most confident and 0 the least confident. Participants must provide the confidence values for both classes and the values have to differ (P1 ≠ P2, i.e, cannot be 0.5 and 0.5). For each sample, the absolute value of the difference between confidence values associated to each class must be ≥ 0.0001. Submissions with strictly equal class confidence values for a sample, as well as missing class confidence values, will be taken as incorrect predictions, thereby reducing the classifier performance. The sum of the confidence values across predicted classes for each sample must be equal to 1 (P1+ P2=1).

Samples predicted to belong to the non-current smoker class, will then be used, and further classified with a second signature model as former smokers or never smokers (see “Stepwise class predictions”)

Participants are asked to provide a tab separated (\t) text file, including the header line, as indicated in the following example:

Sample ID

Former smoker

Never smoker

Sample1

P1

P2

Sample2

0.95

0.05

Sample3

0.94

0.06

SampleM

0.85

0.15

Note: participants must provide class confidence predictions for all test samples.

 

Gene signature list

The list of genes that consitutes the predictive signature will be submitted in a tab separated (\t) text file, as indicated in the following example:

Gene symbol 1

Gene symbol 2

Gene symbol 3

Gene symbol X

If participants train two separate two-class prediction models predictive of smoking exposure or cessation status, the two respective gene signatures will be submitted in two separate files formatted as described above. In case participants train a three-class prediction model (ability to directly predict smoker, former smoker and never smoker labels), a single gene signature will be submitted.

 

Write-up files

Complete details of the method(s) should be provided including:

  1. Gene expression processing (when relevant)
  2. Feature selection (when relevant)
  3. Classification algorithm(s) with pseudo-code or scripts
  4. Signatures with model parameters applied for the predictions

In addition, participants must provide any other details that would allow easy reproduction and assessment of the results.

The write-up document should be submitted in Word or pdf file format for each sub-challenge with the following naming convention:

<team name>_writeup_sc2.doc (.docx or .pdf)

Find here for download template files to submit as well as templates for the gene lists.

 

Submit your files

 

Share this page