Diagnostic Signature Challenge - MS Diagnostic Sub-Challenge

Diagnostic Signature Challenge - MS Diagnostic Sub-Challenge

Team BCM CCEM AUPR_avg Rank-sum Rank
Team050 0.10603 0.106082 0.298526 117 39
Team055 0.323237 0.304667 0.355752 104 35
Team056 0.322229 0.310587 0.359433 101 34
Team063 0.385636 0.339222 0.350221 97 31
Team071 0.297115 0.244562 0.317236 113 38
Team080 0.480689 0.474007 0.471878 58 19
Team081 0.620215 0.656182 0.684175 14 5
Team091 0.654097 0.646118 0.696319 12 4
Team106 0.461831 0.446173 0.424731 76 26
Team112 0.522304 0.482 0.586734 43 13
Team114 0.344866 0.325732 0.357622 99 33
Team115 0.495953 0.523975 0.466839 50 16
Team120 0.568904 0.604263 0.812707 17 6
Team122 0.377252 0.290358 0.328354 106 36
Team132 0.565893 0.554858 0.786753 23 8
Team140 0.500357 0.511273 0.516593 47 15
Team149 0.554509 0.56775 0.665176 25 9
Team158 0.476595 0.504891 0.426535 61 20
Team161 0.244281 0.252618 0.336361 111 37
Team164 0.46692 0.451917 0.425468 73 24
Team170 0.47029 0.439917 0.501394 65 23
Team171 0.382431 0.307258 0.373777 97 31
Team177 0.474039 0.467419 0.460714 63 21
Team187 0.5448 0.552892 0.626978 31 10
Team202 0.513529 0.548605 0.52873 40 12
Team208 0.576029 0.639186 0.820426 11 2
Team212 0.444565 0.428008 0.452087 79 27
Team221 0.628777 0.625199 0.819485 11 2
Team227 0.883929 0.883333 0.874283 3 1
Team241 0.390625 0.373611 0.424154 85 29
Team251 0.454031 0.480133 0.421276 73 24
Team253 0.50326 0.518294 0.529608 43 13
Team261 0.5 0.466667 0.589573 50 16
Team269 0.54062 0.556518 0.604154 31 10
Team273 0.469705 0.45558 0.481379 64 22
Team276 0.482281 0.519083 0.454234 54 18
Team284 0.453123 0.436408 0.387187 84 28
Team290 0.574545 0.561492 0.721315 20 7
Team291 0.386161 0.366667 0.392801 89 30

The aim of this sub-challenge was to verify that a robust diagnostic signature for different types of multiple sclerosis (MS) patients can be extracted from gene expression data.

Participants were asked to develop and submit a classifier that can stratify MS patients in one of two phenotype groups – relapsing-remitting multiple sclerosis (RRMS) or Control - based on the Peripheral Blood Mononuclear Cells (PBMC) transcriptome. The classifier was build by using publicly available gene expression data with clinical, demographic, and batch information, and was tested on an independent dataset.


Overview of the Multiple Sclerosis Diagnostic Sub-Challenge

as communicated to the Participants



The aim of this sub-challenge is to verify that a robust diagnostic signature for different types of multiple sclerosis (MS) patients can be extracted from gene expression data.

Participants are asked to develop and submit a classifier that can stratify MS patients in one of two phenotype groups – relapsing-remitting multiple sclerosis (RRMS) or Control - based on the Peripheral Blood Mononuclear Cells (PBMC) transcriptome. The classifier will be build by using publicly available gene expression data with clinical, demographic, and batch information, and will be tested on an independent dataset.



Multiple sclerosis (MS) is an autoimmune disease that affects the central nervous system. The trigger of the autoimmune process in MS is unknown. MS is believed to occur as a result of some combination of genetic, environmental and infectious factors [1], and possibly other factors such as vascular problems [2]. Previous studies of identical twins have demonstrated a concordance of 30% to develop MS [3], suggesting that the genetic background has a relatively limited but significant role in triggering MS.

The symptoms of the disease result from inflammation, swelling, and lesions on the myelin.  

There are a few MS progression subtypes (see Figure 1): relapsing remitting MS (RRMS), primary progressive MS (PPMS), and secondary progressive MS (SPMS). In 85% of the patients, the disease has a relapse - remitting (RR) course, which is characterized by the onset or deterioration of the neurological symptoms (relapses), followed by partial or complete recovery (remissions).

Similarly to most other autoimmune diseases, MS is significantly more common (at least 2-3 times) in women than men. This disease is most commonly diagnosed between the ages of 20 and 50.  The risk of developing MS in the general population is 1/750 and over 2.5 million people are living with the disease worldwide. The disease can be managed and the symptoms controlled to various degrees of success with an individualized, multifaceted approach that includes medications and other therapies. However, there is no cure for multiple sclerosis [4]. MS has a significant socioeconomic impact; in 2007 the cost was roughly 50,000 dollars per person affected by MS [5].

Diagnosis by a neurologist usually involves ruling out other nervous system disorders with invasive/expensive tests such as lumbar puncture, Magnetic Resonance Imaging (MRI) scan of the brain and nerve function study [4].  


Figure 1: Progression of the disease for clinically isolated syndromes and multiple sclerosis types

A clinically isolated syndrome (CIS) is an individual's first neurological episode, caused by inflammation or demyelination of nerve tissue. The diagnosis of multiple sclerosis is only possible after a MRI confirms lesions in the brain, which typically happens after multiple sites are affected (in the course of usually multiple events). The main forms of MS are distinguished by their different courses over time. RRMS is the most common form of MS. It defines patients having relapses followed by periods of remission. Multiple sclerosis diagnosis is made after a minimum of 2 relapses for RRMS. PPMS patients have constant symptoms without remission. SPMS progression starts the same way as RRMS, but at some point there is no more remission.


The Sub-Challenge

The subchallenge is to identify a classifier that can distinguish between RRMS and Control (healthy) subjects from PBMC gene expression data. While gene signatures are the typical components of classifiers from gene expression, we believe that there is room for exploration of other biologically-interpretable signatures that go beyond over-or-under expressing genes.


Figure 2: The multiple sclerosis diagnostic subchallenge


The Data

Each participant can find any suitable training data from publicly available repositories. For convenience, we include a list of third party publicly available datasets that participants may be able to use for training purposes:

Dataset ID



Relapsing RRMS

Remitting RRMS





























































Table 1: Composition of possible training datasets.

Each cell displays the number of samples available for the corresponding phenotype.

*Please note that controls from this dataset come from other neurological disorders of non-inflammatory nature.

Gene expression files can be downloaded from the Gene Expression Omnibus (GEO) Database or ArrayExpress by searching for the corresponding Dataset IDs.  We note that we do not control these sites and that the use of the data available on those sites may be subject to restrictions.

For testing, (including preparation of your submission), we provide participants with gene expression data from 60 PBMC samples without revealing their diagnosis, together with the following clinical information: gender, race/ethnicity, age, weight, height, and the body mass index (BMI). An Excel file “MS Diagnostic Clinical Info.xls” containing this information is available for download together with the test data. We note that your use of this data is subject to the restrictions described in the Challenge Rules. You must accept these Challenge Rules to participate in the Challenge.

Data for testing was generated using the Affymetrix® GeneChip Human Genome U133 Plus 2.0 platform and is available for download as both “raw” data in the manufacturer’s CEL file format and as a table of quantified gene expression values. Raw CEL files were converted to gene expression values using the MAS5 algorithm implemented in Expression Console™, which is available for download if you choose to register on the third party site. In addition, participants are encouraged to use their preferred normalization method (RMA, GCRMA, etc) if so they choose. Additional third party methods for data normalization are available in the Bioconductor package for the R statistical computing environment. Again, please note that we do not control these sites and that the use of the materials available on these sites may be subject to restrictions.


Format for Submission of Predictions

Challenge participants should upload their prediction with the following naming convention:MSDiagnostic _<Team name>_predictions.txt
For each sample ID, participants should provide a confidence score of the prediction that a sample belongs to RRMS or Control class. The confidence of the classification should have a value between 0 and 1, with 1 being the most confident and 0 the least confident.
The sum of the confidence scores across predicted classes for each sample must be equal to 1.

Please provide a tab separated (\t) text file, including the header line, as indicated in the following example:

Sample ID


















Note: participants must provide class confidence predictions for all test samples.


Submission of Write-up

The complete details of the method should be provided including:

  1. Raw gene expression processing (when relevant)
  2. Batch effect correction (when relevant)
  3. Feature selection (when relevant)
  4. Classification algorithm(s) with pseudo-code or scripts

In addition, participants must provide any other details that would allow easy reproduction and assessment of the results.

The write-up file should be submitted for each sub-challenge, with the following naming convention:

MSDiagnostic _<Team name>_writeup.txt

In addition to plain text, the write-up can also be submitted as a word document.

The Submission must include all details in this Section and also set forth in the Challenge Rules document.

Please note, that by agreeing to the Challenge Rules document, you have granted certain rights and permissions in your submission and method.



The identity of the provider of the test data sets will be disclosed following the submission deadline.



  1. Compston A and Coles A: Multiple Sclerosis. Lancet 2008, 372(9648):1502-1517
  2. Minagar A, Jy W, Jimenez JJ, Alexander JS: Multiple sclerosis as a vascular disease. Neurol Res. 2006, 28(3):230-235.
  3. Compston A and Coles A. Multiple sclerosis. Lancet 2002, 359(9313):1221-1231.
  4. http://www.ninds.nih.gov/disorders/multiple_sclerosis/multiple_sclerosis.htm
  5. Trisolini M, Honeycutt A, Wiener J, and Lesesne S: Global economic impact of multiple sclerosis. Multiple Sclerosis International Federation. 2010, London, United Kingdom.


Scoring Legend:

Belief Confusion Matrix: A matrix whose element {i,j} is the average confidence that a subject belonging to class i is in class j. Each prediction has its own belief confusion matrix. The perfect belief confusion matrix is the identity matrix.
BCM (Belief Confusion Metric): This metric measures the trace of the difference between a prediction Belief Confusion Matrix and the Perfect Confusion Matrix (details).  The final value is normalized to be between 0 and 1.
CCEM (Correct Class Enrichment Metric): To compute this metric we add the confidence of the subjects whose classes were correctly predicted and subtract the confidence of the subjects whose classes were incorrectly predicted. In other words, this is a measure of enrichment of the correctly classified subjects. The final value is normalized to be between 0 and 1.
AUPR: For each class in a sub-challenge, a list of subjects is created ordered according to the confidence that the subject belongs to that class. Using this list we computed the Precision-Recall curve for each class, from which the Area under the precision recall curve is extracted. Precision is a measure of specificity whereas recall is a measure of completeness.
AUPR_avg: There are as many AUPRs as classes in a sub-challenge. The AUPR_avg metric is the arithmetic mean of the AUPR across the classes.
Rank-sum: For each team, the rank-sum is the sum of the ranking of that team in the three metrics BCM, CCEM and AUPR_avg
Rank: The rank of the sum of the ranks over the 3 computed metrics.

Click here to open the Scoring Metrics document.

Share this page