Diagnostic Signature Challenge - Psoriasis Sub-Challenge

Diagnostic Signature Challenge - Psoriasis Sub-Challenge

Team BCM CCEM AUPR_avg Rank-sum Rank
Team050 0.979805 0.980985 1 13 3
Team055 0.95573 0.962177 0.975535 66 20
Team056 0.975824 0.980655 0.998924 27 7
Team063 0.938089 0.953639 0.997883 62 19
Team071 0.963643 0.95687 0.997883 46 14
Team080 0.917985 0.943588 0.994923 82 29
Team081 0.942654 0.953097 1 45 13
Team091 0.959137 0.978078 1 22 4
Team106 0.843233 0.84009 1 81 28
Team112 0.941772 0.936855 0.938327 100 36
Team114 0.914841 0.920661 0.995888 93 34
Team115 0.866053 0.922129 0.998924 84 31
Team120 0.753544 0.805868 0.961953 123 45
Team122 0.929345 0.958339 0.996388 70 23
Team132 0.819496 0.86871 0.986818 111 41
Team140 0.530718 0.54252 0.548515 144 48
Team149 0.932916 0.938528 0.944845 100 36
Team158 0.961257 0.962698 1 27 7
Team161 0.988422 0.993527 1 5 2
Team163 0.945476 0.937823 0.976128 84 31
Team164 0.439419 0.449282 0.454625 150 50
Team170 0.956042 0.968387 1 25 6
Team171 0.789605 0.888646 0.996873 101 39
Team181 0.95291 0.951613 0.939213 82 29
Team187 0.943673 0.946431 0.994755 76 25
Team201 0.823206 0.907903 0.996873 97 35
Team202 0.971429 0.967742 0.959296 58 17
Team203 0.967196 0.967742 0.959405 58 17
Team208 0.944295 0.96196 0.998924 50 15
Team210 0.484831 0.512664 0.508775 147 49
Team212 0.95327 0.970024 0.998924 38 9
Team220 0.6 0.548387 0.658358 141 47
Team221 0.980258 0.983332 0.979429 43 12
Team226 0.649974 0.685403 0.846008 138 46
Team227 0.985714 0.983871 0.979032 40 11
Team235 0.943112 0.941452 0.996758 71 24
Team241 0.981481 0.983871 0.980884 39 10
Team242 0.868149 0.924565 1 69 22
Team245 0.807768 0.803374 0.994923 111 41
Team251 0.965961 0.965068 1 24 5
Team253 0.746348 0.864524 0.99566 111 41
Team261 0.952931 0.949624 0.957185 80 27
Team269 0.752273 0.715405 0.997883 106 40
Team273 0.944513 0.938955 0.994808 77 26
Team276 0.935029 0.939129 1 56 16
Team284 0.935434 0.948306 0.957667 90 33
Team290 0.860698 0.913484 0.994923 100 36
Team291 0.916512 0.92332 0.92781 111 41
Team294 0.99995 0.999978 1 3 1
Team297 0.881583 0.927021 1 67 21

The aim of this sub-challenge was to verify that a robust diagnostic signature for Psoriasis can be extracted from gene expression data.

Participants were asked to develop and then submit a classifier that can stratify skin samples into one of two phenotype groups - Psoriasis or Control. The classifier was built by using any publicly available gene expression data with their related clinical, demographic and batch information, and was tested on an independent dataset.

 

Overview of the Psoriasis Sub-Challenge

as communicated to the Participants

 

Synopsis

The aim of this sub-challenge is to verify that a robust diagnostic signature for Psoriasis can be extracted from gene expression data.

Participants are asked to develop and then submit a classifier that can stratify skin samples into one of two phenotype groups — Psoriasis or Control. The classifier will be built by using any publicly available gene expression data with their related clinical, demographic and batch information, and will be tested on an independent dataset.

 

Background

Psoriasis is the most prevalent autoimmune disease in the U.S; according to current studies, as many as 7.5 million Americans — approximately 2.2 percent of the population — have psoriasis. It is a chronic inflammatory and hyperproliferative skin disease, which, in addition to cutaneous manifestation, is accompanied with inflammatory arthritis in up to 40% cases.

The disease is diagnosed following physical examination of the skin lesions. Microscopic analysis of psoriatic skin biopsy shows thick, red, flaky cells with no sign of inflammation and blood tests can differentiate psoriasis from rheumatoid arthritis (see Figure 1).

 


Figure 1: Clinical manifestation of psoriasis. (A) The red boxes show the most prevalent sites where psoriasis affects the skin.
(B) Schematic view of the skin structure of a healthy and a psoriatic patient. Psoriatic skin shows signs of inflammation and scales (dead skin).

Psoriasis is typically treated with topical treatments of both steroids and non-steroids and phototherapy: UVB and UVA with light-sensitizing medication.  There are also systemic medications and new drugs that target the autoimmune response and specific parts of the immune system (T cells, TNF, interleukin).

The economic burden of psoriasis is estimated to be approximately $11.2 billion in the US, and CHF 314–458 million in Switzerland [1].

 

The Sub-Challenge

The sub-challenge is to identify a classifier that can distinguish lesional psoriatic skin from a healthy skin sample. At least 8 gene-signature-based classifiers have already been described in the literature [2-9] with very little overlap in the genes derived from these studies. While gene signatures are the typical components of classifiers from gene expression, we believe that there is room for exploration of other biologically-interpretable signatures that go beyond over- or under-expressing genes.

 

The Data

Each participant can find any suitable training data from publicly available repositories. For convenience, we include a list of third party publicly available datasets that participants may be able to use for training purposes:

Dataset ID

Normal skin

Lesional skin

GSE13355

58

64

GSE14905

21

28

Total

79

92

Table 1: Composition of possible training datasets.  Each cell displays the number of samples available for the corresponding phenotype.

Gene expression files can be downloaded from the Gene Expression Omnibus (GEO) Database or ArrayExpress by searching for the appropriate dataset ID.  We note that we do not control these sites and that the use of the data available on those sites may be subject to restrictions.

For testing, (including preparation of your submission), we provide participants with gene expression data from 62 skin samples without revealing their diagnosis, together with the following clinical information: gender, race/ethnicity, age, weight, height, and the body mass index (BMI). An Excel file “Psoriasis Clinical Info.xls” containing this information is available for download together with the test data.  We note that your use of this data is subject to the restrictions described in the Challenge Rules. You must accept these Challenge Rules to participate in the Challenge.

Data for testing was generated using the Affymetrix® GeneChip Human Genome U133 Plus 2.0 platform and is available for download as both “raw” data in the manufacturer’s CEL file format and as a table of quantified gene expression values. Raw CEL files were converted to gene expression values using the MAS5 algorithm implemented in Expression Console™, which is available for download if you choose to register on the third party site. In addition, participants are encouraged to use their preferred normalization method (RMA, GCRMA, etc) if so they choose. Additional third party methods for data normalization are available in the Bioconductor package for the R statistical computing environment. Again, please note that we do not control these sites and that the use of the materials available on these sites may be subject to restrictions.

 

Format for Submission of Predictions

Participants should upload their prediction using the following naming convention:

Psoriasis_<Team name>_predictions.txt

For each sample ID, participants should provide the confidence score of the prediction that a sample belongs to Psoriasis or Control class. The confidence of the classification should have a value between 0 and 1, with 1 being the most confident and 0 the least confident. The confidence score that the sample belongs to a psoriasis affected patient plus the confidence score that the sample belongs to a control subject must add to 1.

Please provide a tab separated (\t) text file, including the header line, as indicated in the following example:

Sample ID

Psoriasis_confidence

Control_confidence

psoriasis_1

0.20

0.80

psoriasis_2

0.95

0.05

psoriasis_3

0.94

0.06

psoriasis_61

0

1

psoriasis_62

0.85

0.15

Note: You must provide class confidence predictions for all test samples.

 

Submission of Write-up

The complete details of the method should be provided including:

  1. Raw gene expression processing (when relevant)
  2. Batch effect correction (when relevant)
  3. Feature selection (when relevant)
  4. Classification algorithm(s) with pseudo-code or scripts

In addition, participants must provide any other details that would allow easy reproduction and assessment of the results.

The write-up file should be submitted with the following naming convention:

Psoriasis_<Team name>_writeup.txt

In addition to plain text, the write-up can also be submitted as a word document.

The Submission must include all details in this Section and also set forth in the Challenge Rules document.
Please note, that by agreeing to the Challenge Rules document, you have granted certain rights and permissions in your submission and method.

 

Credits

The identity of the provider of the test data sets will be disclosed following the submission deadline.

 

References

  1. National Psoriasis Foundation. http://www.psoriasis.org/
  2. Gudjonsson JE, Ding J, Li X, Nair RP, Tejasvi T, Qin ZS et al.: Global gene expression analysis reveals evidence for decreased lipid biosynthesis and increased innate immunity in uninvolved psoriatic skin. J Invest Dermatol 2009, 129: 2795-2804.
  3. Guttman-Yassky E, Suarez-Farinas M, Chiricozzi A, Nograles KE, Shemer A, Fuentes-Duculan J et al.: Broad defects in epidermal cornification in atopic dermatitis identified through genomic analysis. J Allergy Clin Immunol 2009, 124: 1235-1244.
  4. Kulski JK, Kenworthy W, Bellgard M, Taplin R, Okamoto K, Oka A et al.: Gene expression profiling of Japanese psoriatic skin reveals an increased activity in molecular stress and immune response signals. J Mol Med (Berl) 2005, 83: 964-975.
  5. Reischl J, Schwenke S, Beekman JM, Mrowietz U, Sturzebecher S, Heubach JF: Increased expression of Wnt5a in psoriatic plaques. J Invest Dermatol 2007, 127:163-169.
  6. Suarez-Farinas M, Shah KR, Haider AS, Krueger JG, Lowes MA: Personalized medicine in psoriasis: developing a genomic classifier to predict histological response to Alefacept. BMC Dermatol 2010, 10.
  7. Yao Y, Richman L, Morehouse C, de los RM, Higgs BW, Boutrin A et al.: Type I interferon: potential therapeutic target for psoriasis? PLoS One 2008, 3: e2737.
  8. Zaba LC, Suarez-Farinas M, Fuentes-Duculan J, Nograles KE, Guttman-Yassky E, Cardinale I et al.: Effective treatment of psoriasis with etanercept is linked to suppression f IL-17 signaling, not immediate response TNF genes. J Allergy Clin Immunol 2009, 124: 1022-10.
  9. Suarez-Farinas M, Lowes MA, Zaba LC, Krueger JG: Evaluation of the psoriasis transcriptome across different studies by gene set enrichment analysis (GSEA). PLoS One 010, 5: e10247.

 

Scoring Legend:

Belief Confusion Matrix: A matrix whose element {i,j} is the average confidence that a subject belonging to class i is in class j. Each prediction has its own belief confusion matrix. The perfect belief confusion matrix is the identity matrix.BCM (Belief Confusion Metric): This metric measures the trace of the difference between a prediction Belief Confusion Matrix and the Perfect Confusion Matrix (details).  The final value is normalized to be between 0 and 1.
CCEM (Correct Class Enrichment Metric): To compute this metric we add the confidence of the subjects whose classes were correctly predicted and subtract the confidence of the subjects whose classes were incorrectly predicted. In other words, this is a measure of enrichment of the correctly classified subjects. The final value is normalized to be between 0 and 1.
AUPR: For each class in a sub-challenge, a list of subjects is created ordered according to the confidence that the subject belongs to that class. Using this list we computed the Precision-Recall curve for each class, from which the Area under the precision recall curve is extracted. Precision is a measure of specificity whereas recall is a measure of completeness.
AUPR_avg: There are as many AUPRs as classes in a sub-challenge. The AUPR_avg metric is the arithmetic mean of the AUPR across the classes.
Rank-sum: For each team, the rank-sum is the sum of the ranking of that team in the three metrics BCM, CCEM and AUPR_avg
Rank: The rank of the sum of the ranks over the 3 computed metrics.

Click here to open the Scoring Metrics Document.

Share this page