PrognoScan: A new database for meta-analysis of the prognostic value of genes.

manuscript - presentation

PrognoScan searches the relation between gene expression and patient progonsis such as overall survival (OS) and disease free survival (DFS) across a large collection of publicly available cancer microarray datasets.
Background
In cancer research, the association between a gene and clinical outcome suggests the underlying etiology of the disease and consequently can motivate further studies. The recent availability of published cancer microarray datasets with clinical annotation provides the opportunity for linking gene expression to prognosis. However, the data are not easy to access and analyze without an effective analysis platform.
Description
To take advantage of public resources in full, a database named "PrognoScan" has been developed. This is 1) a large collection of publicly available cancer microarray datasets with clinical annotation, as well as 2) a tool for assessing the biological relationship between gene expression and prognosis. PrognoScan employs the minimum P-value approach for grouping patients for survival analysis that finds the optimal cutpoint in continuous gene expression measurement without prior biological knowledge or assumption and, as a result, enables systematic meta-analysis of multiple datasets.
Conclusion
PrognoScan provides a powerful platform for evaluating potential tumor markers and therapeutic targets and would accelerate cancer research. The database is publicly accessible at http://www.prognoscan.org/.
Screenshots

PrognoScan screenshot and sample search results (part 1).
(A) The top page is quite simple and only requires entering the gene identifier(s). (B) Summary table. Column headings include dataset, cancer type, subtype, endpoint, cohort, contributor, array type, probe ID, number of patients, optimal cutpoint, Pmin and Pcor. A statistically significant value of Pcor is given in red font. Each dataset has a link to the public domain where the raw data is archived. By clicking a probe ID in the summary table, a detailed report for the test is displayed. The table can be downloaded in a tab delimited file from the button at bottom.



PrognoScan screenshot and sample search results (part 2).
(A) Annotation table. Row headings are color-coded. For example, headings of details such as therapy history, sample type and pathological parameters are highlighted in yellow and basic attributes in blue. (B) Expression plot. Patients are ordered by the expression values of the given gene. The X-axis represents the accumulative number of patients and the Y-axis represents the expression value. Straight lines (cyan) show the optimal cutpoints that dichotomize patients into high (red) and low (blue) expression groups. (C) Expression histogram. The distribution of the expression value is presented where the X-axis represents the number of patients and the Y-axis represents the expression value on the same scale as the expression plot. The line of the optimal cutpoint is also shown (cyan). (D) P-value plot. For each potential cutpoint of expression measurement, patients are dichotomized and survival difference between high and low expression groups is calculated by log-rank test. The X-axis represents the accumulative number of patients on the same scale as the expression plot and the Y-axis represents raw P-values on a log scale. The cutpoint to minimize the P-value is determined and indicated by the cyan line. The gray line indicates the 5% significance level. (E) Kaplan-Meier plot. Survival curves for high (red) and low (blue) expression groups dichotomized at the optimal cutpoint are plotted. The X-axis represents time and the Y-axis represents survival rate. 95% confidence intervals for each group are also indicated by dotted lines.
Newsletter
2012/09/17 The stable address http://www.prognoscan.org has been set.
2011/11/16 The Okayama et al. dataset of lung adenocarcinomas has been added.
2011/11/16 The Gobble et al. dataset of liposarcoma has been added.
2011/08/27 The Laurent et al. dataset of uveal melanomas has been added.
2011/08/22 The Bonome et al. dataset of ovarian cancers has been added.
2011/06/22 The Wilkerson et al. dataset of lung squamous cell carcinomas has been added.
2011/04/13 The Zhu et al. dataset of non-small cell lung cancers has been added.
2011/04/13 The Lee et al. dataset of meningiomas has been added.
2011/01/29 The Bos et al. dataset of breast cancers has been added.
2011/01/10 The Denkert et al. dataset of ovarian cancers has been added.
2010/06/28 P-values and HRs from Cox univariate analysis have been added.
2010/06/28 PrognoScan has adopted original processed expression values as default setting.
2010/06/28 The Yoshihara et al. dataset of ovarian cancers has been added.
2010/06/28 The Smith et al. dataset of colorectal cancers has been added.
2010/06/28 The Li et al. dataset of breast cancers has been added.
2010/06/28 The Nutt et al. dataset of gliomas has been added.
2010/06/28 The Freije et al. dataset of gliomas has been added.
2010/06/28 The Sboner et al. dataset of prostate cancers has been added.
2010/02/10 The Jorissen et al. dataset of colorectal cancers has been added.
2009/12/23 The Shedden et al. dataset of lung adenocarcinomas has been added.
2009/12/23 The Bogunovic et al. dataset of metastatic melanomas has been added.
2009/12/05 An option for reporting test results based on original processed data has been added.
2009/11/13 A column for hazard ratio (HR) between high and low expression groups has been added.
2009/10/10 The Dave et al. dataset of follicular lymphomas has been added.
2009/08/09 The Tomida et al. dataset of lung adenocarcinomas has been added.
2009/06/13 The Staub et al. dataset of colorectal cancers has been added.
2009/04/24 PrognoScan has been published in BMC Med Genomics. 2009 2:18.
Bibliography
The PrognoScan database has been referred in
Notes
Please cite the use of this database as:

PrognoScan: A new database for meta-analysis of the prognostic value of genes. Mizuno H, Kitada K, Nakai K, Sarai A. BMC Med Genomics. 2009 2:18.

Dynamic links to the PrognoScan search results can be created using Entrez gene IDs as:

http://dna00.bio.kyutech.ac.jp/PrognoScan-cgi/PrognoScan.cgi?MODE=CAL_PROGNOSIS_GID&QUERY=Entrez gene ID

Feedbacks and bug reports are highly appreciated. Please use the following address for contact:

Hideaki Mizuno

Links
AE: ArrayExpress
GEO: Gene Expression Omnibus
caArray: Array Data Management System
SurvExpress: Biomarker validation for cancer gene expression
LSDB: Life Science DataBase

PrognoScan