Person: Haibe-Kains, Benjamin
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
First Name
Name
Search Results
Publication A Three-Gene Model to Robustly Identify Breast Cancer Molecular Subtypes
(Oxford University Press (OUP), 2012) Haibe-Kains, Benjamin; Desmedt, C.; Loi, S.; Culhane, Aedin; Bontempi, G.; Quackenbush, John; Sotiriou, C.Background - Single sample predictors (SSPs) and Subtype classification models (SCMs) are gene expression–based classifiers used to identify the four primary molecular subtypes of breast cancer (basal-like, HER2-enriched, luminal A, and luminal B). SSPs use hierarchical clustering, followed by nearest centroid classification, based on large sets of tumor-intrinsic genes. SCMs use a mixture of Gaussian distributions based on sets of genes with expression specifically correlated with three key breast cancer genes (estrogen receptor [ER], HER2, and aurora kinase A [AURKA]). The aim of this study was to compare the robustness, classification concordance, and prognostic value of these classifiers with those of a simplified three-gene SCM in a large compendium of microarray datasets. Methods - Thirty-six publicly available breast cancer datasets (n = 5715) were subjected to molecular subtyping using five published classifiers (three SSPs and two SCMs) and SCMGENE, the new three-gene (ER, HER2, and AURKA) SCM. We used the prediction strength statistic to estimate robustness of the classification models, defined as the capacity of a classifier to assign the same tumors to the same subtypes independently of the dataset used to fit it. We used Cohen k and Cramer V coefficients to assess concordance between the subtype classifiers and association with clinical variables, respectively. We used Kaplan–Meier survival curves and cross-validated partial likelihood to compare prognostic value of the resulting classifications. All statistical tests were two-sided. Results - SCMs were statistically significantly more robust than SSPs, with SCMGENE being the most robust because of its simplicity. SCMGENE was statistically significantly concordant with published SCMs (k = 0.65–0.70) and SSPs (k = 0.34–0.59), statistically significantly associated with ER (V = 0.64), HER2 (V = 0.52) status, and histological grade (V = 0.55), and yielded similar strong prognostic value. Conclusion - Our results suggest that adequate classification of the major and clinically relevant molecular subtypes of breast cancer can be robustly achieved with quantitative measurements of three key genes.
Publication Significance Analysis of Prognostic Signatures
(Public Library of Science (PLoS), 2013) Beck, Andrew; Knoblauch, Nicholas W.; Hefti, Marco; Kaplan, Jennifer; Schnitt, Stuart; Culhane, Aedin; Schroeder, Markus S.; Risch, Thomas; Quackenbush, John; Haibe-Kains, BenjaminA major goal in translational cancer research is to identify biological signatures driving cancer progression and metastasis. A common technique applied in genomics research is to cluster patients using gene expression data from a candidate prognostic gene set, and if the resulting clusters show statistically significant outcome stratification, to associate the gene set with prognosis, suggesting its biological and clinical importance. Recent work has questioned the validity of this approach by showing in several breast cancer data sets that ‘‘random’’ gene sets tend to cluster patients into prognostically variable subgroups. This work suggests that new rigorous statistical methods are needed to identify biologically informative prognostic gene sets. To address this problem, we developed Significance Analysis of Prognostic Signatures (SAPS) which integrates standard prognostic tests with a new prognostic significance test based on stratifying patients into prognostic subtypes with random gene sets. SAPS ensures that a significant gene set is not only able to stratify patients into prognostically variable groups, but is also enriched for genes showing strong univariate associations with patient prognosis, and performs significantly better than random gene sets. We use SAPS to perform a large meta-analysis (the largest completed to date) of prognostic pathways in breast and ovarian cancer and their molecular subtypes. Our analyses show that only a small subset of the gene sets found statistically significant using standard measures achieve significance by SAPS. We identify new prognostic signatures in breast and ovarian cancer and their corresponding molecular subtypes, and we show that prognostic signatures in ER negative breast cancer are more similar to prognostic signatures in ovarian cancer than to prognostic signatures in ER positive breast cancer. SAPS is a powerful new method for deriving robust prognostic biological signatures from clinically annotated genomic datasets.