SMAD4-dependent barrier constrains prostate cancer growth and metastatic progression

Effective clinical management of prostate cancer (PCA) has been challenged by significant intratumoural heterogeneity on the genomic and pathological levels and limited understanding of the genetic elements governing disease progression. Here, we exploited the experimental merits of the mouse to test the hypothesis that pathways constraining progression might be activated in indolent Pten-null mouse prostate tumours and that inactivation of such progression barriers in mice would engender a metastasis-prone condition. Comparative transcriptomic and canonical pathway analyses, followed by biochemical confirmation, of normal prostate epithelium versus poorly progressive Pten-null prostate cancers revealed robust activation of the TGFβ/BMP–SMAD4 signalling axis. The functional relevance of SMAD4 was further supported by emergence of invasive, metastatic and lethal prostate cancers with 100% penetrance upon genetic deletion of Smad4 in the Pten-null mouse prostate. Pathological and molecular analysis as well as transcriptomic knowledge-based pathway profiling of emerging tumours identified cell proliferation and invasion as two cardinal tumour biological features in the metastatic Smad4/Pten-null PCA model. Follow-on pathological and functional assessment confirmed cyclin D1 and SPP1 as key mediators of these biological processes, which together with PTEN and SMAD4, form a four-gene signature that is prognostic of prostate-specific antigen (PSA) biochemical recurrence and lethal metastasis in human PCA. This model-informed progression analysis, together with genetic, functional and translational studies, establishes SMAD4 as a key regulator of PCA progression in mice and humans.

Adenocarcinoma of the prostate (PCA) is the most common form of cancer and the second leading cause of cancer death in American men 2 .Current methods of stratifying tumours to predict outcome are based on clinical-pathological factors including Gleason grade, PSA and tumour stage 3 .These parameters are widely considered inadequate, which has motivated the genetic and biological study of PCA progression with the goal of identifying progression risk biomarkers capable of improving patient management 4 .
Genetic studies of human PCA has identified signature pathogenetic events 5 , a number of which have been validated and mechanistically defined in genetically engineered mouse models of PCA 6 .Prostate-specific Pten deletion (Pten pc−/− ) results in prostate intraepithelial neoplasia (PIN) which, following a long latency, can progress to high-grade adenocarcinoma, albeit with minimally invasive and metastatic features [7][8][9][10] .To understand this feeble progression phenotype, we conducted transcriptome comparison of Pten pc−/− PIN relative to wild-type prostate epithelium (Supplementary Data 1).In addition to the expected PI3K and p53 (also known as TRP53) pathway representation 8 , knowledge-based pathway analysis revealed prominent TGFβ/BMP signalling in Pten pc−/− PIN (Supplementary Fig. 1).
Immunohistochemical and western blotting analyses of Smad4 expression confirmed robust increase in Pten pc−/− PIN compared to wild-type prostate epithelium (Fig. 1a, b).In line with reported down-regulated expression of SMAD4 in a subset of human primary prostate tumours 11 , Oncomine expression analysis showed consistent SMAD4 downregulation in human PCA metastasis (Fig. 1c and Supplementary Fig. 2).Loss of SMAD4 in advanced PCA is further supported by recent report of frequent epigenetic silencing of the SMAD4 promoter in advanced disease 12 .On the functional level, SMAD4 knockdown in PC3 showed significantly enhanced frequency of metastases to the lung from renal capsule implantation (Fig. 1d and Supplementary Fig. 3).These observations prompted speculation that a SMAD4-dependent barrier constrains PCA progression.
Molecular pathological analysis of PCA-bearing Pten pc−/− Smad4 pc−/− mice showed metastatic spread of Krt8 and androgen receptor-positive (Krt8 + , Ar + ) tumour nodules to draining lumbar lymph nodes in 25/ 25 cases and lung metastases in 3/25 cases (0.3-3 mm diameter metastatic nodules)(Fig.2d, Supplementary Fig. 8 and Supplementary Table 1).The histological features of these metastases resembled those of the primary prostate tumour (Fig. 2d).These observations are in contrast to the Pten pc−/− PCA-bearing mice which never developed metastatic lesions when examined at 1year of age (n = 10), and only two mice(2/8)older than 1.5 years of age contained a solitary lumbar lymph node metastasis and one of these mice also possessed a solitary lung micrometastasis (Supplementary Table 1), a constrained progression phenotype that aligns with previous reports [7][8][9] .Similarly, 0/20 Pten pc−/− p53 pc−/− PCA-bearing mice developed metastasis during the same observation period (data not shown).
Having demonstrated the distinctly different metastatic potential of the Pten pc−/− , Pten pc−/− Smad4 pc−/− , and Pten pc−/− p53 pc−/− models, we then compared transcriptomes of primary PCAs from each to gain insight into the molecular determinants of their phenotypic differences.First, primary anterior prostate tumours with comparable sizes were harvested from 15-week-old animals from each model for mRNA profiling.Comparisons of Pten pc−/− Smad4 pc−/− (n = 5) versus Pten pc−/− (n = 5) or Pten pc−/− p53 pc−/− (n = 3) with Pten pc−/− (n = 5) prostate tumour transcriptomes defined the Pten pc−/− Smad4 pc−/− or Pten pc−/− p53 pc−/− signatures (Supplementary Data 2, 3).Ingenuity Pathway Analysis (IPA) was used to generate hypotheses on the biological processes that underlie the metastatic phenotype in the Pten pc−/− Smad4 pc−/− PCAs.In contrast to the Pten pc−/− p53 pc−/− signatures, we found that the two most significantly enriched gene-categories in the Pten pc−/− Smad4 pc−/− signature are 'cellular movement' and 'cellular growth and proliferation' (Supplementary Fig. 9).Enrichment of cell growth and proliferation genes in Pten pc−/− Smad4 pc−/− PCA concurs with histopathological observations of markedly increased proliferation index relative to Pten pc−/− tumours (Fig. 3a, b).Increased proliferation index was not associated with changes in apoptosis (Supplementary Fig. 10), but rather neutralization of oncogene-induced senescence (OIS) as reflected by loss of senescence-associated β-galactosidase staining (Fig. 3a, b).A survey of key regulators of G1/S transition and OIS revealed significant induction of cyclin D1 protein but without significant changes in p53, p21 (also known as Cdkn1a) and p27 (also known as Cdkn1b) in Pten pc−/− Smad4 pc−/− relative to Pten pc−/− tumours (Fig. 3c and Supplementary Fig. 11).Complementing this hypothesis-driven survey, cyclin D1 was computationally identified as the only cell cycle regulator in the Pten pc−/− Smad4 pc−/− signature that both exhibits human PCA progression-correlated expression in Oncomine and harbours putative SMAD-binding elements (SBEs) in its promoter lifespan in the Pten pc−/− Smad4 pc−/− compared with the Pten pc−/− cohort.c, Gross anatomy of representative prostates at 22weeks of age.Scale bar, 10 mm.d, H&E-stained sections and immunohistochemical analyses of primary PCA, lumbar lymph nodes and lung of Pten pc−/− Smad4 pc−/− .The tumour context is depicted in low-magnification insets.Scale bar, 50 μm.
(Supplementary Data 2).Indeed, chromatin immunoprecipitation (ChIP) assays confirmed that SMAD4 can bind to one of the SBEs in the cyclin D1 gene promoter (Supplementary Figs 12 and 13).Correspondingly, TGFβ1 (also known as TGFB1)-treated SMAD4transduced Pten pc−/− Smad4 pc−/− prostate tumour cells show down-regulated cyclin D1 expression (Supplementary Fig. 14a).Finally, enforced cyclin D1 expression significantly enhanced xenograft tumour growth in vivo (Fig. 3d).Together, these data support the thesis that cyclin D1 is a key mediator of the cardinal tumour biological feature of increased proliferation in the metastatic Pten pc−/− Smad4 pc−/− model.
We next obtained available ORFs corresponding to 21 of the 84 'Cellular Movement' genes (Supplementary Table 2) and assayed their ability to enhance invasion of human prostate cancer cells.Using the modified Boyden chamber assay, 10/21 ORFs enhanced invasion of prostate cancer cells including PC3 (Supplementary Table 2).Among these validated invasion genes, SPP1 was selected for deeper analysis given its PCA progression-correlated expression in Oncomine, its prognostic potential for BCR in univariate COX proportional hazard analysis in a data set comprising of transcriptome and outcome data on 79 PCA patients (Supplementary Tables 3 and 4) 13 , and its known link to TGFβ signalling under different cellular contexts [1][2][3][4][5][6] .Western blotting and immunohistochemical analyses confirmed increased Spp1 expression in Pten pc−/− Smad4 pc−/− compared to Pten pc−/− tumours (Fig. 3c and Supplementary Fig. 11) and promoter analysis 17 identified a conserved SBE in the Spp1 promoter which was confirmed by ChIP assay in cells treated with TGFβ1 (Supplementary Fig. 15).In contrast to previous studies showing Smad4 as an inducer of Spp1 expression through displacement of transcription repressors from Spp1 promoter in a mink lung epithelial cell line and a preosteoblastic cell line 14,16 , loss of Smad4 in the Pten pc−/− Smad4 pc−/− prostate tumour cells results in markedly increased Spp1 expression(Fig.3c and Supplementary Data 2).TGFβ1 treatment correspondingly suppressed Spp1 expression in SMAD4-dependent manner in Pten pc−/− Smad4 pc−/− prostate tumour cells (Supplementary Fig. 14b).These observations underscore the context-specific actions of TGFβ-SMAD4 signalling on its downstream targets 18 .Next, to verify that Spp1 functionally contributes to the metastatic phenotype in our model, we showed significant inhibition of invasive activity in vitro upon knockdown of Spp1 in Pten pc−/− Smad4 pc−/− mouse PCA cells (Supplementary Fig. 16).Conversely, enforced SPP1 expression enhanced invasion in vitro of several human lines (Supplementary Fig. 17).Finally, orthotopic implantation of SPP1-transduced PC3 cells in the prostate exhibited increased lumbar lymph node metastasis and enhanced metastasis to lung (Fig. 3e-f and Supplementary Fig. 18).These results strongly indicated that SPP1 is a pro-metastasis invasion gene in human PCA and in the Pten pc−/− Smad4 pc−/− PCA model.The in vivo genetic modelling studies, the in silico transcriptomic and pathway analyses, along with the tumour biological and functional characterizations collectively point to the inactivation of Pten and Smad4 as well as activation of cyclin D1 (also known as Ccnd1) and Spp1 as drivers of PCA progression.As such, we posited that these four key PCA metastasis progression relevant genes may carry prognostic value for metastasis risk in human PCA (see Supplementary Fig. 19).To this end, we assessed how robustly these four genes can stratify risk of BCR (> 0.2 ng ml −1 ) in the data set from ref. 13.Although only SPP1 was significantly correlated with BCR in univariate analysis, an overall risk score integrating the four-gene signature by multivariate Cox regression showed significant association with BCR as well (P-value = 0.0025, and overall C-index = 0.66, see Supplementary Tables 4 and 5).Furthermore, the four-gene model robustly stratified the ref.
13 cohort by K-mean clustering into two groups that exhibited significant difference in risk for BCR by Kaplan-Meier analysis (Fig. 4a; hazard ratio = 2.6, log-rank test P = 0.012).Importantly, by C-statistics, this four-gene signature carries independent prognostic information as it can enhance the prognostic accuracy of Gleason score from C-index from 0.77 to 0.8 (Fig. 4b), even though by itself, the four-gene signature (C-index as 0.75) performs only as well as Gleason score alone (Fig. 4b).
We repeated this analysis in an independent extreme-case-control cohort derived from the Physicians' Health Study (PHS) (Supplementary Table 6; see Methods for study design), where we showed that the four-gene model was also capable of enhancing the prognostic accuracy of Gleason score in predicting metastatic lethal outcome (Fig. 4c; C = 0.716 by four-gene signature).Although exclusion of non-informative cases may have biased towards a positive association, the prognostic performance by this four-gene signature is unlikely a chance occurrence because, by gene-set-enrichment testing, it outperforms 243 other bidirectional signatures curated in the Molecular Signature Databases of the Broad Institute (MSigDB, version 2.5) in predicting metastatic lethal outcome in this PHS extreme-casecontrol cohort (Supplementary Fig. 20).
Encouraged by the prognostic value in two independent cohorts using RNA expression yet mindful of the inherent intra-tumoural heterogeneity of PCA which may obscure expression differences in whole-tumour transcriptome profiles, we next performed immunohistochemical staining with validated antibodies against PTEN, SMAD4, cyclin D1 and SPP1 on a tumour tissue microarrays (TMA) comprising a cohort of 405 tumour specimens randomly selected from men diagnosed with prostate cancer who underwent radical prostatectomy in the PHS cohort.Staining results were quantified by expert pathologists (R.L. and M.L.) blinded to the outcome of the cases.Indeed, not only does the four-protein model improve the prognostic accuracy of Gleason score in combination, it performs significantly better than Gleason score alone (Fig. 4d; C = 0.774 for Gleason only, C = 0.829 for four-protein model alone, and C = 0.882 for Gleason + four-protein model; P = 0.015 for improvement).Moreover, the addition of the four-protein model to the clinical parameters (Gleason, age at diagnosis, TNM stage; C = 0.842) leads to a significant seven point increase in the C-statistic (C = 0.913), P-value for difference between full clinical model versus clinical model + four-protein signature = 0.047 (Supplementary Table 7).The enhanced prognostic value of 'Gleason + four-protein model' was similarly validated in yet another independent cohort, the Directors Challenge TMA containing 40 prostate cancer patients with recurrence as outcome (Supplementary Table 8) (Fig. 4e and Supplementary Fig. 19c; C = 0.704 for Gleason alone versus C = 0.740 for Gleason + four-protein model).
In summary, concomitant Pten and Smad4 inactivation in the prostate epithelium can bypass OIS, enhance tumour cell proliferation and drive invasion to produce a fully-penetrant invasive and metastatic PCA phenotype in the mouse (Supplementary Fig. 21).The human relevance of this Pten pc−/− Smad4 pc−/− model of metastatic PCA is credentialed by the prognostic significance of a four-marker signature derived from this mouse model in predicting biochemical recurrence or lethal metastasis in human PCAs.Thus this study will facilitate the development of a molecularly-based prognostic assay that may complement the current standard of care to improve evidence-based management of PCA patients, a current major unmet need.
Representative sections from at least three mice were counted for each genotype.

Establishment of inducible Pten pc−/− Smad4 pc−/− SMAD4-TetOn cell lines
Pten pc−/− Smad4 pc−/− prostate tumour cells (see above) were used as parental cells for establishment of inducible SMAD4 TetOn cells using TetOn Advanced Inducible Gene Expression System (Clontech).Human SMAD4 coding region inserted into the pTRE-Tight vector, and a TetOn SMAD4 expression system was generated according to the manufacturer's protocol.Stable clones were induced to express SMAD4 using 1 μg ml −1 doxycycline (dox), and expression was verified to be comparable to the SMAD4 level in Pten pc−/− prostate tumours by western blot analysis of whole-cell extracts, using anti- SMAD4 antibody (sc-21742, Santa Cruz) (Supplementary Fig. 12).

RNA isolation and real-time PCR
Total RNA was extracted using TRIzol followed by RNeasy Mini kit (Qiagen) cleanup and RQ1 RNase-free DNase Set treatment (Promega) according to the manufacturer's instructions.First strand cDNA was synthesized using 1 μg of total RNA and Superscript II (Invitrogen).Real-time quantitative PCR was performed in triplicates with a MxPro3000 and SYBR GreenER qPCR mix (Invitrogen).The relative amount of specific mRNA was normalized to Gapdh.Primer sequences are available upon request.

Transcriptomic and pathway analyses
For transcriptomic analyses, anterior prostate from mice at 15weeks of age were isolated and total mRNA extracted, labelled and hybridized to Affymetrix GeneChip Mouse Genome 430 2.0 Arrays by the Dana-Farber Cancer Institute Microarray Core Facility according to the manufacturer's protocol.Affymetrix mouse MOE430 raw data (CEL files) were preprocessed using robust multi-array analysis (RMA) of the Affy package of Bioconductor.The background-corrected, normalized and summarized probe set intensity data were then analysed using significance analysis of microarrays (SAM) to identify differentially expressed genes.Using a twofold, FDR 5% cut-off, we generated a 3,532 probe set that distinguishes differentially expressed genes in anterior prostate samples from Pten pc−/− (five mice) versus WT (PB-Cre4) (three mice), 397 probe sets that distinguishes differentially expressed genes in anterior prostate samples from Pten pc−/− Smad4 pc−/− (five mice) versus Pten pc−/− (five mice), and 370 probe sets that distinguishes differentially expressed genes in Pten pc−/− p53 pc−/− (three mice) versus Pten pc−/− (five mice).Gene information for all probes was annotated based on 'Mouse430_2.na28.annot.csv'downloaded from the Affymetrix website.Probes with multiple genes in the Affymetrix annotation file were mapped against latest mouse genome build (UCSC mm9) for the single matching gene.Probes mapped to more than one position on mm9 were ignored.Human orthologues of mouse genes were extracted from HomoloGene build 64(ftp://ftp.ncbi.nih.gov/pub/HomoloGene/).Intersection of the murine list with the human orthologous genes produced an orthologous set of genes.
All differentially expressed gene lists generated as described above were further analysed with the Ingenuity Pathways Analysis program (http://www.ingenuity.com/index.html) to identify canonical pathways, and molecular and cellular functions enriched in the related gene lists.

Viral production and transduction
Approximately 2 × 10 6 293T cells were seeded in 100 mm plates 15h before transfection (∼30% confluent) in 10% FBS/DMEM with antibiotics.For MSCV viral production, 3 μg viral backbone, 2.7 μg gag/pol expression vectors, and 0.3 μg VSV-G expression vector were diluted to 20 μl using Opti-MEM (Invitrogen) and combined with 180ml Opti-MEM containing 12 μl FuGENE-6 (Roche).This mixture was incubated at room temperature (RT) for 20 min and added to the 10 ml media covering the 293T cells.For pLKO shRNA lentivirus production, 10 μg of viral backbone and 10 μg of lentiviral packaging vectors were diluted to 1,000 μl using Opti-MEM (Invitrogen).The resulting mix was combined with 1,000 μl Opti-MEM containing 30 μl Liptofectamine2000 (Invitrogen), incubated at room temperature for 20 min and added to 8 ml media covering the 293T cells.The media was replaced with 10% FBS/DMEM approximately 10 h post-transfection and viral supernatants were collected at 36 h and 60 h after transfection and combined.Viral supernatants (5 ml) containing 8 μg ml −1 polybrene were added to target cells that were seeded 24 h before infection at 70-80% confluence.Cells were infected twice and allowed to recover in 10% FBS/RPMI 1640 with antibiotics for 12 h following the second infection, after which cells were selected with 2 μg ml −1 puromycin for 4 days and allowed to recover in normal medium for 24 h before further experiments.

Transwell invasion assay
Standard 24-well Boyden invasion chambers (BD Biosciences) were used to assess cell invasiveness following the manufacturer's suggestions.Briefly, cells were trypsinized, rinsed twice with PBS, resuspended in serum-free media, and seeded at 2 × 10 5 cells per well for PC3 cells and Pten pc−/− Smad4 pc−/− cells, 4 × 10 5 cells per well for BPH1 cells.
Chambers in triplicate were placed in 10% serum-containing media as a chemo-attractant and an equal number of cells were seeded in cell culture plates in triplicate as input controls.Following 22 h incubation, chambers were fixedin 10% formalin, stained with crystal violet for manual counting or by pixel quantification with Adobe Photoshop.Data was normalized to input cells to control for differences in cell number (loading control).

Orthotopic and renal capsule implantation
Male SCID mice (6 weeks old) were obtained from Taconic.Orthotopic and renal capsule implantations were performed as described previously 25,26 .Briefly, a suspension of 1 × 10 6 cells in 50 μl of a 1:1 mixture of PBS and Matrigel (BD Biosciences) was injected into the anterior prostate lobe.For renal capsule implantation 5 × 10 5 cells were suspended in 50 μl of neutralized type I rat tail collagen (BD Biosciences), allowed to gel at 37 °C for 15min, covered with growth medium, followed by grafting beneath the renal capsule of mice.

Identification of putative SMAD binding sites (SBEs)
The Smad binding elements (SBEs) in the promoters of the Pten pc−/− Smad4 pc−/− signature of 267 genes were identified computationally by established methods 16 .Briefly, the conserved nucleotides in the 4kb promoter regions of the promoters were isolated and scanned for enrichment of the SMAD binding motifs in TRANSFAC.Enrichment was assessed by comparing the target regions to matched control regions at the same distance from the transcription start sites of random genes.Promoter analysis on these gene sets for SBEs used the CisGenome software (http://www.biostat.jhsph.edu/∼hji/cisgenome/).
TMA slides were scanned using the CRi Nuance v2.8 (Woburn) slide scanner following the standard bright field TMA protocol.The system acquires images at 20nm intervals and combines them into a stack file which represents one image.This was done automatically to create one image for each core on the TMA.The maximum likelihood method was used to extract the spectra of DAB and haematoxylin, which represent the different elements of IHC.inForm v0.4.2 software (CRi) was used to analyse the spectral images of each core.Initially, a training set comprising two classes of tissue was created: 'tumor' and 'other'.Representative areas for each class were marked on 12-16 images from each TMA.The software was trained on these areas using the spectra of both the counterstain (haematoxylin) and the immunostain (DAB) and tested to determine how accurate it could differentiate between the two classes.This process was repeated until further iterations no longer improved accuracy.
Histological images were then analysed using the 'nuclear or cytoplasmic' algorithm.The multispectral imaging capabilities of the Nuanc slide scanner allows the software to isolate or segment the nuclei using the unmixed spectra of the nuclear counterstain and the DAB immunohistochemical stain used in addition for a nuclear biomarker.In turn, cytoplasm is found based on the non-nuclear tumour area.Threshold settings approximated: scale1,offset subtraction0,minimum blob size 30, maximum blob size 10,000, circularity threshold 0, edge sharpness 0, fill hole enabled (nuclear parameters); algorithm 4, area 200, compactness 0.5, Wht threshold 225 (cytoplasmic parameters).The final score was based on the percentage of the cytoplasmic or nuclear tumour area that was positively stained and this was represented as a ten bin histogram.This involved each pixel being placed into one of ten bins based on the intensity of the DAB spectra, with an adjustment of the threshold for the 9th bin by the user in order to create a desirable distribution.By reviewing images and their scores, a threshold level of these bins was determined that represented real staining, and the values from the bins above this threshold were added together to create a final score which represented the per-centage of cytoplasmic or nuclear area that waspositively stained.All samples were also reviewed by pathologists (R.L. and M.L.) to ensure that assigned scores were appropriate.TMA cores that were difficult to classify (due to technical artefacts such as folds in the tissue, air bubbles, cores overlapping or due to difficulty in morphological classification) were either eliminated from the analysis in order to categorize the tissue appropriately.The Directors Challenge TMA originally contained 52 patient samples 27 .However, as is typical of most heavily used TMAs, some of the samples become exhausted over time from extensive use by the M.L. lab and the community.After careful quality control of each core on the TMA by R.L. in M.L. lab, only 40 high quality core samples were considered usable (Supplementary Table 7).Careful quality control of each core on the PHS TMA by R.L. in the M.L. lab, 405 high quality core samples were considered usable (Supplementary Table 5).

Clinical outcome analysis
The raw Affymetrix HG-U133A expression profiles and clinical information of 79 prostate cancer patients from the ref. 13 cohort (Supplementary Table 2) 12 were generously provided by W. Gerald.The raw data set was analysed by MAS5 algorithm.Low-expression probesets with less than 20% present calls across the 79 samples were excluded from the data.The remaining 13,027 probesets map to 8,763 genes with unique symbols, and the mean log-transformed probeset levels were used as the gene expression profiles.
A univariate Cox proportional hazard analysis was conducted using the R 'survival' package for invasion assay positive genes to identify those expression in PCA tumours was positively associated with biochemical recurrence (BCR, defined by post-op PSA > 0.2 ng ml −1 ) in the ref. 13 data set 12 .K-means clustering algorithm was used with the PTEN/SMAD4/CCND1/SPP1 four-gene model to identify two cancer sample clusters.The initial centres for the K-means clustering were set at the two cases with the longest Euclidean distance.Kaplan-Meier analysis for the survival difference of the two cancer patient clusters was conducted using the R 'survival' package.C-statistics analysis was conducted using the R 'survcomp' package.The statistical procedures used in the analyses include a bootstrapping step that estimates the distribution of C-statistics of all models across 10,000 random bootstrapping instances, and a comparative step that uses the paired t-test to compare the C-statistics of models and evaluate the statistical significance 28 .Multivariate Cox proportional hazards model analysis with the four-gene signature was used to estimate the coefficients of individual genes, which combined the four-gene expression levels into an integrated risk score model defined.
To validate further the prognostic significance of this four-gene model, we repeated this analysis in an independent cohort derived from the Directors Challenge cohort 27 (Supplementary Table 7) and the Physicians' Health Study (PHS) cohort.PHS cohort (Supplementary Table 5): the men with prostate cancer included in this study were participants in the Physicians' Health Study (PHS), an ongoing randomized trial among US male physicians 29,30 .The men were diagnosed with histologically-confirmed prostate cancer after randomization, between January 1983 and December 2004.We obtained archival formalin-fixed, paraffin-embedded tissue specimens, either radical prostatectomy (95%) or TURP (5%) and constructed tumour tissue microarrays for immunohistochemical analyses; 405 had sufficient tumour tissue available for this project.All men in the trial were followed for mortality, and cause of death was confirmed by a study endpoints committee.In addition, we retrieved medical records and questionnaire data on the men with prostate cancer to collect information on treatments, clinical characteristics, as well progression of the cancer.Through March 2010, 38 men of 405 had developed a lethal metastatic phenotype, defined by bony metastases or cancer-specific death.
We undertook gene expression profiling as part of a previous project to define molecular signatures in prostate cancer 31 on a subset of the PHS included on the TMAs.As part of the sampling, we sought to maximize efficiency for studies of lethal prostate cancer by devising a study design that included men who either died from prostate cancer or developed metastases during follow up ('lethal prostate cancer' cases) or who survived at least 10 years after their diagnosis without any evidence of metastases (men with 'indolent prostate cancer').We sought to include all lethal cancers, based on follow-up through March 2007, and took a random sample of indolent cancers for a total sample size of 116 cases.In this design, we exclude men with non-informative outcomes, namely those who died from other causes within 10 years of their prostate cancer diagnosis or had been followed for less than 10 years with no disease progression.The natural history of prostate cancer is quite long, with men dying of prostate cancer even 15 or more years after cancer diagnosis 32 .Thus, we NIH-PA Author Manuscript NIH-PA Author Manuscript NIH-PA Author Manuscript excluded prostate cancer cases with less than 10 years follow-up to increase confidence on the outcome annotation since we are not seeking to estimate survival time.By focusing on long-follow-up cases, an extreme-case-control study design allows us to maximally identify lethal versus indolent prostate cancer.In addition, to minimize the potential that C-statistics estimation might be biased towards a higher lethal composition by such extreme-case-studydesign, we have chosen a logistic regression analysis rather estimating survival analysis.
The tissue based studies were approved by the Institutional Review Boards of Harvard School of Public Health and Partners Healthcare.
We assessed the enrichment of the four-gene signature to that of 244 bidirectional signatures curated in the Molecular Signature Databases of the Broad Institute (MSigDB, version 2.5) by computing an enrichment statistic 33 .
NIH-PA Author Manuscript NIH-PA Author Manuscript NIH-PA Author Manuscript

Figure 1 .
Figure 1.SMAD4 is a putative suppressor of prostate tumour progression a, b, Immunohistochemical (a) and western blot analysis (b) of wild-type (WT) and Pten pc−/− use prostate tissues.Scale bar, 50 μm.c, Oncomine boxed plot of SMAD4 expression levels between human PCA and metastasis in multiple data sets including those from ref. 19 and ref. 20.d, SMAD4 knockdown enhanced metastatic potential to lung from PC3 cells implanted in renal capsule of immunocompromised nude mice.

Figure 2 .
Figure 2. Smad4 deletion drives progression of Pten-deficient prostate tumour to highly aggressive prostate cancer metastatic to lymph node and lung a, Haematoxylin and eosin (H&E) stained sections of representative anterior prostates (AP) at 7, 11 and 15weeks.Scale bar, 200 μm.b, Kaplan-Meier cumulative survival analysis showing significant (P < 0.0001) decrease in

Figure 3 .
Figure 3. Ccnd1 and Spp1 are mediators of prostate tumour cell proliferation and metastasis a, BrdU pulse-labelling and SA-β-galactosidase (β-Gal) staining of 15-week-old APs.b, Quantification of BrdU pulse labelling and β-Gal staining.Error bars represent s.d. for a representative experiment performed in triplicate.c, Western blot analysis demonstrating elevated Ccnd1 and Spp1 levels in Pten pc−/− Smad4 pc−/− compared to Pten pc−/− prostate tumours.d, Enforced CCND1 expression significantly enhanced prostate xenograft tumour growth of PC3 cells.e, f, Enforced SPP1 expression significantly increases metastatic activity of PC3 cells from prostate xenograft to lumbar lymph nodes (e) and to lung (f).

Figure 4 .
Figure 4. Prognostic potential of a four-gene signature in human PCA a, The four-gene set of PTEN/SMAD4/CCND1/SPP1 can dichotomize PCA cases for BCR in the ref. 13 data set.b, c, C-statistic analysis revealed that this four-gene set can enhance the prognostic accuracy of Gleason score in the ref. 13 data set (b) and in an independent PHS cohort (c).d, TMA-based four-protein model also significantly improve the prognostic ability of Gleason (P = 0.015) from the PHS cohort.e, Representative immunohistochemical staining with specific antibody against PTEN, SMAD4, CCND1 and SPP1 in the Directors Challenge TMA.Scale bar, 200 μm.