ORIGINAL REPORTS
Genitourinary Cancer
Article Tools
OPTIONS & TOOLS
COMPANION ARTICLES
ARTICLE CITATION
DOI: 10.1200/JCO.2009.25.0977 Journal of Clinical Oncology - published online before print April 26, 2010
PMID: 20421545
Expression Signature of E2F1 and Its Associated Genes Predict Superficial to Invasive Progression of Bladder Tumors
J.-S.L. and S.-H.L. contributed equally to this work.
In approximately 20% of patients with superficial bladder tumors, the tumors progress to invasive tumors after treatment. Current methods of predicting the clinical behavior of these tumors prospectively are unreliable. We aim to identify a molecular signature that can reliably identify patients with high-risk superficial tumors that are likely to progress to invasive tumors.
Gene expression data were collected from tumor specimens from 165 patients with bladder cancer. Various statistical methods, including leave-one-out cross-validation methods, were applied to identify a gene expression signature that could predict the likelihood of progression to invasive tumors and to test the robustness of the expression signature in an independent cohort. The robustness of the gene expression signature was validated in an independent (n = 353) cohort.
Bladder cancer is the seventh most prevalent type of cancer worldwide and accounts for an estimated 150,000 deaths annually.1,2 This cancer presents as a heterogeneous disease that consists of two main phenotypic groups with distinct biologic behavior and prognoses.3,4 Approximately 80% of patients have tumors that are superficial or non–muscle- invasive (Ta, T1, and Tis). The remaining 20% of patents have muscle-invasive tumors (T2, T3, and T4) with dismal prognosis, and these patients usually have no previous history of low-grade noninvasive tumors.5,6 Although considerable effort has been devoted to establishing a prognostic model, using clinical information and pathologic classification, of superficial tumors that provides information at diagnosis about both survival and treatment options,7–11 predicting the clinical behavior of these tumors prospectively remains challenging. Thus, there is a great need for robust methods capable of identifying patients with high-risk superficial tumors that are likely to develop into invasive tumors.
On the basis of the success of recent genome-wide gene expression profile studies,12–14 we characterized tumor transcriptome at the systems level to address the heterogeneity of superficial bladder cancer and identify potential markers that could be used to divide patients into distinct subclasses with different progression rates. The results revealed two subclasses of patients characterized by a significant difference in progression-free survival for the same stage of disease after curative treatment.
One hundred sixty-five primary bladder cancer tissue samples from patients with histologically diagnosed transitional-cell carcinoma were obtained from the Chungbuk National University Hospital, Cheongju, South Korea. Tumors were staged and graded according to standard criteria of the American Joint Committee on Cancer. Clinical data were also obtained retrospectively, and Table 1 lists the pathologic and clinical characteristics of the patients. Fifty-eight samples of histologically normal-looking surrounding tissues from the patients with transitional-cell carcinoma and 10 normal bladder mucosae from patients with benign diseases were also included for study. The collection and use of specimens was approved by the Institutional Review Board of the Chungbuk National University College of Medicine, and informed consent was obtained from each patient. Samples were then frozen in liquid nitrogen and stored at −80°C until RNA extraction.
|
| Characteristic | All (N = 165) | Superficial (n = 103) | Invasive (n = 62) | |||
|---|---|---|---|---|---|---|
| No. of Patients | % | No. of Patients | % | No. of Patients | % | |
| Age, year | ||||||
| Median | 66 | 66 | 66 | |||
| Range | 24-88 | 24-88 | 38-87 | |||
| Sex | ||||||
| Male | 135 | 82 | 87 | 53 | 48 | 29 |
| Female | 30 | 18 | 16 | 10 | 14 | 8 |
| Cancer stage, TNM class | ||||||
| Ta | 23 | 14 | 23 | 14 | ||
| T1 | 80 | 48 | 80 | 48 | ||
| T2 | 32 | 19 | 32 | 19 | ||
| T3 | 19 | 12 | 19 | 12 | ||
| T4 | 11 | 7 | 11 | 7 | ||
Complete transurethral resection of the tumor was performed on all patients with superficial tumors, and patients with superficial tumors with intermediate- or high-risk bladder cancer received one cycle of intravesical treatment (Bacillus Calmette-Guérin [BCG] or mitomycin).15 Response to treatment was assessed by cystoscopy and urinary cytology. Patients who were free of disease 3 months after treatment were assessed every 3 months for the first 2 years and every 6 months thereafter. The median follow-up time for patients with superficial bladder cancer was 58 months (range, 3 to 137 months).
Total RNA was isolated from the tissues with TRIzol (Life Technologies, Carlsbad, CA) reagent according to the manufacturer's protocol. Five hundred nanograms of total RNA were used for labeling and hybridization, according to the manufacturer's protocols (Illumina, San Diego, CA). After the bead chips were scanned with an Illumina BeadArray Reader, the microarray data were normalized using the quantile normalization method in the Linear Models for Microarray Data (LIMMA) package in the R language environment.16 Measured gene expression values were log2 transformed and median centered across genes and samples. We next selected genes for further analysis by removing genes whose expression levels varied by a factor of less than 2 across the samples. Primary microarray data are available in NCBI's Gene Expression Omnibus public database (microarray platform, GPL6102; microarray data, GSE13507).
We identified genes that were differentially expressed between the two classes using a random-variance t test.18 Genes were considered to have statistically significant differences in expression if the P < .001. We also performed a global test of whether the expression profiles differed between the classes by performing permutations. For each permutation, the P values were recomputed, and the number of genes significant at the P = .001 level was noted. The proportion of the permutations that resulted in at least as many significant genes as the actual data was the significance level of the global test.
Transcription factor (TF) binding sites enriched in invasive tumors were identified using a functional class scoring analysis, as described by Pavlidis et al.19 For each gene in a TF binding site, the P value comparing its expression in superficial versus invasive tumors was computed. The statistical significance of the TF binding sites containing n genes represented on the array was evaluated by computing the empirical distribution of these summary statistics in random samples of n genes. The functional class scoring analysis for TF binding sites was performed using BRB-ArrayTools.17
To test the ability of the gene expression profile to predict the class of patients in an independent cohort, we developed models based on the compound covariate predictor,20 linear discriminant analysis,21 nearest neighbor classification,21 and support vector machines with linear kernel.22 Before analysis, gene expression data of Korean and European cohorts were independently centralized and combined together. We estimated the prediction error of each model using leave-one-out cross-validation (LOOCV), as described by Simon et al.23 For each LOOCV training set, the entire model-building process was repeated, including the gene selection process. We also evaluated whether the cross-validated error rate estimate for a model was significantly less than one would expect from random prediction. The class labels were randomly permuted (100 permutations), and the entire LOOCV process was repeated. The significance level is the proportion of the random permutations that gave a cross-validated error rate no greater than the cross-validated error rate obtained with the real data.
Gene expression profile data were collected from surgically removed tumors from 165 patients with bladder cancer (Table 1). Hierarchical clustering analysis of gene expression data from all tissues yielded two major clusters, one representing bladder tumors and the other representing surrounding tissues and normal urothelium, with a few exceptions (Appendix Fig A1, online only). Thus, gene expression patterns reflecting molecular configuration are readily distinguishable between bladder tumors and nontumor tissues, as previously observed.10,24,25
We first sought to find gene sets that are differentially expressed in three tissue groups (normal surrounding tissues and superficial and invasive tumors). We applied a Venn diagram comparison of two gene lists to select gene expression patterns unique for invasive tumors. First, we generated two different gene lists by applying the two-sample t test (Fig 1A; P < .001). Gene list X represents the genes that were differentially expressed between normal tissues and superficial tumors, whereas gene list Y represents the genes that were differentially expressed between superficial and invasive tumors. When the two gene lists were compared, the following three different patterns were observed: X not Y (1,546 genes), X and Y (508 genes), and Y not X (394 genes; Fig 1B). Genes in the X not Y category largely reflect common characteristics of both superficial and invasive bladder tumors. Genes in the X and Y category display superficial tumor-specific expression patterns. Genes in the Y not X category show characteristic invasive tumors.

Fig 1. Comparison of gene lists from two independent statistical tests. (A) Venn diagram of genes selected by univariate testing (two-sample t test) with multivariate permutation testing (10,000 random permutations). The blue circle (gene list X) represents genes differentially expressed between nontumor tissues and superficial tumors. The red circle (gene list Y) represents genes differentially expressed between superficial and invasive tumors. We applied a cutoff of P < .001 to retain genes whose expression was significantly different between the two groups of tissues examined. In each comparison, the maximum allowed number of genes with false-positive results was 10, and the probability of getting the selected number of genes significant by chance (at the P = .001 level) if there were no real differences between the groups was 0. In addition, only genes with at least a 1.5-fold difference between the two groups compared were considered for further analysis. The expression of 394 genes (Y not X category) was found to differ significantly between superficial and invasive tumors. (B) Expression patterns of selected genes in the Venn diagram. The data are presented in matrix format in which rows represent individual genes and columns represent each tissue. Each cell in the matrix represents the expression level of a gene feature in an individual tissue. The red and green colors in cells reflect high and low expression levels, respectively, as indicated in the scale bar (log2-transformed scale). Red and blue bars on the left side of the heat map represent the respective genes in the Venn diagram. Colored bars at the top of the heat map represent tissues.
The oncogene E2F1 was one of 162 genes we identified as being up-modulated in invasive tumors. Given the statistical significance of E2F1 expression (P < .001 by the two-sample t test) and its roles in tumorigenesis (Appendix Fig A2A, online only), we further tested its association with invasion in bladder tumors. Expression of E2F1 was examined in an independent patient group from a previous study that collected expression data from 105 Spanish patients with bladder tumors (33 superficial tumors and 72 invasive tumors).24 Expression of E2F1 was significantly up-modulated in the invasive tumors in this cohort (P < .001 and P < .001 by two-sample t test; Appendix Figs A2B and A2C), strongly indicating that activation of E2F1 might be a critical genetic event in the development or progression of invasive bladder tumors.
Because E2F1 is a TF, we tested whether E2F binding sites are enriched in the promoters of genes whose expression is significantly modulated in invasive tumors. Applying a robust analysis strategy, gene set enrichment analysis,26,27 we found that E2F binding sites were significantly enriched in promoters of genes whose expression is strongly associated with invasive tumors (Appendix Table A1, online only), supporting the association of E2F1 activation with invasive bladder tumors. Notably, enrichment of binding sites in promoters also indicates that many of these genes are direct targets of E2F1. We next examined whether the E2F TF-specific gene expression signature is enriched in invasive tumor by cross comparing the expression signatures of five oncogenes (CTNNB1, SRC, E2F3, RAS, and MYC)28 with gene expression data from the 165 bladder tumors (Appendix Fig A3 and Table A2, online only). The results of analysis showed that only E2F3-specific gene expression signatures are associated with invasive tumor, supporting our previous observation in promoter sequence analysis.
During data analyses, we noticed that expression of E2F1 was not uniformly absent in superficial tumors. This observation led us to explore the possibility that expression and/or activation of E2F1 in superficial tumors is indicative of progression of superficial tumors to invasive tumors. We first re-examined expression of E2F1 in superficial tumors and divided 103 superficial tumors into two groups according to the expression level of E2F1. Progression of superficial tumors to invasive tumors was significantly higher in the E2F1-high (EH) group (upper 50th percentile) than in the E2F1-low (EL) group (lower 50the percentile; P = .012 by log-rank test; Appendix Fig A4, online only). In addition, receiver operating characteristic analysis for predicting progression within 3 years showed significant area under curve (0.706; Appendix Fig A5, online only). Both results signify the involvement of E2F1 during the progression to invasive tumor.
Expression of E2F1 is not necessarily the only indicator of the activation of E2F1 because many different mechanisms regulate E2F1 activity.29,30 Many studies have demonstrated that gene expression signatures of direct downstream targets of TFs truly reflect the activation of TF. We next tried to identify a gene expression signature under the direct influence of E2F1 activation during progression and use the signature to predict the likelihood of tumor progression. We generated 1,516 in trans genes correlated with E2F1 activation (P < .001 by Pearson correlation; r < –0.4 or r > 0.4). On the basis of hierarchical clustering analysis of the expression patterns of these genes, we divided patients with superficial tumors into two groups—EH and EL patients. Consistent with our previous observation, the progression rate of EH patients was significantly higher than that of the EL patients (P = .002 by log-rank test; Fig 2).

Fig 2. Correlation of gene expression with E2F1 expression in superficial bladder cancer. (A) Genes with expression patterns highly correlated with that of E2F1 were selected for cluster analysis (P < .001, r < −0.4 or r > 0.4). One thousand five hundred sixteen gene features were selected for analysis. Patients were divided into the following two groups: E2F1-high (EH) cluster and E2F1-low (EL) cluster. (B) Kaplan-Meier plots of progression-free survival of patients with superficial bladder cancer. The presence of the E2F1 signature in superficial tumors was strongly indicative of progression of superficial tumors to invasive tumors (P = .002, log-rank test).
We next sought to validate our findings by using gene expression data from an independent cohort of European patients with bladder tumors.31 To overcome the idiosyncrasies of any one particular prediction algorithm, we adopted a previous strategy using five different statistical methods to test the robustness of our signature-based prediction of the risk of progression to invasive tumors (Fig 3A).14,32,33 Briefly, we identified the genes most differentially expressed between the EH and EL subgroups in the Korean cohort (the training set). These genes were combined to form a series of classifiers that estimate the probability that a particular bladder tumor belongs to subgroup EH or EL. The number of genes in the classifiers was optimized to minimize misclassification during the LOOCV of the tumors in the training set (Appendix Table A3, online only). When applied to the European cohort of 353 patients with superficial tumors (the test set), all five models produced consistent prediction patterns. Kaplan-Meier plots in the test set showed significant differences in risk of progression between patients in the EH and EL subgroups (Fig 3B). These results not only demonstrate a strong association between gene expression patterns and progression, but also provide strong evidence of the reliability of the prediction.

Fig 3. Construction of prediction models and evaluation of predicted outcomes. (A) A schematic overview of the strategy used for the construction of prediction models and evaluation of predicted outcomes based on gene expression signatures. (B) Kaplan-Meier plots of progression of patients with bladder cancer in validation sets predicted by compound covariate predictor (CCP), Bayesian compound covariate predictor (BCCP), linear discriminator analysis (LDA), nearest centroid (NC), and support vector machines (SVM). In addition to the five prediction methods, hierarchical clustering analysis (HCA) was applied to gene expression data from a European cohort, and it stratified patients with bladder cancer into two groups. The differences between groups were significant as indicated (log-rank test). (+) Indicate censored data. EH, E2F1 high; EL, E2F1 low.
Because tumor stage represents distinctly diverse disease characteristics including prognosis, we next tested whether the newly identified signature is independent of tumor stage (Ta and T1) in superficial tumor. As expected, prognosis of stage Ta was significantly better than that of stage T1 in the validation cohort (Fig 4A). When the signature-based stratification was applied to stage Ta and T1 separately, the signature successfully identified a population of high-risk patients in both stages (Figs 4B and 4C). In fact, when all of the stratifications were combined together, the signature identified patients in stage Ta whose risk of progression was higher than that of T1 (Fig 4D). Together, these data indicate that the signature provides information on the risk of progression independent of current staging systems. Because grade is another known risk factor of progression, we assessed the utility of the signature in patients with superficial tumors who differed only in grade (low and high grade). The grade was well associated with progression-free survival (Fig 4E). Within these groups, the signature clearly identified a high-risk population of both low- and high-grade patients with bladder cancer (Figs 4F and 4G). Moreover, the signature also identified patients in the low-grade subgroup whose risk of progression was higher than that of high-grade patients (Fig 4H). Taken together, these results demonstrate that the signature captures biologic differences among bladder cancers that are not included in the current staging criteria.
In the European validation cohort, the prognostic association between the signature and other known clinical and pathologic risk factors for progression-free survival of superficial bladder cancer was also assessed by univariate and multivariate analyses (Table 2). In addition to stage, grade, and Bacillus Calmette-Guérin/mitomycin treatment, which are already well-known risk factors, the progression signature was a significant risk factor for progression-free survival in univariate analysis. Multivariate Cox analysis that included all relevant pathologic variables revealed that the gene signature remained an independent risk factor for progression-free survival. In addition, we carried out a decrease in concordance index approach to estimate how much the new signature can improve the predictive accuracy of progression.34,35 Briefly, using the six variables in Table 2, six prediction models each lacking one variable were generated and compared with the full model containing all variables. In each comparison, the degree of decrease in predictive accuracy was estimated by measuring the decrease in concordance index after omitting one variable. The biggest decrease in concordance index was observed when the new gene expression signature was omitted in the prediction model (Appendix Table A4, online only). Taken together, these findings suggest that the signature retains its prognostic relevance even after the classical pathologic prognostic features have been taken into account and is most significant contributor in predicting progression of superficial bladder tumors.
|
| Variable | Univariate | Multivariate | ||||
|---|---|---|---|---|---|---|
| Hazard Ratio | 95% CI | P | Hazard Ratio | 95% CI | P | |
| Progression signature (SVM)* | 3.19 | 2.02 to 5.05 | < .001 | 2.34 | 1.38 to 3.96 | .0016 |
| Stage (T1 or Ta) | 1.84 | 1.19 to 2.83 | .006 | 0.56 | 0.31 to 1.02 | .06 |
| Grade (high or low) | 2.54 | 1.54 to 4.17 | < .001 | 2.41 | 1.28 to 4.55 | .006 |
| Age | 1.05 | 1.03 to 1.08 | < .001 | 1.04 | 1.02 to 1.07 | .001 |
| Sex (male or female) | 0.81 | 0.47 to 1.39 | .45 | 0.89 | 0.51 to 1.55 | .68 |
| BCG/MMC treatment (yes or no) | 0.51 | 0.29 to 0.89 | .017 | 0.57 | 0.32 to 1.02 | .06 |
Abbreviations: SVM, support vector machines; BCG, Bacillus Calmette-Guérin; MMC, mitomycin.
*Predicted outcome from SVM in Figure 3 was used for analysis (E2F1 low or E2F1 high).
By systemic comparison of gene expression data from invasive and superficial bladder tumors, we have developed a method to predict the likelihood of recurrence-associated progression from superficial to invasive tumor after curative therapy.
Several independent but complementary approaches have been devised and implemented to ensure the reproducibility of our data analyses and to test the robustness of our prediction methods. We started with a Venn diagram approach to identify genes whose expression is unique to invasive tumors. Expression of E2F1 was strongly associated with invasive tumors in our Korean cohort (Fig 1). Several lines of evidence strongly support this association. First, the association of E2F1 expression with invasive tumors remained strong in an independent Spanish cohort (Appendix Fig A2). Second, when five gene expression signatures reflecting activation of different oncogenes were applied to the gene expression data from the Korean cohort, the gene expression signature reflecting activation of E2F3 (closest member to E2F1 among the E2F family) was the only one able to discriminate invasive tumors from superficial tumors (Appendix Fig A3 and Table A2). Third, gene set enrichment analysis revealed that E2F binding sites are significantly enriched in promoters of genes whose expression is strongly associated with invasive tumors (Appendix Table A1). These data strongly indicate that E2F1 expression is functionally associated with invasive tumor development.
Strikingly, both expression of E2F1 and the gene expression signature reflecting activation of E2F1 are strong predictors of the progression of superficial tumors to invasive tumors (Appendix Fig A4 and Fig 2). The robustness of predictive gene expression signatures was tested using five independent algorithms in gene expression data from a larger, independent European cohort (Fig 3).
Because E2F1 is the best known direct downstream target of RB1,36 our results strongly support the notion of involvement of RB1 in tumor progression. Previous studies showed a strong association between RB1 dysregulation and invasive bladder tumors,37,38 although the mechanism of its involvement during progression is not clearly understood. Our gene network-based pathway analyses have provided key insights into the pathogenesis of tumor progression possibly involving inactivation of RB1. The higher biologic activity of E2F1 suggests that it might be the major driving force during progression.
In conclusion, our findings show that a prognostic molecular signature that can predict the likelihood of progression of superficial bladder tumors exists at disease presentation. Furthermore, unequal distribution of expression patterns reflecting activation of E2F1 in subtypes (EL and EH) with different progression rates supports the notion that distinct molecular features of tumor revealed in gene expression govern the clinical phenotypes of superficial bladder cancer. However, identification of patients with high-risk superficial bladder tumors would not necessarily indicate improvement of patient treatment unless appropriate treatments for patients with different risk are developed. Future study should focus on identification of a limited number of markers that still harbor the robustness of our new gene expression signatures but are small enough that their expression can be measured using simple technology like real-time quantitative reverse transcriptase polymerase chain reaction.
Supported by the Basic Science Research Program through the National Research Foundation funded by the Korean Ministry of Education, Science and Technology (MEST) Grants No. 2009-0063260 (S.-H.L.) and 2009-0083310 (I.-S.C.); 21C Frontier Functional Human Genome Project Grants No. FG06-11-06 (S.-H.L.) and FG-4-2 (J.-S.L.) of MEST; the intramural program of the Korea Research Institute of Bioscience and Biotechnology (I.-S.C.), and the intramural faculty fund from the University of Texas M. D. Anderson Cancer Center (J.-S.L.).
Authors' disclosures of potential conflicts of interest and author contributions are found at the end of this article.
The author(s) indicated no potential conflicts of interest.
Conception and design: Ju-Seog Lee, Sun-Hee Leem, In-Sun Chu
Provision of study materials or patients: Yong-June Kim, Wun-Jae Kim
Collection and assembly of data: Sun-Hee Leem, Sang-Yeop Lee, Seon-Kyu Kim
Data analysis and interpretation: Ju-Seog Lee, Sun-Hee Leem, Sang-Cheol Kim, Eun-Sung Park, Sang-Bae Kim, In-Sun Chu
Manuscript writing: Ju-Seog Lee, Sun-Hee Leem, In-Sun Chu
Final approval of manuscript: Ju-Seog Lee, Sun-Hee Leem, Sang-Yeop Lee, Sang-Cheol Kim, Eun-Sung Park, Sang-Bae Kim, Seon-Kyu Kim, Yong-June Kim, Wun-Jae Kim, In-Sun Chu
| 1. | DM Parkin, F Bray, J Ferlay , etal : Global cancer statistics, 2002 CA Cancer J Clin 55: 74– 108,2005 Crossref, Medline, Google Scholar |
| 2. | A Jemal, R Siegel, E Ward , etal : Cancer statistics, 2009 CA Cancer J Clin 59: 225– 249,2009 Crossref, Medline, Google Scholar |
| 3. | CP Dinney, DJ McConkey, RE Millikan , etal : Focus on bladder cancer Cancer Cell 6: 111– 116,2004 Crossref, Medline, Google Scholar |
| 4. | XR Wu : Urothelial tumorigenesis: A tale of divergent pathways Nat Rev Cancer 5: 713– 725,2005 Crossref, Medline, Google Scholar |
| 5. | GD Steinberg, DL Trump, KB Cummings : Metastatic bladder cancer: Natural history, clinical course, and consideration for treatment Urol Clin North Am 19: 735– 746,1992 Medline, Google Scholar |
| 6. | M Liebert, J Seigne : Characteristics of invasive bladder cancers: Histological and molecular markers Semin Urol Oncol 14: 62– 72,1996 Medline, Google Scholar |
| 7. | JM Pow-Sang, JD Seigne : Contemporary management of superficial bladder cancer Cancer Control 7: 335– 339,2000 Crossref, Medline, Google Scholar |
| 8. | HW Herr : Tumour progression and survival in patients with T1G3 bladder tumours: 15-year outcome Br J Urol 80: 762– 765,1997 Crossref, Medline, Google Scholar |
| 9. | S Holmäng, H Hedelin, C Anderstrom , etal : Recurrence and progression in low grade papillary urothelial tumors J Urol 162: 702– 707,1999 Crossref, Medline, Google Scholar |
| 10. | SE Lee, IG Jeong, JH Ku , etal : Impact of transurethral resection of bladder tumor: Analysis of cystectomy specimens to evaluate for residual tumor Urology 63: 873– 877,2004 Crossref, Medline, Google Scholar |
| 11. | S Holmang, SL Johansson : Stage Ta-T1 bladder cancer: The relationship between findings at first followup cystoscopy and subsequent recurrence and progression J Urol 167: 1634– 1637,2002 Crossref, Medline, Google Scholar |
| 12. | AA Alizadeh, MB Eisen, RE Davis , etal : Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling Nature 403: 503– 511,2000 Crossref, Medline, Google Scholar |
| 13. | MJ van de Vijver, YD He, LJ van't Veer , etal : A gene-expression signature as a predictor of survival in breast cancer N Engl J Med 347: 1999– 2009,2002 Crossref, Medline, Google Scholar |
| 14. | JS Lee, IS Chu, J Heo , etal : Classification and prediction of survival in hepatocellular carcinoma by gene expression profiling Hepatology 40: 667– 676,2004 Crossref, Medline, Google Scholar |
| 15. | MC Hall, SS Chang, G Dalbagni , etal : Guideline for the management of nonmuscle invasive bladder cancer (stages Ta, T1, and Tis): 2007 update J Urol 178: 2314– 2330,2007 Crossref, Medline, Google Scholar |
| 16. | BM Bolstad, RA Irizarry, M Astrand , etal : A comparison of normalization methods for high density oligonucleotide array data based on variance and bias Bioinformatics 19: 185– 193,2003 Crossref, Medline, Google Scholar |
| 17. | R Simon, A Lam, M-C Li , etal : Analysis of gene expression data using BRB-Array Tools Cancer Inform 3: 11– 17,2006 Google Scholar |
| 18. | GW Wright, RM Simon : A random variance model for detection of differential gene expression in small microarray experiments Bioinformatics 19: 2448– 2455,2003 Crossref, Medline, Google Scholar |
| 19. | P Pavlidis, J Qin, V Arango , etal : Using the gene ontology for microarray data mining: A comparison of methods and application to age effects in human prefrontal cortex Neurochem Res 29: 1213– 1222,2004 Crossref, Medline, Google Scholar |
| 20. | MD Radmacher, LM McShane, R Simon : A paradigm for class prediction using gene expression profiles J Comput Biol 9: 505– 511,2002 Crossref, Medline, Google Scholar |
| 21. | S Dudoit, F Fridlyand, TP Speed : Comparison of discrimination methods for classification of tumors using DNA microarrays J Am Stat Assoc 97: 77– 87,2002 Crossref, Google Scholar |
| 22. | S Ramaswamy, P Tamayo, R Rifkin , etal : Multiclass cancer diagnosis using tumor gene expression signatures Proc Natl Acad Sci U S A 98: 15149– 15154,2001 Crossref, Medline, Google Scholar |
| 23. | R Simon, MD Radmacher, K Dobbin , etal : Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification J Natl Cancer Inst 95: 14– 18,2003 Crossref, Medline, Google Scholar |
| 24. | M Sanchez-Carbayo, ND Socci, J Lozano , etal : Defining molecular profiles of poor outcome in patients with invasive bladder cancer using oligonucleotide microarrays J Clin Oncol 24: 778– 789,2006 Link, Google Scholar |
| 25. | L Dyrskjøt, T Thykjaer, M Kruhoffer , etal : Identifying distinct classes of bladder carcinoma using microarrays Nat Genet 33: 90– 96,2003 Crossref, Medline, Google Scholar |
| 26. | SB Kim, S Yang, SK Kim , etal : GAzer: Gene set analyzer Bioinformatics 23: 1697– 1699,2007 Crossref, Medline, Google Scholar |
| 27. | A Subramanian, P Tamayo, VK Mootha , etal : Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles Proc Natl Acad Sci U S A 102: 15545– 15550,2005 Crossref, Medline, Google Scholar |
| 28. | AH Bild, G Yao, JT Chang , etal : Oncogenic pathway signatures in human cancers as a guide to targeted therapies Nature 439: 353– 357,2006 Crossref, Medline, Google Scholar |
| 29. | JR Nevins : The Rb/E2F pathway and cancer Hum Mol Genet 10: 699– 703,2001 Crossref, Medline, Google Scholar |
| 30. | O Stevaux, NJ Dyson : A revised picture of the E2F transcriptional network and RB function Curr Opin Cell Biol 14: 684– 691,2002 Crossref, Medline, Google Scholar |
| 31. | L Dyrskjøt, K Zieger, FX Real , etal : Gene expression signatures predict outcome in non-muscle-invasive bladder carcinoma: A multicenter validation study Clin Cancer Res 13: 3545– 3551,2007 Crossref, Medline, Google Scholar |
| 32. | JS Lee, IS Chu, A Mikaelyan , etal : Application of comparative functional genomics to identify best-fit mouse models to study human cancer Nat Genet 36: 1306– 1311,2004 Crossref, Medline, Google Scholar |
| 33. | JS Lee, J Heo, L Libbrecht , etal : A novel prognostic subtype of human hepatocellular carcinoma derived from hepatic progenitor cells Nat Med 12: 410– 416,2006 Crossref, Medline, Google Scholar |
| 34. | MW Kattan : Evaluating a new marker's predictive contribution Clin Cancer Res 10: 822– 824,2004 Crossref, Medline, Google Scholar |
| 35. | MW Kattan : Judging new markers by their ability to improve predictive accuracy J Natl Cancer Inst 95: 634– 635,2003 Crossref, Medline, Google Scholar |
| 36. | N Dyson : The regulation of E2F by pRB-family proteins Genes Dev 12: 2245– 2262,1998 Crossref, Medline, Google Scholar |
| 37. | RJ Cote, MD Dunn, SJ Chatterjee , etal : Elevated and absent pRb expression is associated with bladder cancer progression and has cooperative effects with p53 Cancer Res 58: 1090– 1094,1998 Medline, Google Scholar |
| 38. | P Cairns, AJ Proctor, MA Knowles : Loss of heterozygosity at the RB locus is frequent and correlates with muscle invasion in bladder carcinoma Oncogene 6: 2305– 2309,1991 Medline, Google Scholar |
Acknowledgment
We thank C.P.N. Dinney and D. McConkey for critical reading of the manuscript.

Fig A1. Hierarchical clustering of gene expression data of human bladder tissues. Gene expression data were collected from 233 bladder tissues (165 tumors, 58 surrounding tissues, and 10 normal tissues). Genes with an expression ratio that has at least a two-fold difference relative to the median gene expression level across all tissues in at least 25 tissues were selected for hierarchical analysis (5,751 gene features). The data are presented in matrix format in which rows represent individual genes and columns represent each tissue. Each cell in the matrix represents the expression level of a gene feature in an individual tissue. The red and green colors in cells reflect high and low expression levels, respectively, as indicated in the scale bar (log2-transformed scale). The dendrogram lists tissues examined.

Fig A2. Expression of E2F1 in superficial and invasive bladder tumors. (A) Expression of E2F1 in Korean patients with bladder cancer. Gene expression ratios in the x-axis indicate relative expression of E2F1 in tumors when compared with normal bladder tissues. When identifying genes that might best represent the association with invasive tumors, both statistical significance and biologic activity of genes were considered together. For biologic activity of genes, transcriptional activity of genes (ie, transcription factor) was most considered because its biologic activity can be best reflected in gene expression patterns. E2F1 was the only transcription factor with significant statistical difference when invasive tumors were compared with superficial tumors (P < .001, two-sample t test) and surrounding tissues (P < .001, two-sample t test). (B and C) Expression of E2F1 in Spanish patients with bladder cancer (105 patients; 33 superficial tumors and 72 invasive tumors).24 Gene expression data from the Spanish cohort were generated by using Affymetrix (Santa Clara, CA) U133A microarray platform. Two E2F1 probes were presented in U133A, and probe identification numbers are presented as indicated.

Fig A3. A dendrogram and heat map overview of the two-way hierarchical cluster analysis of gene expression data from 165 bladder tumors and E2F3-specific gene expression data. The data are presented in matrix format in which columns represent individual samples and rows represent each gene. Each cell in the matrix represents the expression level of a gene feature in an individual sample. The red and green colors in cells reflect high and low expression levels, respectively, as indicated in the scale bar (log2-transformed scale). Colored bars between dendrogram and heat map represent samples, as indicated at the end of each row. To estimate the statistical significance, two-way contingency table analysis with χ2 test was applied. Description of data analysis is as follows. Oncogene-specific gene expression data (CTNNB1, SRC, E2F3, RAS, and MYC) and bladder tumor gene expression data were independently centralized across the samples and combined together, and hierarchical clustering was applied to the combined data. We first asked whether the activated E2F3 signature could distinguish invasive tumors from superficial tumors. We considered tumors coclustered with activated E2F3 in human cells to have activation of the E2F3 oncogene. Fifty-four of 63 invasive tumors were coclustered with E2F3-specific gene expression signatures, whereas 64 of 103 superficial tumors were excluded from the E2F3 signature cluster. The sensitivity and specificity for predicting invasive tumors were 0.85 and 0.62, respectively. However, the other oncogene-specific gene expression signatures could not distinguish invasive tumors from superficial tumors (Appendix Table A2), suggesting that activation of E2F transcription factors in bladder tumors is more associated with invasive tumors than any other oncogenes examined in the analysis. However, because of the limited scope of the analysis (small number of oncogenes and limitation of clustering methods to estimate similarity of signature) and low sensitivity (0.62) of prediction, we cannot rule out the possibility that other oncogenes (not examined in this analysis) might also be associated with invasive tumors.

Fig A4. Expression of E2F1 in superficial tumors and its association with progression to invasive tumor. (A) Relative expression of E2F1 in superficial tumors. Superficial tumors were ranked according to relative expression level of E2F1, and tumors were subdivided into two groups as indicated in different colors. (B) Kaplan-Meier plots of progression of patients with superficial bladder cancer. Higher expression of E2F1 in superficial tumors is significantly associated with progression to invasive tumors. PROG, progression.
|
| Rank* | Gene Set | TRANSFACid | No. of Genes† | P‡ (LS permutation) | P‡ (KS permutation) |
|---|---|---|---|---|---|
| 1 | MYC | T00140 | 688 | < .001 | < .001 |
| 2 | CEBPA | T00105 | 148 | < .001 | < .001 |
| 3 | CEBPB | T00581 | 61 | < .001 | < .001 |
| 4 | CREB1 | T00163 | 177 | < .001 | < .001 |
| 5 | E2F1 | T01542 | 436 | < .001 | < .001 |
| 6 | E2F2 | T01544 | 138 | < .001 | < .001 |
| 7 | E2F4 | T01546 | 223 | < .001 | < .001 |
| 8 | EGR1 | T00241 | 99 | < .001 | < .001 |
| 9 | ERG | T00265 | 21 | < .001 | < .001 |
| 10 | ESR1 | T00261 | 102 | < .001 | < .001 |
| 11 | ETS1 | T00112 | 161 | < .001 | < .001 |
| 12 | FLI1 | T02066 | 36 | < .001 | < .001 |
| 13 | JUN | T00029 | 212 | < .001 | < .001 |
| 14 | MYB | T00137 | 160 | < .001 | < .001 |
| 15 | NFIC | T00176 | 109 | < .001 | < .001 |
| 16 | NFKB1 | T00591 | 196 | < .001 | < .001 |
| 17 | POU2F1 | T00641 | 102 | < .001 | < .001 |
| 18 | RARA | T00719 | 93 | < .001 | < .001 |
| 19 | REL | T00168 | 24 | < .001 | < .001 |
| 20 | RELA | T00594 | 83 | < .001 | < .001 |
| 21 | SP1 | T00759 | 306 | < .001 | < .001 |
| 22 | SPI1 | T02068 | 83 | < .001 | < .001 |
| 23 | TFAP2A | T00035 | 305 | < .001 | < .001 |
| 24 | TP53 | T00671 | 267 | < .001 | < .001 |
| 25 | USF1 | T00874 | 103 | < .001 | < .001 |
Abbreviations: LS, logarithmic signature; KS, Kolmogorov-Smirnov.
*Gene list of transcription factors is ranked by P values of the LS permutation test.
†We used curated transcription factor binding sites available from Transcriptional Regulatory Element Database (http://rulai.cshl.edu/TRED) to eliminate binding sites without experimental verification.
‡To avoid potential false-positive results, only transcription factor binding sites with P < .001 in both LS and KS permutation are included in list.
|
| Cluster | No. of Tumors | Sensitivity* | Specificity* | |
|---|---|---|---|---|
| Invasive | Superficial | |||
| E2F3† | 54 | 39 | 0.85 | 0.62 |
| Control | 9 | 64 | ||
| MYC | 31 | 90 | 0.49 | 0.12 |
| Control | 32 | 13 | ||
| RAS | 27 | 67 | 0.42 | 0.34 |
| Control | 36 | 36 | ||
| SRC | 14 | 71 | 0.22 | 0.31 |
| Control | 49 | 32 | ||
| CTNNB1 | 26 | 82 | 0.41 | 0.20 |
| Control | 37 | 21 | ||
*Sensitivity and specificity for predicting invasive tumors with each defined oncogene-specific gene expression signature.
†By χ2 test, P < .001.
|
| Class | Sensitivity | Specificity | PPV | NPV |
|---|---|---|---|---|
| CCP | ||||
| EH | 0.83 | 0.929 | 0.907 | 0.867 |
| EL | 0.929 | 0.83 | 0.867 | 0.907 |
| LDA | ||||
| EH | 0.872 | 0.964 | 0.953 | 0.9 |
| EL | 0.964 | 0.872 | 0.9 | 0.953 |
| NC | ||||
| EH | 0.787 | 0.911 | 0.881 | 0.836 |
| EL | 0.911 | 0.787 | 0.836 | 0.881 |
| SVM | ||||
| EH | 0.915 | 0.857 | 0.843 | 0.923 |
| EL | 0.857 | 0.915 | 0.923 | 0.843 |
| BCCP | ||||
| EH | 0.702 | 0.875 | 0.825 | 0.778 |
| EL | 0.875 | 0.702 | 0.778 | 0.825 |
NOTE. All classifiers have P < .01. The probability of obtaining a small cross-validated misclassification rate by chance was obtained by repeating the entire cross-validation procedure using 100 random permutations of the class labels for the clinical criteria being evaluated; this gave rise to a classifier P value. Sensitivity is the probability of a class A sample being correctly predicted as class A. Specificity is the probability of a non–class A sample being correctly predicted as non–class A. PPV is the probability that a sample predicted as class A actually belongs to class A. NPV is the probability that a sample predicted as non–class A actually does not belong to class A. For class EH, if n11 = No. of class EH samples predicted as EH, n12 = No. of class EH samples predicted as EL, n21 = No. of EL samples predicted as EH, and n22 = No. of EL samples predicted as EL, then sensitivity = n11/(n11 + n12), specificity = n22/(n21 + n22), PPV = n11/(n11 + n21), and NPV = n22/(n12 + n22).
Abbreviations: PPV, positive predictive value; NPV, negative predictive value; CCP, compound covariate predictor; EH, E2F1 high; EL, E2F1 low; LDA, linear discriminator analysis; NC, nearest centroid; SVM, support vector machines; BCCP, Bayesian compound covariate predictor.
|
| Variable | Decrease in Concordance Index | 95% CI | P |
|---|---|---|---|
| Progression signature | 0.035 | 0.004 to 0.075 | .01 |
| Stage | −0.002 | −0.015 to 0.0094 | .68 |
| Grade | 0.01 | −0.007 to 0.034 | .15 |
| Age | 0.008 | −0.016 to 0.037 | .27 |
| Sex | 0.003 | −0.006 to 0.021 | .28 |
| BCG/MMC treatment | 0.018 | −0.007 to 0.048 | .02 |
Abbreviations: BCG, Bacillus Calmette-Guérin; MMC, mitomycin.


