Gene Expression across Normal and Tumor tissue Home | Document | SiteMap   
Gene Expression across Normal and Tumor tissue
Menu Contents
 • Overview
  - Introduction
- DB description
- Method
- Outlier detection
- Analysis of
   Laboratory Effects

- Disclaimers
 • Search
 • Analysis
  - ERBB2 (HER2)
 • Inforamation
- Cancer Gene
- Cancer Cell Line
 • Reference



Recent examples of successful cancer therapeutics such as Gleevec, Herceptin, and Iressa suggest that the concept of ‘molecular targeted therapy’ is applicable to human cancers of diverse tissue and genetic origin (Stuart and Sellers, 2009). ‘Oncogene addiction’ is a term to describe a phenomenon in which the growth and survival of tumors are impaired by the inactivation of a single oncogene (Weinstein and Joe, 2008). There are several established relationships between genetic alterations and corresponding targeted therapies, and efforts to identify further genetic alterations are underway. Mechanisms of genetic alterations include mutations (EGFR in lung cancer), translocations (BCR-ABL in chronic myeloid leukemia), and gene amplifications (ERBB2 in breast cancer) (Stuart and Sellers, 2009).

Interestingly, some addicted oncogenes are altered in only a subset of cancer patients. For examples, ERBB2 is amplified and over-expressed in about 25-30% of breast cancer patients, whereas EGFR is mutated in about 20% of lung cancer patients. Cancer Outlier Profile Analysis (COPA) is a computational method that identifies gene expression profiles that are pathogenically over-expressed in only a subset of patients. AGTR1 is an example of a potential target genes identified by applying the COPA method to the Oncomine database (Rhodes, et al., 2009).

A database with a large sample size is a great advantage when searching for genes over-expressed in only a subset of patients. For example, identifying genes over-expressed in 50 out of 1000 patients is easier and more reliable than identifying genes over-expressed in 2 out of 40 patients. Although the sample size of most individual gene expression studies rarely exceeds one thousand, a data set of nearly ten thousand samples (i.e., GeneSapiens database) can be created by a combined analysis of multiple data sets (Kilpinen, et al., 2008). Recent work has shown that analysis of a large microarray data set compiled from many data sets can reveal novel findings that are difficult to observe in the individual studies (Lukk, et al., 2010). For a combined analysis, data sets created by the Affymetrix platforms (i.e., U133A and U133plus2) offer several advantages. First, a majority of gene expression data sets have been created using the Affymetrix platforms. Second, many data sets are accompanied by raw CEL files so that users can pre-process them as they wish. We have collected human tissue gene expression data sets produced using the Affymetrix U133A and U133Plus2 platforms from public resources, and constructed a large-scale gene expression database of more than 40,000 samples.

Data Summary
 • Data sets
  - U133Plus2 : 306
- U133A : 241
 • Samples
  - U133Plus2 : 24,300
- U133A : 16,400
 • Probes
  - U133Plus2 : 54,613
- U133A : 22,215
analysis documant information Search top
  111 Gwahangno, Yuseong-gu, Daejeon, Korea TEL. +82-42-879-8116, FAX. +82-42-861-1759, Email
Copyright © 2010 by Korea Research Institute of Bioscience and Biotechnology (KRIBB) All rights reserved.