Gene Expression across Normal and Tumor tissue Home | Document | SiteMap   
Gene Expression across Normal and Tumor tissue
Menu Contents
 • Overview
  - Introduction
- DB description
- Method
- Outlier detection
- Analysis of
   Laboratory Effects

- Disclaimers
 • Search
 • Analysis
  - ERBB2 (HER2)
 • Inforamation
- Cancer Gene
- Cancer Cell Line
 • Reference



  - Data preprocessing and normalization

W henever CEL files were available (288/306 for U133plus2 and 191/242 for U133A), we pre-processed them using the MAS5 algorithm using the affy package (Gautier, et al., 2004). We chose the MAS5 algorithm because it is a single-array algorithm in which expression values are independent of other data. We then normalized each sample to a target density of 500. For data sets without CEL files but pre-processed by the MAS5 algorithm (18/306 for U133plus2 and 51/242 for U133A), we used expression measures downloaded from the web source and normalized them to a target density of 500.

  - Data integration

W e then classified each sample according to tissue and disease types. Most samples (~80%) were classified into either cancer or normal, but about 20% of samples were classified into other diseases including neurodegenerative diseases, immune-related diseases, and organ-specific diseases. We also collected expression data for more than 2500 samples comprising nearly 1000 different cancer cell lines across tissues, and processed them using the same method. The system implementation is based on an Apache web server, PHP scripts for data processing, R scripts for image production, and MySQL as a backend database.

Data Summary
 • Data sets
  - U133Plus2 : 306
- U133A : 241
 • Samples
  - U133Plus2 : 24,300
- U133A : 16,400
 • Probes
  - U133Plus2 : 54,613
- U133A : 22,215
analysis documant information Search top
  111 Gwahangno, Yuseong-gu, Daejeon, Korea TEL. +82-42-879-8116, FAX. +82-42-861-1759, Email
Copyright © 2010 by Korea Research Institute of Bioscience and Biotechnology (KRIBB) All rights reserved.