The data section offers the possibility to explore tissue sides (default) or disease states of all 54613 entries within the CellLineNavigator database. 50 entries per page were displayed be default, eligible the value can be changed to 100, 250, 500 per page. Furthermore, a sorting option is supported whereas the user can sort by Symbol, Transcript, Entrez Gene Id or Probe ID. Moreover, an option to filter for specific expression is provided, possible options are: 1.5fold, 2fold, 2.5fold, 3fold, 4fold, and 5fold. The default filter is set to list all genes with a different expression level of at least 2fold in comparison to the control (see Material / Data Sources). However, the user may also set the filter criteria to none (no filter) or no regulation (list all genes whose M-Values are in the range of -1 to +1).
To allow users a high degree of flexibility to access CellLineNavigator, we implemented an advanced search section, offering the user Fulltext search or Explore profile options:
Download CellNavigator in tab separated text file format.
Genome-wide expression data, freely available at ArrayExpress experiment E-MTAB-37,were kindly provided by J. Greshock et al., Laboratory of Cancer Metabolism Drug Discovery, GlaxoSmithKline, USA. The transcript abundance of 317 cancer cell lines were analyzed using the Affymetrix HG-U133 Plus2 GeneChip technology. This chip covers the complete human genome for analysis of over 45,000 transcripts, more than 19,000 genes respectly. All data were available in technical triplicates. Corresponding information on tissue site and disease state were supported for each cell line
The analysis were implemented in R-Project using the bioconductor libraries affy, hgu133plus2.db and fRMA (McCall 2011, McCall 2010).
Processing Affymetrix U133 Plus2:
After quality control, two mircroarrays experiments (SNU398 - Replicate 1 and SNU423 - Replicate 2) were neglected for further analysis, due to insufficient RNA play real slot online level detection. All data were normalized using the expresso function of the affy package and following settings: background adjustment method: mas, normalization method: quantiles, PM adjustment method: mas and the method used for the computation of expression values: medianpolish. Next, we calculated the expression median for each probe set over all cell lines. These values were subsequently used as control to calculate log2 transformed expression ratios (M-values), after the median expression was calculated for each cancer cell line. M-values representing the expression levels of tissue sites and disease states were calculated accordingly. Official gene symbols and NCBI Entrez GeneIDs were assigned to the data using the hgu133plus2.db package.
Gene barcode generation / Z-Score Transformation:
Gene expression barcodes were generated using the frma (default options) and barcode (output: z-score) function implemented in the frma package (McCall et al. 2011, McCall et al. 2010). A FRMA Z-Score of > 5 suggested that a gene is expressed in a particular tissue. The FRMA Z-Score was generated to allow comparison of the expression profiles to data already present at medicalgenomics.org and other microarray datasets processed with the FRMA method. Finally, the Z-Score was summarized via mean for each cell line, tissue site and diseas state. Official gene symbols and NCBI Entrez GeneIDs were assigned to the data using the hgu133plus2.db package.