Exploratory Analysis of Cancer SAGE Data


Using several analyse techniques for the hierarchical clustering of a SAGE expression dataset of 822 tags from 74 tissue samples (normal and cancer) we show that cleaning the dataset (tags and experiments) is critical and that attribution of a tag to a gene is not easy. Comparison of cancers from various tissues is a difficult task as tissue samples cluster according to tissue origin and not as cancer or normal.

9th European Conferences on Principles and Practice of Knowledge Discovery in Databases (PKDD'05), Discovery Challenge