Additional file 13.

A figure showing the effect of the common reference design in principal component analysis. Data that exhibit no variation in gene expression corresponds to an expression matrix where each gene on each array has exactly the same expression level. A slightly more realistic case exists where each gene has a different expression level, but the expression is just random noise (left panel). The principal components each explain a similar, small amount of the total variation in the data. The case at the other extreme of the spectrum from the random noise example consists of perfectly correlated data with no noise, as might be imagined from ideal replicate arrays (middle panel). The variability in the data occurs from each gene having a different level of expression; however, that expression is identical across arrays. Only one principal component is necessary to capture all of the variation in the data. The third and most realistic case consists of correlated data with random noise. This closely resembles what is observed in the normal tissue dataset with a common reference design. The arrays are highly correlated, resulting in the first principal component explaining the majority of the observed variations, and the remaining variation distributed amongst the remaining components.

Format: PDF Size: 36KB Download file

This file can be viewed with: Adobe Acrobat Reader

Finak et al. Breast Cancer Research 2006 8:R58   doi:10.1186/bcr1608