A schematic outlining the gene set comparisons and filtering operations performed using the normal tissue signature and gene sets from published expression profiles. Circles denote gene sets, labeled by name and with their size. Numbers in brackets denote the size of a gene set after filtering for high variance genes (Var >1) in normal tissue; 7.36% of genes in the normal dataset have variance greater than 1. Intersections between gene sets as well as the size of filtered gene sets are labeled with p values denoting the significance of the overlap (hypergeometric test), or the significance of overrepresentation of high variance genes (χ2 goodness of fit test), respectively. The data were derived from the following sources: SFT/DTF (Additional file 6a,b) [36]; SAGE [33]; CSR (Additional file 6c,d) [44].

Finak et al. Breast Cancer Research 2006 8:R58   doi:10.1186/bcr1608