To identify genes that are significantly enriched by mRNA tagging, we first normalized the total amount of cy3 and cy5 signal to each other in each hybridization. We measured the ratio of the signals from the co-immunoprecipitated RNA (Cy-5) to total RNA in the cell extract (Cy-3), and then calculated the percentile rank for each gene relative to all of genes in each hybridization. The mRNA tagging experiment was repeated six times, and the average percentile rank (mean) from all repeats was determined. A Student's t test was then used to determine which genes showed a mean enrichment that was significantly greater than the average enrichment for all genes. Mock mRNA-tagging was done using four repeats with wild-type (N2) worms.
A gene from the muscle gene list was counted as clustered if its start position was within 10 kb of the start position of another muscle gene. We also varied the distance criteria between 1 kb and 1 MB and observed significant clustering (p<0.001) from 1 to 25 kb.
A detailed explanation of the calculations used to measure gene clustering is at the supplemental web site. Briefly, the calculations included only those genes that were present on the microarray and for which we could determine a chromosomal position1. The calculations used only the first gene in an operon, and only one gene of tandem repeats. The number of clusters expected due to random chance was calculated separately for each chromosome, and then summed to give the total number. This was done so that the number expected due to random chance reflected any bias in the experimental list. For example, sperm genes are nearly missing from the X chromosome(2), and so we generated lists of randomly selected genes from each of the autosomes and the X chromosome in proportion to the observed amounts. Finally, since the germ line data were obtained from experiments using microarrays containing 11,917 genes 2, lists of genes were randomly selected from only these genes to avoid bias.
1. Stein, L., Sternberg, P., Durbin, R., Thierry-Mieg, J. & Spieth, J. WormBase: network access to the genome and biology of Caenorhabditis elegans. Nucleic Acids Res 29, 82-6 (2001).
2. Reinke, V. et al. A global profile of germline gene expression in C. elegans. Mol Cell 6, 605-16 (2000).