Figure 2.
Statistical validations and comparisons to single-species expression
networks. A. Meta-gene
permutations. Shown is the number of
meta-gene interactions (y-axis) exceeding a P-value cutoff (x-axis) in networks
constructed from: real meta-genes (blue line), a random distribution (red
line), and randomly permuted meta-genes (green line). P-values are shown in log-scale.
Red arrow denotes P < .05 used in the gene co-expression
network. B. Random Halves. We randomly divided the databases of each
species into two equally sized sets, and then generated new networks derived
from each half of the data for a series of P-values. Shown is the percent of meta-gene pairs with P<p in the first
half that have P<0.05 in the second half, for each P-value p. P-values are shown in log-scale. Three additional randomizations gave
identical results. Red arrow denotes P
< .05 used in the gene co-expression network. C-F. Comparison of multiple species to single-species
expression networks. We constructed a
co-expression network from each species by selecting a Pearson correlation
cutoff of k and linked every pair of genes with a correlation of k or higher.
We re-iterated this procedure at various settings of k to generate expression
networks with differing degrees of coverage and predictive power. We also constructed co-expression networks
from multiple species as described above, using not only the cutoff of P <
.05 used in the network discussed above but for varying P-values. Shown is a comparison of all methods in
terms of their ability to predict functional categories from KEGG. For each functional category, we combined
the neighbors in the network of all genes from the category, and plotted the
percent of genes from the category that were included (x-axis; coverage) versus
the percent of interactions that were between two genes in that category
(y-axis; accuracy). We varied the
Pearson threshold for constructing the network in each case to obtain different
networks that result in different coverage and accuracy.