Interaction networks from discrete event data by Poisson multivariate mutual information estimation and information flow with applications from gene expression data

In this work, we introduce a new methodology for inferring the interaction structure of discrete valued time series which are Poisson distributed. While most related methods are premised on continuous state stochastic processes, in fact, discrete and counting event oriented stochastic process are natural and common, so called time-point processes. An important application that we focus on here is gene expression, where it is often assumed that the data is generated from a multivariate Poisson distribution. Nonparameteric methods such as the popular k-nearest neighbors are slow converging for discrete processes, and thus data hungry. Now, with the new multi-variate Poisson estimator developed here as the core computational engine, the causation entropy (CSE) principle, together with the associated greedy search algorithm optimal CSE (oCSE) allows us to efficiently infer the true network structure for this class of stochastic processes that were previously not practical. We illustrate the power of our method, first in benchmarking with synthetic datum, and then by inferring the genetic factors network from a breast cancer micro-ribonucleic acid sequence count data set. We show the Poisson oCSE gives the best performance among the tested methods and discovers previously known interactions on the breast cancer data set.


I. INTRODUCTION
Understanding the behavior of a complex system requires knowledge of its underlying structure.However prior knowledge of the network of interactions is often unavailable, necessitating estimation from data.Perhaps no complex system is more important to our health and well being than Data-driven analysis of gene expression is a complex system that is especially important to our health and well being.However, these data are generally time-point process (TPP), and discretely distributed, rather than continuous valued as most mutual information inference methods presume.Specifically, we assume a jointly distributed Poisson process.While TPP are relatively common, as far as we know, no efficient joint entropy estimator exists.To this end, the main goal of this paper is to fill that gap.
Granger causality [12] has been used for network inference when interpreted as a causation inference concept, for linear stochastic processes, as well as transfer entropy (TE) [13] based on information theorety for nonlinear processes.Howeverm when applied to a system with more than two factors, neither of these concepts can distinguish direct versus indirect effects or co-founders, and therefore they will tend to yield false positive connections.To this end, we developed causation entropy (CSE) as a generalization of transfer entropy [14,15], that explicitly defines the information flow between two factors, conditioned on tertiary intermediary factors.This, together with a greedy search algorithm to construct the network of interactions of the complex stochastic process, provably reveals network structure of certain stochastic processes, [15].In past studies, TE as well as CSE were computed nonparametrically, by the Kraskov-Stögbauer-Grassberger (KSG) [16] mutual information estimator which is a K-nearest neighbors (KNN) method.However, if specific knowledge of the joint distribution of the process allows considerable computation efficiencies, such as our previous work where jointly Gaussian variables [15] or jointly Laplace distributed variables in [17] were relevant.
Here, we focus on gene expression networks, which are an application of considerable scientific importance due to their foundational relevance as a building block tool to understanding details of life science.It is well understood that many diseases associate with variations of the expression of a single gene [1][2][3], e.g., famously such as sickle cell disease and cystic fibrosis.However it remains a difficult problem with considerable health implications to explain and to infer complex interactions and associations when many genes may be involved in common and even deadly disease.Such diseases are called polygenetic, and these include the breast cancer example that we study here.According to the Centers for Disease Control (CDC), in the USA, breast cancer is considered to be the second most common form of cancer amongst woman, [4], that in 2019 was forecast to 268,600 cases and 42,260 deaths in 2019.We advance here a new methodology to probe variations in expression of a group (network) of genes that may lead to disease.Understanding the gene interaction network structure may be crucial to the development of future treatments.Network inference itself has many applications beyond cancer research, including fMRI network inference [5][6][7][8], drug-target interaction networks [9], and earthquake network inference [10] and economy issues [11] to name a few.
With this motivation, the main technical premise of this paper is to develop a computationally efficient approach to estimate joint entropy and related information theoretic measures for multivariate Poisson processes.Data derived from these are discretevalued data, and typically consist of a significant fraction of zeros punctuated with nonzero values describing event counts in a given epoch.From the multi-variate Poisson model, we derive an analytical series representation of the joint entropy and the mutual information.Then, a practical finite partial-sum estimator allows estimation of mutual information, toward transfer entropy and causation entropy.
This paper is structured as follows: first, we provide a brief introduction to mathematical background including a multivariate Poisson model and also relevant information theoretic quantities which are necessary to define information flow.Then, we derived our multivariate Poisson joint entropy estimator, which we relate to network inference.Finally, in the Results section we demonstrate our method and performance for benchmark synthetic data and then we study the breast cancer gene expression data sets.From data, we proceed to distribution parameter estimation to approximate joint entropy.

II. BACKGROUND A. Multivariate Poisson Model
First let us recall the single variate Poisson Model, [18,19]: The Poisson model has a multivariate generalization as follows, [20]: , where the set and N 0 = N ∪ {0}.This model is based on assuming that the x i are linearly transformed from a set of independently drawn Poisson variables.We begin with Y ∈ N m×t 0 = (y 11 , y 22 , ..., y nn , y 12 , y 13 , ..., y (n−1)n ) T .
Here each y ij is independent Poisson, that is: Note that in this case λ ij = λ ji .The rows of X thus represent Poisson random variables which have t observations.Although the number of parameters needed to specify this model grows quickly, there are some nice properties.For instance, this model allows a simple estimate of each λ ij , since the sum of independent Poisson variables yields the following covariance matrix structure: with λ ij = λ ji .The (i, j) entries of the covariance matrix represent cov(x i , x j ), the covariance between the two random variables, x i and x j ).Proof of Eq. 5 may be found in the appendix.This model is a multivariate extension of the Poisson model that does not assume the random variables are necessarily independent.However, there are some limitations to this model.First, the rapid growth in the number of states and parameters with respect to the number of variables, making calculation of the joint distribution computationally unwieldy and expensive.Another limitation is that model cannot handle negative covariance [20].These difficulties are particularly complicating in the forth coming entropy computations, and so they must be handled in later sections.

B. Transfer Entropy and Causation Entropy
We briefly review certain Shannon entropies, building toward the concepts of transfer entropy and causation entropy.These are the fundamental concepts of information flow we use to consider network inference.The Shannon entropy of a (discrete) random variable X is given by [22,23]: where P (x) is the probability that X = x, and 0log(0) = 0 is the usual interpretation in this context.For the remainder of this paper we choose the natural log and thus all entropies will be measured in nats.Entropy can be thought of as a measure of how uncertain we are about a particular outcome.As an example we can imagine two scenarios, in one case we have a random variable X 1 = (x 1 , ..., x 1 ) with x (t) 1 = 0(∀t), that is P (X 1 = 0) = 1, in the other case the random variable 2 , ..., x (n) 2 ) with P (X 2 = 0) = 0.5, P (X 2 = 1) = 0.5.Here H(X 1 ) = 0 nats, while H(X 2 ) = ln(0.5)nats which happens to be the maximum for this case [23].It is easy to see that Shannon entropy reaches it's greatest value when we are the most uncertain about the outcome, and its minimal value (0) when we are completely certain about the outcome.We can now examine the case of two random variables X and Y .The joint entropy of a discrete random variable is defined [23]: When the two random variables X and Y are independent There are comparable definitions of differential entropies for continuous random variables in terms of integration.The conditional entropy is defined: The conditional entropy gives us a way to describe the relationship between variables, which is the key to network inference.If knowledge of the variable Y gives us complete knowledge of the variable X then the conditional entropy will be H(X|Y ) = 0 nats.Another important Shannon entropy is the mutual information which is defined as [23]: Finally, the Kullback-Leibler (KL) divergence (D KL ) [23] is stated: The KL divergence describes a distance-like quantity between two probability distributions, though it is not a metric as for one, it is not symmetric (that is in general D KL (P ||Q) = D KL (Q||P )), and also it does not satisfy the triangle inequality.Mutual information Eq. 9 can be written in terms of KL divergence as [23]: describing a deviation from independence of a joint random variable (x, y).For a stationary stochastic process, {X t }, the entropy rate is defined as [37,38]: If the process is Markov (memoryless) then [23]: The transfer entropy from X 2 to X 1 is defined [13,14]: Causation entropy is a generalization of the transfer entropy, where [14,15]: C Q→P|S is designed to describe the remaining information flow from processes Q to processes P that may not accounted for (conditioned on) processes S.An example of causation entropy is shown in Fig. 2 (a).In theory if a process Z has no influence over another process Y , the causation entropy after conditioning out the remaining processes would be identically 0, allowing us to reject a connection from Z to Y .In practice however, when estimating these quantities by statistics from finite samples of noisy data, these will not compute to be identically 0, making it necessary to have a threshold, which is the purpose of using a shuffle test as discussed in [15].Network inference can be developed based on Eq. ( 15).However, considering the power-set of all possible subsets P, Q, S is clearly NP-hard and so not practical.This led to the development of a greedy search algorithm, we referred to as optimal causation entropy (oCSE) [15,17] to a minimal network that explains the data, in terms of minimal causation entropy.This proceeds in two stages, aggregative discovery of statistically significant links, those that are maximally informative influencers in terms of the conditionally already significant links, with possible removal of statistically irrelevant links developed while growing the global network, and significance decided by a null hypothesis in terms of multiple random shuffles of the data.We were able to prove under mild hypothesis of the stochastic process that this procedure will discover the true network, also assuming a good statistical estimation of the entropies.It is precisely this problem of good data-driven statistical estimation of entropies specialized to the scenario of a multivariate Poisson process which is what we handle in this paper in the case of multivariate Poisson.

III. ENTROPY ESTIMATION FROM MULTIVARIATE POISSON DATA A. Estimating Joint Entropy of Poisson Systems
Here we develop an estimator of entropies for the multivariate Poisson distribution, Eq. (2).To this end, we truncate partial sums from series representations.

B. Poisson Entropy
We begin the Poisson Entropy: This expression for the entropy of a Poisson random variable is in terms of an infinite series, which is well approximated by a finite truncation partial sum.

C. Bivariate Poisson Entropy
The Bivariate Poisson case is instructive to the n-variate Poisson case.Consider: Let, and, Then Eq. 17 will become: Now to get the joint entropy of the Bivariate Poisson we have: A scenario of interest arises when λ 11 , λ 22 , and λ 12 are all small and λ 12 << λ 11 λ 22 . .In this case we have since the d 12 term dominates.Small λ 11 and λ 22 ensures that the large x 1 and x 2 terms to become insignificant in Eq. 21.Thus, Grouping terms and remembering (the middle part of) Eq. 16, and estimating D(x 1 , x 2 ) = 1 , a finite partial sum of Eq.21 can be written: Remembering the assumption λ 12 << 1, the expression Eq. 23 reduces further: As Fig. 3 shows, this approximation works well when d 12 << 1 =⇒ λ 12 << λ 11 λ 22 , and in this regime the error will be small.Similar analysis can be carried out for the larger multivariate cases which allows us to arrive at a general formula for our approximation given by: where we are assuming that λ ij are small for all (i, j) pairs.Fortunately as we can see in Eq. 25, all of the quantities on the right hand side are computationally efficient to compute.This in fact greatly reduces the computational time necessary for estimation of the joint entropy.This formulation requires asymptotic assumptions that may not be valid in general in nature.However we find empirically in simulations that by scaling the rates λ ij to be in [0, 1] the estimate performs well, as described by Fig. 3 and, verified in the network simulations, regardless of what the true underlying rates this scaling produces similar results.As a note of caution, consider that when calculating the mutual information in the Poisson model, care must be taken due to how the marginals of a joint Poisson process are drawn.For example from Eq. 9 it may be tempting to assert: with X 1 ∼ Poisson(λ 11 ) and X 2 ∼ Poisson(λ 22 ).However this is not exactly correct, though the error here is subtle.In fact we must make a small change to Eq. 26 to be: here . This subtle difference is important, because without recognizing this fact, the calculated mutual information becomes negative, which violates our well established condition that mutual information be positive.The need for X1 and X2 is apparent from Eq. 5, when two Poisson random variables are summed together their marginals then are drawn from the sum of the underlying rate (i.e.λ ii ) and the coupling rate (i.e.λ ij ).This also transfers to computing the conditional mutual information.To better illuminate this calculation it is helpful to refer to Fig. 2 (b).
In the special case presented in Fig. 2 (b) Eq. 37 becomes therefore, In this special case we can note the following: Applying Eq. 31 to Eq. 30 we find that: We know from Eq. 27 that in the Poisson case this becomes: Applying the following facts to Eq. 33 we find that: Similar analysis also shows that: this implies that we must use the Poisson marginals in the computation of the conditional mutual information.That is in the Poisson case we must have: Note the use of X and Ŷ in this case.This distinction in the Poisson case is important because we note that without using the proper marginals the computation results in negative conditional mutual information which is clearly not correct since conditional mutual information must be positive [23].
Importantly the new definition given in Eq. 25 becomes more computationally efficient than computing the Poisson joint entropy directly from the joint probability.This requires calculation of only separate single variate entropies which is requires less computation.This naturally leads to the question of the accuracy of this new model.As can be seen in Fig. 4 the new definition of entropy still leads to accurate identification of network structure.This new definition also fits into the general framework of entropy which was developed above, allowing us to apply the oMII algorithm to the data.

D. Network Structure and Inference
In a gene interaction network, understanding how future treatments could be developed, especially in the cases where more than a single gene may be implicated in a disease, may help in designing targeted for therapies.Genes interact with outcome such as disease reduces to a network inference problem.We do not assume apriori knowledge of the underlying network structure, but instead we have data describing time series of evolving stochastic processes at each of the states, related to each individual gene.The network is stated as a graph G defined as a set of vertices V ⊂ N and edges E ⊂ V × V, G = {V, E}.Note that |V| = n denotes that there are n vertices (or nodes) in G, by the cardinality, | • |, of a set.The adjacency matrix A ∈ N n×n 0 is a convenient way to encode a graph, When a system has a graph structure it is often referred to as a network.The adjacency matrix then encodes the network structure of the system.Our goal is to estimate network structure Â closely as possible to the true network structure A, that is we want, i,j |A − Â|, to be as small as possible (ideally 0).We would also like for this to be accomplished with as little data (t) as possible, since we are often limited in the amount of real world data we receive.Our estimation of the network structure relies on nodes sharing information with one another.Thus Â may be thought of as which nodes are directly communicating with one another, rather than strictly being the physical structure.In our previous work, [14,15], we proved that under mild hypothesis, Figure 3.The relative error in the joint entropy calculation between the joint entropy calculated through truncation and the joint entropy calculated by our approximation.It is clear that when both λ12 and λ11λ22 are small, the relative error is small.Thus we expect this approximation to work well when all of the estimated rates are small.In practice we find that when scaling the rates to be in [0, 1] we get good results, regardless of how high the true rates were.
the multi-variate stochastic evolving by coupling on a complex network can be derived perfectly by optimal causation entropy (oCSE), errors arising from estimation issues such as model entropies of observations from various distributions, and finite data effects, but the information network structure align accurately in most situations.
In the first example demonstration of our methods, we benchmark with synthetic simulated by the multivariate Poisson model, Eqs. 2 -4.To explicitly incorporate the adjacency matrix A and noise E as shown in [24,26], consider as: .We have established in previous discussion that there is no analytical solution for the entropy of the multivariate Poisson, instead an approximation has been made.Since the Poisson distribution resembles the Gaussian distribution often the latter is assumed for estimates, we thus compare the performance of oMII assuming both distribution types.Fig. 4 shows that the oMII method, but even using the rough Gaussian best estimates of entropies, nonetheless does reasonably well finding the true edges with a high true positive rate (TPR).This is contrasted to network inference based on other entropy estimators, including the nonparametric kNN method, GLASSO, both of which are discussed below, and also the Poisson estimator developed here.However, the Gaussian oMII finds the edges at the expense of a much larger false positive rate (FPR).Specifically, define TPR and FPR as follows: let G = {V, E} be the true network structure and Ĝ = { V, Ê} be the estimated network structure.Then: and In this case \ represents set subtraction.Note that from this definition 0 ≤ TPR ≤ 1 while FPR ≥ 0.

IV. RESULTS
We compare the performance of several methods on simulated data sets, including various types of oMII, as well as GLASSO [25].Unlike oMII, which involves conditional mutual information as its engine, GLASSO involves maximizing the log-likelihood provided in Eq. 43 over values of a regularization parameter ρ, A common method for the choice of ρ is maximazation of the Bayesian information criterion (BIC).We utilize 1000 log-spaced values of ρ in [10 −2 , 1] which varies Â between a complete network to a completely disconnected network with zero edges.Following [26], first we use a box-cox transformation of the Poisson distributed data, to make the data more Gaussian like, prior to using GLASSO.The box-cox transformation of a random variable z is bc(z|γ) = GLASSO results are shown in Fig. 4. The Poisson oMII method is tested data simulated as described in the section above.In Fig. 4 each data point is averaged over 50 realizations of the network dynamics.Two different Erdős-Rényi (ER) graph types are used, one with p = 0.04 and one with p = 0.1.The parameter p in an ER graph controls the sparsity of the graph, thus the graphs with p = 0.1 will have considerably more edges on average than graphs with p = 0.04.For these simulations n = 50 was chosen.The rates were chosen to be λ ij = 1 (∀ i, j) and E i ∼ Poisson(0.5)(∀ i) where E i ∈ N t×1 0 are the columns of E. This is the high SNR scenario from [24].To estimate the rates, we simply use correlation between all pairs in the data.We note that this differs from above where we utilized the covariance matrix.Using correlation rather than covariance guarantees the calculated rates will be relatively small since the values of correlation do not exceed 1 in absolute value, this allows the estimated rates to stay in the small relative error regime shown in Fig. 3.The correlation matrix then gives us all of the off diagonal rates λ ij (i = j) and to obtain the rates λ ij (i = j) we can see from Eq. 5 that we simply need to subtract the sum of the non-diagonal elements from the diagonal elements.That is if we let then λ ii = e ii − j =i λ ij .In Fig. 4 it can be seen that in terms of TPR all of the methods perform quite well with the exception of the KNN version of oMII which exhibits poor performance across all examined sample sizes, likely due to slow convergence.
In fact, for networks with few connections the poorest performing method in terms of TPR is the Poisson oMII method, with the best performing method being GLASSO.However GLASSO produces a very high FPR, in fact GLASSO finds more false positives than there are total edges in the true network, thus producing an FPR of greater than 1.By contrast both Gaussian and Poisson versions of oMII produce significantly lower FPR and the Poisson oMII produces the lowest rate of FPR across all sample sizes.It should be noted as well that the FPR of the Poisson version of oMII maintains an approximately constant level across all sample sizes, while the Gaussian version of oMII has an increasing FPR with sample size.For the denser networks, which had an expected average degree of 5, as expected all methods had a decreased TPR for low sample size.The FPR also fell for all methods due to the larger denominator (more edges).The conclusions remain the same for both network densities.We now examine data derived from breast cancer patients who have been screened for different micro RNA's (miRNA's) occurrence counts of is analyzed by the Poisson oMII method featured in this paper.These data sets are publicly available at https://portal.gdc.cancer.govwebsite, described as TCGA-BRCA sequencing miRNA.In this case, t = 1207 and n = 1881 different miRNA samples are available.Of these, 1881 miRNA's ≈ 1000 pass the two sample Kolmogorov-Smirnov (KS) [30] test comparing to the Poisson distribution, to confidence level α = 0.05.The remaining ≈ 900 miRNA data were then scaled as follows: The notation, < • > represents the mean and ⌊•⌋ componentwise, to integers.The scaled data is well fitting, again by KS-test, to a negative binomial distribution, with only ≈ 200 failing as both Poisson and negative binomial.Recall that the Poisson distribution is a special case of the negative binomial distribution, since: In the limit, r → ∞ in Eq. 47 it is easy to see that the term (1 − λ) r → e −λ , and rewriting k+r−1 k = (k+r−1)!k!(r−1)!→ 1 k! .Combining these facts, as r → ∞, the negative binomial distribution limits to a Poisson distribution.Given that the majority of this miRNA data is distributed as scaled negative binomial (the Poisson data also can be fit as negative binomial) we must interpret the results of with caution especially in light of the results shown in Fig. 4. The results of the application of the Poisson oMII still are interesting, especially in light of the fact that the negative binomial distribution can be viewed as a compound Poisson distribution [27,28].To obtain the networks shown in Fig. 5 we first restricted the data to having a minimum of > 100 total counts,this was to avoid including data that had zero variation or near zero variation.This restriction left us with 1072 miRNA's, oMII was then used to analyze the remaining miRNA data without any further pre-processing, which resulted in the network shown in Fig. 5.The network has many miRNA's which are non-interacting, however there is a large weakly connected component.Focusing on the nodes which are members of the largest weakly connected component (LWCC) we found that many miRNA's that have been previously identified as up or down regulated in breast cancer end up in this component, this component included most of the miRNA's listed in Table 1 of [29].The miRNA's which land in the LWCC will be labeled as interesting miRNA's for brevity.
Focusing on this set of 656 miRNA's, the plot of Fig. 5 focuses in on this component by sizing the nodes relative to their out degree.The nodes with no out degree are so small that they are difficult to see in the figure, while the nodes with largest out degree are prominent.A feature of this network is that there are miRNA's that are "drivers" of the network, in that they have much larger out degree than the majority of other nodes.We list the top 20 miRNA's in order of their centrality based on out degree, betweenness centrality and eigenvector centrality in Table IV.For all three measures the top 4 miRNA's are identically ordered, all 4 of which have been noted for a prominent role in breast cancer [29,[31][32][33][34][35] and they seem to be the main drivers.This suggests that it may be possible to target a small number of miRNA's for some desired behavior of the system of miRNA's in drug development.

V. CONCLUSION
In this paper we have given an approximation to the mutual information of a multivariate Poisson system.We have shown through numerical experiments that this approximation works efficiently, and the results of network estimation indicate that the approximation is justified.We have also developed the oMII (and by extension the oCSE) algorithm for computation of the causation entropy of a Poisson system based on the joint entropy approximation discussed above.We have shown that this model is superior to simply assuming the data is Gaussian, which is likely related to the strange behavior of the marginals in a Poisson sytem, as we have outlined above.The Poisson oMII algorithm also significantly outperforms the nonparametric KNN version of oMII.Finally, we have applied the Poisson oMII algorithm to a breast cancer miRNA expression count dataset, which has produced potentially interesting insights into the network of miRNA's as it relates to breast cancer.Our network inference on the breast cancer miRNA network has shown that there is a relationship between the highest variance (in expression values) of miRNA's.There seems to be unidirectional connections between these miRNA's, with certain miRNA's taking on the role of drivers in the network.This may suggest a future course of action for future drug development.

VII. APPENDIX
Below we offer proof of Eq. 5.
x n = y 1n + y 1n + ... + y nn Without loss of generality we will look at the pair (i = 1, j = 2).In this case we see that the covariance between this pair of random variables is defined:

Figure 1 .
Figure 1.Work flow of our computationally efficient approach to estimate the joint entropy of multi-variate Poisson distributed variables.From data, we proceed to distribution parameter estimation to approximate joint entropy.

Figure 2 .
Figure 2. (a) The causation entropy between two processes Z and Y is shown.In this case since we are only conditioning on a process X, C Z→Y |X = TZ→Y .Of course X may be replaced with a set of variables.(b) Here we show a special case where Z is independent of both X and Y (Z in this case may represent the history of X.In this case it becomes clear that H(Z|X, Y ) = H(Z), H(X|Y, Z) = H(X|Y ) and H(Y |X, Z) = H(Y |X).As explained in the text, this special case helps us to discern what are the proper variables to use in the Poisson case.

B
= [I n ; P ⊙ (1 n tri(A) T )] (40) where I n is the (n × n) identity matrix, P ∈ N n×(m−n) 0 is a permutation matrix with exactly n ones per row ⊙ represents the Hadamard product (componentwise multiplication of same sized arrays). 1 n ∈ N n×1 is the vector of all ones, and tri(A) ∈ N n(n−1) 2 ×1 0 denotes the vectorized upper triangular portion of the adjacency matrix, and E ∈ N n×t 0

Figure 4 .
Figure 4. True Positive and False Positive Rates for several test methods on ER graphs of two different levels of sparsity.Erdős-Rényi (ER)graphs with triangles for a 50 nodes graph with strong sparsity due to p = 0.04, and the x's for 50 nodes ER graphs with due to denser p = 0.1.The magenta lines represent GLASSO, the blue lines represent the Poisson oMII, the red lines represent the Gaussian oMII, and the green lines represent the KNN oMII.In (a) the true positive rate (TPR) is shown for different sample sizes, each point is averaged over 50 realizations of the network dynamics.In (b) the false positive rate (FPR) is shown.Clearly GLASSO finds more true edges, but at the expense of a significantly higher false positives.In fact, for the highly sparse ER network GLASSO finds 3 times as many edges as actually exist in the network with 1000 data points.The FPR increases with data set.As can be seen the Gaussian oMII performs as well as Poisson oMII in TPR with the KNN performing poorly, but the Poisson oMII significantly outperforms all other methods in terms of FPR.It appears that the Poisson oMII is the only method that converges to the true network structure with increasing sample size.(c) Comparing TPR between GLASSO, the hybrid method and Poisson oCSE.The hybrid method has an increased TPR relative to Poisson oCSE.(d) The FPR increases slightly for the hybrid method, but is substantially lower than GLASSO.

Figure 5 .
Figure 5. Example network generated by the hybrid oMII algorithm.Nodes and text are sized relative to the out degree of the node.The nodes with largest out degree have previously been connected with breast cancer.
. was supported by the Army Research Office (N68164-EG) and, J.F. and E.B. were supported by DARPA.