- Research
- Open access
- Published:
A decentralized control approach in hypergraph distributed optimization decomposition cases
Applied Network Science volume 9, Article number: 58 (2024)
Abstract
We study the main decomposition approaches (primal, dual and primal–dual) for a distributed optimization problem from a dynamical system perspective where the couplings among the variables of the optimization problem are described by an undirected, unweighted hypergraph. We conduct stability analysis for the respective dynamical systems of the decomposition cases by using non linear decentralized control theoretic techniques and spectral properties of the respective communication matrices, i.e., the incidence and the Laplacian matrices of the hypergraph. Finally, we provide numerical simulations under a specific coalitional setting that demonstrate the superiority of the hypergraph compared to its respective graph analogue, the clique expansion graph, for the given decomposition algorithms in terms of convergence rate and information transmission efficiency.
Introduction
With the extensive study of multi-agent systems in almost all aspects of applied mathematics many optimization algorithms based on standard optimization theory (Bertsekas 2009) were developed based on the graphical structure of these systems as in Bertsekas (1991), Cerquides (2014) and Shakkottai and Rayadurgam (2008). One of these optimization approaches is called distributed optimization and has its roots in Tsitsiklis (1984). The main trait of distributed optimization is the coupling (common) variables among the objective functions of the agents of the system where a review on distributed optimization is provided in Yang (2019). The couplings of these variables are usually described by using a graph but in this work we use instead a hypergraph in a similar fashion as in Samar et al. (2007) and Boyd (2007).
Hypergraphs (Berge 1973) are a generalization of graphs in the sense that they allow multiple nodes to be attached in the same edge (hyperedge) and not just two as it is in the case of a graph. As a result, this type of graphical structure is capable of depicting more complex communication structures than a graph. More details on the properties of hypergraphs are provided in Dai and Gao (2023). As well as providing a more efficient way for describing a communication structure than a graph, hypergraphs can also provide acceleration to the underlying optimization algorithms. This comparison makes more sense in the coalition or clustering setting where we usually have unions of complete subgraphs which can be viewed as the clique expansions of hypergraphs (Agarwal et al. 2006). It is also important to note that in the clustering setting the notion of consensus is quite important and in the hypergraph setting each hyperedge can be interpreted as consensus variables among the attached agents. For more details on consensus theory we refer the reader to Kia et al. (2019). In the examples of this work we provide numerical simulations under a specific coalitional setting where we show that the hypergraph is more efficient in terms of information transmission and faster in terms of convergence rate than its respective clique expansion graph for the dual decomposition and primal dual algorithm where we utilize the respective Laplacian matrices.
In most cases of distributed optimization algorithms their stability analysis is conducted by utilizing spectral properties and matrix theoretic techniques (Zhang 2011) of the respective communication matrix (Kosaraju 2017; Kvaternik and Pavel 2011), usually the graph Laplacian matrix. Our approach is similar due to the hypergraph Laplacian matrix proposed by Bolla (1993) which satisfies all the properties of a Laplacian matrix. In this paper we focus in algorithms that were presented in Palomar and Mung (2006) and Boyd (2007). We study the main decomposition cases (primal, dual and primal–dual) for the hypergraph distributed optimization problem from a dynamical system viewpoint. We conduct the stability analysis of these dynamical systems by utilizing tools from non linear control theory (Nijmeijer and Van der Schaft 1990), mostly passivity and Lyapunov techniques. We also show that the equilibrium points of the dynamical systems coincide with the optimal solution of the hypergraph distributed optimization problem. This work is an extensive version of Papastaikoudis and Lestas (2023) which was presented in the 12th international conference on complex networks and their applications. In this work we additionally study the prime and dual decomposition cases by expressing the problem with the help of the vector of common values and the hypergraph incidence matrix. We also show how the hypergraph Laplacian matrix which we use directly in the primal dual algorithm naturally rises from the dual decomposition algorithm.
The paper is organized in the following way: in “Preliminaries” section we present the main mathematical preliminaries that we will use in this work. In “Hypergraph distributed optimization” section we introduce the hypergraph distributed optimization problem and finally in “Decomposition cases” section we present the decomposition cases along with the respective stability analysis proofs. Examples that highlight the importance of the hypergraph over its graph analogue, the clique expansion graph, are also presented throughout the evolution of the paper.
Preliminaries
Basic notions
By \({\mathbb {R}}\) we denote the set of real numbers and by \({\mathbb {R}}^n\) its n-th dimension version. For \(x\in {\mathbb {R}}^n\), by \(x\ge 0\) (\(x>0\)) we mean that the vector components of x are nonnegative (positive). By \(||\cdot ||_2\) we denote the \(l_2\) (Euclidean) norm. Function \(|\cdot |\) denotes the cardinality function. For a matrix \(M\in {\mathbb {R}}^{n\times n}\) by \(M^T\) and S(M) we denote its transpose and its spectrum respectively. Matrix \(P\in {\mathbb {R}}^{n\times n}\) is an orthogonal projection matrix when \(P^2=P=P^T\) with spectrum \(S(P)=\{0,1\}\). A symmetric matrix M is called positive (semi)definite \(M(\succeq )\succ 0\) if and only if \(x^TMx(\ge )>0\) for every nonzero \(x\in {\mathbb {R}}^n\).
Non-linear control theory
We study the following continuous, time invariant system,
with \(x(t)\in {\mathbb {R}}^n\) and \(f:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\) being continuous. A function \(V:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) for which the following relationship \(||x||\rightarrow \infty \Rightarrow V(x)\rightarrow \infty\) is satisfied is called radially unbounded. By \({\dot{V}}:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) is denoted the Lie derivative of V which is expressed by the following formula:
Theorem 1
Let \(x^*\) be an equilibrium point of (1). If \(V:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) is radially unbounded and \({\dot{V}}(x)<0, \forall \ x \ne x^*\) then \(x^*\) is globally asymptotically stable and V is a valid Lyapunov function for (1).
Definition 1
By considering a system \(\Sigma\), that has the following state space expression
where \(x(t), u(t), y(t)\in {\mathbb {R}}^n\) are the state, the input and the output of \(\Sigma\) respectively and \(f,h:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\). The system \(\Sigma\) is passive if there exists a positive semidefinite function \(S:{\mathbb {R}}^n\rightarrow {\mathbb {R}}_{+}:=(0,\infty )\), called storage function, such that
holds \(\forall \ x(t), u(t), y(t)\in {\mathbb {R}}^n\). For the special case that:
for some positive definite function \(\psi\) then \(\Sigma\) is strictly passive.
Theorem 2
The negative feedback interconnection of two passive systems is a stable system (Fig. 1).
Definition 2
A domain \({\mathcal {D}}\subseteq {\mathbb {R}}^n\) is called invariant for (1) if
Definition 3
A domain \({\mathcal {D}}\subseteq {\mathbb {R}}^n\) is called positively invariant for (1), if
Theorem 3
(LaSalle’s Invariance Principle) Let \(\Omega \subset D\) be a compact possitively invariant set with respect to (1). Let \(V:D\rightarrow {\mathbb {R}}\) be a continuously differentiable function such that \({\dot{V}}(x)\le 0\) in \(\Omega\). Let \({\mathcal {X}}\) be the set of all points in \(\Omega\) where \({\dot{V}}(x)=0\). Let M be the largest invariant set in \({\mathcal {X}}\). Then every solution starting in \(\Omega\) approaches M as \(t\rightarrow \infty\).
Lemma 1
Considering a convex function \(f:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\), its gradient \(\nabla f:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\) is incrementally passive, i.e., the following inequality holds,
If f is strictly convex, inequality (4) strictly holds \(\forall x\ne y\) and then \(\nabla f\) is strictly incrementally passive.
Lemma 2
A function \(f:A\subset {\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\) is locally Lipschitz if \(\ \forall \ x, x_0\in A\), there exist constant \(M>0\) and \(\delta _0>0\) such that \(||x-x_0||<\delta _0\Rightarrow ||f(x)-f(x_0)||\le M||x-x_0||\).
Graph theory
Graphs
A graph \({\mathcal {G}}=({\mathcal {V}},{\mathcal {E}})\) is an ordered pair, where \({\mathcal {V}}=\{v_1, \ldots ,v_n\}\) is the node set while \({\mathcal {E}}=\{e_1,\ldots ,e_m\}\) is the edge set. The degree of a node \(v_i\) denoted by \(|v_i|\) is the total number of edges that are adjacent to this node. The total number of nodes in the graph is called the order of the graph and is denoted by \(|{\mathcal {V}}|\). We define by \(D_V\) the diagonal \(|{\mathcal {V}}|\times |{\mathcal {V}}|\) matrix whose entries are the degrees of each node
The adjacency matrix of graph \({\mathcal {G}}\), denoted by A, is a \(|{\mathcal {V}}|\times |{\mathcal {V}}|\) matrix whose (i, j)-th entry is given by
The Laplacian matrix of graph \({\mathcal {G}}\), denoted by L, is a \(|{\mathcal {V}}|\times |{\mathcal {V}}|\) matrix given by the following formula,
Hypergraphs
A hypergraph \({\mathcal {H}}=({\mathcal {V}},{\mathcal {E}})\) where \({\mathcal {V}}=\{v_1,\ldots ,v_n\}\) is the finite set of nodes and \({\mathcal {E}}=\{{\mathcal {E}}_1,\ldots ,{\mathcal {E}}_m\}\) is the corresponding set of hyperedges is a generalization of a graph in the sense that each hyperedge can join any number of nodes and not just two. In the case that a hyperedge joins two nodes, it is called a “standard” hyperedge. The degree of a node \(v_i\) denoted by \(|v_i|\) is the total number of hyperedges adjacent to this node while the degree of a hyperedge \({\mathcal {E}}_j\) denoted by \(|{\mathcal {E}}_j|\) is the total number of nodes adjacent to this hyperedge. We define by \(D_V\) the diagonal \(|{\mathcal {V}}|\times |{\mathcal {V}}|\) matrix whose entries are the degrees of each node, i.e., \(D_V=\text {diag}\{|v_1|,\ldots ,|v_n|\}\) and by \(D_E\) the diagonal \(|{\mathcal {E}}|\times |{\mathcal {E}}|\) matrix whose entries are the degrees of each hyperedge, i.e., \(D_E=\text {diag}\{|{\mathcal {E}}_1|,\ldots ,|{\mathcal {E}}_m|\}\). For a hypergraph \({\mathcal {H}}\), the incidence matrix, denoted by E is a \(|{\mathcal {V}}|\times |{\mathcal {E}}|\) matrix whose (i, j)-th entry is defined as:
The hypergraph Laplacian (Bolla’s Laplacian) Q is a \(|{\mathcal {V}}|\times |{\mathcal {V}}|\) matrix given by the formula:
The clique of a hyperedge \({\mathcal {E}}_j\) is a complete subgraph of the hyperedge’s adjacent nodes with \(|{\mathcal {E}}_j|(|{\mathcal {E}}_j|-1)/2\) pairwise interactions.We call a graph as the clique expansion of a hypergraph if we substitute all the hyperedges with their respective cliques.
Hypergraph distributed optimization
We have the following distributed optimization problem for K different subsystems,
where
-
\(F_i:{\mathbb {R}}^{(p_i+m_i)}\rightarrow {\mathbb {R}}\) is the objective function of ith subsystem and is considered to be strictly convex, continuously differentiable with its gradient \(\nabla F_i\) being locally Lipschitz.
-
Vectors \(x_i\in {\mathbb {R}}^{p_i}, \forall \ 1\le i\le K\) denote the variables of the subsystems which we assume are coupling (i.e. their components appear in the variables of other subsystems as well).
-
Vectors \(y_i\in {\mathbb {R}}^{m_i}, \forall \ 1\le i\le K\) denote the variables of the subsystems which we assume are local (i.e. they appear in only one subsystem).
-
\({\mathcal {C}}_i\) is a feasible set for subsystem i, described by linear equalities and convex inequalities.
-
Vector \(z \in {\mathbb {R}}^N\) gives the respective common values of the N different groups of coupling variable components.
-
The relationship \(x_i=E_iz\) allocates the variable components of ith subsystem to their respective common values and \(E_i\) is a \({p_i\times N}\) matrix whose (l, j)-th entry is given by
$$E_{i}^{{lj}} = \left\{ {\begin{array}{*{20}l} {1,\;{\text{if}}\;x_{i}^{l} = z_{j} ,} \hfill & {\forall \;1 \le l \le p_{i} } \hfill \\ {0,} \hfill & {{\text{otherwise}}} \hfill \\ \end{array} } \right.$$(8)with \(x_i^l\) denoting the lth component of variable \(x_i\).
We consider a hypergraph \({\mathcal {H}}=({\mathcal {V}},{\mathcal {E}})\) to represent the coupling variables of the different subsystems.
-
The set of nodes \({\mathcal {V}}\) is partitioned into \({\mathcal {V}}=\{{\mathcal {V}}_1,\ldots ,{\mathcal {V}}_K\}\) where each node in subset \({\mathcal {V}}_i\) is associated with a component of variable \(x_i \ \forall \ 1\le i\le K\).
-
Each hyperedge \({\mathcal {E}}_j\) is associated with the jth component of vector z. Therefore the hyperedge set \({\mathcal {E}}\) is associated with the couplings of different variable components.
For the hypergraph \({\mathcal {H}}\) we have,
Hence, by \(x_i^j\) we denote the coupling variable of the ith subsystem that belongs to the jth group of coupling variables for \(i=1,\ldots ,K\) and \(j=1,\ldots ,N\). The relationship \(x_i=E_iz, \forall \ 1\le i\le K\) can also be written as \(x=Ez\) where \(x=(x_1,\ldots ,x_K)\in {\mathbb {R}}^p\).
We let \(f_i(x_i)\) denote the optimal value for the local variable of the ith subproblem,
with variable \(y_i\), as a function of \(x_i\). Functions \(f_i, i=1,\ldots ,K\) are strictly convex as well since minimization with respect to a variable of a function preserves strict convexity. The Lagrangian of (7) is
which can also be written as
where \(v_i\) corresponds to ith subsystem and is a subvector of Lagrange multiplier v associated with \(x=Ez\). The optimality conditions are:
Example 1
Figure 2 presents a coalitional hypergraph \({\mathcal {H}}\) information structure among three coalitions where each agent from each coalition is allowed to engage in exactly one consensus relationship. Variables \(\{x_1^1,x_2^1,x_3^1\}\) are attached to hyperedge \({\mathcal {E}}_1\) and they describe the agents of the three coalitions engaging in the first consensus relationship. Variables \(\{x_1^2,x_2^2\}\) are attached to hyperedge \({\mathcal {E}}_2\) while the variables \(\{x_1^3,x_3^3\}\) are attached to hyperedge \({\mathcal {E}}_3\) and similarly describe the engagement of the respective agents to the consensus relationships. Hypergraph \({\mathcal {H}}\) has the following characteristics: \(|{\mathcal {V}}|=7, |{\mathcal {E}}|=3,\)
The respective clique expansion graph of the hypergraph in Fig. 2 is given in Fig. 3. We notice that the only difference of the hypergraph with its respective clique expansion graph is the hyperedge \({\mathcal {E}}_1\) and its respective clique \(C_1\) since the other hyperedges are standard and their cliques remain the same. We notice that for the clique \(C_1\) it is required to have three edges in order to substitute the information transmission of hyperedge \({\mathcal {E}}_1\) and for this reason the hypergraph is more efficient than its respective clique expansion graph. The clique expansion matrix has the following characteristics:
Remark 1
It is important to note that in our setting \(E^TE=D_E\).
Decomposition cases
In this section we introduce the continuous time dynamical system interpretation of the primal, dual and primal dual decomposition cases. We prove that the equilibrium points of the decomposition cases dynamical systems are the optimal solutions of the hypergraph distributed optimization problem. We prove the convergence of these dynamical systems using passivity techniques and Lyapunov theory.
Primal decomposition
In primal decomposition, at each iteration the vector z of common values is fixed and we set the coupling variables as \(x_i=E_iz\). The problem is separable and the original problem is equivalent to the primal problem:
To find a gradient of f, we calculate \(\nabla f_i, i=1,\ldots ,K\) which are strictly incrementally passive functions due to strict convexity of \(f_i, i=1,\ldots ,K\). We then have
By solving (12) using a gradient method, we have the following algorithm.
Dynamical System
Theorem 4
Let \(x^*\) be an equilibrium point of the dynamical system (13a)–(13d) then \(x^*\) is a solution to the optimization problem (7).
Proof
We find the equilibrium point of the dynamical system (13a)–(13d) from \(\dot{z}(t)=0\Rightarrow -w(t)=0\Rightarrow \sum \nolimits _{i=1}^K E_i^Tv_i(t)=\sum \nolimits _{i=1}^K E_i^T\nabla f_i(x_i(t))=0\) which is equal with the last of the optimality conditions (11c). The other optimality conditions are satisfied from the equations of the dynamical system and as a result, the equilibrium point of (13a)–(13d) solves the optimization problem (7). \(\square\)
Theorem 5
The dynamical system (13a)–(13d) is a negative feedback interconnection of a passive and a strictly passive system.
Proof
The dynamical system (13a)–(13d) can be seen as a negative feedback interconnection of a passive and a strictly passive system as it is illustrated in Fig. 4. The system of non-linearities is \(\nabla f(x(t))-\nabla f(x^*)=[\nabla f_1(x_1(t))-\nabla f_1(x_1^*),\cdots ,\nabla f_K(x_K(t))-\nabla f_K(x_K^*)]\) which has input \(x(t)-x^*=(x_1(t)-x_1^*,\ldots ,x_K(t)-x_K^*)\) and output \(v(t)-v^*=(v_1(t)-v_1^*,\ldots ,v_K(t)-v_K^*)\). Each \(\nabla f_i(x_i(t)), i=1,\ldots ,K\) is strictly incrementally passive and as a result, the system of non-linearities is strictly incrementally passive with zero storage function \(\forall \ x(t)\in {\mathbb {R}}^n\). The pre/post multiplication of a passive system by \(E^T\) and E respectively preserves passivity since matrix \(E^TE=D_E\) is positive definite. Now, regarding the integrator system it has input \(-E^T(v(t)-v^*)\) and output \(z(t)-z^*=(z_1(t)-z_1^*,\ldots ,z_N(t)-z_N^*)\). In order for this system to be passive there must exist a storage function S(t) where \(S(t):{\mathbb {R}}^N\rightarrow {\mathbb {R}}\) such that \({\dot{S}}(t)\le -[E^T(v(t)-v^*)]^T(z(t)-z^*)=-(v(t)-v^*)^T[E(z(t)-z^*)]=-(v(t)-v^*)^T(x(t)-x^*)\). An appropriate choice of storage function that satisfies the passivity inequality condition is \(S(t)=\frac{1}{2}||z(t)-z^*||_2^2\) with Lie derivative
where we have used in the second line that \(E^Tv^*=0\). As a result, the integrator system is passive and we have a negative feedback interconnection of a passive and a strictly passive system. \(\square\)
Theorem 6
The equilibrium point of the dynamical system (13a)–(13d) is globally asymptotically stable.
Proof
The negative feedback interconnection of a passive and a strictly passive system is a stable system and a candidate Lyapunov function for the interconnected system is the sum of the storage functions of the individual systems. As a result, we choose for the dynamical system (13a)–(13d) as a Lyapunov candidate the function \(V(t)= \frac{1}{2}||z(t)-z^*||_2^2\) where \(V(t):{\mathbb {R}}^N\rightarrow {\mathbb {R}}\) is radially unbounded. The Lie derivative of the Lyapunov function is
since \(\nabla f\) is strictly monotonically increasing and as a result, \({\dot{V}}(t)<0\) for \(x\ne x^*\). In the second line we made use of \(E^T\nabla f^T(x^*)=0\). From the above we conclude that the equilibrium point \(z=z^*\) is globally asymptotically stable (G.A.S.) for the dynamics (13a)–(13d). \(\square\)
Dual decomposition
Now we will use the dual decomposition to solve the optimization problem (7). To form the dual problem we first minimize over z, which results in the optimality condition \(E^Tv=0\) and we then solve the following subproblems,
Since the objective functions \(f_i(x_i)\) are strictly convex, the solution to each subproblem is unique. The dual of the original problem (7) is
where \(h_i(v_i)=\min \limits _{x_{i}}\{f_i(x_i)-v_i^Tx_i: x_i \in {\mathcal {C}}_i\}\). We refer to (17) as the dual decomposition master problem. Assuming that strong duality holds, the primal problem (7) can be equivalently solved by solving the dual problem. Below we present the dynamical system of the dual decomposition algorithm. A detailed procedure of the extraction of the dual algorithm is given in Samar et al. (2007).
Dynamical system
We initialize v so that \(E^Tv=0\).
where \(\nabla f^{-1}_i\) is strictly incrementally passive due to strict convexity of \(f_i\) and we also assume that \(\nabla f^{-1}\) has analytical expression. If we substitute (18b) in (18c) we will have \({\dot{v}}(t)=-(x(t)-E(E^TE)^{-1}E^Tx(t)) \Rightarrow\)
where
Lemma 3
Matrix Q in (20) is an orthogonal projection matrix.
Proof
We have that
and also \(Q=Q^T\). As a result, Q is an orthogonal projection matrix. Matrix Q is also positive semidefinite with spectrum \(S(Q)=\{0,1\}.\) \(\square\)
Example 2
Continuing the previous example we have that the Laplacian matrices for the hypergraph Q and the clique expansion graph L are given below,
Remark 2
An interesting observation is that the communication matrix Q that is used in the dual decomposition algorithm apart from being a projection matrix is also the Bolla’s Laplacian for hypergraphs (Bolla 1993) since \(D_V=I\) and \(D_E=E^TE\).
Theorem 7
Let \(v^*\) be an equilibrium point of the dynamical system (18a)–(18c) then \(v^*\) is a solution to the optimization problem (7) under the initial conditions \(E^T v_0 = 0, v_0 \in T\) and \({\dot{v}}\in T\) where \(T={\mathcal {R}}[Q]\) is the range of matrix Q.
Proof
We find the equilibrium point of the dynamical system (18a)–(18c) from \(\dot{v}(t)=0\Rightarrow -(x(t)-Ez(t))=0\Rightarrow x(t)=Ez(t)\) which is the same as the optimality condition (11b). The rest of the optimality conditions are satisfied from the equations of the dynamical system at steady state and as a result, the equilibrium solution of (18a)–(18c) solves the optimization problem (7). \(\square\)
Theorem 8
The dynamical system (18a)–(18c) is a negative feedback interconnection of a passive and a strictly passive system.
Proof
The dynamical system (18a)–(18c) can be seen as a negative feedback interconnection of a passive and a strictly passive system as it is illustrated in Fig. 5. The non-linearities system is \(\nabla f^{-1}(v(t))-\nabla f^{-1}(v^*)=[\nabla f^{-1}_1(v_1(t)),\ldots ,\nabla f^{-1}_K(v_K(t))]\) which has input \(v(t)-v^*=(v_1(t)-v_1^*,\ldots ,v_K(t)-v_K^*)\) and output \(x(t)-x^*=(x_1(t)-x_1^*,\ldots ,x_K(t)-x_K^*)\). Each \(\nabla f^{-1}_i(v_i(t)), i=1,\ldots ,K\) is strictly incrementally passive and as a result the system of the non-linearities is strictly passive. The pre/post multiplication of a passive system’s input and output respectively by Q preserves passivity since matrix \(Q^2=Q\) is positive semidefinite. Regarding the integrator system it has input \(-(x(t)-x^*)\) and output \(v(t)-v^*\). In order for this system to be passive there must exist a storage function \(S(t):{\mathbb {R}}^p\rightarrow {\mathbb {R}}\) such that \({\dot{S}}(t)\le -Q(x(t)-x^*)(v(t)-v^*)\). A storage function that satisfies the passivity inequality is \(S(t)= \frac{1}{2}||v(t)-v^*||_2^2\) with Lie derivative
where we have used that \(Qx^*=0\). As a result, the integrator system is passive and we have a negative feedback interconnection of a passive and a strictly passive system. \(\square\)
Theorem 9
The equilibrium point of the dynamical system (18a)–(18c) converges asymptotically to the equilibrium point identified in Theorem 7, given initial conditions \(v_0\) that satisfy \(E^Tv_0=0\).
Proof
By defining the space \(T={\mathcal {R}}[Q]\) as the range of matrix Q we have that the space \(T^{\perp }\) is the orthogonal complement of T. We notice that \(v\in T\) since \({\dot{v}}\in T\) and so we can write \(v=Qk\) for some k. By premultiplying v by Q we get \(Qv=Q^2k=Qk=v\) since Q is a projection matrix and as a result, we have \(Qv=v\). The negative feedback interconnection of passive systems is a stable system and a candidate Lyapunov function for the interconnected system is the sum of the storage functions of the individual systems. As a result, we choose for the dynamical system (18a)–(18c) as a Lyapunov candidate function \(V(t)= \frac{1}{2}||v(t)-v^*||_2^2\) where \(V(t):{\mathbb {R}}^p\rightarrow {\mathbb {R}}\) is radially unbounded. The Lie derivative of the candidate Lyapunov function is
since \(\nabla f^{-1}(t)\) is strictly incrementally passive, \(Qx^*=0\) and \(Qv=v\). In order to prove convergence we will make use of LaSalle’s invariance principle. The set \(\Omega =\{v \in T:V(v)\le l\}\) is a compact set for every constant l due to the fact that V is radially unbounded. From \(\dot{V}(t)=0\) we get \(v(t)=v^*\) and since \(x=\nabla f^{-1}(v)\), the invariant set is the equilibrium point. As a result, our solution converges to the optimum solution of the distributed optimization problem for the given initial conditions. \(\square\)
Primal dual algorithm
For the primal dual algorithm we will not make use of the common values vector z but we will directly utilize the hypergraph Laplacian matrix in the formulation of the distributed optimization problem in the following way:
We define the Lagrangian of (24) to be
where \(f(x)=\sum \limits _{i=1}^Kf_i(x_i)\). The primal dual dynamics of problem (24) are given below:
Theorem 10
Let \((x^{*},v^{*})\) be an equilibrium point of the dynamical system (25a)–(25b) then \((x^{*},v^{*})\) satisfies the optimality conditions of the optimization problem (24).
Proof
We find the equilibrium point of the dynamical system from \({\dot{x}}(t)=0\Rightarrow \nabla f(x^*)=Qv^*\) and \({\dot{v}}(t)=0\Rightarrow Qx^{*}=0\). We notice that the equilibrium point satisfies the optimality conditions in (11a)–(11c) and as a result, the equilibrium point of the dynamical system (25a)–(25b) solves the optimization problem (24).
\(\square\)
Theorem 11
The dynamical system (25a)–(25b) is a negative feedback interconnection of a passive and a strictly passive system.
Proof
The dynamical system (25a)–(25b) can be seen as a negative feedback interconnection of two passive systems as it is illustrated in Fig. 6.
The first system which is enclosed by the blue parallelogram refers to the primal dynamics and is a negative feedback interconnection of two systems where the first system is the system of non-linearities which is \(\nabla f(x(t))-\nabla f(x^*)=[\nabla f_1(x_1(t))-\nabla f_1(x_1^*),\ldots ,\nabla f_K(x_K(t))-\nabla f_K(x_K^*)]\). Each \(\nabla f_i(x_i(t)), i=1,\ldots ,K\) is strictly incrementally passive and as a result, the system of the non-linearities is strictly passive. The other system enclosed by the blue parallelogram is an integrator system and in order to be passive there must exist a storage function \(S_1(t):{\mathbb {R}}^p\rightarrow {\mathbb {R}}\) such that \({\dot{S}}_1(t)\le u_1^T(t)y_1(t)\) where \(u_1(t)=-(\nabla f(x(t))-\nabla f(x^*))+Q(v(t)-v^*)\) and \(y_1(t)=x(t)-x^*\). A storage function that satisfies the passivity inequality is \(S_1(t)=\frac{1}{2}||x(t)-x^*||^2\) where \(S_1(t):{\mathbb {R}}^{p}\rightarrow {\mathbb {R}}\) with Lie derivative
where we have used that \(\nabla f(x^*)=Qv^*\) and as a result, the primal dynamics integrator system is passive. The overall system enclosed by the blue parallelogram is strictly passive.
The second system which is enclosed by the red parallelogram refers to the dual dynamics. The system is an integrator which is premultiplied and postmultiplied by Q. This pre/post multiplication preserves passivity since matrix \(Q^2=Q\) is positive semidefinite. For the integrator system we have \(u_2(t)=-Q(x(t)-x^*)\) and \(y_2(t)=Q(v(t)-v^*)\). In order for this system to be passive there must exist a storage function \(S_2(t):{\mathbb {R}}^p\rightarrow {\mathbb {R}}\) such that \({\dot{S}}(t)\le u_2^T(t)y_2(t)=-(x(t)-x^*)^TQ^TQ(v(t)-v^*)=-(x(t)-x^*)^TQ(v(t)-v^*)\) since \(Q^TQ=Q^2=Q\). A storage function that satisfies the passivity inequality is \(S_2(t)=\frac{1}{2}||v(t)-v^*||^2\) where \(S_2(t):{\mathbb {R}}^{p}\rightarrow {\mathbb {R}}\) with Lie derivative
where we have used that \(Qx^*=0\). As a result, the second system is passive. In conclusion we have a passive and a strictly passive system interconnected in negative feedback. \(\square\)
Theorem 12
The equilibrium point of the dynamical system (25a)–(25b) converges asymptotically to the equilibrium point identified in Theorem 10.
Proof
The Lyapunov function of the dynamical system (25a)–(25b) system is constructed as the sum of the respective storage functions. We choose as candidate Lyapunov function \(V(t):{\mathbb {R}}^p\rightarrow {\mathbb {R}}\) where \(V(t)= \frac{1}{2}||x(t)-x^{*}||_2^2+\frac{1}{2}||v(t)-v^{*}||_2^2\). The respective Lie derivative is
since \(\nabla f(x)\) is strictly incrementally passive. From LaSalle’s invariance principle we have convergence to the largest invariant set for which \(\dot{V}=0\), i.e., \(x=x^*\). This set includes only the equilibrium point \((x^*,v^*)\). As a result, the dynamical system (25a)–(25b) converges asymptotically to the solution \((x^*,v^*)\). In the fifth line of the proof we have used that \(v^T(t)Qx(t)=x^T(t)Qv(t)\) while in the sixth line of the proof we have used that \(v^T(t)Qx^*=0, x^T(t)Qv^*=\nabla f(x^*)^Tx(t)\) and \(\nabla f(x^*)^Tx^*=0\) from the equilibrium properties. \(\square\)
Remark 3
From the control theoretic interpretation of the decomposition cases it is possible to design controllers that improve the time response of the system. Since the dynamical systems satisfy the passivity property an appropriate choice of controllers would be passivity based controllers (Chopra 2006; Ortega et al. 1997) such as PID controllers. A problem with this type of controllers is that there is no standard rule of selecting their parameters (Somefun et al. 2021) and it can be viewed as a more adhoc procedure. An alternative way of acceleration is the use of proximal algorithm methods as they are presented in Parikh and Boyd (2014).
Remark 4
In Papastaikoudis and Lestas (2023) we have also presented a Laplacian matrix formula for a directed weighted hypergraph and we have shown how this matrix is decomposed for the stability analysis setting that we studied throughout this paper.
Example 3
Continuing the previous example and by setting the objective functions of the coalitions to be
where
where
where
with the constants \(c_1, c_2, c_3\) to be randomly chosen. The objective function can be written more compactly as
where
The dual decomposition algorithm (19) with quadratic objective function takes the following form
where \(L=\{L^H, L^C\}\) with \(L^H\) corresponding to the hypergraph Laplacian matrix while \(L^C\) corresponds to the clique expansion graph Laplacian matrix. The convergence of the primal variables, the dual variables and the objective function are presented in Figs. 7, 8 and 9 respectively and the hypergraph provides better convergence rate in all cases.
For the primal dual algorithm the dynamical system (25a)–(25b) takes the following form for quadratic objective functions
where \(L=\{L^H, L^C\}\) with \(L^H\) corresponding to the hypergraph Laplacian matrix while \(L^C\) corresponds to the clique expansion graph Laplacian matrix. The convergence of the primal variables, the dual variables and the objective function are presented in Figs. 10, 11 and 12 respectively and the hypergraph provides better convergence rate in all cases.
Remark 5
It is important to note that for continuous time optimization algorithms the convergence rate can be made arbitrarily fast by choosing the gains in the passive components to be arbitrarily large (as passivity is preserved for arbitrary positive gains). For this reason we will present an example for the discretized version of the dual decomposition algorithm with a slightly modified information structure.
Example 4
In this example we have a similar coalition structure, i.e., a union of hyperedges connected by standard edges where a node can be attached to a maximum of two hyperedges and not exactly one as previously. In Figs. 13 and 14 we present the hypergraph communication and the respective clique expansion graph communication for the case of two coalitions with the respective objective functions to be
where
where
The objective function can be written more compactly as
where
The discretization of the dual decomposition algorithm (19) with quadratic objective function takes the following form
where \(L=\{L^H, L^C\}\) with \(L^H\) corresponding to the hypergraph Laplacian matrix while \(L^C\) corresponds to the clique expansion graph Laplacian matrix. By \(\rho\) we denote the stepsize of the algorithm and we have that \(\rho =\frac{1}{\lambda _{\min }+\lambda _{\max }}\) where \(\lambda _{\min }\) and \(\lambda _{\max }\) are the smallest non zero and largest eigenvalues respectively of matrix L. From Fig. 13 we have for the hypergraph,
and consequently the hypergraph Laplacian matrix is,
From Fig. 14 we have for the clique expansion graph that
As a result the clique expansion graph Laplacian matrix is
The convergence of the primal variables, the dual variables and the objective function are presented in Figs. 15, 16 and 17 respectively and the hypergraph provides better convergence rate in all cases.
Remark 6
We would like to point out that the hypergraph Laplacian matrix (6) has greater complexity associated with its computation relative to its respective clique expansion graph Laplacian matrix (5) but we would like to note that the tradeoff between this computational complexity and the convergence rate depends on the nature of the problem, e.g., the improved convergence rate can be of greater importance that overcomes the computation cost associated with the Laplacian computation or vice versa. This situation has to be considered based on the the nature of the problem and the desired goals.
Conclusion
We have studied various distributed algorithms that solve a network optimization problem that uses an undirected and unweighted hypergraph as its communication structure from a dynamical system viewpoint. We proved the stability of these dynamical systems with the use of non linear control theory and an appropriate decomposition of the respective incidence and Laplacian matrices of the hypergraph. We also highlight numerically the superiority of hypergraph compared to its respective clique expansion graph in terms of information transmission efficiency and convergence rate for the given algorithms.
Availability of data and materials
No datasets were generated or analysed during the current study.
References
Agarwal S, Branson K, Belongie S (2006) Higher order learning with graphs. In: Proceedings of the 23rd international conference on machine learning
Berge C (1973) Graphs and hypergraphs, North-Holland Publishing Company.
Bertsekas D (1991) Linear network optimization: algorithms and codes. MIT Press
Bertsekas D (2009) Convex optimization theory, vol 1. Athena Scientific
Bolla M (1993) Spectra, Euclidean representations and clusterings of hypergraphs. Discrete Math 117:19
Boyd S et al (2007) Notes on decomposition methods. Notes for EE364B. Stanford University 635:1–36
Cerquides J et al (2014) A tutorial on optimization for multi-agent systems. Comput J 57(6):799–824
Chopra N, Spong MW (2006) Passivity-based control of multi-agent systems. In: Advances in robot control: from everyday physics to human-like movements, pp 107–134
Dai Q, Gao Y (2023) Hypergraph computation. Springer
Kia SS, Van Scoy B, Cortes J, Freeman RA, Lynch KM, Martinez S (2019) Tutorial on dynamic average consensus: the problem, its applications, and the algorithms. IEEE Control Syst Mag 39(3):40–72
Kosaraju KC et al (2017) Stability analysis of constrained optimization dynamics via passivity techniques. IEEE Control Syst Lett 2(1):91–96
Kvaternik K, Pavel L (2011) Lyapunov analysis of a distributed optimization scheme. In: International conference on NETwork games, control and optimization (NetGCooP 2011). IEEE
Nijmeijer H, Van der Schaft A (1990) Nonlinear dynamical control systems, vol 464, No. 2. Springer
Ortega R, Jiang ZP, Hill DJ (1997) Passivity-based control of nonlinear systems: a tutorial. In: Proceedings of the 1997 American control conference (Cat. No. 97CH36041), vol 5. IEEE
Palomar DP, Chiang M (2006) A tutorial on decomposition methods for network utility maximization. IEEE J Sel Areas Commun 24(8):1439–1451
Papastaikoudis I, Lestas I (2023) Decentralized control methods in hypergraph distributed optimization. In: International conference on complex networks and their applications. Springer
Parikh N, Boyd S (2014) Proximal algorithms. Found Trends® Optim 1(3):127–239
Samar S, Boyd S, Gorinevsky D (2007) Distributed estimation via dual decomposition. In: 2007 European control conference (ECC). IEEE
Shakkottai S, Srikant R (2008) Network optimization and control. Found Trends® Netw 2(3):271–379
Somefun OA, Akingbade K, Dahunsi F (2021) The dilemma of PID tuning. Ann Rev Control 52:65–74
Tsitsiklis J (1984) Problems in decentralized decision making and computation. Doctoral dissertation. Massachusetts Institute of Technology, Laboratory for Information and Decision Systems
Yang T et al (2019) A survey of distributed optimization. Ann Rev Control 47:278–305
Zhang F (2011) Matrix theory: basic results and techniques.
Acknowledgements
Not applicable.
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
Ioannis Papastaikoudis conceptualized, designed and wrote the main manuscript. Dr. Jeremy Watson provided the simulations and made important comments. Professor Ioannis Lestas supervised the writing process of the paper and made important comments.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Yes.
Consent for publication
Yes.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Papastaikoudis, I., Watson, J. & Lestas, I. A decentralized control approach in hypergraph distributed optimization decomposition cases. Appl Netw Sci 9, 58 (2024). https://doi.org/10.1007/s41109-024-00662-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41109-024-00662-y