In this section we describe our proposed class of Friend-Foe Dynamic Linear Threshold (F2DLT) models, which is comprised of: the Non-Competitive F2DLT (nC-F2DLT), the Semi-Progressive Competitive F2DLT (spC-F2DLT), and the Non-Progressive Competitive F2DLT (npC-F2DLT). We first provide an overview of the framework based on F2DLT. Next, we introduce key features common to all models, then we elaborate on each of them.
Overview
Figure 1 illustrates the conceptual architecture of a framework for information diffusion and influence propagation based on our proposed models. Given a population of OSN users, the framework requires three main inputs: (i) a trust network, which is inferred from the social network of those users to model their trust/distrust relationships; (ii) user behavioral characteristics that are intrinsic to each user (i.e., exogenous to an information diffusion scenario) and oriented to express two aspects: activation-threshold, i.e., the effort needed to activate a user through cumulative influence from her/his neighbors, and quiescence, i.e., the user’s hesitation in being actively committed with the propagation process; and, (iii) one or multiple competing campaigns, i.e., information cascades generated from the agent(s) having viral marketing purposes. Moreover, the information diffusion process has a time horizon, and its temporal unfolding is reflected in the evolution of the information diffusion graph: this also depends on the dynamics of the users’ behaviors in response to the influence chains started by the campaign(s), which admit that users may switch from the adoption of a campaign’s item to that of another one. Putting it all together, our F2DLT based framework embeds all previously discussed aspects that are required to explain complex propagation phenomena, i.e., competitive diffusion, non-progressivity, time-aware activation, delayed propagation, and trust/distrust relations.
Please note that inferring a trust network from a social network is not an objective of this work; rather, we assume that trust relationships between users of an OSN are available and, as we shall describe next in this section, they are exploited as key information to develop our proposed models. Several heuristics have been proposed to infer a trust network from social relations and interactions among users in an OSN. A common approach is to infer trust relationships based on the social influence exerted by users over the network and propagation of trust ratings (Golbeck and Hendler 2006; Overgoor et al. 2012; Hamdi et al. 2013; Jiang et al. 2014). Other studies utilize users’ activities in social media (Gilbert and Karahalios 2009), or users’ attributes and interactions (Liu et al. 2008), or combinations of aspects concerning user affinity, familiarity and reputation (Yin et al. 2012), social influence, social cohesion and the afffective valence expressed by the users in the textual contents they produce (Vedula et al. 2017). The interested reader may refer to (Tang and Liu 2015; Sherchan et al. 2013) for an exhaustive overview on the topic.
Basic definitions
We are given a trust network represented by a directed graph G=〈V,E,w〉, with set of nodes V, set of edges E, and weighting function w:E↦[−1,1] such that, for every edge (u,v)∈E,wuv:=w(u,v) expresses how much v trusts its in-neighbor u. Positive, resp. negative, value of wuv corresponds to a trust, resp. distrust, relation.
For every v∈V, we denote with \(N^{in}_{+}(v)\) and \(N^{in}_{-}(v)\) the set of neighbors trusted by v (i.e., friends of v) and the set of neighbors distrusted by v (i.e., foes of v), respectively. Moreover, as required in linear threshold models, the constraints \(\sum \nolimits _{u \in N^{in}_{+}(v)} w_{uv} \leq 1\) and \(\sum \nolimits _{u \in N^{in}_{-}(v)} |w_{uv}| \leq 1\) must be fulfilled.
Let \({\mathcal {G}} = G(g, q, T) = \langle V, E, w, g, q, T \rangle \) be a directed weighted graph representing the LT-based information diffusion graph associated with trust network G, where T denotes a time interval for the diffusion process, g and q denote time-dependent activation-threshold and quiescence functions. These are introduced in \({\mathcal {G}}\) to model the aspects of time-aware activation and delayed propagation, respectively. We use symbol St to denote the set of active nodes at time t, and symbol \({\widetilde {S}}_{t}\) to denote the set of active nodes for which, at t, the quiescence time is not expired yet, i.e., the quiescent nodes.
Activation-threshold function. According to the LT model, every node v∈V is associated with an exogenous activation-threshold, θv∈(0,1], which corresponds to the a-priori effort needed in terms of cumulative influence to activate the node. We enhance this concept by defining an activation-threshold function, \(g: V, T \mapsto \mathbb {R}^{+}\), such that for every v∈V and t∈T:
$$g(v,t) = \theta_{v} + \vartheta(\theta_{v},t), $$
i.e., the activation of v at time t depends both on the user’s pre-assigned threshold, θv, and on a time-evolving activation term, 𝜗(·,·), which models the dynamic response of a user towards the activation attempt exerted by her/his neighbors.
To specify 𝜗(·,·), we devise two main scenarios for g(·,·):
-
A biased scenario, modeled as a non-decreasing monotone function, to capture the tendency of a user to consolidate her/his belief, according to the confirmation-bias principle (Anagnostopoulos et al. 2015).
-
An unbiased scenario, modeled as non-monotone function, whereby we assume that a user could revise her/his uncertainty to activate over time, thus becoming more or less inclined to change her/his opinion on an information item. This is particularly meaningful in applications such as customer retention, or churn prediction (i.e., a decrease in the activation-threshold would correspond to the tendency of a user to churn in favor of another service).
Both variants 𝜗(·,·) range within the interval [0,1], for any v∈V.
Let us first consider the biased scenario, which is focused on the confirmation bias principle. We choose the following form for the activation-threshold function, by which the value increases by increasing the time a node keeps staying in the same active state:
$$ g(v,t) = \theta_{v} + \vartheta(\theta_{v},t) = \theta_{v} + \delta \times \min\left \{\frac{1-\theta_{v}}{\delta}, t-t_{v}^{last}\right \}, $$
(1)
where \(t_{v}^{last}\) denotes the last (i.e., most recent) time v was activated and δ≥0Footnote 1 represents the increment in the value of g(v,t) for consecutive time steps. Thus, the longer a node has kept its active state for the same information cascade (campaign), the higher its activation value, and as a consequence, it will be harder to make the node change its state, or even no more possible (i.e., g(v,t) saturates to 1, as the difference \(\left (t - t_{v}^{last}\right)\) exceeds (1−θv)/δ).
In the unbiased scenario, we define the activation-threshold function such that, for each v, the value of the function is maximum (i.e., 1) just after the activation, i.e., at time \(t=t_{v}^{last}+1\), then for subsequent time steps, the function exponentially decreases towards θv:
$$ g(v,t) = \theta_{v} + \vartheta(\theta_{v},t) = \theta_{v} + \exp\left(-\delta \left(t-t_{v}^{last}-1\right)\right) - \theta_{v} \mathbb{I}\left[t-t_{v}^{last}=1\right], $$
(2)
where \(\mathbb {I}[\cdot ]\) denotes the indicator function, i.e., it equals 1 if \(t-t_{v}^{last}=1\), 0 otherwise. Note that δ is used differently w.r.t. the previous scenario, as it acts as a coefficient that controls the decrease of the activation-threshold function over time.
Quiescence function. Each node in \({\mathcal {G}}\) is also associated with a quiescence value, which quantifies the latency in propagation through that node. We define a quiescence function, q:V,T↦T, non-decreasing and monotone, such that for every v∈V,t∈T, with v activated at time t:
$$q(v,t) = \tau_{v} + \psi\left(N^{in}_{-}(v),t\right), $$
where τv∈T represents an exogenous term modeling the user’s hesitation in being fully committed with the propagation process, and \(\psi (N^{in}_{-}(v),t)\) provides an additional delay proportional to the amount of v’s neighbors that are distrusted and active, by the time the activation attempt is performed by the v’s trusted neighbors:
$$ q(v,t) = \tau_{v} + \psi\left(N^{in}_{-}(v),t\right) = \tau_{v} + \exp\left({\lambda \times {\underset{u \in S_{t-1}}{\sum}} \vert w_{uv} \vert }\right), $$
(3)
where λ≥0 is a coefficient modeling the average user sensitivity in the perceived negative influence. Intuitively, this coefficient would weight more the negative influence as the diffusing informative item is more “worth of suspicion”. Note also that, in Eq. 3, wuv is a negative value, since u is a distrusted neighbor of v, i.e., \(u \in N^{in}_{-}(v)\).
Rationale for activation and propagation. Our choice of using, on the one hand, friends for the activation of a user, and on the other hand, foes to impact on delayed propagation, represents a key distinction from related work (Litou et al. 2016; Talluri et al. 2015; Weng et al. 2016). Therefore, in our models, the trusted connections and distrusted connections play different roles: only friends can exert a degree of (positive) influence, whereas foes can only contribute to increase the user’s hesitation to commit with the propagation process. It should be noted that both activation and delayed propagation terms also include exogenous factors. We indeed take into consideration both the existence of environmental and personal factors of influence on an individual’s behavior. Several studies in information diffusion and influence maximization have reported evidences that, apart from influence coming from social contacts, an individual may be affected by some external event(s) and/or personal reasons to adopt an information (Goyal et al. 2010) as well as to delay the adoption of an information (Iniguez et al. 2018). In our setting, we tend to reject as true in general, the principle “I agree with my friends’ idea and disagree with my foes’ idea” (which is also close to the adage “the enemy of my enemy is my friend”), since this would imply that the behavior of a user should be completely determined by the stimuli coming from her/his neighbors. Rather, according to most conceptual models developed in social science and human-computer interaction fields (see, e.g., (Tedjamulia et al. 2005; Bishop 2007)), we believe that the individual’s influenceability has a component based on personal characteristics.
Non-competitive model
We introduce the first of the three proposed models, which refers to a single-item propagation scenario. Figure 2 shows the life-cycle of a node in the diffusion graph under this model.
Definition 1
Non-Competitive Friend-Foe Dynamic Linear Threshold Model (nC-F2DLT) Let \({\mathcal {G}} = \langle V, E, w, g, q, T \rangle \) be the diffusion graph of Non- Competitive Friend-Foe Dynamic Linear Threshold Model (nC-F 2DLT). The diffusion process under the nC-F 2DLT model unfolds in discrete time steps. At time t=0, an initial set of nodes S0 is activated. At time t≥1, the following rule applies: for any inactive node \(v \in V \setminus \left (S_{t-1} \cup \widetilde {S}_{t-1}\right)\), if \(\sum \nolimits _{u \in N^{in}_{+}(v) \cap S_{t-1}} w_{uv} \geq g(v,t)\), then v will be added to the set of quiescent nodes \(\widetilde {S}_{t}\), with quiescence time equal to t∗=q(v,t). Once the quiescence time is expired, v will be removed from \(\widetilde {S}_{t}\) and added to the set of active nodes \(\phantom {\dot {i}\!}{S}_{t^{*}}\). The process continues until T is expired or no more activation attempts can be performed. □
Competitive models
Here we introduce the two competitive F2DLT models. Let us first provide our motivation for developing two different competitive models: through the following example, we illustrate a particular situation that may occur when dealing with two campaigns competitively propagating through a network. Please note that, throughout the rest of this paper, we will consider only two competing campaigns for the sake of simplicity; nevertheless, our proposed models are generalizable to more than two competing campaigns.
Example 1
Figure
3
shows an example activation sequence in a competitive scenario between two information cascades, distinguished by colors red and green. At time t=0, nodes u and z are green-active, and their joint influence causes green-activation of node v as well (since 0.3+0.5≥0.6). At time t=1, as fully influenced by node x, node z has switched its activation in favor of the red campaign. After this switch, at time t=2, it happens that v’s activation state is no more consistent with the (joint or individual) influenced exerted by u and z. In particular, two mutually exclusive events might in principle happen at t=2: either v is deactivated or v maintains its green-activation state. ■
The uncertainty situation depicted in the above example prompted us to the definition of two models, namely semi-progressive and non-progressive F2DLT: the former corresponds to the case of v keeping its current (i.e., green) activation state, whereas the latter corresponds to v returning to the inactive state. Clearly, the two models’ semantics are different from each other: the semi-progressive model assumes that a user, once activated, cannot step aside, unlike the non-progressive one, which instead requires a user to have always the support of her/his in-neighbors to keep activation.
Given two information cascades, or campaigns C′,C′′, for every time step t∈T we will use symbols \({S^{\prime }_{t}}\) and \({S^{\prime \prime }_{t}}\) to denote the sets of active nodes, such that \({S^{\prime }_{t}} \cap {S^{\prime \prime }_{t}} = \emptyset \), and analogously symbols \({\widetilde {S}^{\prime }_{t}}\) and \({\widetilde {S}^{\prime \prime }_{t}}\) as the sets of quiescent nodes, for C′ and C′′, respectively. Also, \(S_{t} = {S^{\prime }_{t}} \cup {S^{\prime \prime }_{t}}\) and \({\widetilde {S}_{t}} = {\widetilde {S^{\prime }}_{t}} \cup {\widetilde {S^{\prime \prime }}_{t}}\).
It should also be noted that, while sharing the time interval (T) of diffusion, C′ and C′′ are not constrained to start at the same time t0. Nevertheless, for the sake of simplicity, we hereinafter assume that \(t_{0}=t_{0}^{\prime }=t_{0}^{\prime \prime }\) (with t0∈T), unless otherwise specified (cf. “Results” section).
Definition 2
Semi-Progressive Competitive Friend-Foe Dynamic Linear Threshold Model (spC-F2DLT). Let \({\mathcal {G}} = \langle V, E, w, g,\) q,T〉 be the diffusion graph of Semi-Progressive Competitive Friend-Foe Dynamic Linear Threshold Model (spC-F 2DLT), and C′,C″ be two campaigns on \({\mathcal {G}}\). The diffusion process under the spC-F 2DLT model unfolds in discrete time steps. At time t=0, two initial sets of nodes, \({S^{\prime }_{0}}\) and \({S^{\prime \prime }_{0}}\), are activated for each campaign. At every time step t≥1, the following state-transition rules apply:
R1. For any inactive node \(v \in V \setminus \left (S_{t-1} \cup {\widetilde {S}_{t-1}}\right)\), if \(\sum \nolimits _{N^{in}_{+}(v) \cap {S^{\prime }_{t-1}}} w_{uv} \geq g(v,t)\), then v will be added to \({\widetilde {S^{\prime }}_{t}}\); analogously, if \(\sum \nolimits _{N^{in}_{+}(v) \cap {S^{\prime \prime }_{t-1}}} w_{uv} \geq g(v,t)\), then v will be added to \({\widetilde {S^{\prime \prime }}_{t}}\). If both conditions hold, i.e., v can be simultaneously activated by both campaigns, a tie-breaking rule will apply, in order to decide which campaign actually determines the node’s transition in the quiescent state.
R2. When a node v enters the quiescent state corresponding to C′ (resp. C′′) for the first time, it will stay in the quiescent node-set \({\widetilde {S^{\prime }}_{t}}\) (resp. \({\widetilde {S^{\prime \prime }}_{t}}\)) until the quiescence time is expired. After that, v will be moved to \({S^{\prime }_{t}}\) (resp. \({S^{\prime \prime }_{t}}\)), i.e., it will become active for C′ (resp. C′′).
R3. Given a node v active for C′′, i.e., \(v \in {S^{\prime \prime }_{t-1}}\), if \(\sum \nolimits _{N^{in}_{+}(v) \cap {S^{\prime }_{t-1}}} w_{uv} \geq g(v,t)\) and \(\sum \nolimits _{N^{in}_{+}(v) \cap {S^{\prime }_{t-1}}} w_{uv} > \sum \nolimits _{N^{in}_{+}(v) \cap {S^{\prime \prime }_{t-1}}} w_{uv}\), then v will be removed from \({S^{\prime \prime }_{t}}\) and added to \({S^{\prime }_{t}}\); analogous rule holds for any node active for the first campaign.
Every node for which none of the above transition-state rules is triggered at time t, it will keep its current state at time t+1. □
The life-cycle of a node in spC-F2DLT is shown in Fig. 4. Note that, once a node becomes active, it cannot turn back to the inactive state, but it can only change the activation campaign. Moreover, switch transitions occur instantly.
Definition 3
Non-Progressive Competitive Friend-Foe Dynamic Linear Threshold Model (npC-F2DLT) Let \({\mathcal {G}} = \langle V, E, w, g,\) q,T〉 be the diffusion graph of Non-Progressive Competitive Friend-Foe Dynamic Linear Threshold Model (npC-F 2DLT), and C′,C′′ be two campaigns on \({\mathcal {G}}\). The diffusion process in npC-F 2DLT evolves according to the same rules as in spC-F 2DLT plus the following rule concerning the deactivation process of an active node:
R4. For any active node v at time t−1, if \(\sum \nolimits _{N^{in}_{+}(v) \cap {S^{\prime }_{t-1}}} w_{uv} < \theta _{v}\) and \(\sum \nolimits _{N^{in}_{+}(v) \cap {S^{\prime \prime }_{t-1}}} w_{uv} < \theta _{v}\), then v will turn back to the inactive state at time t.
Every node for which none of the transition-state rules is triggered at time t (including the ones defined for spC-F 2DLT), it will keep its current state at time t+1. □
It should be noted that a node’s deactivation rule depends on θv only (rather than on the whole function g(v,t)); otherwise, every node activated at a given time could deactivate itself in the next time step, due to the increase in its activation threshold. This would eventually lead to a configuration in which all nodes in the network, except the initially activated ones, are in the inactive state. The life-cycle of a node in the npC-F2DLT is illustrated in Fig. 4. Note that, unlike in spC-F2DLT, transitions to inactive state are allowed.
Theoretical properties of the models
In this section we provide insights into the proposed models. Our main goal is to understand how the features introduced in each of our LT-based models impact on the models’ spread behavior, particularly on monotonicity and submodularity properties. We organize our analysis into two parts: the first corresponding to non-competitive diffusion, and the second to competitive diffusion.
Non-competitive diffusion
We show that nC-F2DLT can be reduced to LT with quiescence time, hereinafter denoted as LTqt. By proving the equivalence between the two models, we hence claim that both the monotonicity and submodularity properties hold for nC-F2DLT. Note that since we deal with a progressive model, we assume without loss of generality that, for every node v, the activation-threshold function has a constant value for the whole duration of the diffusion process, i.e., g(v,t)=θv.
Definition 4
Reduction ofnC-F2DLT toLTqt. Given \({\mathcal {G}} = \langle V, E, w, g, q, T\rangle \) for nC-F2DLT, a diffusion graph \({\mathcal {G}}_{LT}=\langle V_{LT},E_{LT} \rangle \) can be derived, under LTqt, such that VLT=V and ELT={(u,v)|(u,v)∈E,wuv>0}. Every node v∈VLT is assigned a quiescence time equal to the maximum value of the quiescence function qv(·), i.e., \(\tau _{v}^{max} = \tau _{v} + \psi (N_{-}^{in}(v))\). □
Definition 4 exploits the fact that the distrust connections are not involved in the activation process, but only in the calculation of the quiescence time. Therefore, we can assume this time to be the maximum possible value, and hence we can study the propagation under LTqt. The reduction of nC-F2DLT to LTqt is meaningful since the two models are proved to be equivalent, as we report in the following theoretical result.
Proposition 1
The Non-Competitive Trust Threshold Model (nC- F2DLT) and the Linear Threshold Model with quiescence time (LTqt) are equivalent. \(\blacktriangleleft \)
Proof
According to the definition of equivalence of two diffusion models in (Kempe et al. 2003; Chen et al. 2013), in order to prove the equivalence of nC-F2DLT and LTqt we need to prove that the distribution of the active sets for any given seed set S0 is the same under the two models. We provide a proof by induction, hence we consider the evolution of the active sets during the diffusion rounds.
For the LTqt model, the probability of a node to be activated exactly at time t+1 (with t≥1) is given by:
$$ \begin{aligned} \Pr(v \in \widetilde{S_{t+1}} \mid v \notin S_{t}) &= \frac{\Pr\left(v \in \widetilde{S_{t+1}}, v \notin S_{t}\right)}{\Pr(v \notin S_{t})} \\ &= \frac{\Pr\left(\sum\nolimits_{u \in S_{t-1}}w_{uv} < \theta_{v} \leq \sum\nolimits_{u \in S_{t}}w_{uv}\right)}{\Pr\left(\sum\nolimits_{u \in S_{t-1}} w_{uv} < \theta_{v}\right)} \\ &= \frac{\sum\nolimits_{u \in S_{t} \setminus S_{t-1} }w_{uv}}{1-\sum\nolimits_{u \in S_{t-1}}w_{uv}} \end{aligned} $$
(4)
Above, it should be noted that the joint probability \(\Pr \left (v \in \widetilde {S_{t+1}}, v \notin S_{t}\right)\) corresponds to the probability that the threshold associated with node v falls into the interval denoted by the influence received by v until the previous time step and the one received at the current time step. Moreover, Pr(v∉St) is just the probability that, at time (t−1), the influence received by v is still below its threshold. Finally, we derive the last equality in Eq. 4, which intuitively denotes that the influence exerted by the nodes in St∖St−1, i.e., the nodes turning into the active state exactly in the current time step, is decisive to exceed the threshold θv.
For the npC-F2DLT model, the conditional probability \(\Pr \left (v \in \widetilde {S_{t-1}} \mid v \notin S_{t}\right)\) can be derived starting from Eq. 4 by constraining wuv such that \(u \in N^{in}_{+}(v)\), i.e., only trusted relations are considered. This leads to an equivalent definition of conditional probability, which holds for every time step t and seed set S0. Therefore, we can conclude that the final active sets will be the same for both models. □
It should be noted that, due to the quiescence times, the sets of active nodes in the two models may not be the same at every time step, but the two final active sets will match each other.
Since the introduction of quiescence time in LT does not have effect on the distribution of the final active nodes (Chen et al. 2013), we obtain the following equivalence: LT ≡LTqt ≡nC-F2DLT. Therefore, the activation function is still monotone and submodular under nC-F2DLT.
Example 2
Consider Fig.
5
, where the propagation process unfolds according to the LTqt dynamics. Nodes u and z are chosen as initial seeds. Thresholds and weights are set such that θ≤wuv and max{wvx,wzx}<θx≤wvx+wzx, therefore the combined influence of v and z is required for the activation of node x. The dashed edge denotes a distrust connection removed as a result of the reduction defined in Definition 4. In the initial time step (t=0), u activates v causing its transition from the inactive state to the quiescent state (in yellow). When \(t=\tau _{v}^{max}\), v turns to the active state, and together with z it becomes able to trigger the activation of node x (which will eventually become active by the time-horizon T).
It should be noted that the same dynamics holds for the nC-F 2DLT model, apart from the difference that concerns the quiescence time of node v: this would be less than \(\tau _{v}^{max}\) since y, a foe of v, is not involved in the propagation process. ■
Competitive diffusion
We focus here on spC-F2DLT and npC-F2DLT, and show that both models can be reduced to the Homogeneous Competitive Linear Threshold (H-CLT) with Majority Vote as tie-breaking rule (Chen et al. 2013). This is a competitive, progressive model based on LT, for which it is known that its activation function is monotone but not submodular regardless of the particular tie-breaking rule.
To begin with, we might recall that the non-progressive LT-based diffusion can be reduced to the progressive case, using a particular form of layered graph (Kempe et al. 2003). Given a time interval T and a diffusion graph G=〈V,E〉 for non-progressive LT, a new graph GT can be derived such that every node v∈V will have a replica vt in every layer at time t∈T, and for every edge (u,v)∈E there will be an edge (ut−1,vt) in GT.
Unfortunately, this serialization technique cannot be directly applied to our models, since it is not designed to deal with competitive or non-progressive diffusion and it discards activation or delayed propagation aspects. In the following, we define serialization techniques that are suitable for our competitive models and treat one particular configuration at a time. One general requirement is related to the time horizon to bound the unfolding of the diffusion process. In fact, when dealing with competitive models, the termination guarantee is lost. A simple example is provided next to depict such a non-termination scenario.
Example 3
In Fig.
6
, nodes u and z are chosen as seed for the green campaign and the red one, respectively. Nodes v and x become green-active and red-active, respectively, at time t=1. Next, they will constantly switch their activation campaign, causing non-termination of the diffusion process. ■
CONFIGURATION 1: No quiescence time, constant activation-threshold.
We assume that q(v)=0 and g(v,t)=θv, for all v∈V,t∈T. For both spC-F2DLT and npC-F2DLT, we claim their reduction to the H-CLT model with majority voting as tie-breaking rule.
Definition 5
spC-F2DLT graph serialization for reduction toH-CLT. Given a time interval T, we define a layered graph GT=〈VT,ET〉 such that, for each layer at time t∈T, every node v∈V will be represented in VT as a tuple \(\left \langle v^{1}_{t},v^{2}_{t},v^{3}_{t} \right \rangle \). Instances \(v^{1}_{t}\) and \(v^{2}_{t}\) have activation-threshold equal to 0, while \(v^{3}_{t}\) has the same threshold as the original node v∈V. The set of edges is defined as \(E^{T} = \left \{\left (u_{t}^{1},v_{t+1}^{3}\right) \mid (u,v) \in E, t, t+1 \in T \right \} \cup \left \{\left (v_{t}^{3},v_{t}^{2}\right) \mid v \in V, t \in T\right \} \cup \left \{\left (v_{t}^{2},v_{t}^{1}\right) \mid v \in V, t \in T\right \} \cup \left \{\left (v_{t}^{1},v_{t+1}^{2}\right) \mid v \in V, t \in T\right \}\), and the following constraint on edge weights must hold: \( \forall v_{t}^{2} \in V^{T}, \ w\left (v_{t-1}^{1},v_{t}^{2}\right) < w\left (v_{t}^{3},v_{t}^{2}\right).\) □
In the above definition, triples act as connectors between two consecutive time-layers. The role of any connector component is as a sort of “switch” to enable a node choosing between its activation state in a layer and the one in the subsequent layer. In other words, node \(v^{1}_{t}\) is the main instance of node v, since the activation state of \(v^{1}_{t}\) reflects the state of v in the original graph, under spC-F2DLT at time t; node \(v^{3}_{t}\) is the instance of v connected with other nodes from layer at t−1, therefore it reflects the influence received by v in the original graph, at time t−1; if the activation attempt to \(v^{3}_{t}\) fails, node \(v^{2}_{t}\) will be activated with the same state of v; otherwise, according to the edge weight constraint (cf. Definition 5), \(v^{2}_{t}\) will switch to the other campaign, and then will propagate to instance \(v^{1}_{t}\). Recall that \(v^{1}_{t}, v^{2}_{t}\) have zero activation-threshold. Figure 18 in Appendix A shows an example of serialization for a spC-F2DLT diffusion graph with time horizon set to 2.
It should be emphasized that, compared to the serialization method in (Kempe et al. 2003), we require replication of each node in each layer, and additional edges connecting the replica-instances, in order to allow the maintenance of the activation state when no activation event occurs between two time-consecutive layers.
Analogous reduction technique can be defined for the npC-F2DLT model.
Definition 6
npC-F2DLT graph serialization for reduction to H-CLT. Given a time interval T, we define a layered graph GT=〈VT,ET〉 such that, for each layer at time t∈T, every node v∈V will be represented in VT as a tuple \(\left \langle v^{1}_{t},v^{2}_{t},v^{3}_{t} \right \rangle \). Instances \(v^{1}_{t}\) and \(v^{2}_{t}\) have activation-threshold equal to 1 and 0, respectively, while \(v^{3}_{t}\) has the same threshold as the original node v∈V. The set of edges is defined as \( E^{T} = \left \{\left (u_{t}^{1},v_{t+1}^{3}\right) \mid (u,v) \in E, t,t+1 \in T \right \} \cup \left \{\left (v_{t}^{3},v_{t}^{2}\right) \mid v \in V, t \in T\right \} \cup \left \{\left (v_{t}^{2},v_{t}^{1}\right) \mid v \in V, t \in T\right \} \cup \left \{\left (v_{t}^{3},v_{t}^{1}\right) \mid v \in V, t \in T\right \} \cup \left \{\left (v_{t}^{1},v_{t+1}^{2}\right) \mid v \in V, t, t+1 \in T \right \}\), and the following constraints on edge weights must hold: \( \forall v_{t}^{2} \in V^{T}, \ w\left (v_{t-1}^{1},v_{t}^{2}\right) < w\left (v_{t}^{3},v_{t}^{2}\right)\), and \( \forall v_{t}^{1} \in V^{T}, \ w\left (v_{t}^{2},v_{t}^{1}\right) + w\left (v_{t}^{3},v_{t}^{1}\right) = 1.\) □
It should be noted that the last condition in Definition 6 imposes nodes \(v_{t}^{2}\) and \(v_{t}^{3}\) to hold the same activation state in order to activate \(v_{t}^{3}\).
Analogously to the reduction of spC-F2DLT to H-CLT, we can conveniently devise a notion of “connector” component between any two consecutive layers, which however in this case should also account for node deactivations. Figure 19 in Appendix A shows an example of connector for the npC-F2DLT model.
Claim 1
For any given diffusion graph \({\mathcal {G}}\) under spC-F2DLT (resp. npC-F2DLT), assuming constant activation-threshold and no quiescence time, every node v in \({\mathcal {G}}\) is active at time t∈T if and only if its corresponding instance \(v^{1}_{t}\) is active in the serialized graph GT (resp. npC-F2DLT). \(\blacktriangleleft \)
CONFIGURATION 2: Constant quiescence time, constant activation-threshold.
We assume that q(v)=τv and g(v,t)=θv, for all v∈V. For both spC-F2DLT and npC-F2DLT, we claim their reduction to H-CLT with majority voting as tie-breaking rule.
In this case, we need to consider that, whenever a node is activated, its quiescence time may not expire before the time horizon; for this reason, we will consider only nodes reachable from \(S_{0} = S^{\prime }_{0} \cup S^{\prime \prime }_{0}\) within T, for any two given seed sets \(S^{\prime }_{0}\) and \(S^{\prime \prime }_{0}\). To identify such nodes, we define a quiescence-aware distance measure that accounts for the quiescence times along the path connecting any two nodes. Given nodes u,v, and the set P(u,v) of all paths between u and v, the distance from u to v will be measured as \( d(u,v) = \min _{p \in P(u,v)} \sum \nolimits _{x \in p} \tau _{x} \). Moreover, we denote with d(S0,v) the minimum distance between nodes u∈S0 and v. By exploiting this distance, we will discard all nodes that cannot be “contagious” before the end of T, say tmax. Therefore, the node set VT of the layered graph is defined as:
$$V^{T} = \left\{ \left\langle v_{t}^{1},v_{t}^{2},v_{t}^{3} \right\rangle \mid \forall v \in V, \ t \in T, \ d(S_{0},v) < t_{max}\right\}. $$
Each node v∈V with quiescence time τv will have connections from the previous layers according to the following rule: for any layer at time t, if t<d(S0,v) then v will not have any incoming edges, otherwise all incoming edges of v will be from the layer at time t−τv−1.
Using the above settings in the serialization method previously presented, it can easily be demonstrated that both spC-F2DLT and npC-F2DLT can be reduced to an equivalent H-CLT model.
Claim 2
For any given diffusion graph \({\mathcal {G}}\) under spC-F2DLT (resp. npC-F2DLT), assuming constant activation-threshold and constant quiescence time, every node v in \({\mathcal {G}}\) is active at time t∈T if and only if its corresponding instance \(v^{1}_{t}\) is active in the serialized graph GT (resp. npC-F2DLT). \(\blacktriangleleft \)
CONFIGURATION 3: Variable quiescence time, constant activation-threshold.
We assume that q(v,t) is variable, while g(v,t)=θv, for all v∈V,t∈T.
Like in the previous case, we need to specify the seed sets \(S^{\prime }_{0}, S^{\prime \prime }_{0}\). However, note that the quiescence time of a node now depends on the actual activation state of its in-neighborhood (cf. Eq. 3), which makes it unfeasible a direct serialization of the whole diffusion graph.
Starting from the original diffusion graph \({\mathcal {G}}\), we derive an “intermediate” graph \(\widehat {{\mathcal {G}}}\), which is equivalent to \({\mathcal {G}}\) unless each node v∈V is associated with a quiescence time interval \([\tau _{v},\tau _{v}^{max}]\), where \(\tau _{v}^{max}=\tau _{v} + \psi (N_{-}^{in}(v))\). Let us denote with \({\mathcal {G}}^{min}\) the instance of \(\widehat {{\mathcal {G}}}\) such that the quiescence time of every \(v \in \widehat {{\mathcal {G}}}\) is τv, and with \({\mathcal {G}}^{max}\) the instance of \(\widehat {{\mathcal {G}}}\) such that the quiescence time of every \(v \in \widehat {{\mathcal {G}}}\) is \(\tau _{v}^{max}\).
Although we cannot assert that spC-F2DLT and npC-F2DLT are equivalent to H-CLT under the layered graph obtained by applying the previously described serialization techniques, an important theoretical result can nonetheless be provided, as reported next.
Claim 3
For any diffusion graph \({\mathcal {G}}\) under spC-F2DLT (resp. npC-F2DLT), with campaigns C′,C′′, assuming constant activation-threshold and variable quiescence time, for any seed sets \(S^{\prime }_{0}\) and \(S^{\prime \prime }_{0}\), it holds that:
$$ \sigma^{\prime}_{H{-}CLTmax}(S^{\prime}_{0},S^{\prime\prime}_{0}) \leq \sigma^{\prime}(S^{\prime}_{0},S^{\prime\prime}_{0}) \leq \sigma^{\prime}_{H\text{-}CLTmin}(S^{\prime}_{0},S^{\prime\prime}_{0}), $$
(5)
where σ′ is the number of nodes activated by C′ under spC-F2DLT (resp. npC-F2DLT), \(\sigma ^{\prime }_{H\text {-}CLTmax}(S^{\prime }_{0},S^{\prime \prime }_{0})\) and \(\sigma ^{\prime }_{H\text {-}CLTmin}(S^{\prime }_{0},S^{\prime \prime }_{0})\) are the number of nodes activated by C′ under H-CLT in the layered graph obtained by serialization of spC-F2DLT (resp. npC-F2DLT) on \({\mathcal {G}}^{max}\) and \({\mathcal {G}}^{min}\), respectively. \(\blacktriangleleft \)
Enabling variable quiescence time, i.e., ψ(·), means that the exact time required by each node to make a transition from the quiescent state to the active one cannot be established in advance at the beginning of the propagation process. Since for any node v the quiescent time ranges within \([ \tau _{v}, \tau _{v}^{max}]\), we devise two opposite scenarios. In the first scenario, represented by the rightmost side of Eq. 5, each node is assumed to wait the minimum amount of time, i.e., τmin, before its activation; this leads to a higher fraction of nodes that could be activated before the time horizon T is reached. The second scenario, represented by the leftmost side of Eq. 5, assumes that each node has to wait the maximum possible quiescence time, i.e., τmax; as a consequence, a smaller fraction of nodes will be able to complete the activation process before the time limit, thus leading to a lower spread.
CONFIGURATION 4: No quiescence time, variable activation-threshold.
We assume that q(v)=0 and g(v,t)=θv+𝜗(θv,t), for all v∈V,t∈T. For both spC-F2DLT and npC-F2DLT, we claim their reduction to H−CLT with majority voting as tie-breaking rule. In the following, we refer to the biased activation-threshold function, although it is easy to show analogous considerations for the non-biased activation-threshold function.
Because of the dynamic behavior of the activation-threshold function, we cannot predict its value at any particular time step of the diffusion process; nevertheless, by specifying the value of coefficient δ in Eq. 1, we can derive the value of \(t_{v}^{max}\), which would suggest how many time-layers we have to look back in order to know the actual threshold value of v at a particular time t. In order to capture such dynamic aspect in H-CLT, we define a further serialization technique, built on top of the previously defined. We will restrict to a particular case, afterwards we provide some rules that apply to the general case.
Let us assume to focus on a particular node v, and at any two consecutive time steps of activation for the same campaign its threshold increases by δ. Again, node v will have replicas for any time-layer t, i.e., \(\langle v^{1}_{t},v^{2}_{t},v^{3}_{t} \rangle \), with the first replica, \(v^{1}_{t}\), holding the actual state of v in the corresponding serialized graph for the competitive model. In addition, we introduce further replicas, in number equal to the value \(t_{v}^{max}\); suppose, for the sake of simplicity, \(t_{v}^{max} = 3\), we derive replica nodes \(\left \langle v_{t}^{3,r1},v_{t}^{3,r2},v_{t}^{3,r3} \right \rangle \), such that each of them will have a threshold value in [θv,1] with increment of δ. Figure 7 illustrates this new component in the serialized graph.
Because this component is introduced as an extension of the previous techniques, the meaning of the nodes \(v^{1}_{t-1},v^{1}_{t-2},v^{1}_{t-3}\) remains the same as in the previous cases. On the right side of Fig. 7, each of the additional replicas has a different value of threshold and it is connected with nodes coming from the previous layers. Clearly, the overall behavior of this component depends on the weights attached to every edge in the structure. In this regard, we define the following constraints on the edge weights:
$$ \left\{\begin{array}{ll} w^{3,r1} > w_{1}^{13} & \quad (a) \\ \forall i>1 \quad w_{i}^{13}=w^{3,ri} & \quad (b) \\ \forall i \geq 1 \quad w_{i}^{13} > \sum\nolimits_{j>i}^{n} w_{j}^{13} & \quad (c)\\ \forall i \geq 1 \quad w^{3,ri} > \sum\nolimits_{j>i}^{n} w^{3,rj} & \quad (d) \\ w^{3,r1} - w_{1}^{13} < w_{n}^{13} & \quad (e) \end{array}\right. $$
(6)
It should be noted that the activation attempts are performed directly on the replicas. Therefore, the above constraints on the edge weights control whether a node assumes the state derived as the outcome of the most recent activation attempts, or the one consistent with its personal history. as the outcome of the most recent activation attempts or the one consistent with its personal history. Each of the aforementioned inequality contributes to this decision process, following a different purpose. Eq. 6(a) ensures that the state derived from the last activation attempt is always preferred to the one derived from the previous time step. Eq. 6(b) ensures that the information coming from the previous time steps shall be given the same importance as the one derived from the current replicas. Eq. 6(c-d) ensures that the most recent information, i.e., the closest previous time steps, has higher priority than the earliest one. Eq. 6(e) ensures that there is consistency with respect to the state assumed in the closest previous time step and farthest involved time step (e.g., the third previous time step in the addressed scenario).
Moreover, the threshold of the “central” node in the component (\(v^{3}_{t}\)) is set to w3,r1, to ensure sequentiality of the diffusion. By setting \(\theta _{v_{t}^{3}}\) equal to w3,r1, we avoid that \(v_{t}^{3}\) can be activated by its own replicas belonging to layers preceding the t−1-th layer.
Figure 8 shows how the above defined connector is integrated into a serialization technique. In the figure, only the connections incident on vertex v are expanded. The red edges are the ones connecting consecutive layers, therefore the replica \(v_{t}^{3,r1}\) is connected with the previous layer, the replica \(v_{t}^{3,r2}\) is connected with the second previous layer and so on. Blue edges represent the new connections due to the introduction of this new component.
Claim 4
For any given diffusion graph \({\mathcal {G}}\) under spC-F2DLT (resp. npC-F2DLT), assuming variable activation-threshold and no quiescence time, every node v in \({\mathcal {G}}\) is active at time t∈T if and only if its corresponding instance \(v^{1}_{t}\) is active in the serialized graph GT (resp. npC-F2DLT). \(\blacktriangleleft \)