Comparative Analysis of Box-Covering Algorithms for Fractal Networks

Research on fractal networks is a dynamically growing field of network science. A central issue is to analyze fractality with the so-called box-covering method. As this problem is known to be NP-hard, a plethora of approximating algorithms have been proposed throughout the years. This study aims to establish a unified framework for comparing approximating box-covering algorithms by collecting, implementing, and evaluating these methods in various aspects including running time and approximation ability. This work might also serve as a reference for both researchers and practitioners, allowing fast selection from a rich collection of box-covering algorithms with a publicly available codebase.

objects: using the box-covering method, also called box-counting method. The method aims to determine the minimum number of boxes needed to cover the entire network. Although this problem belongs to the family of NP-hard problems ), numerous algorithms have been proposed to obtain an approximate solution.
In this contribution, we provide a systematic review and comparative analysis on a large collection of algorithms that have been proposed throughout the years to perform the box-covering of the network. The high number of approximating algorithms introduced recently calls for an impartial, systematic comparison of the algorithms. There are a few recent works that describe the most important box-covering algorithms (Wen and Cheong 2021;Rosenberg 2020;Huang et al. 2019;Deng et al. 2016), however, to the best of our knowledge, this is the first comprehensive comparative study in this field. After collecting and implementing various methods, we compared their performance in terms of approximation ability and running time on a number of real-world networks. In addition, we made all our codes publicly available on GitHub (https:// github. com/ Peter TKova cs/ boxes) together with detailed documentation and a tutorial Python notebook. It makes our results reproducible and allows for future contributions from interested community members.

Fractality in geometry
The term fractal was coined by Benoit B. Mandelbrot and the concept was originally developed for sets in a Euclidean space (Mandelbrot 1982;Falconer 2004). In fractal geometry, the box-covering algorithm is one possible way to estimate the fractal dimension of a set S in a d-dimensional space ( R d ). The set is lying on an evenly spaced d-dimensional grid and the hypercubes of this grid are boxes of size l. We consider the N(l) minimal number of boxes of size l that are needed to cover the set and see how it scales with the box size. The box-counting or Minkowski dimension is defined as The exact mathematical definition of fractals goes beyond the scope of this paper. Roughly speaking, a set with a non-integer box-counting dimension is considered to have fractal geometry, since it suggests that the fractal scales differently from the space it resides in.

Fractality of networks
The notion of fractal networks is motivated by the concepts from fractal geometry and measuring the fractal dimension of a network is analogous to the geometric case (Song et al. 2005). The box-counting method can be easily generalized to networks since the vertex set of an undirected graph together with the graph distance function (geodesic distance) form a metric space, and the box-counting algorithm generalizes to any metric space. For networks, the box-covering also called box-counting method works as follows: • We say that a group of nodes fit into a box of size l B if, for any pair of these nodes, their shortest distance is less than l B . dim B (S) = lim l→0 log N (l) − log l • A network is covered by boxes of size l B if its nodes are partitioned such that every group fits into one box of size l B . • As the number of possible coverings is finite, there is a minimal number of boxes to cover the network with l B sized boxes. This is what we denote by N B (l B ). • If the minimal number of boxes scales as a power of the box size, i.e., N B (l B ) ∝ l −d B B , the network is informally defined to be fractal with a finite fractal dimension or box dimension d B .
Although the " ∝ " scaling relation is not well-defined and cannot strictly hold for finite networks, the box-covering-based fractal dimension can be defined mathematically rigorously for infinite networks and graph sequences (Komjáthy et al. 2019). However, in practice, one usually verifies the fractality of real-world networks by approximating the N B (l B ) values on a feasible range of l B box size values and then inspects them on a log-log plot to see if the relationship follows a power-law to a good approximation (see Fig. 1). The left side of Fig. 1 shows an optimal box-covering of a small graph with three different box sizes. The right side of the figure illustrates the scaling of the number of boxes N B and the box size l B for a fractal and a non-fractal network.
Although fractality is also defined for weighted networks (Wei et al. 2013), in this work we only focus on unweighted and undirected graphs. We also note that besides the boxcovering-based fractal dimension, several other network dimension concepts have been proposed throughout the years, some of them are equivalent under some regularity conditions on the network . For a survey on fractal dimensions of networks, we refer to Rosenberg (2018Rosenberg ( , 2020.

Algorithms
The optimal box-covering of some mathematically tractable network models (e.g. (u, v)flower, hierarchical scale-free graph, Song-Havlin-Makse model, Sierpinski network) can be determined rigorously (Rozenfeld et al. 2007;Komjáthy et al. 2019;Niu and Li 2020), however, the box-covering of real networks cannot be studied analytically hence it must be calculated algorithmically. The box-covering of a network is known to be NPhard ). Hence to have an algorithm of practically acceptable time complexity, one must use approximating methods. As it is usual for semi-empirical methods, there are a good number of different proposals for the task, see Table 1. To give some intuitive summary, we can say that most algorithms follow a greedy strategy where the Fig. 1 On the left: the illustration of the box-covering problem on a small graph. On the right: a schematic plot depicting the scaling of the number of boxes for a fractal and a non-fractal network. The figure was motivated by Gallos et al. (2007) distinction between algorithms is drawn by the actual greedy decision method. However, we present a number of approaches beyond the greedy paradigm.
It is also important to mention that there are algorithms that use diameter-based boxes, while other methods use radius-based boxes, also called balls of radius r B around a center node c. The two approaches sometimes cause some ambiguity in the terminology.
Using radius-based boxes require the notion of "centers" or "seeds", which means that each box is assigned to a special, central node. While this idea is natural in R n , it may be misleading for a graph since the edges between vertices generally cannot be embedded into a Euclidean space. For example, in a box that contains a cyclic subgraph, there is no vertex that would be special by any means. The reason, however, why these nodes are called seeds or center nodes is that these nodes are the first elements of the boxes and the boxes are built around them.

General remarks
In the following part, we describe and evaluate the implemented algorithms in detail. Throughout this paper, we denote the box size by l B , the box radius (if applies) by r B ,  Kuang et al. (2014) where l B = 2r B + 1 . However, let us note that in the literature the definition of an l Bsized box can be inconsistent. In the implementation, we define boxes such that for boxsize l B the distance between two nodes of the same box can be less than or equal to l B . In the evaluation part, we followed the widely used convention, which means that in an l b -box the distance between any two nodes is strictly less than l B . This means that the "equivalent box-size" is l B =l B + 1 = 2r B + 1.
In this section, we use the less than or equal ("implementation") convention for the algorithms. We will also provide a sketch of the pseudocode for the discussed algorithms. The aim is to foster understanding, there may be minor differences compared to the actual Python code.
As it is apparent from the definition, the box-covering process of a network requires the ability to determine the shortest path between two nodes to be able to decide if they can be in the same box. This could be done in multiple ways, for example, by performing a breadth-first search on the fly. However, for convenience and to avoid calculating the pairwise distances multiple times, we implemented the following approach: in the beginning, we calculate and store the pairwise shortest distances d(v i , v j ) for all v i , v j ∈ V , where V is the vertex set of the graph. This data is then used in the subsequent calculations. With this notion, we define the ball of radius r B around center c , which is B(c, r B ) = {d(v i , c) ≤ r B | v i ∈ V } . This term will be used extensively throughout this paper since many algorithms operate on these balls. Note that in this work, the terms graph and network are used interchangeably.

Greedy coloring (GC)
The problem of box-covering can be mapped to the famous problem of graph coloring ). Therefore, a graph coloring algorithm can be utilized to solve the boxcovering problem. The idea is the following: let us consider the auxiliary graph of our original network, which consists of the same vertices and "auxiliary edges". This means that two vertices are connected by an edge in the auxiliary graph if and only if their distance in the original network is greater than l B , which means that they cannot be in the same box. The idea is illustrated in Fig. 2, moreover, the pseudocode is detailed in Algorithm 1. Once we construct the auxiliary graph for a given l B , we perform a graph (vertex) coloring on the auxiliary graph, i.e., we assign colors to the vertices of the auxiliary graph such that (1) adjacent vertices cannot have the same color, (2) and we seek to use the least number of colors. This problem is equivalent to the box-covering of the original network since the assigned colors in the auxiliary graph correspond to the assigned boxes in the original network. The vertices of the same color in the auxiliary graph are at most l B far away from each other, hence they form a box in the original network. Moreover, the minimum number of colors required to color the vertices (the chromatic number) of the auxiliary graph equals the minimum number of boxes needed to cover the original graph.
Unfortunately, the complexity of the graph coloring problem is NP-hard. A wellknown approximating algorithm is greedy coloring. This algorithm consists of two main steps (Kosowski and Manuszewski 2004): 1. Order the nodes by some method 2. Iterate over the above sequence: assign the smallest possible color ID to every node The schematic way the greedy coloring box-covering algorithm works is presented in Algorithm 2.
The algorithm is completely well-defined if the strategy, i.e., the ordering method is specified. In this work, we used a random sequence. Even though more advanced methods exist, we think that this naive approach serves as a great baseline for the other algorithms.
Our implementation relies on the greedy coloring implementation of the networkx package (Hagberg et al. 2008) and in our Python package, we made it possible to select any other built-in strategy. Li et al. (2017) introduced a related box-covering algorithm, which uses a so-called maxmin ant colony heuristic optimization method to color the vertices of the auxiliary graph.
While in this work we focus on unweighted networks, to be comprehensive, let us note that there are a few box-covering methods, that aim to solve the box-covering problem by assigning weights to the edges of an originally unweighted network. For example, Zhang et al. (2016) proposed a greedy coloring method where the construction of the auxiliary graph is different. They adopted Coulomb's Law to assign repulsive force, i.e., weights to the edges, which was motivated by the observation that in real fractal networks hubs are less likely to be connected (Song et al. 2006), however, there are some counterexamples as well Nagy 2018). The construction of the auxiliary graph hence is based on the weighted shortest paths, called smallest repulsive force paths, and not on the original shortest paths. Recently, Gong et al. (2020) introduced a deterministic box-covering algorithm, which is a modification of the algorithm of Zhang et al., namely it starts the coloring with the high-degree nodes.

Random sequential
The random sequential algorithm was introduced by Kim et al. (2007) as one of the first approximating box-covering algorithms and similarly to the algorithms that were proposed by Song el al. (2007), it also uses the idea of burning (breadth-first search). Burning means that the boxes are generated by growing them from one randomly selected center (or seed) vertex towards its neighborhood, see Fig. 3. Moreover, once some nodes have been assigned to a box, they are "burned out".
In every step, an unburned center node c is randomly chosen, and then the ball of radius r B is formed around c, more precisely, the algorithm selects those unburned nodes that are at most r B far from the center node c. These newly burned nodes together form a new box. Note that in the original paper, the center node is selected from the whole set of nodes V, yet we modified it to select it from the uncovered set of nodes. The authors argue that in some cases, that was necessary to obtain the desired behavior, namely, they stated that if they disallow disconnected boxes, they get different results for the scaling of N B on the graph of WWW. In a disconnected box, the vertices are not connected, but there is a path between them and the center node, which is however not a member of the box because it has been burned already.
One year later, Gao et al. (2008) proposed a so-called rank-driven box-covering algorithm, which is a computationally more efficient version of the random sequential algorithm. Zhang et al. (2017) proposed a modified version of the random sequential algorithm, where the seeds of the firstly selected boxes are the nodes with the highest degrees, and it forces that the largest hubs are in different boxes. The authors also compare their algorithm to the original random sequential method, and a so-called P-BC variant of the RS, which selects always the highest degree nodes as seeds.

Compact box burning (CBB)
As the name suggests, similarly to the random sequential algorithm the compact box burning algorithm also uses the concept of burning. The algorithm was introduced by Song et al. (2007) together with the greedy and the MEMB algorithms. The main point is to grow boxes by picking new nodes randomly from a candidate set C that contains all nodes that are not farther away than l B from any node that is already in the candidate set. In the end, this set C is going to form a box, and the process guarantees that the box is compact. 1 Kitsak et al. (2007) introduced a simplified version of the CBB algorithm which estimates the number of boxes rather fast and it has been also used and detailed in a more recent work (Yuan et al. 2017). The modified algorithm does not construct the actual boxes, it only approximates N B for a given l B as follows: first initializes N B and sets its value to zero. Then it selects a random center node c, marks the nodes of the ball of radius l B centered on c, and finally increases the value of N B by one. Then randomly selects another unmarked center node and repeats this process until the whole network is covered. The authors stress that this estimation of N B is always less than the actual minimum number of boxes needed to cover the network, hence to improve the estimation they suggest performing the computation many times and then taking the maximum of the estimations which is claimed to be a better approximation of N B and that still has a small time complexity.
Note that this algorithm could be also considered as a simplified sampling-based box-covering method (see "Sampling-based method" section), which is a more general framework that selects the "best" box-covering from many independent box covers generated by a simple covering algorithm.

Maximal excluded mass burning (MEMB)
The Maximal Excluded Mass Burning algorithm is the third box-covering algorithm that was presented in the influential paper of Song et al. (2007). Instead of using l B , this method also uses the notion of radius r B and centered boxes: every box has a special node, a center. Boxes are constructed such that every member node of the box is not farther away than r B from the center node. The algorithm guarantees that all the boxes are connected, i.e., two nodes of the same box can always be reached with a path that is inside of the box.
The algorithm could be interpreted as an improvement of the random sequential method in some aspects but the analogy fails because the way burning is implemented is different from random sequential. It works as follows: • First, the next center is chosen on the basis of maximal excluded mass, that is the number of uncovered nodes not farther away than r B from the center node. 2 • Once the next center is chosen, all uncovered nodes within r B are covered, but not yet assigned to boxes. • Finally, after every node is covered, non-center nodes are assigned to centers in a way that the resulting boxes are connected.
The pseudocode of the MEMB algorithm is presented in Algorithm 4.
Figure 4 helps to understand the reason why boxes are only formed in the end and not burned on the fly. The authors' motivation was that in scale-free networks, where there are a few hubs, the hubs should be selected as the center of the boxes, otherwise the tiling of the network is going to be far from the optimal case. That is why the algorithm first selects centers and then assigns the remaining nodes to them.

Ratio of excluded mass to closeness centrality (REMCC)
The REMCC box-covering algorithm has been introduced by Zheng et al. (2016), and it can be regarded as a modification of the MEMB algorithm. Both algorithms rely on excluded mass, but it also considers the ratio of excluded mass to closeness centrality (REMCC) to select the center nodes.
In every step, a new center is chosen from the uncovered set of nodes such that the chosen node has the maximal f score, which is a novel metric in the paper. The f score is defined as the ratio of the excluded mass and closeness centrality or in other words, it is the product of the excluded mass and the average length of the shortest paths to all other nodes. The authors argue that the reason why they consider the closeness centrality in the selection of the center nodes is that if we choose a central (important) node of the network as a center (seed) of a box, then eventually more boxes will be needed to cover the network. After the center node is selected, all nodes in the r B ball of the center are covered. In this procedure-in contrast to the MEMB algorithm-centers are selected from the uncovered set of nodes.
The steps of the REMCC box-covering algorithm are detailed in Algorithm 6. Note that in the current implementation, the algorithm does not return the formed boxes, only the centers are determined, however, after the center nodes are given, the boxbuilding method of the MEMB algorithm can be applied.

MCWR
Recently, Liao et al. (2019) proposed an algorithm, called MCWR, which is a combination of the MEMB and the random sequential (RS) algorithms. Their goal was to create an algorithm that has the accuracy of the MEMB and the fast speed of the random sequential algorithm. In addition, to reduce the time complexity of the MEMB algorithm, a new way of keeping track of the excluded mass is proposed.
The authors argue that due to the excluded-mass-based selection of the central nodes, the MEMB algorithm has high accuracy but also high time complexity. Since the random sequential algorithm selects the center nodes randomly, it is one of the fastest algorithms but it gives a very rough approximation of the optimal number of boxes.
The main idea of the MCWR algorithm is that it mixes the two ways of center node selection. More specifically, the authors introduced a parameter p ∈ [0, 1] , which represents the mixing portion of the MEMB method in the approach of choosing a center node. Hence, before choosing a new center, a biased coin is tossed and with probability p the algorithm performs the usual MEMB steps and with probability 1 − p it chooses a center uniformly at random. 3 After finding a new center, the excluded mass of the vertices is updated, but in contrast to the original MEMB algorithm, to reduce the time complexity, it only updates the mass for the nodes that are inside of the newly covered ball of radius r B . The assignment of the center nodes to the non-center nodes, i.e., the construction of the actual boxes, is the same as in the MEMB algorithm.
Note that the setting of the mixing parameter p is indeterminate. The authors apply the algorithm with different p settings to real networks without a suggested default setting. However, it is clear that if p = 0 , the MCWR is the same as the RS, and similarly if p = 1 , it works like the MEMB algorithm. Therefore, the smaller the p value is, the "closer" the algorithm is to the random sequential, hence the faster and less accurate the MCWR is.

Merge algorithm (MA)
The merge algorithm (MA) has been introduced by Locci et al. (2010), who investigated the fractal dimension of a software network using the merge algorithm, the simulated annealing, and the greedy coloring box covering algorithms.
The merge algorithm is based on the following simple idea: first, every node itself is a box (the authors refer to the boxes as clusters), and then for l B > 0 the boxes are formed using a successive aggregation approach, i.e., two boxes are merged if their distance is at most l B , more precisely, if the size of the union of the boxes is still at most l B . Thus, to perform the boxcovering of a network with box size l B = k + 1 , the Merge algorithm uses the previous results, i.e., the cover of the network with l B = k sized boxes. The pseudocode of the merge algorithm is detailed in Algorithm 8. Note that Locci et al. refers to the boxes as clusters, that is why in the code they are denoted by c.

Simulated annealing (SA)
The Simulated Annealing (SA) box-covering algorithm has also been introduced by Locci et al. (2010). Generally, simulated annealing is a meta-heuristic technique that was inspired by the annealing in metallurgy and it is used in optimization problems for approximating the global optimum in a large (typically discrete) search space, hence it can also be applied to approximate the minimum number of boxes needed to cover a network. In the simulated annealing context, the state of the physical system is the box covering of the network and the "internal energy" of the state is the number of boxes N B .
In the simulated annealing process, several states, i.e., coverings are checked, and a new neighbor state S ′ of the current state S is created using the following three operations: (1) moving a single node from one box (with at least two nodes) to another if the size of the extended box is still less than l B , (2) creating new box by excluding a node from a box, and (3) merging boxes using the merge algorithm (see Algorithm 8).
After performing these steps in the above sequence, the internal energy E(S ′ ) of the new state S ′ , i.e., the new approximation of the number of boxes is checked, and if E(S ′ ) ≤ E(S) then the algorithm continues with the new configuration. However, if the new boxing is worse than the previous one, i.e., E(S ′ ) > E(S) then with probability it is still accepted. Note that at each iteration, the system is cooled down, which means that the probability p of accepting a worse covering is reduced: T ′ = T · c , where c < 1 is the cooling constant parameter.
We introduced several modifications in the implementation compared to the original paper. First, the operations (1) and (2) are rather vaguely defined in the paper. To be able to perform these operations, it must be ensured, for example that after the removal of a given node, the original box would remain nonempty. Thus, the changes compared to the paper of Locci et al. (2010) are as follows: • Moving nodes we only try to move nodes into neighboring boxes to spare iterating over all nodes when checking the distance ( l B ) condition. Moreover, instead of ensuring that k 1 movements are performed, we perform 2 · k 1 trials, which may fail if for a randomly chosen box we cannot move any additional node from the neighbor boxes.
If k 1 movements are successfully performed, the process is terminated. • Creating new boxes we proceed in an analogous manner, we perform at most k 2 trials on creating new boxes: a random box is chosen and if it has at least two nodes then a random node of this box is removed and that node forms a new box on its own. If k 2 creations are successfully performed, the process is terminated. • Merging boxes as opposed to the original paper (Locci et al. 2010), in our implementation this step is performed before evaluating the energy of the new state, i.e., the number of boxes are counted after the merge procedure is also done.
Our implementation of the simulated annealing box-covering algorithm is detailed in Algorithm 9. Zhou et al. (2007) proposed an edge-covering method, where the covering is performed with the simulated annealing optimization. Clearly, an edge-covering approach automatically covers the nodes as well, but the estimated number of boxes is greater than or equal to the number of boxes returned by a node-covering method since in the edge-covering method some nodes may be covered multiple times. Thus, the edge-cover method also estimates a different box-dimension of the network.

Overlapping-box-covering algorithm (OBCA)
The overlapping-box-covering algorithm was introduced by Sun and Zhao (2014). As the name of the algorithm suggests, the novelty of this method is that instead of partitioning the nodes, it uses overlapping boxes. The authors claim that this technique yields a more robust estimation of the number of boxes required to cover the network because separated boxes are prone to randomness. The authors suggest that instead of burning boxes on the fly, one should only mark possible boxes while processing the data and then choose the final boxing in the end.
The algorithm proceeds by first only creating box proposals, during which possible center nodes are iterated in ascending order with respect to the degree.
• A node is a possible center if it is uncovered yet. The algorithm starts with the smalldegree nodes and large-degree nodes are checked at last. • In a "proposed box", those nodes are included whose distance from each other is at most l B . They are chosen from nodes whose distance from the center is at most l B . • Once a proposed box is formed, the "covered frequency" of all nodes belonging to the newly proposed box is increased by one. The covered frequency of a node denotes how many overlapping proposed boxes contain it.
The final step of the algorithm is the revision of the proposed boxes. A box is called "redundant" if it only contains nodes that are also contained by other boxes, i.e., the covered frequency of all the nodes in a redundant box is larger than 1. In the last step, the redundant boxes are deleted and the covered frequency of the nodes of the recently removed box is decreased by 1. After the iteration is over, only non-redundant boxes remain. The detailed pseudo-code of the overlapping-box-covering method is described in Algorithm 10.
In a recent work, Zheng et al. (2020) proposed a modification of this algorithm, which they refer to as the improved overlapping box covering algorithm (IOB). The authors argue that the algorithm could achieve better results if it would rank and sort the nodes according to the excluded mass of closeness centrality instead of the degree.

Differential evolution (DE)
The differential evolution (DE) box-covering algorithm has been introduced by Kuang et al. (2014). In general, differential evolution is also a metaheuristic optimization method that belongs to the class of evolutionary algorithms, and it does not require the optimization problem to be differentiable since it optimizes a problem by iteratively trying to improve a candidate solution.
The proposed differential evolution box-covering algorithm uses the deterministic sequential greedy coloring algorithm such that the coloring sequence of nodes is represented by an N-dimensional vector, where N is the size of the network. The vector encoding of the different sequential greedy coloring procedures transforms the optimization problem into an N-dimensional space, which makes it possible to use the differential evolution paradigm.
Similarly to the simulated annealing, the DE algorithm also performs some operations on the population of these vectors to generate new possible solutions. Namely, there are two operations: mutation and crossover. In the mutation process, new vectors are created using randomly chosen three existing vectors. Moreover, to further increase the diversity of the set of examined solutions, the crossover operation generates new random vectors by randomly mixing the coordinates of the existing vectors and the newcomer vectors that were created in the mutation process. At the end of each iteration, the best solution is saved. The algorithm is detailed in Algorithm 11.

Particle swarm optimization (PSO)
A box-covering method using a heuristic optimization method called discrete Particle Swarm Optimization was presented by . The authors argue that the differential evolution algorithm (DE), introduced in their previous work (Kuang et al. 2014), has two main drawbacks that can be improved by the PSO algorithm. First, the DE algorithm has a continuous search space which increases the computational complexity. Moreover, it extensively uses the greedy algorithm which increases the time complexity.
To be able to use the PSO optimization method, the box-covering problem has to be encoded in a discrete form using the position and velocity of so-called "particles". A particle represents a valid boxing of the network, more precisely, if the size of the network is N, then the position of the particle is an N-dimensional vector X = (x 1 , . . . , x N ) where the ith coordinate x i denotes the ID of the box where the ith node belongs to, moreover x i ∈ 1, 2, . . . , N . The velocity of a particle is an where v i indicates whether the ith node is ready to change its box, i.e., the value of x i can be changed.
The algorithm operates with a set (swarm) of particles and in every step, each vector (particle) may be updated depending on its velocity, its most optimal boxing, the whole swarm's best boxing, and on random variables as well. In the end, the best overall boxing is returned. The PSO algorithm is detailed in Algorithm 12. Note that we defined some auxiliary functions and operators before the actual algorithm. Unfortunately, the ⊕ operator is vaguely defined in the original paper, but we believe that the one below is the most meaningful interpretation of it.
In a follow-up paper, the authors propose a modification of the PSO algorithm (Wu et al. 2017), called multiobjective discrete particle swarm optimization (MOPSO) box-covering algorithm, which aims to solve two optimization problems simultaneously. Besides minimizing the number of boxes required to cover the network, the algorithm maximizes the fractal modularity ) defined by the boxing, which was motivated by the observation that fractal networks have a highly modular structure (Song et al. 2006). The pseudocode of the MOPSO boxcovering algorithm is almost the same as the PSO. The only difference is that in the last part of the code, where it checks whether the current solution is better than the population and global best, it is not enough to check the number of used boxes but a better solution is also required to have a higher modularity (Wu et al. 2017).

Fuzzy algorithm
The fuzzy box-covering algorithm and the corresponding concept of fuzzy fractal dimension were introduced by Zhang et al. (2014). The authors propose a novel scheme for estimating the fractal dimension of a network. Instead of assigning boxes, they introduce a measure to estimate the fraction of the network one box covers on average. For each node, a box of radius r B around a central node is constructed and the contributions of the nodes of the box are summed. This contribution is a so-called "membership function" that exponentially decays with the distance from the central node. After aggregating and normalizing these contributions, we get an estimation of the proportion of the network that a box covers on average. Taking its inverse gives the approximating box number.
It must be noted, however, that in our experience this approximating number of boxes is not meaningful, only the regressed d B fractal dimension may be of practical interest. Pseudo-code of the fuzzy box-covering method is presented in Algorithm 13. Note that our pseudo-code is slightly different from the code presented in Zhang et al. (2014) because we assumed that the networks are undirected, hence it is enough to calculate the distance between v j and v k only once.

Sampling-based method
Recently, Wei et al. (2019) proposed a novel way to tile a network, that could be considered as an "advanced version" and extension of the overlapping-box-covering algorithm, however, the authors did not compare it to the OBCA algorithm. Note that this proposal of Wei et al. is rather a framework than a particular algorithm.
There are two main stages in the covering process: the first step is the generation of many box proposals, for example by running the random sequential or CBB algorithms-n times. In the second phase, the authors suggest picking some of the proposed boxes in a greedy manner to tile the network. Obviously, a particular realization of the covering is defined by the employed algorithm, since the final cover consists of the boxes that the greedy strategy selected from the box proposals.
Besides CBB and random sequential, the authors proposed a so-called maximal box sampling strategy for sampling box proposals which is described in Algorithm 14. Wei et al. also proposed two greedy strategies: big-box first and small-box removal. These procedures are detailed in Algorithm 15. The authors show that the small-box-removal sampling considerably outperforms the big-box-first strategy and classical algorithms such as MEMB and CBB.

Further methods
Note that there are a few algorithms that are not implemented in our Python package yet, but for the sake of completeness, we briefly introduce these methods in this section. Schneider et al. (2012) introduced a box-covering algorithm that is usually referred to as minimal-value burning (MVB) algorithm in the related works. Their algorithm first creates a box around every node, and then it removes the unnecessary boxes. Although their algorithm is said to be nearly optimal, it comes at a price: its computational complexity is high, i.e., for regular networks it scales exponentially with the number of nodes. Akiba et al. (2016) introduced a sketch-based box-covering algorithm, which uses bottom-k min-hash sketch representation of the boxes and it is based on the reduction of the box-covering problem to a set-covering problem. The authors compare their algorithm to the GC ), CBB ), MEMB , and the MVB algorithm (Schneider et al. 2012). Akiba et al. also made their implementations of these algorithms publicly available, which can be found at http:// git. io/ fract ality.
Recently, Giudicianni et al. (2021) proposed a community-structure-based algorithm that can be used for networks with uniform degree distribution, i.e., where the nodes have roughly the same amount of connections such as in infrastructure networks (e.g., road, electrical grid, and water distribution networks).
Note that in this work we do not touch upon the concept of multifractality, but for the sake of completeness we mention that the sandbox algorithm, introduced by Tél et al. (1989), can be used to estimate the multifractal dimension of networks (Liu et al. 2015).

Network data
We have tested the implemented algorithms on numerous real-world networks, many of which are already known from the related literature (see e.g., Song et al. 2005, Wu et al. 2017. The analyzed networks are listed in Table 2.
Some of the considered networks are directed or contain more than one connected component. Since box-covering algorithms are developed for undirected and connected graphs, our policy was to connect nodes with an undirected edge if there was a link between them in any direction, moreover, we worked with the largest connected component of unconnected networks. We also remark that there were networks containing loops but we believe that they do not have any effect since any node has 0 distance from itself.
We summarize some simple structural features of these networks in Table 3. Besides the usual basic metrics, the value calling for more explanation is the GINI-score. It was introduced by the Italian statistician and sociologist Corrado Gini to quantify wealth inequality in society. This motivated the use of the metric since we wanted to account for the 'inequality of degree distribution' among the vertices of the network. In our work, the GINI-score is defined as follows: where k i stands for the degree of the ith node, N is the number of nodes and the sum in the numerator runs over the nodes sorted by their degree in ascending order. (1) Note that formula (1) gives 0 for "total equality" (all vertices have the same degree) and bounded by 1 + 1/N , that is practically 1 for large networks.
We note here that the clustering coefficient was calculated with the built-in networkx.average_clustering method with default arguments. The modularity was calculated from a partition with the Girvan-Newman method (see girvan_newman in networkx with default arguments). To calculate the modularity, we used quality. modularity from networkx with weight=None.

Evaluation
In this section, we turn to the evaluation and comparison of the implemented algorithms on the networks. Before starting the investigations, the reader is reminded that this is the point where we return to the evaluation convention for box sizes, see the introduction of "Algorithms" section. The difference between the two conventions might seem very minor and some authors argue that the particular definition has no effect on the fractal scaling, however, Rosenberg (2020) emphasizes that it can indeed make a large difference. Thus, it makes it essential for the evaluation part-especially for the box dimension approximation-to return to the more frequently used convention on the box size.

Preliminaries
Here we describe the boxing process we applied in our experiments. It is presented in a pseudocode format (see Algorithm 16), however, this pseudocode does not necessarily follow the Python code strictly in all cases. We note that for the merge algorithm, the actual process of measuring the running time is a bit more complicated since it computes N B values for all box sizes up to a maximal value (see Algorithm 8). We recorded the execution time plus the preprocessing time for all box sizes, including l B = 1 . Furthermore, to avoid counting the preprocessing time multiple times, we retained the total execution time by subtracting the preprocessing time (that is the running time of the algorithm with l B = 1 ) from the execution time of the l B ≥ 2 coverings, and finally, we summed up these "corrected" runtimes.
In the current analysis, we performed the box covering n = 15 times for each box size, for each network, for each algorithm. The algorithms and their parameter settingsalong with their respective abbreviations-are listed in Table 4. We intended to choose the shorthand notations to be as intuitive as possible.

Comparison of the number of boxes
The most obvious way to analyze our results is to prepare N B (l B ) plots and inspect them. In these plots, we show the mean N B value (denoted by N B ) against the box size l B for each network. We remind the reader that here, box sizes are to be understood in the evaluation convention, as detailed previously, i.e., it doesn't matter if the algorithm is radius-based or diameter-based, the compared boxes have the same equivalent size. To retain comprehensibility, we only plot around 5 algorithms in the same figure, where the algorithms are denoted with different colors and markers. For comparison, we included the greedy algorithm in each plot. Moreover, note that we only plotted for box sizes appearing in the greedy algorithm's output. Figure 5 illustrates the N B (l B ) plot for the Tokyo metro network with five algorithms, similar plots for all the other networks and algorithms are available in the supplementary material. 4 Similarly to the related works (Wu et al. 2017;Schneider et al. 2012), we also plot the difference of the N B values and the number of boxes returned by a baseline algorithm, which is denoted by N base B . In our work, the baseline N base B was chosen to be the minimal (best) value returned by the greedy algorithm (remember that we ran it 15 times for each box size). With this, we plot the difference of the mean N B and the baseline N base B , which is denoted by N B . Figure 5 shows the �N B (l B ) plot for the Tokyo metro network with five different algorithms, similar plots for all the other networks and algorithms are available in the supplementary material.

Comparison with performance scores
In the following step, we assess the performance of the algorithms with a performance score, defined as the normalized deviation from the baseline-that is the best output by the greedy algorithm-formally: Behold that the lower this score is, the better the coverage of the algorithm is. Also note that the score is defined for every boxing output, meaning that we have n = 15 scores for each algorithm, box size, and network. The reason behind the definition of this P performance score is that these values-expressing relative performance-can be aggregated over box sizes since they are "dimensionless". This is a very desirable property since we want to draw some conclusions from an intimidating amount of data. There is an additional caveat though, we shall define an appropriate interval of box sizes on which we investigate performance scores. The motivation behind limiting the analysis on certain box sizes is that the performance score is not so informative in those cases when the whole network is covered with very few boxes. In practice, the tail distribution of the N B is usually noisy, thus it is pruned from the fitting when the fractal dimension is estimated. Our criterion to consider a box size was that the baseline shall have at least ten boxes. Otherwise even if the difference between N B and N base B is only one, the relative change compared to the value of N base B would be misleadingly large. Following this line of thought, we also normalized the runtimes to assess the average speed of our algorithms. This is done by dividing the runtime of an algorithm by the time that was needed to calculate the N base B baseline number of boxes (coming from the output of the best greedy run). By doing this, we can get an overall speed measure for the algorithms.
At last, we are in the position to concisely define the way we compare the algorithms: we plot the mean performance scores versus the mean normalized runtimes with averaging over the accepted l B values (defined earlier). With this, we get one plot per network where the algorithms are represented by one marker as Fig. 6 shows. To increase information content, the area of the markers is proportional to the empirical variance of the performance score (so the radius of the dots is proportional to the standard deviation). Note that the sizes of markers on the plots for different networks are not to be compared due to varying scaling. Due to the great differences in  runtimes, we divided the plots into two sections, with the left having linear and the right having logarithmic horizontal axis (the plots join at 1.1)-see Fig. 6. The abbreviated name of the algorithms is written next to their corresponding marker. As a general rule, if the marker is barely visible due to the small standard deviation, the labels give guidance about the markers' location. There are some cases, where this would lead to ambiguity, which is avoided by arrows pointing to the position of the marker. Due to the extremely long running time, we had to terminate the simulated annealing and PSO algorithms on the E. coli network. Hence, these data points are missing from the set but we can confidently say that their absence does not alter our conclusions. The results on the Minnesota road network and dolphin social network are depicted in Fig. 6, similar figures for all the remaining networks are available in the supplementary material. 5 Finally, we conclude this section by comparing performance on multiple networks on the same plot. We chose the following real networks: Minnesota road network (min), Tokyo metro (tok), computer science Ph.D. 's collaboration network (phd), A. fulgidus network (ful), and E. coli cellular network (eco). The reason why we chose these networks is that the number of accepted box sizes was more than one. To make the figures clean, we formed three groups of the algorithms and plotted these groups in separate graphs, the results are shown in Fig. 7. For comparison, the results of the greedy algorithm are always shown. Although we stress that this investigation is insufficient for deciding which algorithms are "the best", we can observe some patterns that will allow us to propose general conclusions in "Discussion and conclusion" section.

Remark about variances
In the analysis so far, we used the variance of the P performance score on the acceptance region, which is determined by the number of boxes in the corresponding baseline value. One shall notice that in general, this deviation can come from two sources. First, the intrinsic variance of the algorithm, which is the uncertainty of the outcome for a fixed box size (on a given network). The second possible source is that in the acceptance range the average performance of the algorithm varies with respect to the baseline value.
In the plots so far, we accounted for both types of variances, since we considered the variance of all the P scores in the acceptance region. However, it would be also informative to assess the intrinsic variance too, since our baseline is just another approximating algorithm, even if it is an established one. This comparison is done in Table 5 where the mean intrinsic standard deviations and the total standard deviation of the P performance scores are reported (considering scores only for the accepted box sizes). It is interesting to see that the "volatile" algorithms depicted in the third subplot of Fig. 7 behave differently through this lens: for example, the intrinsic deviation is often much smaller than the total deviation for the merge algorithm and especially for the REMCC, however, in the case of the random sequential, they are much closer to each another. Hence the readers should behold the difference between the notions of intrinsic and total deviations and use the appropriate one for their purpose.

Approximating the box dimension
As a final step in the evaluation, we aim to estimate the fractal dimension of the investigated networks. Here we returned to the mean N B values for each algorithm, and we did not impose the acceptance criterion for the l B box sizes for this analysis.
We plotted the logarithm of the average N B versus the logarithm for l B and tried to fit a linear function to it. This seemingly easy task involved a lot of subjective judgments on whether the relationship could be called linear and on what range of box sizes shall we perform the regression and if there are enough points on the plot to be able to make a confident statement. Behold that fitting linear functions by consulting the plot and adjusting the setup if necessary is a prerequisite for having meaningful results. In practice, the log(N B ) − log(l B ) plot is mostly not linear on the whole range. Generally speaking, the characterization of power laws in empirical data is cumbersome, since the tail of the distribution is usually unreliable due to large fluctuations, furthermore, the identification of the range, where the power-law relation holds is difficult (Clauset et al. 2009).
In this experiment, there may be three outcomes: 1) the fit is performed and the relationship is found to be linear, 2) it is realized that the plot is not "linear enough", or 3) the fit was not possible to carry out (missing or too few points: mouse brain (mou), and E. coli (eco) networks). In addition to the estimated fractal dimension, we also accounted for the error of the fit (sum of squared errors) with the following expression: where x and y stand for the log(l B ) and log(N B ) values, respectively. The numerator is the sum of squared residuals of the fit in the denominator, x is the mean of x values, n is the number of data points in the fit. Due to the subjective nature of the experiment, the first two authors have performed this fitting procedure independently and then compared the results. Numerical values are reported in Tables 6 and 7.

Discussion and conclusion
In this final section, we summarize our observations and point out some directions our analysis could be extended.

Comparison of algorithms
We compared the algorithms primarily based on their P performance scores that measure relative performance to the baseline values (the best result of the greedy coloring algorithm). The main tool of comparison is to plot the mean P scores against the average relative runtime with respect to the baseline for networks with sufficiently large diameters. The radius of the markers is proportional to the standard deviations of the P performance score. Firstly, the P scores and runtimes shall be considered and the standard deviations should be considered only as a secondary feature due to the multiple sources of variance.
The results shown in Figs. 6 and 7 might allow us drawing the following conclusions. First, it should be noted that there are many algorithms that turned out to be way too slow on the analyzed networks: differential evolution (DE), particle swarm optimization (PSO), sampling with maximal box sampling, and simulated annealing (SA). We must also note that these methods yield relatively low box numbers (which is favorable) but the price for this seems to be too heavy. It is also interesting to note that three of these methods (DE, PSO, SA) are metaheuristic algorithms with several hyperparameters. We followed the instructions of the original papers to set the parameters but it seems that tuning the hyperparameters has a high effect on the results and it might be troublesome.
Our results are congruent with that of the related works, for example, Locci et al. (2010) reported the running time of the greedy coloring, merge, and simulated annealing algorithms, and their results also show that the order of magnitude of the runtime of the SA algorithm (1177 s) is twice as much as that of the GC (13 s) and the merge algorithm (21 s) on the E. coli network. Furthermore, Akiba et al. (2016) reported that the MEMB algorithm is faster than the greedy coloring and the CBB algorithms, however, in our experiments the difference between the time complexity of the CBB and MEMB algorithm was smaller.
Secondly, there is a group of relatively inaccurate methods as random sequential, REMCC, merge, and sampling with random sequential. Although these algorithms have acceptable runtimes, they fail to have good performance consistently.
The third group is which may be called "the most desirable" according to our criteria, that consists of the MCWR, OBCA, MEMB, and CBB algorithms. These algorithms are Table 4 The analyzed algorithms with their respective parameter values The leftmost column stands for the abbreviation. In some plots, the expanded version of these abbreviations appears but it is clear what they mean quite decent both in terms of speed and accuracy. Note that OBCA is the most accurate in this group but it has sometimes a relatively long running time, and in its current implementation it is not applicable if one is also interested in the created boxes, however, a valid node-box assignment could be done easily. In some sense, the greedy coloring algorithm also belongs to this third group but it is usually slightly slower than the other members of this group. From Figs. 6 and 7 we can observe that the performance of the MEMB and the greedy algorithms are very similar, which has been also shown by Wu et al. (2017). From Fig. 7 and Table 5, it is apparent that the random sequential and the REMCC algorithms have the highest variance in the performance scores, Moreover, the RS algorithm has also high intrinsic variance that has been also pointed out by Song et al. (2007).
One may be tempted to draw some general conclusions about the relation between the structure of the network and the performance of the algorithms, but it is a very challenging task considering the limited number of analyzed networks. However, without going into details about the qualitative behavior of the algorithms, we can notice that the "fast but inaccurate" random sequential and merge algorithms achieve one magnitude lower performance scores on the Minnesota road network than on the C. elegans network. We speculate that this may occur because the road network has concentrated degree distribution (most of the nodes have 2-4 neighbors), which is also shown by the low GINI  Table 3), whereas in the C. elegans network, the degree distribution is much more uneven (higher GINI score).

Comparing the estimated fractal dimensions
As estimating the fractal dimension is one of the main points of box-covering algorithms, we also investigated what fractal dimensions the various algorithms yield on the investigated networks. Tables 6 and 8 suggest, that the dimension of the fractal networks is typically between 1.5 and 3.5. For example, the dimension of the Tokyo metro network is around 1.5, which makes sense, since the metro network is a composition of path graphs, however, at the transfer stations, these one-dimensional graphs are joined, which increases the fractal dimension of the network. On the other hand, clearly, a metro network does not contain as many junctions as the Minnesota road network, whose fractal dimension is around 2.0, as one would expect from a grid-like network.
We can observe that the estimated fractal dimension varies with the applied boxcounting algorithm significantly. For example, on the phd network it ranges from 1.8 (REMCC) to 2.5 (merge) at the first experimenter and from 2.0 (REMCC) to 2.4 (merge) at the second experimenter. Similarly, the estimated fractal dimension of the Minnesota road network ranges from 1.6 (REMCC and SM) to 2.1 (merge) at the first experimenter and from 1.6 (SM) to 2.2 (DE and OBCA) at the second experimenter. Interestingly, the dimension estimated using the merge algorithm is generally much higher than the dimension resulted from the other algorithms. It is due to the fact that for small l B values the merge algorithm performs quite poorly but for larger l b values its performance increases, see Fig. 8.
In related works, the reported dimensions are close to our approximations. However, we found that the estimated fractal dimension not only depends on the applied algorithm but also on the way the regression is carried out, especially when the diameter of the network is small. For example, our first experimenter found that the d B dimension of the dolphin network is around 1.8, which is in alignment with the Fig. 6 Plot showing the mean P performance scores against the mean normalized runtime on the Minnesota road network and dolphins social network. The size of markers are proportional to the the variance of the performance score. Behold that the right side of the plots has logarithmic scale for the running time findings of Zheng et al. (2016) who found that the dimension is 1.86 using the MEMB algorithm. Deng et al. (2016) also reported similar values for dolphin network estimated by the MEMB, CBB, and greedy coloring algorithms: d B = 1.59 , d B = 1.90 , and d B = 1.83 , respectively. On the other hand, the estimated d B of the dolphin network according to our second experimenter is between 2.3 and 2.4, which is congruent with the result of Rosenberg (2020), who reported a dimension of 2.38. Hence, we can conclude that in the literature the reported fractal dimensions of the networks, especially of the small ones, are subject to the experimenters' subjective views, more specifically, what range of l b values is used for fitting.
In the case of the other networks, the difference between the two experimenters is not so significant, however, the first experimenter's estimated values are usually below that of the second experimenter, which is probably due to the fact that the second experimenter fitted the line on a smaller range of l B values. This is also the reason why there are many missing data in the first experimenter's estimations.
According to our experimenters, most of the box-covering algorithms' results suggested that the fractal dimension of the E. coli network is around 3.5, which is close to   (2014) also estimated the fractal dimension of the C. elegans network with the OBCA and the CBB algorithms: d B = 2.99 and d B = 2.95 , which are also close to the estimations of our experimenters, which is around 3.0.     While the first experimenter did not find the estimation eligible for the Polbooks network, according to our second experimenter, its box dimension is around 2.4, which is also in alignment with the findings of Deng et al. (2016), who reported the following fractal dimension values for the Polbooks network approximated by MEMB, CBB, and the greedy coloring algorithm: d B = 1.85 , d B = 2.29 , and d B = 2.27 , respectively.
Note that the Facebook social network and the brain network could not be considered fractal networks, since the N B decays rather exponentially with the size of the boxes, and the estimated dimension would be above 5 in these networks.
The related works do not report the error of the regression analysis, however, we believe that to gain a better understanding of the goodness-of-fit of the line, it is also important to measure the SSE values, which are shown in Tables 7 and 9. We can observe that these SSE values are relatively large that is due to the small number of observations, especially in the case of the second experimenter (Table 9). These results also support the fact that the exact choice of the range of the l B values where the linear regression is carried out, significantly influences both the value and the error of the estimated fractal dimension. We argue that the impact of the range on estimating the dimension might be greater than the impact of the applied box-covering algorithm itself, especially in the case of small networks. Unfortunately, there is no canonical method for determining the appropriate range where the fractal scaling holds, which calls for further research.

Limitations and future work
In this review, we attempted to be as comprehensive as possible, but the dynamic improvements in the field of fractal networks and box-covering algorithms make it hardly possible to be fully comprehensive. However, since our implementation is open source, one can easily integrate novel algorithms into our package. It is also fair to acknowledge that our analysis contains some arbitrary decisions. For example, the number of repetitions at each l B or the way we defined the acceptance region for box sizes might seem arbitrary. We believe, however, that for a meaningful comparison with a comprehensible performance measure, the aggregation of the results based on some arbitrary decisions is inevitable.
We only reported on the total variance of the G performance scores of the algorithm, but one can argue that in some situations the intrinsic variation is more informative to use. Furthermore, we assessed the runtime of the algorithms by averaging the normalized runtimes. In this setting, the relative runtimes have the same 'weight' for each box size what we believe is desirable but one could be more interested in the cases where the running time is quite long and not so much in the "easy and fast" cases.
Due to the shortcomings of the evaluation framework, this study alone is by no means authorized to give a final verdict on the algorithms. We believe that the proposed framework together with the open-source code base is an important contribution to the community and it can serve as a starting point for future investigations. To gain a better understanding and make a more confident judgment on the performance of the algorithms, a higher number of networks from various domains could be used, perhaps together with more illuminating performance measures.
Moreover, another interesting further direction of research is to study box-covering algorithms on mathematical network models.