^{1}

^{1}

^{*}

^{1}

Coarse graining of complex networks is an important method to study large-scale complex networks, and is also in the focus of network science today. This paper tries to develop a new coarse-graining method for complex networks, which is based on the node similarity index. From the information structure of the network node similarity, the coarse-grained network is extracted by defining the local similarity and the global similarity index of nodes. A large number of simulation experiments show that the proposed method can effectively reduce the size of the network, while maintaining some statistical properties of the original network to some extent. Moreover, the proposed method has low computational complexity and allows people to freely choose the size of the reduced networks.

Many complex systems in reality can be abstracted into complex networks [

Since the similar topological properties of real-world complex networks, it is a hot topic to study the commonness of networks and the universal methods to deal with them [

In the past decade, some well-known coarse-graining methods have been proposed [

In general, degree is the simplest and most important concept to describe the attributes of a single node. In undirected networks, the degree of node i is defined as the nature of nodes, is not only related to the degree of the nodes, but also to the degree of their neighbor nodes, the number of edges connected to i, i.e., the number of neighbor nodes of node i. Actually, the nature of nodes is not only related to the degree of the nodes, but also to the degree of their neighbor nodes. From the point of view of information transmission, the more common neighbors of two nodes, the more similar information they receive and the ability to receive information. From the perspective of information transfer, the more common neighbors the two nodes have, the more similar information they receive and the ability to receive information. In this paper, based on the similarity index of nodes, we introduce a new possible coarse-graining technique. According to the number of common-neighbor of nodes in the network, the algorithm describes the similarity between nodes and extracts the reduced network by merging similar nodes. The method is computationally simple, and more importantly, the size of the reduced network can be accurately controlled. Numerical simulations on three typical networks, including the ER random networks, WS small-world networks, SF networks reveal that the proposed algorithms can effectively preserve some topological properties of the original networks.

Consider a complex network G = ( V , E ) consisted of N nodes, where V is the set of nodes and E is the set of edges. The adjacency matrix A = ( a i j ) describes the topology of the network. In general, a i j = 1 indicates the presence of an edge, while a i j = 0 stands for the absence of edges. For an undirected unweighted network, whose adjacency matrix A must be symmetric matrix, i.e. a i j = a j i and the sum of the i’th row (or the i’th column) elements of the matrix A is exactly the degree k i of the node i. Here we use the Jaccard similarity index to calculate the similarity between node pairs. The similarity between any node i and node j in the network is defined as:

s i j = | Γ ( i ) ∩ Γ ( j ) | | Γ ( i ) ∪ Γ ( j ) | , (1)

Let Γ ( i ) denote the set of neighbor nodes of node i, | Γ ( i ) | is the cardinality of the set Γ ( i ) . Mathematically, | Γ ( i ) | = k i . Γ ( i ) ∩ Γ ( j ) is the common neighbor node set of node i and j， Γ ( i ) ∪ Γ ( j ) is the union of the neighbor nodes of node i and j. By Equation (1), s i j = s j i , it shows that the similarity with the node i itself is 1. And if the node i and the node j have no common neighbor nodes, then their similarity is zero, i.e., s i j = 0 ,so 0 ≤ s i j ≤ 1 . Because s i j describe the degree of local structure similarity between the node i and j. We treat s i j as the local similarity index.

The similarity between the node i and other nodes in the network can be expressed by a Ndimension vector s i = ( s i 1 , s i 2 , ⋯ , s i N ) T . The larger the value ∑ j = 1 N s i j is, the more nodes in the network are locally similar to the node i. Therefore, we extend the Equation (1), the global similarity index for node i in the network is defined as follows:

g s i = ∑ j = 1 N s i j . (2)

The larger g s i is, the more likely the node i will be the cluster center of some similar nodes.

It is noted that coarse-graining methods have to solve two main problems: one is the emergence of nodes, that is, to determine which nodes should be merged; And the second is how to update the edges in the process of coarse graining. In the following content, the noded similarity coarse-graining scheme is introduced from these two sides.

Suppose we are going to coarse grain a network containing N nodes to a smaller one with N ˜ ( N ˜ < N ) nodes. First, we need to select N ˜ cluster center, perform the clustering algorithm to get the corresponding N ˜ cluster, then merge nodes in the same cluster.

In order to select N ˜ suitable nodes as the cluster centers, it is necessary to ensure that the extracted cluster centers have as much high global similarity as possible (with as many nodes as possible in the network). It is also required that the local similarity between the two clustering centers should not be too high (otherwise, they may belong to the same cluster, only one of them could be the clustering center).

The detailed steps for selecting N ˜ clustering are shown as follows.

Step 1: Get the local similarity and the global similarity of each node in the network. The sequence g s v 1 , g s v 2 , ⋯ , g s v N of the generalized degree of N nodes has been sorted in decrease order.

Step 2: Set V S be the set of cluster centers. Firstly, put the node v 1 which corresponding to the maximum global similarity g s v 1 into V S , denoting V S = { v 1 } . Secondly, pick the node v 2 corresponding to the second largest global similarity g s v 2 , if s v 1 v 2 < N ˜ N × β (N and N ˜ are the size of the coarse-grained networks and original networks respectively, β is an adjustable parameter). It indicates that the node v 2 and v 1 are not in the same cluster, so v 2 could be the second cluster center. Push v 2 into V S , denoting V S = { v 1 , v 2 } . Otherwise, if s v 1 v 2 ≥ N ˜ N × β , which means that the local similarity between the node v 2 and v 1 is too high and these two nodes may belong to the same cluster. Then v 2 cannot be put into V S as a new cluster center. Continue to select the cluster centers in the order of g s v 1 , g s v 2 , ⋯ , g s v N , the new cluster center node v i has to satisfy: s v i v j < N ˜ N × β , v j ∈ V S . In this way, stop selecting the new cluster centers until the number of V S reaches N ˜ , denoting V S = { v 1 , v 2 , ⋯ , v N ˜ } .

Step 3: Take v 1 , v 2 , ⋯ , v N ˜ as the cluster centers respectively, their corresponding clustering sets are described as M 1 , M 2 , ⋯ , M N ˜ . And then cluster the remaining N − N ˜ nodes in the network (the collection of the remaining nodes is represented as: V ¯ S = V − V S ). In order to find the clustering set M j that the node v i of the V ¯ S belong to, our objective is to find:

min v j ∈ V S ∑ v i ∈ V ¯ S ‖ s v i − s v j ‖ 2 (3)

where, s v i ( s v j ) is corresponding to the local similarity of the node v i ( v j ) with other nodes in the network. ‖ s v i − s v j ‖ 2 is the L 2 norm between nodes v i and v j , which is also called the Euclidean distance. Repeat operations until all nodes in V ¯ S are merged with N ˜ cluster centers. Finally, we will get N ˜ clustering sets.

N ˜ clustering sets have been obtained from the section 3.1, merge nodes in each cluster and get N ˜ coarse-grained nodes. To keep the connectivity of the reduced network, the following step is to update edges, the detailed content is as following:

Definition of weight. The set of nodes in ith cluster is defined as M i ( M i is also the ith node in coarse-grained networks). We re-encode the weight, specifically:

W M i M j = ∑ i ∈ M i , j ∈ M j a i j , i , j = 1 , 2 , ⋯ , N ˜ , i ≠ j , (4)

where, a i j is the element in the adjacency matrix A = ( a i j ) of the original network. And W M i M j is the weight of the edge between node M i and M j .

Definition of edge. The edge e M i M j between nodes M i and M j is defined by:

e M i M j = { 0 W M i M j < max ( | M i | , | M j | ) N ˜ N 1 otherwise (5)

| M i | , | M j | separately represent the number of the nodes in ith, jth cluster. As presented above, the framework can preserve the edges between the clusters (each cluster corresponds to a coarse-grained node) that are closely related to each other in original networks. Moreover, it can prevent the network from reducing into a fully connected network. And removing the weight of the edges, only displaying the topology structure of the coarse-grained networks, is conducive to keep some statistical properties of the original networks. In particular, if the network becomes disconnected after deleting the edge e M i M j , then reconnect the edge e M i M j in order to ensure the connectivity of the network. Now we can create an undirected unweighted network after two steps as described above.

To better illustrate the algorithm we proposed, this section will apply the noded similarity coarse-graining scheme on the small toy network, as shown in

A 9-node toy example is shown in

g s 3 , g s 4 , g s 8 , g s 9 , g s 5 , g s 6 , g s 1 , g s 7 , g s 2 . It can be found that: g s 3 = g s 4 = 3.1 and g s 8 = g s 9 = 2.65 . Intuitively, the two yellow circular nodes like the two green diamond nodes have totally equal topological roles. According to step 2 of section 3.1, put the largest global similarity node 3 into the set V S , namely, node 3 is the first cluster center corresponding the clustering set M 1 . Then take the node 4 to compare with the cluster center node 3. From the Equation (1),

Γ ( 3 ) = Γ ( 4 ) = { 2 , 5 } , s 34 = 1 > 7 9 × 0.7 . Hence the node 4 cannot be placed as a

cluster center into to the set V S . And so on, the set of cluster center V S = { 3 , 8 , 5 , 6 , 1 , 7 , 2 } with the collection of the remaining nodes V ¯ S = { 4 , 9 } can be obtained, while each element in V S corresponds to the clustering sets:

the node 4 and the cluster center node 3 is the smallest one, so the node 4 should be merged with the cluster center node 3 together. They belong to the same clustering set

definition in Equation (3),

This section is devoted to an extensive numerical demonstration to investigate several properties of the noded similarity coarse-graining networks, including the average path length, average degree and clustering coefficient. Recently, the average path length, average degree and clustering coefficient are the three most concerned topological properties in the research of complex networks. They describe more explicit information about the various aspects in the networks. Specifically, we will give a clearer definition in the following sections. And for simplicity, we main consider three typical networks (the ER random networks, the WS small-world networks and the SF scale-free networks). To better illustrate the effect of the proposed method on these topological properties. We investigate our method with different values of

Additionally, for each type of the artificial complex networks, we fix the size of these networks as

The average path length L between two nodes is defined by:

where N is the size of the network,

different rewiring probability p and coordinator number K. For the SF networks under optimal

Average degree

The evolutions of average degree

As displayed in

The clustering coefficient measures the edge connection probability between the neighbor nodes in a network. The clustering coefficient

average degree

where

The result shows that the optimal parameter

coincide with each other especially

The coarse-graining techniques are promising ways to study the large-scale complex networks. In this paper, we have developed a new algorithm to reduce the sizes of complex networks. This method is based on the local similarity and the global similarity of nodes, which is more suitable for the original intention of coarse graining. Particularly, we introduce a tuning parameter

This project is supported by National Natural Science Foundation of China (61563013, 61663006) and the Natural Science Foundation of Guangxi (No. 2018GXNSFAA138095).

Wang, Y.Y., Jia, Z. and Zeng, L. (2018) Coarse Graining Method Based on Noded Similarity in Complex Network. Communications and Network, 10, 51-64. https://doi.org/10.4236/cn.2018.103005