J. Software Engineering & Applications, 2010, 3: 185-189
doi:10.4236/jsea.2010.32023 Published Online February 2010 (http://www.SciRP.org/journal/jsea)
Copyright © 2010 SciRes JSEA
185
Research on LFS Algorithm in Software Network
Wei Wang, Hai Zhao, Hui Li, Jun Zhang, Peng Li, Zheng Liu, Naiming Guo, Jian Zhu, Bo Li,
Shuang Yu, Hong Liu, Kunzhan Yang
Information Science and E n g i n e e ri n g Northeastern University, Shenyang, China.
Email: wangweiwin1@163.com, zhhai@neuera.com
Received October 11th, 2009; revised October 26th, 2009; accepted November 5th, 2009.
ABSTRACT
Betweenness centrality helps researcher to master the changes of the system from the overall perspective in software
network. The existing betweenness centrality algorithm has high time complexity but low accuracy. Therefore, Layer
First Searching (LFS) algorithm is proposed that is low in time complexity and high in accuracy. LFS algorithm
searches the nodes with the shortest to the designated node, then travels all paths and calculates the nodes on the paths,
at last get the times of ea ch node being traveled which is betweenness centrality. Th e time complexity of LFS algorithm
is O(V2).
Keywords: LFS, Software Network, the Shortest Path, Betweenness Centrality
1. Introduction
It is over fifteen years since Norman Fenton outlined the
need for a scientific basis of software measurement. Such
a theory is a prerequisite for any useful quantitative ap-
proach to software engineering, although little attention
has been received from both practitioners and researchers.
Measurement is the process that assigns numbers or
symbols to attributes of real-world entities. Unfortunately,
many empirical studies of software measurements lack a
forecast system that combines measurements and pa-
rameters in order to make quantitative predictions. How
can we overcome these limitations?
Here we present a new approach to software engineer-
ing based on recent advances in complex networks. As a
typical parameter and an important global statistics, be-
tweenness centrality can meet the need s of researchers to
know the inter-reaction of software network from the
overall perspective. The definition of betweenness cen-
trality is the times of a node being traveled in all the
shortest paths of the software network and it reflects the
influence of the node in the whole network.
The existing betweenness centrality algorithms are ei-
ther based on the shortest path or merely approximate
algorithm [1–3], and both have its own shortcomings.
First, the traditional shortest path algorith m has high time
complexity and costs too much time to be available in
large network, which makes it impossible to put the al-
gorithms into practice. Th e research on approximation of
betweenness centrality is unavailable for low accuracy
which brings great obstacle for further study [4,5].
As mentioned in [6], the averag e path length is similar
to normal distribution with the mean of 15.21. LFS algo-
rithm is proposed based on this method.
The dissertation contains three parts: First, take
Dijkstra algorithm as example to conclude shortcomings
of traditional algorithms. Then, propose LFS algorithm.
At last, compare these two algorithms to show the ad-
vantages of LFS algorithm.
2. Traditional Dijkstra Algorithm
The complexity of betweenness centrality comes from
calculating the shortest path between each two nodes in
network, and the time complexity of Dijkstra is O(n3).
The existing algorithms based on the shortest path con-
tain Dijkstra, Floyd-Warshall, and Johnson. Dijkstra is
the most popular and classic algorithm.
The idea of Dijkstra is like this. Abstract the network
into a graph, Put the isolated nodes (the nodes with
out-degree and in-degree being both 0) in set V and the
nodes having been traveled in set S, then calculate the
shortest path from vi to any node in the graph. Move the
node vk with the shortest path from V-S to S till V-S is
empty.
Initialization: Set up a two-dimensional array a to
mark whether the shortest path has been found out be-
tween the two nodes. Set up a none-dimensional array to
store the betweenness centrality. Set up a adjacency ma-
trix arcs with 1 if there exist edge between the nodes,
else with . V is a set of all nodes and S is a set of all
marked nodes. The value from vi to vj is initializes as
Research on LFS Algorithm in Software Network
186
follows:
D[j]=arcs[vi][j] vjV
Then put vi int o S.
Pick up vk which satisfies
D[k]=min{D[j] | v jV-S}
vk is the end point of th e shortest p ath starting from vi.
Put vk into S.
Calculate the length of the shortest path from vi to
each accessible node in set V-S
D[k]+arcs[k][m]<D[m]
and revalue the D[m] as
D[k]+arcs[k][m]=D[m]
Add 1 to betweenness centrality of nodes if only
they are on the shortest path from vi to any node in the
graph. Then mark these nodes being travelled in the two
dimensional array a. Repeat , n-1 times. At last, it
gets all the shortest paths from vi to the other nodes in
the graph.
It is the end.
Repeat the process for n times and get the shortest path
between each two nodes. Since each time contains a
two-cycle, the time complexity of Dijkstra is O(V3). The
space complexity is T(V2).
3. LFS Algorithm
According to the description to Dijkstra, it has a
three-cycle and find out the shortest path between each
two nodes which, then add 1 to between centrality of the
nodes being on the path . The application range is limited
for its high time complexity a nd the time consumption is
unavailable when it is applied into large network of
thousands of nodes. To solve this problem, the disserta-
tion proposed a new algorithm (LFS) w ith the help of the
features of software network.
As a number of papers mentioned [7–9], the length of
shortest path between each two nodes wouldn’t exceed a
constant. For example, [1] says the average of the length
is 15.21. Based on this method, LFS is proposed. Starting
from a node in the network, put the connected nodes with
the shortest path of 1 into array, and then put that of 2
into the array and so on till that of n in the array but
there’s no co nnected nod e with shortest p ath of n+1. Ad d
1 to the nodes on the paths which just have been found.
The time complexity of LFS is the summation of length
of all the shortest path between each two nodes, that is
O(V2). Compared with Dijkstra, the time complexity of
LFS is reduced obviously.
The preparation of LFS is similar to Dijkstra: abstract
the network into a graph, set up an adjacent list which
makes single-link lists for all non-isolated nodes in the
graph and the i-list contains the nodes directly connected
to the non-isolated node vi. Each node is composed by
two parts: neighboring nodes field (adjvex) and linking
field (nextar c). The neighboring nodes field marks wh ere
the nodes connected to node vi are in the graph and the
linking field marks the next node. Each single-link has a
head node which is composed of data field (data) and
linking field (firstarc). Data field marks the number of vi
in the graph and liking filed marks the first node that is
connected to vi.
Initialization: Set up a two-dimensional array c ini-
tialized maximum to store the length of the shortest path
between each two nodes in the graph, then set up a
one-dimensional array bc initialized 0 to store the be-
tweenness centrality of each node. V is the number of
non-isolated nodes in the graph.
Set up array a and b. a is used to restore the end
node of the shortest path. b is used to restore the number
of nodes with the same starting node and the same length
and put the starting point into a and put la=0 into b. At
last, set the relative element 0 in array c.
Judge whether it is the end of array a. If it is, turn to
; else turn to.
Check out the number of nodes with the length of
shortest path la, and set it to be n.
Travel the next node in array a and the find out all
child nodes which are connected to this node (parent
node) directly. If the value from starting node to this
node in array c is bigger than la, set the value from the
starting node to this node and the value from this node to
starting node in array c to be la+1, then pu t the id of par-
ent node and the id of child node into a. The same nodes
being put into an m times means that there are m shortest
paths from the starting node to this node. Add 1 to num
which is the number of nodes on layer la+1. Because the
id of parent node can be found by the id of child node,
the shortest paths from the starting node to the other
nodes in array a can been found, then add 1 to the nodes
on each shortest path.
repeat n-1 times.
put num into array b, la=la+1, turn to .
repeat - V-1 times.
It is the end.
According to the description above, the time complex-
ity of LFS is the summation of all th e shortest path in the
network, that is O(V2). And V is the number of
non-isolated nodes in the network. The space complexity
is T(V2) which is equal to that of Dijkstra.
The comparison between the time complexity and
space complexity of Dijkstra and LFS:
Table 1. Time complexity and space complexity of Dijkstra
and LFS
algorithm time complexity space complexity
Dijkstra OV3 T(V2)
LFS OV2 T(V2)
Copyright © 2010 SciRes JSEA
Research on LFS Algorithm in Software Network
Copyright © 2010 SciRes JSEA
187
4. Performance Evaluations
Dijkstra takes breadth-first method to travels all the
nodes in the software network, find out all the shortest
paths, obtain the nodes on the shortest path, [10–12] and
calculate betweenness centrality. Dijkstra has a three-
cycle which makes the time complexity is so high that it
is a fatal shortcoming when applied into large-scale
software network.
LFS only has one-cycle, and travels nodes in the net-
work one by one, then find out the shortest path from
starting node to the other nodes. The time complexity of
LFS is O(V2) and the space complexity is T(V2). De-
pending on what mentioned above, LFS has great advan-
tages both in time complexity and in space complexity.
With the development of computer science, the com-
puter memory becomes bigger and bigger which can sat-
isfy all kinds of demands and no longer need to be con-
sidered. So we spend no more time on discussing space
complexity and merely pay attention to time complexity.
We get satisfying result with the help of HP computer
which is composed of Core Duo 6300 CPU, 1.86GHz
Frequency, DDR2 667 1GB Memory and Windows XP
SP2 Operation System.
In order to verify that LFS has great advantages in
time complexity, 22 samples are selected and sorted as-
cending which can justify whether LFS is applicable to
software of different sizes. The comparison of time con-
sumption between Dijkstra and LFS is shown as follows:
DT is the time cost by calculating betweenness cen-
trality of the software by Dijkstra, and WT is that by LFS.
Time units are seconds. DT/WT which marks advantages
of LFS in time consumption is the ratio of DT and WT.
Since both Dijkstra and LFS are related to the number
of non-isolated nodes in the software network, the rela-
tion between these two kinds of algorithm is shown
through the number of non-isolated nodes by the follow-
ing figures. Figure 1 shows the time cost by these two
algorithm when measure the same software. Figure 2
shows the multiples that LFS saved when compared to
Dijkstra.
As shown in Figure 1, when the number of
non-isolated nodes goes up, the time consumption of
Dijkstra appears a clear upward trend, while the time
consumption of LFS changes smoothly. And the larger
the number of non-isolated nodes the more obviously
the difference appears, which approves that the times
consumption would not increase as quickly as the
number of nodes, but Dijkstra is on the contrary.
As shown in Figure 2, Dijkstra cost more than 10
times even tens of times of time than LFS when they
are applied into the same software. With the growth of
the number of nodes, the curve has a clear upward
trend although it is fluctuant sometimes. It approves
that the advantages of LFS in time complexity appears
more and more obviously as the software becomes
larger and larger under the same circumstance, which
says that LFS is especially fit for large-scale software
network.
Table 2. The comparison of time consumption betw ee n Dijkstra and LFS
software number of
nodes number of
edges number of non-
isolated nodes DTs WTs DT/WT
waimea 116 193 86 0.359 0.032 11.22
kicad 212 300 180 0.609 0.110 5.54
ktorrent 263 335 217 3.313 0.250 13.25
rhythmbox 366 342 252 8.349 0.531 15.72
filezilla 431 577 358 5.500 0.563 9.77
licq 574 633 491 12.110 1.282 9.45
freemind 713 933 562 53.172 1.812 29.34
espgs 1339 1271 955 150.094 7.063 21.25
abiword 1300 2124 1167 384.391 11.531 33.34
ArgoUML 2031 2217 1731 747.093 31.718 23.55
kdegraphics 2014 3498 1749 1036.781 44.688 23.20
mysql-5.0.56 3132 3837 2182 1685.828 54.453 30.96
mysql_6.0.6 3793 5368 2889 3131.318 104.531 29.96
kdepim 3518 4447 3008 2933.594 136.047 21.56
koffice 4580 5892 3883 4853.296 185.891 26.11
linux 7343 6045 4238 6756.612 296.313 22.80
resin 5076 7875 4261 10613.281 389.072 27.28
node 5418 11451 5418 13693.187 553.985 24.72
firefox 9261 15533 5781 18167.438 725.530 25.04
firefox 7100 48236 7100 24267.391 942.793 25.74
Mozilla 8354
13878 7195 37298.863 1315.750 28.35
firefox 10115 17469 8830 120818.089 3152.580 38.32
Research on LFS Algorithm in Software Network
188
Figure 1. The time cost by Dijkstra and LFS
Figure 2. The multiples that LFS saved when compared to Dijkstra
As shown in Figure 2, the value of DT/WT shows an
upward trend in volatility. Due to the relationship be-
tween the time complexity of LFS algorithm and the
complexity of the network itself, the value of DT/WT
depends on the size of the network which is equals to
the sum of the length of the entire shortest path. When
Copyright © 2010 SciRes JSEA
Research on LFS Algorithm in Software Network 189
the software network is a little more complex and the
lengths of shortest paths between some nodes are rela-
tive longer, LFS algorithm’s time complexity will in-
crease. For the same reason, when a more uniform dis-
tribution of the network and the lengths of shortest
paths between some nodes are relative shorter, the ad-
vantage of LFS is fully reflected that the time com-
plexity of LFS drops to relative small and the time
consumption comes down too, but Dijkstra doesn’t
have such feature. Therefore, the value of DT/WT
shows an upward trend in volatility. However, the
curve appears an upward trend from the whole.
As mentioned above, the low time complexity of
LFS can meet the needs of starting research on
large-scale software and solve the problems brought by
time complexity in the software measurement. At the
same time, the accurate data obtained will bring accu-
rate and reliable conclusion in practical research work.
5 Conclusions
LFS can solve a series of problems brought by the tra-
ditional algorithm. It has advantages both in time com-
plexity and accuracy which are so important in practi-
cal research work that may result in disaster conclusion
without it. LFS improve the efficiency and the accu-
racy to calculate the betweenness centrality, which
ensures the further research to be continued smoothly.
REFERENCES
[1] M. Pióro, A. Szentesi, J. Harmatos, A. Juttner, P. Ga-
jowniczek, and S. Kozdrowski, “On open shortest path
first related network optimization problems,” Perform-
ance Evaluation, Vol. 48, pp. 201–223, May 2002.
[2] E. P. F. Chan and N. Zhang, “Finding shortest paths in
large network systems,” Proceedings of the 9th ACM In-
ternational Workshop on Advances in Geographic Infor-
mation Systems (ACMGIS2001), Atlanta, Georgia, pp.
160–166, 2001.
[3] D. Awduchen, A. Chiu, A. Elwalid, I. Widjaja, and X.
Xiao, “Overview and principles of internet traffic engi-
neering,” RFC 3272, May 2002.
[4] D. Torrieri, “Algorithms for finding an optimal set of
short disjoint paths in a communication network,” Com-
munications, IEEE Transactions, Vol. 40, No. 11, pp.
1698–1702, 1992.
[5] B. Fortz and M. Thorup, “Internet traffic engineering by
optimizing OSPF weights,” in Proceedings IEEE INFO-
COM, pp. 519–528, 2000.
[6] X. Zhang, H. Zhao, W. B. Zhang, and C. Li, “Research on
CFR algorithm for Internet,” Journal on Communications,
Vol. 27, No. 9, September 2006.
[7] B. Fortz and M. Thorup, “Optimizing OSPF/IS-IS
weights in a changing world,” IEEE Journal on Selected
Areas in Communications, Vol. 20, No. 5, pp. 756–767,
May 2002.
[8] G. Rétvári and T. Cinkler, “Practical OSPF traffic engi-
neering,” IEEE Communications Letters, Vol. 8, No. 11,
pp. 689–691, November 2004.
[9] A. R. Soltani, H. Tawfik, J. Y .Goulermas, et al., “Path
planning in construction sites: Performance evaluation of
the dijkstra, a* and GA search algorithms,” Advanced
Engineering Informatics, Vol. 16, No. 4, pp. 291–303,
2002.
[10] Z. Wang, “Internet QoS: Architectures and mechanisms
for quality of service,” Academic Press, CA, San Diego,
2001.
[11] M. Pióro and D. Medhi, “Routing, flow, and capacity
design in communication and computer networks,” Mor-
gan Kaufmann, CA, San Diego, November 2004.
[12] W. Ben-Ameur and E. Gourdin, “Internet routing and
related topology issues,” SIAM Journal on Discrete
Mathematics, Vol. 17, No. 1, pp. 18–49, 2003.
[13] Gamma Erich, Helm Richard, Johnson Ralph, and Vlis-
sides John, “Design patterns: Elements of reusable ob-
ject-oriented software,” Addison-Wesley Longman Pub-
lishing Co., Inc. Boston, USA, 1995.
[14] M. M. Lehma and J. F. Rmail, “Software evolution and
software evolution processes,” Annals of Software Engi-
neering, Vol. 14, No. 1, pp. 275–309, 2002.
[15] B. Dougherty, J. White, C. Thompson, and D. C. Schmidt,
“Automating hardware and software evolution analysis,”
Engineering of Computer Based Systems (ECBS), 16th
Annual IEEE International Conference and Workshop on
the[C], pp. 265–274, 2009.
[16] S. N. Dorogovtsev and J. F. Mendes, “Scaling properties
of scale-free evolving networks: Continuous approach,”
Physical Review E, Vol. 63, No. 5, pp. 56125, 2001.
[17] N. Zhao, T. Li, L. L. Yang, Y. Yu, F. Dai, and W. Zhang,
“The resource optimization of software evolution proc-
esses,” Advanced Computer Control International Con-
ference on [C] (ICACC), pp. 332–336, 2009.
[18] B. Behm, “Some future trends and implications for sys-
tems and software engineering processes,” Systems En-
gineering, Vol. 9, No. 1, pp. 1–19, 2006.
[19] Lollini Paolo, Bondavalli Andrea, and Di Giandomenico
Felicita, “A decomposition-based modeling framework
for complex systems,” IEEE Transaction on Reliability,
Vol. 58, No. 1, pp. 20–33, 2009.
[20] Y. Ma and K. He, “A complexity metrics set for
large-scale object-oriented software systems,” in Pro-
ceedings of 6th International Conference on Computer
and Information Technology, pp. 189–189, 2006.
[21] K. Madhavi and A. A. Rao, “A framework for visualizing
model-driven software evolution,” IEEE International
Advance Computing Conference (IACC), pp. 1628–1633,
2009.
[22] S. Valverde and R. V. Sole, “Network notifs in computa-
tional graphs: A case study in software architecture,”
Physical Review E, Vol. 72, No. 2, pp. 26107, 2005.
Copyright © 2010 SciRes JSEA