 J. Software Engineering & Applications, 2009, 2: 111-115 doi:10.4236/jsea.2009.22016 Published Online July 2009 (www.SciRP.org/journal/jsea) Copyright © 2009 SciRes JSEA An Optimal Algorithm for Prufer Codes* Xiaodong Wang1, 2, Lei Wang3, Yingjie Wu1 1Department of Computer Science, Fuzhou University, Fuzhou, China; 2Department of Computer Science, Quanzhou Normal Uni-versity, Quanzhou, China; 3College of Computing, Georgia Institute of Technology, Atlanta GA 30332, USA. Email: wangxd@fzu.edu.cn Received February 19th, 2009; revised February 23rd, 2009; accepted February 24th, 2009. ABSTRACT This paper studies the algorithms for coding and decoding Prufer codes of a labeled tree. The algorithms for coding and decoding Prufer codes of a labeled tree in the literatures require time usually. Although there exist linear time algorithms for Prufer-like codes [1,2,3], the algorithms utilize the integer sorting algorithms. The special range of the integers to be sorted is utilized to obtain a linear time integer sorting algorith m. The Prufer code problem is reduced to integer sortin g. In this paper we consider the Prufer code problem in a different a ngle and a more direct manner. We start from a naïve algorithm, then improved it gradually and finally we obtain a very prac tical linear time algorithm. The techniques we used in this paper are of interest in their own right. (Onlogn) Keywords: Design of Algorithm, Labeled Trees, Prufer Codes, Integer Sorting1. Introduction Labeled trees are of interest in practical and theoretical areas of computer science. For example, Ethernet has a unique path between terminal devices, thus being a tree: labeling the tree nodes is necessary to uniquely identify each device in the network. An interesting alternative to the usual representations of tree data structures in com-puter memories is based on coding labeled trees by means of strings of node labels. This representation was first used in the proof of Cayley’s theorem  to show a one-to-one correspondence between free labeled trees on n nodes and strings of length n-2. In addition to this purely mathematical use, string-based coding of trees has many practical applications. For instance, they make it possible to generate random uniformly distrib-uted trees and random connected graphs: the generation of a random string followed by the use of a fast decod-ing algorithm is typically more efficient than random tree generation by the addition of edges, since in the latter case one must pay attention not to introduce cy-cles. In addition, tree codes are employed in genetic algorithms, where chromosomes in the population are represented as strings of integers, and in heuristics for computing minimum spanning trees with additional constraints, e.g., on the number of leaves or on the di-ameter of the tree itself. Not last, tree codes are used for data compression and for computing the tree and for-est volumes of graphs. Let T be a labeled tree whose nodes are numbered from 0 to n-1. For some vertex v in T, the degree of v, denoted by d[v], is the number of edges incident to v. If d[v]=1, then v is called a leaf. According to Prufer’s proof, any sequence of n-2 numbers, each number in {0,1,…,n-1} can determine a unique labeled tree of n nodes. Such a number sequence is the Prufer code of a labeled tree. Algorithm A is the straightforward imple-mentation of Prufer’s proof. Algorithm A Input: A labeled tree T of n nodes as a list of n-1 edges. Output: c, the Prufer code of T. Method: Step 1. B←{0,1,…,n-1}. Step 2. For i ← 0 to n-3 do Step 2.1. x ← min{kB: k is a leaf }. Step 2.2. B←B-{x}. Step 2.3. Remove x and its incident edge (x, y) from T. Step 2.4. c[i]←y. Step 3. Return c. End of Algorithm For example, let the input labeled tree T be the graph depicted in Figure 1. After the Algorithm A terminates, c = [2,4,0,1,3,3] which is the Prufer code of T . *Supported by Natural Science Foundation of China under Grant No.60172017 and Natural Science Foundation of Fujian under Grant No.A0510008. An Optimal Algorithm for Prufer Codes 112 Figure 1. A labelled tree The algorithms for coding and decoding Prufer codes of a labeled tree in the literatures require tim- e usually. As stated in , although the problem of pro-ducing a Prufer code in linear time is an exercise in two books [5, 6], there exists no explicit publication of a so-lution. In  an time algorithm for generating a Prufer code from a labeled tree was described, but the time algorithm utilized the integer sorting algo-rithms. The special range of the integers to be sorted was utilized to obtain a linear time integer sorting algorithm. The Prufer code problem is reduced to integer sorting. In this paper we consider the Prufer code problem in a dif-ferent angle and present a very practical linear time algo-rithm directly. The techniques we used in this paper are of interest in their own right. (Onlogn)) )))()On()On2. A Linear Time Algorithm for Coding The most time consuming step in Algorithm A is Step 2. The leaf elimination scheme of the Prufer code implicitly defines a root for the labeled tree T. Actually, it is easy to see that the last element in the code is n-1 which is the root of the labeled tree T. Given a list of n-1 edges, we can build the parent array f and the degree array d of the labeled tree T with only preprocessing time, wh- ich allows checking and updating a node's degree in time. With this preprocessing, the Step 2.3 of Alg- orithm A can be implemented in time. Using a he- ap, the leaf with the smallest number can be found in time. Hence, Step 2 takes time to- tally. The time complexity of Algorithm A is also . ()On(1)O(OnlOnl(1)Oogn()ogn(OnlognWe can improve the algorithms further. An insight into the problem is that for the heap operations in the Step 2.1 of Algorithm A, there is a special kind of nodes that they are deleted from the heap immediately after they are inserted into the heap. This kind of nodes can be treated easily without heap operation. The remained heap operation can be replaced by a linear scan of the degree array d. Therefore the total time to find x in the Step 2.1 is reduced to O(n). The linear time algorithm can be presented detailed as follows. Coding Algorithm: 1: index ← x ← min{0≤k