In the theory of computational complexity, the travelling salesman problem is a typical one in the NP class. With the aid of a brand-new approach named “maximum-deleting method”, a fast algorithm is constructed for it with a polynomial time of biquadrate, which greatly reduces the computational complexity. Since this problem is also NP-complete, as a corollary, P = NP is proved to be true. It indicates the crack of the well-known open problem named “ P versus NP”.
The travelling salesman problem asks the following question: “Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city and returns to the origin city?” [
The proof of P = NP is associated with an important concept, named “NP-complete”, which was firstly proposed by Cook in 1971 [
Proposition 1. Let L be a language over a finite alphabet. If L is NP-complete and L belongs to P then P = NP.
It follows from [
From October 2018, I began to think about this, and a fresh idea suddenly appeared when I deleted the longest one among the total 15 paths (which connect 6 cities separately) by a pen on a paper (see
may greatly reduce the computational complexity, particularly for the problem with large number of cities. This “maximum-deleting method” had thrown some lights on the travelling salesman problem and the P versus NP problem. The subsequent endeavor on the preciseness lifted the mysterious veils of them. The present paper is a report on these.
There is a necessity to re-express the travelling salesman problem in mathematical language. Let n be the total number of the concerned cities (include the origin city, n ≥ 4 ), which are expressed as a series of nodes C k specified by the two-dimensional coordinates ( x k , y k ) ( 1 ≤ k ≤ n ) (the order of the nodes is arbitrarily chosen). The path between every pair of cities is abstracted as a line-segment (in the following we call it by “line” for short) and the one between the i -th and the j -th notes is denoted by l i , j ( 1 ≤ i , j ≤ n with i ≠ j ). Notice that l j , i and l i , j denote the same line, only the case i < j is considered in the following. All these lines compose a set:
D * = { l i , j : 1 ≤ i , j ≤ n , i < j } ,
whose number of elements is
N = ( n − 1 ) + ( n − 2 ) + ⋯ + 2 + 1 = ∑ k = 1 n − 1 k = n ( n − 1 ) 2 . (1)
The concerned route that visits each city and returns to the origin city can be expressed as a closed loop Q which connects all the n nodes with two requirements:
1) Each node has two connections (that is, it only connects two lines);
2) The journey length of Q (that is, the sum of the lengthes for the selected n lines in D * ) should be shortest among all the choices.
Under the frame of maximum-deleting method, the first requirement is a criterion and the second requirement is a strategy.
To demonstrate the strategy in a clear way, we arrange the elements in D * by comparing their lengthes defined by the Euclidean distance
| l i , j | = ( x i − x j ) 2 + ( y i − y j ) 2 (2)
and get
D 0 = { l p ( k ) , q ( k ) : 1 ≤ k ≤ N , | l p ( r ) , q ( r ) | ≥ | l p ( r + 1 ) , q ( r + 1 ) | for all 1 ≤ r ≤ N − 1 } ,
where N is given in Equation (1) and the subscripts p ( k ) and q ( k ) are two mappings. For example, in case l 2,5 has the biggest length in D * then it reads l p ( 1 ) , q ( 1 ) in D 0 . For a candidate of Q, its journey-length reads
S = s 1 + s 2 + ⋯ + s n = ∑ k = 1 n s k , (3)
where the chosen lines are also arranged according to the length, and s k denotes the length of the k -th line with s k ≥ s k + 1 for all 1 ≤ k ≤ n − 1 . To put the connectivity aside, the upper and lower bounds of S accord with the choice of the first n terms and the last n terms in D 0 , respectively. Since the candidate of Q is always composed by n segments and each one of them has many choices, to shorten the total length the longest line should be avoided. Precisely, if the length s 1 in Equation (3) corresponds to the first line l p ( 1 ) , q ( 1 ) in D 0 , then the substitution of this line by another one in D 0 with a shorter length s ′ 1 should be beneficial for shortening the possible length S. After deleting l p ( 1 ) , q ( 1 ) one can continue to delete l p ( 2 ) , q ( 2 ) , l p ( 3 ) , q ( 3 ) , ⋯ , only if the line to be deleted has more than two connections on each endpoint. This is an ensemble compressing strategy which squeezes the candidate set for Q close to the last n terms of D 0 . General speaking, it seldom occurs for the last n terms composing a single closed loop (the lower bound of S is seldom achieved). So there always exists some lines l p ( k ) , q ( k ) with k ≤ N − n in the left set, that is, some shorter lines are deleted. This leaves a certain leeway for modifications in the last process where the shortest principle should be obeyed. We note that, when the deleting process is finished, the left lines usually compose many small loops which need to be connected into a single one.
Notice that requirement (I) for Q only needs 2 of the n − 1 (≥3) connections for each node, it is always possible for executing the maximum-deleting strategy. The maximum-deleting algorithm for the travelling salesman problem is as follows:
Step 1. To input n nodes C k and generate all the lines in D * .
Step 2. To calculate the corresponding line lengths by Equation (2) and arrange them in the form of D 0 [the mapping from D * to D 0 needs to be saved].
1After the deleting process, except some particular nodes with 2 m ( m ≥ 2) connections, all the others are the normal nodes which have 2 connections. The reason is that, originally all the nodes have the same number of connections and all the deleted lines connect pairs of nodes, it is impossible for a sole node who remains odd number of connections. In addition, a particular node only connects the normal nodes in its neighborhood, since if it connects another particular one at least two lines between them should be deleted.
Step 3. To delete the lines in D 0 one by one from the beginning (see
Step 4. If there are some particular nodes, to transform them into the normal ones. For the case in
2After Step 4 all the left nodes are the normal ones and the number of left lines is n .
3 | l s , q | + | l p , r | = O C s + O C r + O C p + O C q > | l p , q | + | l r , s | , | l s , p | + | l q , r | .
4After Step 5 The left n lines compose either a single closed loop or some isolated closed loops without twists.
particular nodes who own 2 m ( m ≥ 3 ) connections, one can repeat this processing and delete the lines pair by pair until only a pair of connections are left2.
Step 5. If there are two crossed lines as in
Step 6. If the left n lines compose many isolated closed loops, to connect them in the following way: For the cases in
Step 7. To modify the unique closed loop according the shortest principle. Firstly, to re-number the nodes from C 1 along this loop in an anticlockwise order; Secondly, to follow this order and search from C 1 for the triangle determined by three neighboring nodes who owns a shortest boundary which is not on the loop, and then make the possible substitution. For example, in
According to this algorithm, when Step 3 is finished all the possible longer lines are deleted and the left ones compose either a single closed loop or many small closed loops. The determined thing is that the firstly generated normal node only remains two shortest connections. For the subsequent generated normal nodes, in the sake of connectivity each of them either remains two shortest connections or share one or two lines with the previously generated nodes (it doesn't matter whether they connect a particular node). Since the deleting processing is done from the longest one, it reduces all the possible longer connections respect to all the n nodes, and, to insure the connectivity, there are no other choices for a given node possessing a relatively shorter connection. So the closed loop to be found is based on the left lines with the shortest sum of lengths. In the subsequent steps, the small loops are opened and connected together. During this process the shortest principle is obeyed and the total length of the candidate loop is reduced to the maximum extent. So the final closed loop has the shortest length among all the choices. It is the anticipated solution for the travelling salesman problem.
Theorem 1. The maximum-deleting algorithm for the travelling salesman problem has a polynomial time of order O ( n 4 ) .
Proof: It follows from the definition of D * that the first step of this algorithm costs computation time of about K 1 N = K 1 × n ( n − 1 ) / 2 = O ( n 2 ) , where K 1 is a constant. To select the longest line in D * it needs to make N − 1 times comparison. To select the second longest line it needs to make N − 2 times comparison ⋯ . So the arranging process for all the lines in the second step needs a computation time:
∑ k = 1 N − 1 k = N ( N − 1 ) 2 = 1 8 n 2 ( n − 1 ) 2 − 1 4 n ( n − 1 ) = O ( n 4 ) .
The corresponding saving process needs a time of about O ( N ) = O ( n 2 ) . The deleting process in Step 3 also needs a time of about O ( N ) = O ( n 2 ) . From Step 4 to Step 7, only some of the lines are adjusted. In each step, the number of involved lines is no more than 2 n . To include the judgements and substitutions, all the calculating process costs computation time of about O ( n ) . Hence, the computational complexity of this algorithm is determined by the second step, and all the process requires a polynomial time of order O ( n 4 ) . The proof is finished.
Corollary 1. P = NP.
Proof: On the one hand, it follows from [
By using a new approach named “maximum-deleting method”, we have constructed a fast algorithm for the travelling salesman problem with a polynomial time of order O ( n 4 ) , which will greatly reduce the computational complexity. Notice that this problem is NP-complete, as a corollary, we have also solved the well-known open problem named “P versus NP”. The result indicates that P = NP which will result in a surprise to all, since great majority of people have believed that P ≠ NP.
The author declares no conflicts of interest regarding the publication of this paper.
Wang, J.L. (2018) Fast Algorithm for the Travelling Salesman Problem and the Proof of P = NP. Applied Mathematics, 9, 1351-1359. https://doi.org/10.4236/am.2018.912088