To enhance the approximation and generalization ability of artificial neural network (ANN) by employing the principles of quantum rotation gate and controlled-not gate, a quantum-inspired neuron with sequence input is proposed. In the proposed model, the discrete sequence input is represented by the qubits, which, as the control qubits of the controlled-not gate after being rotated by the quantum rotation gates, control the target qubit for reverse. The model output is described by the probability amplitude of state in the target qubit. Then a quantum-inspired neural network with sequence input (QNNSI) is designed by employing the sequence input-based quantum-inspired neurons to the hidden layer and the classical neurons to the output layer, and a learning algorithm is derived by employing the Levenberg-Marquardt algorithm. Simulation results of benchmark problem show that, under a certain condition, the QNNSI is obviously superior to the ANN.
Many neuro-physiological experiments indicate that the information processing character of the biological nerve system mainly includes the following eight aspects: the spatial aggregation, the multi-factor aggregation, the temporal cumulative effect, the activation threshold characteristic, self-adaptability, exciting and restraining characteristics, delay characteristics, conduction and output characteristics [
Since Kak firstly proposed the concept of quantum-inspired neural computation [
In this paper, in order to fully simulate biological neuronal information processing mechanisms and to enhance the approximation and generalization ability of ANN, we proposed a quantum-inspired neural network model with sequence input, called QNNSI. It’s worth pointing out that an important issue is how to define, configure and optimize artificial neural networks. Refs. [
In the quantum computers, the “qubit” has been introduced as the counterpart of the “bit” in the conventional computers to describe the states of the circuit of quantum computation. The two quantum physical states labeled as
An n qubits system has 2n computational basis states. For example, a 2 qubit system has basis
where
In quantum computation, the logic function can be realized by applying a series of unitary transform to the qubit states, which the effect of the unitary transform is equal to that of the logic gate. Therefore, the quantum services with the logic transformations in a certain interval are called the quantum gates, which are the basis of performing quantum computation.
The definition of a single qubit rotation gate is written as
The NOT gate is defined by its truth table, in which
The notation X for the quantum NOT is used for historical reasons. If the quantum state
In a true quantum system, a single qubit state is often affected by a joint control of multi-qubits. A multi-qubits controlled-not gate
In
Suppose that the
It is observed from Equation (7) that the output of
In
The probability of the target qubit state
At this time, after the joint control of the n input bits, the target bit
In this section, we first propose a quantum-inspired neuron model with sequence input, as shown in
Let
where
for
In this paper, the QNNSI model is shown in
Unlike ANN, each input sample of QNNSI is described as a matrix instead of a vector. For example, the l-th sample can be written as
Let
output relationship of quantum-inspired neuron, in interval
The j-th output in hidden layer (namely, the spatial and temporal aggregation results in [0, T]) is given by
The k-th output in output layer can be written as
Suppose the l-th sample in n-dimensional input space
These samples can be converted into the quantum states as follows
where
Similarly, suppose the
then, these output samples can be normalized by the following equation
In QNNSI, the adjustable parameters include the rotation angles of quantum rotation gates in hidden layer, and the weights in output layer. Suppose
Let
According to the gradient descent algorithm in Ref. [
where
The gradient of the connection weights in output layer can be calculated as follows
Because gradient calculation is more complicated, the standard gradient descent algorithm is not easy to converge. Hence we employ the Levenberg-Marquardt algorithm in Ref. [
Let P denote the parameter vector, e denote the error vector, and J denote the Jacobian matrix. p, e, and J are respectively defined as follows
According to Levenberg-Marquardt algorithm, the QNNSI iterative equation is written as follows
where
If the value of the evaluation function E reaches the predefined precision within the preset maximum of iterative steps, then the execution of the algorithm is stopped, else the algorithm is not stopped until it reaches the predefined maximum of iterative steps.
To examine the effectiveness of the proposed QNNSI, the time series prediction for Mackey-Glass is used to compare it with the ANN with a hidden layer in this section. In this experiment, we implement and investigate the QNNSI in Matlab (Version 7.1.0 .246) on a Windows PC with 2.19 GHz CPU and 1.00 GB RAM. Our QNNSI has the same structure and parameters as the ANN in the simulations, and the same Levenberg-Mar- quardt algorithm [
Mackey-Glass time series can be generated by the following iterative equation [
where
From the above equation, we may obtain the time sequence
set, and the remaining 200 as the testing set. Our prediction schemes is to employ n data adjacent to each other to predict the next one data. Namely, in our model, the sequence length equals to n. Therefore, each sample consists of n input values and an output value.
Hence, there is only one output node in QNNSI and ANN. In order to fully compare the approximation ability of two models, the number of hidden nodes are respectively set to
Obviously, ANN has n input nodes, and an ANN’s input sample can be described as a n-dimensional vector. For the number of input nodes of QNNSI, we employ the following nine kinds of settings shown in
It is worth noting that, in QNNSI, an
Our experiment scheme is that, for each kind of combination of input nodes and hidden nodes, one ANN and nine QNNSIs are respectively run 10 times. Then we use four indicators, such as average approximation error, average iterative steps, average running time, and convergence ratio, to compare QNNSI with ANN. Training result contrasts are shown in Tables 2-5, where QNNSIn_q denotes QNNSI with n input nodes and q sequence length.
From Tables 2-5, we can see that when the input nodes take 6, 9, and 12, the performance of QNNSIs are
QNNSI | ANN | ||
---|---|---|---|
Input nodes | Sequence length | Input nodes | Sequence length |
1 | 36 | 36 | 1 |
2 | 18 | 36 | 1 |
3 | 12 | 36 | 1 |
4 | 9 | 36 | 1 |
6 | 6 | 36 | 1 |
9 | 4 | 36 | 1 |
12 | 3 | 36 | 1 |
18 | 2 | 36 | 1 |
36 | 1 | 36 | 1 |
QNNSI | Hidden nodes | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
10 | 12 | 14 | 16 | 18 | 20 | 22 | 24 | 26 | 28 | 30 | 32 | 34 | 36 | 38 | 40 | |
QNNSI1_36 | 0.55 | 0.55 | 0.59 | 0.59 | 0.59 | 0.59 | 0.59 | 0.67 | 0.67 | 0.59 | 0.71 | 0.63 | 0.67 | 0.63 | 0.67 | 0.68 |
QNNSI2_18 | 0.55 | 0.54 | 0.51 | 0.25 | 0.13 | 0.14 | 0.14 | 0.31 | 0.32 | 0.13 | 0.41 | 0.13 | 0.32 | 0.23 | 0.41 | 0.41 |
QNNSI3_12 | 0.04 | 0.04 | 0.13 | 0.13 | 0.13 | 0.04 | 0.13 | 0.32 | 0.32 | 0.13 | 0.32 | 0.13 | 0.32 | 0.22 | 0.41 | 0.41 |
QNNSI4_9 | 0.04 | 0.04 | 0.13 | 0.13 | 0.04 | 0.13 | 0.13 | 0.22 | 0.32 | 0.04 | 0.22 | 0.04 | 0.22 | 0.22 | 0.32 | 0.31 |
QNNSI6_6 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 |
QNNSI9_4 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 |
QNNSI12_3 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 |
QNNSI18_2 | 0.17 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.05 | 0.04 |
QNNSI36_1 | 0.47 | 0.47 | 0.47 | 0.47 | 0.47 | 0.47 | 0.47 | 0.47 | 0.48 | 0.47 | 0.47 | 0.47 | 0.47 | 0.47 | 0.47 | 0.47 |
ANN36 | 0.23 | 0.14 | 0.41 | 0.14 | 0.23 | 0.23 | 0.14 | 0.32 | 0.32 | 0.14 | 0.32 | 0.05 | 0.32 | 0.23 | 0.41 | 0.32 |
QNNSI | Hidden nodes | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
10 | 12 | 14 | 16 | 18 | 20 | 22 | 24 | 26 | 28 | 30 | 32 | 34 | 36 | 38 | 40 | |
QNNSI1_36 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
QNNSI2_18 | 100 | 100 | 92.5 | 76.8 | 57.1 | 52.6 | 42.5 | 51.4 | 49.6 | 32.8 | 51.7 | 26.2 | 43.8 | 36.8 | 51.0 | 51.1 |
QNNSI3_12 | 8.90 | 8.20 | 16.6 | 16.5 | 15.2 | 6.70 | 15.0 | 33.5 | 33.6 | 14.4 | 33.1 | 14.1 | 33.4 | 23.9 | 42.5 | 42.5 |
QNNSI4_9 | 5.60 | 5.40 | 14.8 | 14.5 | 4.70 | 14.1 | 13.8 | 23.7 | 33.0 | 4.20 | 23.5 | 4.10 | 23.6 | 23.1 | 33.0 | 32.9 |
QNNSI6_6 | 5.4 | 5.5 | 4.9 | 5.1 | 4.9 | 4.6 | 4.4 | 4.5 | 4.2 | 4.1 | 4.1 | 4.0 | 4.1 | 4.0 | 6.2 | 4.5 |
QNNSI9_4 | 6.6 | 6.0 | 6.0 | 5.5 | 5.8 | 5.3 | 5.3 | 5.2 | 5.0 | 4.6 | 4.6 | 4.7 | 4.4 | 4.2 | 4.4 | 4.6 |
QNNSI12_3 | 10 | 7.7 | 6.6 | 7.2 | 6.7 | 6.1 | 6.2 | 6.2 | 5.9 | 6.2 | 5.9 | 5.8 | 6.1 | 5.7 | 5.6 | 5.5 |
QNNSI18_2 | 52.9 | 32.2 | 33.0 | 19.0 | 20.6 | 16.9 | 11.3 | 10.8 | 10.8 | 9.90 | 9.50 | 9.10 | 8.70 | 9.10 | 7.60 | 8.10 |
QNNSI36_1 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
ANN36 | 32.9 | 21.0 | 48.3 | 20.5 | 30.2 | 29.6 | 20.6 | 38.2 | 45.4 | 20.2 | 38.2 | 10.0 | 37.0 | 5.50 | 46.8 | 36.9 |
QNNSI | Hidden nodes | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
10 | 12 | 14 | 16 | 18 | 20 | 22 | 24 | 26 | 28 | 30 | 32 | 34 | 36 | 38 | 40 | |
QNNSI1_36 | 99 | 124 | 149 | 180 | 210 | 244 | 279 | 321 | 359 | 396 | 435 | 489 | 538 | 600 | 657 | 722 |
QNNSI2_18 | 83 | 103 | 117 | 117 | 104 | 111 | 108 | 146 | 158 | 123 | 209 | 126 | 223 | 212 | 316 | 350 |
QNNSI3_12 | 9 | 11 | 23 | 27 | 30 | 19 | 40 | 92 | 104 | 57 | 131 | 70 | 164 | 135 | 253 | 281 |
QNNSI4_9 | 7 | 9 | 21 | 26 | 14 | 33 | 39 | 69 | 97 | 21 | 89 | 25 | 112 | 122 | 188 | 208 |
QNNSI6_6 | 7 | 9 | 10 | 12 | 14 | 15 | 17 | 19 | 21 | 24 | 26 | 27 | 31 | 33 | 51 | 44 |
QNNSI9_4 | 7 | 9 | 11 | 12 | 15 | 17 | 18 | 21 | 23 | 25 | 27 | 30 | 32 | 33 | 38 | 43 |
QNNSI12_3 | 9 | 9 | 10 | 13 | 15 | 16 | 19 | 22 | 24 | 28 | 30 | 33 | 37 | 40 | 42 | 44 |
QNNSI18_2 | 37 | 30 | 37 | 28 | 35 | 35 | 29 | 32 | 37 | 38 | 42 | 45 | 49 | 55 | 53 | 61 |
QNNSI36_1 | 69 | 88 | 109 | 131 | 150 | 176 | 204 | 235 | 269 | 306 | 346 | 389 | 436 | 486 | 540 | 598 |
ANN36 | 13 | 13 | 32 | 19 | 32 | 38 | 33 | 66 | 89 | 50 | 101 | 37 | 128 | 139 | 203 | 182 |
QNNSI | Hidden nodes | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
10 | 12 | 14 | 16 | 18 | 20 | 22 | 24 | 26 | 28 | 30 | 32 | 34 | 36 | 38 | 40 | |
QNNSI1_36 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
QNNSI2_18 | 0 | 0 | 20 | 60 | 90 | 90 | 90 | 70 | 70 | 90 | 60 | 90 | 70 | 80 | 60 | 60 |
QNNSI3_12 | 100 | 100 | 90 | 90 | 90 | 100 | 90 | 70 | 70 | 90 | 70 | 90 | 70 | 80 | 60 | 60 |
QNNSI4_9 | 100 | 100 | 90 | 90 | 100 | 90 | 90 | 80 | 70 | 100 | 80 | 100 | 80 | 80 | 70 | 70 |
QNNSI6_6 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
QNNSI9_4 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
QNNSI12_3 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
QNNSI18_2 | 70 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
QNNSI36_1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
ANN36 | 80 | 90 | 60 | 90 | 80 | 80 | 90 | 70 | 70 | 90 | 70 | 100 | 70 | 80 | 60 | 70 |
obviously superior to that of ANN, and the QNNSIs have better stability than ANN when the number of hidden nodes changes.
Next, we investigate the generalization ability of QNNSI. Based on the above experimental results, we only investigate three QNNSIs (QNNSI6_6, QNNSI9_4, and QNNSI12_3). Our experiment scheme is that three QNNSIs and one ANN train 10 times on the training set, and the generalization ability is immediately investigated on the testing set after each training. The average results of the 10 tests are regarded as the evaluation indexes. For convenience of description, let
Taking 30 hidden nodes for example, the evaluation indexes contrast of QNNSIs and ANN are shown in
These experimental results can be explained as follows. For processing of input information, QNNSI and ANN take different approaches. QNNSI directly receives a discrete input sequence. In QNNSI, using quantum information processing mechanism, the input is circularly mapped to the output of quantum controlled-not gates in hidden layer. As the controlled-not gate’s output is in the entangled state of multi-qubits, therefore, this mapping is highly nonlinear, which make QNNSI have the stronger approximation ability. In addition, QNNSI’s each input sample can be described as a matrix with n rows and q columns. It is clear from QNNSI’s algorithm that, for the different combination of n and q, the output of quantum-inspired neuron in hidden layer is also different. In fact, The number of discrete points q denotes the depth of pattern memory, and the number of input nodes n denotes the breadth of pattern memory. When the depth and the breadth are appropriately matched, the QNNSI shows excellent performance. For the ANN, because its input can only be described as a nq-dimensional vector, it is not directly deal with a discrete input sequence. Namely, it only can obtain the sample characteristics by way of breadth instead of depth. Hence, in the ANN information processing, there exists inevitably the loss of sample characteristics, which affects its approximation and generalization ability.
It is worth pointing out that QNNSI is potentially much more computationally efficient than all the models referenced above in the Introduction section. The efficiency of many quantum algorithms comes directly from quantum parallelism that is a fundamental feature of many quantum algorithms. Heuristically, and at the risk of over-simplifying, quantum parallelism allows quantum computers to evaluate a function f(x) for many different values of x simultaneously. Although quantum simulation requires many resources in general, quantum parallelism leads to very high computational efficiency by using the superposition of quantum states. In QNNSI, the input samples have been converted into corresponding quantum superposition states after preprocessing. Hence, as far as a lot of quantum rotation gates and controlled-not gates used in QNNSI are concerned, information processing can be performed simultaneously, which greatly improves the computational efficiency. Because the above experiments are performed in classical computer, the quantum parallelism has not been explored. However, the efficient computational ability of QNNSI is bound to stand out in future quantum computer.
This paper proposes quantum-inspired neural network model with sequence input based on the principle of quantum computing. The architecture of the proposed model includes three layers, where the hidden layer consists of quantum-inspired neurons and the output layer consists of classical neurons. An obvious difference from classical ANN is that each dimension of a single input sample consists of a discrete sequence rather than a single value. The activation function of hidden layer is redesigned according to the principle of quantum computing. The Levenberg-Marquardt algorithm is employed for learning. With application of the information processing mechanism of quantum rotation gates and controlled-not gates, the proposed model can effectively obtain the sample
QNNSI | ANN | ||||||
---|---|---|---|---|---|---|---|
Model | Model | ||||||
QNNSI6_6 | 0.0520 | 0.0084 | 0.0001 | ANN36 | 0.3334 | 0.1598 | 0.0185 |
QNNSI9_4 | 0.0541 | 0.0089 | 0.0001 | ANN36 | 0.3334 | 0.1598 | 0.0185 |
QNNSI12_3 | 0.0566 | 0.0093 | 0.0001 | ANN36 | 0.3334 | 0.1598 | 0.0185 |
characteristics by ways of breadth and depth. The experimental results reveal that a greater difference between input nodes and sequence length leads to a lower performance of proposed model than that of classical ANN; on the contrary, it obviously enhances approximation and generalization ability of proposed model when input nodes are closer to sequence length. The following issues of the proposed model, such as continuity, computational complexity, and improvement of learning algorithm, are subjects of further research.
This work was supported by the National Natural Science Foundation of China (Grant No. 61170132), Natural Science Foundation of Heilongjiang Province of China (Grant No. F2015021), Science Technology Research Project of Heilongjiang Educational Committee of China (Grant No. 12541059), and Youth Foundation of Northeast Petroleum University (Grant No. 2013NQ119).