Intelligent was very important for command decision model, and it was also the key to improve the quality of simulation training and combat experiment. The decision-making content was more complex in the implementation of tasks and the nature of the problem was different, so the demand for intelligence was high. To solve better the problem, this paper presented a game method and established a game neural network model. The model had been successfully applied in the classification experiment of winning rate between chess game, which had good theoretical significance and application value.
Today, simulation training and combat experiments are increasingly demanding on command decision models. Combat experiments need to solve the problem that the experimental credibility is not high and the space to think about is difficult to automatically explore. This poses a demand for the intelligence of the command decision model. In previous literatures [
The modern power system in the sending, configuration, use and other aspects of the participants was increasingly diverse, and they had their own independent demands of interests, moreover, it will inevitably lead to conflicts of interest. Therefore, it was necessary to establish a fair and reasonable coordination of interests and mechanisms of conflict resolved, balance and optimizing the interests of all parties [
When the energy with volatility and randomness was accessed to the grid in the wind power generation, photovoltaic power generation and others, in order to effectively control the risk to achieve better control effect, the random interference of nature such as wind, light and other uncertain factors could be seen as non-cooperative game side, then the above problems could be solved based on non-cooperative game theory [
Due to the need of confidentiality of communication, military and so on, and the environment for each signal was increasingly complex so that the characteristic information of the target had some ambiguity. However, fuzzy automata [
This paper will propose a game method, establish the game neural network model, and apply the model to the classification experiment of winning rate between chess.
Game was an important application of heuristic search all along. There have been several game systems as early as 1960s. One of the parties in the game tries to maximize the odds of winning goal, while the other tries to make the opposite side deviate from the winning goal. Both sides of the game and the other side are always moving to the most beneficial to their own state, on the other hand, that is, both sides are always moving to the most disadvantageous to each other’s state.
Each step in the game is to achieve a Nash equilibrium. That is, the game with n Objects is described as G = { A , I , S , U } , where A = { 1 , 2 , ⋯ , n } is a set of all Objects that represent the behavior of the game coordination and the decision Object, whose purpose is to maximize own payment or utility level by selecting the strategy of action; I denotes that each Object owns information, including features of other Objects and strategy information of action; S represents a set of all possible strategies or actions of Object. All feasible strategy of an Object is called its strategy space, that is, S i denotes the strategy space of the ith Object; U is the income function, which indicates the benefits obtained by Object. It refers to the gains and losses of Object under the combination of the established strategy, that is, it is the level of benefits under a particular strategic combination. If the strategic combination S ∗ = { s 1 ∗ , ⋯ , s i ∗ , ⋯ , s n ∗ } in this issue is a Nash equilibrium, it must satisfy
U i ( s i ∗ , s − i ∗ ) ≥ U i ( s i , s − i ∗ ) ∀ s i ∈ S i (1)
where s i ∗ indicates the strategy chosen by the ith Object; s − i ∗ represents the vector that consists of strategies of all Objects except for the ith Object; U i represents the benefit obtained by the ith Object; S i represents the strategy space of the ith Object.
In a more complex game, according to a certain time and space costs, the search is extended to a certain layer so far. Because of the leaf node of the explicit sub-graph by this expansion is not the final success or failure state of the game, it cannot give the sure value that is the success or failure in the end. In this case, the values to the leaf nodes are assigned according to some heuristic functions that can show the probability of success or failure. Then, the value of each node in the search graph is pushed forward from bottom to top according to the reverse rule of the ˅/˄ method, including the root node. However, the value of this retreat to the root node does not indicate who will win, but only consider the limited number of steps that are described the number of layers of and/or in the search graph. The heuristic function value that corresponds to the best state in these layers can be achieved.
Each Object wants to win in the game. Therefore, the advantage of one side relative to the other side is directly estimated by some heuristic knowledge. In the chess, the entropy advantage is very important, so a simple heuristic strategy is always to calculate the advantageous difference of entropy between one side ˅ and the other side ˄, and maximizes the difference as far as possible. The more sophisticated some heuristic strategies assign different heuristic function values based on the differences of the entropy. The vast majority of games will have many heuristics information that can be easily used.
Give the heuristic function h ( n ) , and assume the benefit function is U i ( n ) = h ( n ) , for example, h ( n ) = ˅/˄. Then, the depth and width search method is used to find the optimal step according to the Nash equilibrium principle. In order to facilitate the discussion, the nine-grid game here is used to describe the game algorithm of the heuristic function.
Example 1. In the nine-grid game, assume one side ˅ is * side and the other side ˄ is the O side. Let ˅ start first.
The whole state in this issue is a total of 9! nodes. Even if the homogeneous chess game is removed, it is still a big number. Obviously, all blind searches here are not working. Therefore, the heuristic search method must be considered. In this case, the heuristic function h ( n ) = ˅/˄ search method is used.
The whole row, the whole column or the whole diagonal of the chessboard are called the winning line. Here, the winning line method is defined as follows:
1) If there is no any chessman on a winning line, it is called a 0-order winning line. 0-order winning line can be regarded as belonging to the * side, can also be regarded as belonging to the O side, which they have no effect on the valuation.
2) If there is only one chessman of * (O) side on a winning line, it is called the first-order winning line of the * (O) side.
3) If there are two chessmen of * (O) side on a winning line, the winning line is called the second-order winning line of * (O) side.
4) If there are three chessmen of * (O) side on a winning line, the winning line is called the third-order winning line of * (O) side.
Thus, h ( n ) can be defined as:
1) If the node n is a non-final node of the * side, the evaluation function of the * side is as follows:
h ( n ) = (the number of first-order winning line of * side − the number of first-order winning line of O side) + 4 × (the number of second-order winning line of * side − the number of second-order winning line of O side) + a + 6 × (the number of third-order winning line of * side − the number of third-order winning line of O side) + b
where,
a = { + 2 Ifthe ∗ sidetakesthechessman , itcanoccupythesecond-orderwinninglineoftheOside − 2 IftheOsidetakesthechessman , itcanoccupythesecond-orderwinninglineofthe ∗ side 0 others
b = { + 3 Ifthe ∗ sidetakesthechessman , itcanoccupythethird-orderwinninglineoftheOside − 3 IftheOsidetakesthechessman , itcanoccupythethird-orderwinninglineofthe ∗ side 0 others
2) If the node n is the final node of the * side to win, then h ( n ) = + ∞ .
3) If the node n is the failure final node of the * side, then h ( n ) = − ∞ .
4) If the node n is a draw, then h ( n ) = 0 .
The search graphs of the algorithm ˅/˄ on the first and second steps can be obtained by using h ( n ) , as shown in
Similarly, the search graph of heuristic algorithm ˅/˄ on the subsequent steps can be obtained.
From the search graphs with ˅/˄ of full two steps of the nine-grid game, both sides of the game are guided by h ( n ) to carry out the search. The outcome is a draw, and then the mistakes of either side will be “self-defeated”.
The heuristic function is important. If its definition is not appropriate, it may get undesirable results. In this case, the winning line is defined as: If there are only chessmen of * (O) side or empty on a winning line, but no chessmen of O (*) side, the winning line is called the winning line of the * (O) side. In this way, the heuristic function h ( n ) of the * side can be defined as the evaluation function h 1 ( n ) as follows:
1) If the node n is a non-final node, then h 1 ( n ) = the number of winning lines of * side − the number of winning lines of O side.
2) If the node n is a draw, then h 1 ( n ) = 0 .
3) If the node n is the final node of the * side to win, then h 1 ( n ) = + ∞ .
4) If the node n is the failure final node of the * side, then h 1 ( n ) = − ∞ .
Obviously, if the evaluation function h ′ 1 ( n ) is obtained by using the point of view of the O side to analyze the evaluation of the same chess, then there must be h ′ 1 ( n ) = − h 1 ( n ) .
The heuristic search graph of ˅/˄ on the first step can be obtained by using h 1 ( n ) , which is the same as that of by using h ( n ) , as shown in
However, the defect of h 1 ( n ) has been exposed in the search graph of ˅/˄ on the second step, because it is not accurate to guide the search in the chess game, as shown in
From
This heuristic method with search ˅/˄ is to separate completely the process of generating the game tree from the process of calculating, evaluating and determining the optimal step. Only when the game trees with the specified depth all are generated is carried out, then starts to calculate, evaluate and determine the optimal step, this separation leads to lower search efficiency. If the calculation of the evaluation function of the endpoint and the pushing down operation of the inverted value of the intermediate node are completed at the same time as the tree grows, i.e., while the game tree is generated and the evaluation is calculated at the same time, it is possible to reduce the workload of many generation and calculations. This technique is called ˅/˄ pruning technology.
Learning the game algorithm by establishing a neural network is an important method in intelligent decision making. By analyzing the chess historical data of both sides of the game, the winning ratio of the two sides of the game is obtained in 100 games in a certain period of time, then the winning ratio can be defined:
r ( x ) = m n
where r ( x ) denotes the winning ratio of two sides; x denotes any one of both sides; m indicates the times of winning event; n indicates the total number of matches. The results regarding winning ratio between two sides are different, as shown in
The data in the table is taken as the input sample P of the network. P is a two-dimensional random vector, and its distribution is shown in
The weight is trained by SOFM network. The distribution of the initial weights of the network is shown in
W(i, 1) and W(i, 2) in
When the number of steps is 100, the distribution of weights is shown in
It can be seen from
Party A | 0.5501 | 0.5113 | 0.5069 | 0.5001 | 0.6017 | 0.5298 | 0.5000 | 0.4961 | 0.5212 | 0.5011 |
---|---|---|---|---|---|---|---|---|---|---|
Party B | 0.4499 | 0.4887 | 0.4931 | 0.4999 | 0.3983 | 0.4702 | 0.5000 | 0.5039 | 0.4788 | 0.4989 |
different samples. With the increase of the number of training steps, the distribution of neurons is more reasonable. However, when the number of times of training reaches a certain value, the change of the distribution of weights is not obvious, because the weights are out of range the neighbor of the winning
output N ( i ∗ ) which is a near neighbor of the winning output neuron i ∗ , and which is designated by the distance between the output neurons. For example, the distribution of weights after training 300 steps and that of training 500 steps are similar. The adjustment of weights is given as follows: For each output neuron i ∈ { N ( i ∗ ) , i ∗ } , the weight is updated by the expression ω k j ( t + 1 ) = ω k j ( t ) + η ( t ) d k ( t ) , if j ∈ N ( i ) . where d k is the difference of weights of outputs between the present time and last time, and the η ( t ) = η has been determined in experiment experience. This rule only updates the near neighbor of the winning output neuron.
After the end of the network training, the weights are determined. A value at each time is input, the network will automatically classify it. Therefore, the network can be tested by using this feature. First, the sample vector P is input to the network for testing, and the simulation function is used to observe the classification of the sample data by network. The simulation results are Output = [3, 9, 9, 12, 1, 5, 12, 12, 10, 12]. The topology of neural network after training is shown in
Now, the winning ratio p = [0.5; 0.5] for both parties in a certain period of time is input to verify which category it belongs to. The simulation result is Output = 12. This shows that the 12th neuron of the network is stimulated at this time, so p belongs to the fourth category. By comparing the data directly, p is indeed very close to the data in group 4, group 7 and group 10 of samples.
As a classical method to analyze the benefit relationship between the multi-
decision main parts, the game theory is widely used in all aspects of macro-decision-making strategy and micro-decision-making system. In this paper, a smart command decision model was solved by using the superiority of learning of neural network, and the establishment of the game algorithm of neural network was also given. In general, the game theory plays an increasingly important role in the application of field of engineering decision-making research from macro to micro, from qualitative to quantitative. With the rise of the concept of Internet with various decision-making, the democracy and fairness of decision-making will be paid more and more attention, and the game method on neural network is a powerful tool to solve the above problems.
This work is supported by National 973 Program (No. 613237); Henan Province Outstanding Youth on Science and Technology Innovation (No. 164100510017); Natural Science Basic Research Plan in Shaanxi Province of China with Grant (No. 2014JQ7248), respectively.
Tian, Y., Min, S. and Wu, Q.E. (2018) Application of Neural Network to Game Algorithm. Journal of Computer and Communications, 6, 1-12. https://doi.org/10.4236/jcc.2018.62001