Z. X. YE, J. S. Chen
8
{ (1,0),(-1,0),(0,1),(0,-1)}
o
N
(see Figure 1). This means that each vertex has 4 nearest
neighboring vertices.
Prisoners’ dilemma game: The basic game we study in
this work is two person prisoners’ dilemma game (PDG)
which is defined in its classic form: This game is played
by two players. We assume that each player has only two
choices of strategies which may be identified as {, }
CD
.
Where C represents to cooperate and D represents defect.
At any run of dynamic games, if both players choose C,
they get a pay-off R each; if one player chooses D while
the other chooses C, the defector player gets the biggest
pay-off T, while the other gets S; if both players defect,
they get pay-off P. We can write the payoff in the matrix
form:
(, )(,)(,)(,)
(,)(, )(,)(,)
QCC QCDRRST
QDCQDDTSPP
Q
(1)
where the pay-off values must satisfy the inequalities
2.TRPS andRST
For more about PDG, readers may refer [3-5].
Stage games: In our class of supergames, some stage
games are played over discrete time At
each discrete time every player plays four 2-strategy 2-
person prisoners’ dilemma games simultaneously with
his neighbors. At the end of each game, player receives
payoff
if he plays strategy y while his
neighbor plays strategy
{0,1,2,}.i
i
(, )
ij j
Qyx
j
; so his total payoff from
playing strategy y is the sum of the payoffs received from
playing y against each of his neighbors. Then player i
may revise his strategy from y to z with probability
(|(,))
11
exp( ,)(,)
|| ||
11
exp( ,)
'||||
i
i
i
ij jijj
jN
i
ij j
jN
i
pz iy
Qzx Qyx
N
Qzx
N
x
(2)
where
and '
are normalization factors to make
(| (,))1
i
zA
pz iy
x (3)
The global updating rule is synchronous, i.e., all play-
ers change their strategies simultaneously at the same
time.
The dynamics of a supergame is characterized by a
stochastic process which is called strategy evolution
process (SEP). Technically, the SEP for a large super-
game is a Markov chain whose state at time t is denoted
by ,
{;
tti}
iVX. It takes value over
{, }
VV
t
CD
,
{; }
tti
iV
x is the realization of t
. Equivalently
he state of SEP at time y a probability
distribution t
we may model tt b
on V
. Suppose that the configuration
1t
x determs theategy of player i at time t with
ability (called local transition probability):
,1 ,1,
(| )(|:}
itititi tji
pxpxxj W
ine
prob
str
x (4)
Note that
,
,1,
|;)1
ti
iti tji
x
xjWfor alliV
(p x
(5)
Let be the global one-step transition prob-
)jW
()Py|x
abilities from x toy. Then for Synchronous updating
rule, the global transition probabilities of the SEP are
defined by
P
1,1,
() (|:
ttititj i
iV
pxx
x|x (6)
The global transition probabilities (6) defines a dis-
crete-time Markov process on the configuration space
V
. Given a measure 1t
on the configuration 1t
x
defines a probability msure 1
=
tt
(6) ea
P on t
x.
()( ))ddP
xx
11 1
( |
tttttt
d
xx (7)
We say that a measure
is stationary
va
or time in-
riant if
=
P. We are interested in existence and
uniquenessf the invariant measures under the above
mentioned condition, i.e., the ergodicity and reversibility
of the SEP. In certain cases there may exist multiple in-
variant measures. This phenomenon is called phase tran-
sition. The following result is well known.
Theorem 3.1: The invariant measures
o
for the time
ev
ous updating case, to find the invari-
an
3. Simulation Result of the Limiting
Ine report the simulation results for exam-
We consider the finite sub-lattice with
olution form a nonempty convex set.
Proof: see [6].
For the synchron
t measure analytically is quite difficult. So in this pa-
per we focus on the numerical simulation which shows
interesting behavior. The detail is given in the next sec-
tion.
Behavior
this section w
ining the limiting behavior of dynamic supergames de-
scribed in the previous section. For convenience, we set
the following payoff matrix
(1,1
) (8,0)
(0, 8)(5, 5)
Q
40 40
verti-
ce The los and 3 different boundary conditions. cal and
global transition probabilities are given by (2) and (6),
respectively. We use white color to represent D strategy
state, and black color the C strategy state. For different
which is called configuration space of the SEP at time t.
Copyright © 2013 SciRes. OJAppS