Prisoners' Dilemma Supergame on Rectangle Lattice

doi:10.4236/ojapps.2013.31B1002

Paper Menu >>

Journal Menu >>

Open Journal of Applied Sciences, 2013, 3, 7-11

doi:10.4236/ojapps.2013.31B1002 Published Online April 2013 (http://www.scirp.org/journal/ojapps)

Prisoners' Dilemma Supergame on Rectangle Lattice

Zhongxing Ye1,2, Jingshu Chen3

1School of Business Information Management, Shanghai Institute of Foreign Trade, Shanghai, China

2Department of Mathematics, Shanghai Jiao Tong University, Shanghai, China

3Department of Management Science and Engineering, Stanford University, Stanford, USA

Email: yezx@shift.edu.cn, jingshuc@stanford.edu

Received 2013

ABSTRACT

In this paper a class of large supergames, i.e., infinitely repeated games played by many players are studied. The players

located on the vertex set of planar rectangle lattice play several basic games with his neighbors. The basic game is

two-person prisoners’ dilemma game with asymmetric payoffs. Under the conditions of the pre-specified updating rules

and the transition probabilities, the relevant stochastic process of strategy evolution forms a Markovian process. The

simulation results about the long-run behavior are provided.

Keywords: Prisoners’ Dilemma; Supergame; Planar Rectangle Lattice; Markov Process; Invariant Measure;

Equilibrium

1. Introduction

In a series of our previous works we have investigated a

class of large Ising-type supergames, i.e., infinitely re-

peated games played by (infinitely) many players located

on networks. In these stylized class of supergames, game

players are located on the vertex set of planar rectangle

lattice or trees (see [1,2]). Each player plays several basic

games with his neighbors. The basic game is two-person

Ising-type game with symmetric payoffs. Under the con-

ditions of the pre-specified updating rules and the transi-

tion probabilities i.e., these relevant stochastic process of

strategy configuration given, the formula of invariant

measures which represent the long-run equilibrium plays

are obtained. The phase transition phenomena are discov-

ered.

In this work we extend our previous work to prisoners’

dilemma game with asymmetric payoffs in stead of

Ising-type game with symmetric payoffs. There has been

many research works on prisoner’s dilemma games (see

[3]) and for evolutionary prisoner’s dilemma games (for

example, see [4,5]). In this work, we assume that each

player plays several two person prisoners’ dilemma

games only with her neighbors, in each and every period

of discrete times. Players change their strategy simulta-

neously at every period of time. In section 2, we offer the

structures of planar rectangle lattices and the formulation

of the class of prisoners’ dilemma supergames by offer-

ing the ingredients needed. In section 3, we study a spe-

cial dynamic supergame with basic two person prisoners’

dilemma game with asymmetric payoffs. The theoretic

analysis is difficult. So we provide simulated results to

pursue the limiting behavior of the dynamics. Section 4

is the conclusion, in which some further research direc-

tions are mentioned.

2. Supergames on Planar Rectangle Lattices

Based on Basic Two Person Prisoners’

Dilemma Games

The players: We assume that players are located on the

vertex sites of a planar rectangle lattice (see Figure 1)

(, )GVE



, where V is the vertex set and E the edge set

of the lattice. We also assume that all the players are

identical.

Neighborhood: A neighborhood system

is a collection of nonempty subsets of the vertices of V

such that (i) i does not belong to for all i

{, }

NNiV



; (ii)



if and only if ,



for all is called

the neighborhood of i. We define the set

,;Vi

{}

WN i



.

In this work we only consider the following structure of

neighborhood. The structure of neighboring sites of the

origin (0,0)o



is given by

Figure 1. Planar rectangle lattice and neighborhood struc-

ture.

Z. X. YE, J. S. Chen

{ (1,0),(-1,0),(0,1),(0,-1)}

N

(see Figure 1). This means that each vertex has 4 nearest

neighboring vertices.

Prisoners’ dilemma game: The basic game we study in

this work is two person prisoners’ dilemma game (PDG)

which is defined in its classic form: This game is played

by two players. We assume that each player has only two

choices of strategies which may be identified as {, }



Where C represents to cooperate and D represents defect.

At any run of dynamic games, if both players choose C,

they get a pay-off R each; if one player chooses D while

the other chooses C, the defector player gets the biggest

pay-off T, while the other gets S; if both players defect,

they get pay-off P. We can write the payoff in the matrix

form:

(, )(,)(,)(,)

(,)(, )(,)(,)

QCC QCDRRST

QDCQDDTSPP









Q





(1)

where the pay-off values must satisfy the inequalities

2.TRPS andRST 

For more about PDG, readers may refer [3-5].

Stage games: In our class of supergames, some stage

games are played over discrete time At

each discrete time every player plays four 2-strategy 2-

person prisoners’ dilemma games simultaneously with

his neighbors. At the end of each game, player receives

payoff

if he plays strategy y while his

neighbor plays strategy

{0,1,2,}.i

(, )

ij j

Qyx

; so his total payoff from

playing strategy y is the sum of the payoffs received from

playing y against each of his neighbors. Then player i

may revise his strategy from y to z with probability

(|(,))

exp( ,)(,)

|| ||

exp( ,)

'||||

ij jijj

ij j

pz iy

Qzx Qyx

Qzx























































(2)

where



and '



are normalization factors to make

(| (,))1

pz iy





x (3)

The global updating rule is synchronous, i.e., all play-

ers change their strategies simultaneously at the same

time.

The dynamics of a supergame is characterized by a

stochastic process which is called strategy evolution

process (SEP). Technically, the SEP for a large super-

game is a Markov chain whose state at time t is denoted

by ,

{;

tti}

iVX. It takes value over

{, }

CD 

{; }

tti



x is the realization of t

. Equivalently

he state of SEP at time y a probability

distribution t

we may model tt b



on V

. Suppose that the configuration



x determs theategy of player i at time t with

ability (called local transition probability):

,1 ,1,

(| )(|:}

itititi tji

pxpxxj W



ine

prob

str



x (4)

Note that

,1,

|;)1

iti tji

xjWfor alliV



(p x







(5)

Let be the global one-step transition prob-

)jW

()Py|x

abilities from x toy. Then for Synchronous updating

rule, the global transition probabilities of the SEP are

defined by

1,1,

() (|:

ttititj i

pxx









x|x (6)

The global transition probabilities (6) defines a dis-

crete-time Markov process on the configuration space

. Given a measure 1t





on the configuration 1t



defines a probability msure 1

(6) ea



P on t

()( ))ddP





11 1

( |

tttttt

 

xx (7)

We say that a measure



is stationary

or time in-

riant if



P. We are interested in existence and

uniquenessf the invariant measures under the above

mentioned condition, i.e., the ergodicity and reversibility

of the SEP. In certain cases there may exist multiple in-

variant measures. This phenomenon is called phase tran-

sition. The following result is well known.

Theorem 3.1: The invariant measures

for the time

ous updating case, to find the invari-

3. Simulation Result of the Limiting

Ine report the simulation results for exam-



We consider the finite sub-lattice with

olution form a nonempty convex set.

Proof: see [6].

For the synchron

t measure analytically is quite difficult. So in this pa-

per we focus on the numerical simulation which shows

interesting behavior. The detail is given in the next sec-

tion.

Behavior

this section w

ining the limiting behavior of dynamic supergames de-

scribed in the previous section. For convenience, we set

the following payoff matrix

(1,1



) (8,0)

(0, 8)(5, 5)









40 40



verti-

ce The los and 3 different boundary conditions. cal and

global transition probabilities are given by (2) and (6),

respectively. We use white color to represent D strategy

state, and black color the C strategy state. For different

which is called configuration space of the SEP at time t.

Z. X. YE, J. S. Chen 9

value of parameter



, we simulate the dynamics of

evolution 500 steps. r simplicity, we only report the

following three cases. The dynamic percentage of write

vertices and the final state are provided via graphs.

1) Case 1. 0





The change of every player’s strategy is independent

of the history of his own and his neighbors’ past strate-

gies. So this case is not of interest.

2) Case 2. 1





i) The whitdae bounry conditions: (Figure 2)

ii) The black boundary conditions: (Figure 3)

iii) The periodic boundary conditions: (Figure

3) Case 3. 3





i) White boy condundaritions: (Figure 5)

ii) Black boundary conditions: (Figure 6)

iii) Periodic boundary conditions: (Figure

(a)

(b)

Figure 2. (a)



1

ercen

case white boundary conditions: ith w

the dynamic ptage of write vertices; (b)





1 case

with white boundary conditions: the final coation nfigur

(a)

(b)

Figure 3. (a)





ercen

case wlack boundary conditions: ith b

the dynamic ptage of write vertices; (b)





1 case

with black boundary conditions: the final configuration

after 500 steps of evolution.

Z. X. YE, J. S. Chen

(a)

(b)

Figure 4. (a)



1

mic p

case w periodic boundary co-ithndi

tions: the dynaercentage of write vertices; (b)





case with periodic boundary conditions: the final configura-

tion after 500 steps of evolution.

(a)

(b)

Figure 5. (a)





ercen

case white boundary conditins: ith wo

the dynamic ptage of write vertices; (b)





3 case

with white boundary conditions: the final configuration

after 500 steps of evolution.

(a)

(b)

Figure 6. (a)





ercen

case wlack boundary conditios: ith bn

the dynamic ptage of write vertices; (b)





3 case

with black boundary conditions: the final configuration

after 500 steps of evolution.

Z. X. YE, J. S. Chen

From the above simulation result, we can see that for

small value of



), the dynamic percentage of

white vertices approaches to 50%, while for large value



) the dynamic percentage of white vertices

approaches to 100%. In other word, the final configura-

tion of SEP will approach to the sole state with all white

color. This means that all players approach to defect. The

greater the



, the faster the convergence. So we con-

jecture that there exists a critical value of



denoted by



such that for 0c



 and c



 the limiting

behavior of the dynamic supergames areifferent. For



 cases, the dynamic percentages of white ver-

tices approache to 50%. While for c





 cases, the

dynamic percentages of white vertices approache to

100%. To find the exact value of c



is also of interest.

But the problem to prove these confirmations and con-

jecture theoretically remains open.

(a)

(b)

Figure 7. (a)



3

ercen

case with periodic boundary conditions:

the dynamic ptage of te vertices; (b) wri





3 case

with periodic boundary conditions: the final configuration

after 500 steps of evolution.

4. Conclusions

We study the dynam

rectangle lattice and

ic prisoners’ dilemma supergame on

provide some interesting simulation

ledgements

thank the National Natural

a (Program No. 11171215)

[1] A. J. Wang, Z “On the Long-run

Equilibria of ame on Z^d,” Jour-

erence on Pro-

On,” World Scientific Publishing

a Game on Highly Clustered Commu-

results and some conjucture. The theoretic analysis re-

mains open. Other types of supergames on different lat-

tices and based on other basic games will be reported

elsewhere.

5. Acknow

The authors would like to

Science Foundation of Chin

and Shanghai 085 Project for financial supports.

REFERENCES

. X. Ye and Y. C. Liu,

Class of Large Super Ga

nal of Mathematical Sciences: Advances and Applications,

Vol. 12, No. 1, 2012, pp. 21-46.

[2] H. Ou and Z. X. Ye, “Dynamic Supergames on Trees,”

Proceeding of 2010 International Conf

gress in Informatics and Computing(PIC-2010), Shanghai,

December 10-12, 2010.

[3] G. Kendall, X. Yao, S. Y. Chong, “The Iterated Prison

ers’ Dilemma: 20 Years

Co Pte Ltd, 2007.

[4] Y. K. Liu, Z. Li, X. J. Chen and L. Wang, “Evolutionary

Prisoners’ Dilemm

nity Networks,” Chinese Physics, Vol. 18, No. 7, 2009,

pp. 2623-2628. doi：10.1088/1 674-1056/ 18/7/001

[5] M. A. Saif and P. M. Gade, “Prisoner’s Dilemma with

Semi- synchronous Updates: Evidence for a First Order

,” American Mathe-

Phase Transition,” Vol. 6, 2009.

[6] R. Kinderman and J. L. Snell, “Markov Random

Fields and Their Applicarions

matical Society (AMS), Providence Rhode Island,

1980. doi：10 .1090/conm/001