dummies
 

Suchen und Finden

Titel

Autor/Verlag

Inhaltsverzeichnis

Nur ebooks mit Firmenlizenz anzeigen:

 

Handbook of Game Theory

Petyon Young, Shmuel Zamir

 

Verlag Elsevier Reference Monographs, 2014

ISBN 9780444537676 , 1024 Seiten

Format PDF, ePUB

Kopierschutz DRM

Geräte

116,00 EUR


 

Chapter 2

Advances in Zero-Sum Dynamic Games


Rida Laraki*,; Sylvain Sori    * CNRS in LAMSADE (Université Paris-Dauphine), France
† Econimicsonomics Department at Ecole Polytechnique, France
‡ Mathematics, CNRS, IMJ-PRG, UMR 7586, Sorbonne Universites, UPMC Univ Paris 06, Univ Paris Diderot, Sorbonne Paris Cite, Paris, France

Abstract


The survey presents recent results in the theory of two-person zero-sum repeated games and their connections with differential and continuous-time games. The emphasis is made on the following

(1) A general model allows to deal simultaneously with stochastic and informational aspects.

(2) All evaluations of the stage payoffs can be covered in the same framework (and not only the usual Cesàro and Abel means).

(3) The model in discrete time can be seen and analyzed as a discretization of a continuous time game. Moreover, tools and ideas from repeated games are very fruitful for continuous time games and vice versa.

(4) Numerous important conjectures have been answered (some in the negative).

(5) New tools and original models have been proposed. As a consequence, the field (discrete versus continuous time, stochastic versus incomplete information models) has a much more unified structure, and research is extremely active.

Keywords

repeated

stochastic and differential games

discrete and continuous time

Shapley operator

incomplete information

imperfect monitoring

asymptotic and uniform value

dual game

weak and strong approachability.

JEL Codes

C73.C61

C62

AMS Codes

91A5

91A23

91A25

2.1 Introduction


The theory of repeated games focuses on situations involving multistage interactions, where, at each period, the Players are facing a stage game in which their actions have two effects: they induce a stage payoff, and they affect the future of the game (note the difference with other multimove games like pursuit or stopping games where there is no stage payoff). If the stage game is a fixed zero-sum game G, repetition adds nothing: the value and optimal strategies are the same as in G. The situation, however, is drastically different for nonzero-sum games leading to a family of so-called Folk theorems: the use of plans and threats generates new equilibria (Sorin's (1992) chapter 4 in Handbook of Game Theory (HGT1)).

In this survey, we will concentrate on the zero-sum case and consider the framework where the stage game belongs to a family m, ∈M, of two-person zero-sum games played on action sets I×J. Two basic classes of repeated games that have been studied and analyzed extensively in previous volumes of HGT are stochastic games (the subject of Mertens's (2002) chapter 47 and Vieille's (2002) chapter 48 in HGT3) and incomplete information games (the subject of Zamir's (1992) chapter 5 in HGT1). The reader is referred to these chapters for a general introduction to the topic and a presentation of the fundamental results.

In stochastic games, the parameter m, which determines which game is being played, is a publicly known variable, controlled by the Players. It evolves over time and its value mn + 1 at stage n+1 (called the state) is a random stationary function of the triple (in, jn, mn) which are the moves, respectively the state, at stage n. At each period, both Players share the same information and, in particular, know the current state. On the other hand, the state is changing and the issue for the Players at stage n is to control both the current payoff gn (induced by (in, jn, mn)) and the next state mn + 1.

In incomplete information games, the parameter m is chosen once and for all by nature and kept fixed during play, but at least one Player does not have full information about it. In this situation, the issue is the trade-off between using the information (which increases the set of strategies in the stage game) and revealing it (this decreases the potential advantage for the future).

We will see that these two apparently very different models—evolving known state versus unknown fixed state—are particular incarnations of a more general model and share many common properties.

2.1.1 General model of repeated games (RG)


The general presentation of this section follows Mertens et al. (1994). To make it more accessible, we will assume that all sets (of actions, states, and signals) are finite; in the general case, measurable and/or topological hypotheses are in order, but we will not treat such issues here. Some theorems will be stated with compact action spaces. In that case, payoff and transition functions are assumed to be continuous.

Let M be a parameter space and g a function from I×J×M to . For every m∈M, this defines a two Player zero-sum game with action sets I and J for Player 1 (the maximizer) and Player 2, respectively, and with a payoff function g(‧, m). The initial parameter m1 is chosen at random and the Players receive some initial information about it, say a1 (resp. b1 for Player 1 (resp. Player 2). This choice of nature is performed according to some initial probability distribution on π A×B ×M, where A and B are the signal sets ofeach Player. The game is then played in discrete time.

At each stage n = 1,2,…, Player 1 (resp. Player 2) chooses an action inI (resp. jnJ). This determines a stage payoff gn = g(in, jn, mn), where mn is the current value of the state parameter. Then, a new value mn+1 of the parameter is selected and the Players Advances in zero-sum dynamic games get some information about it. This is generated by a map Q from I×J×M to the set of probability distributions on A×B×M. Moreprecisely, atstage n+1, a triple (an+1, bn+1, mn+1) is chosen according to the distribution Q(in, jn, mn) and an+1 (resp. bn+1) is transmitted to Player 1 (resp. Player 2).

Note that each signal may reveal some information about the previous choice of actions (in, jn) and/or past and current values (mn and mn + 1) of the parameter:

Stochastic games (with standard signaling: perfect monitoring) (Mertens, 2002) correspond to public signals including the parameter: an + 1 = bn + 1 = {in, jn, mn + 1}.

Incomplete information repeated games (with standard signaling) (Zamir, 1992) correspond to an absorbing transition on the parameter (mn = m1 for every n) and no further information (after the initial one) on the parameter, but previous actions are observed: an + 1 = bn + 1 = {in, jn}.

A play of the game induces a sequence m1, a1, b1, i1, j1, m2, a2, b2, i2, j2,…, while the information of Player 1 before his move at stage n is a private history of him of the form (a1, i1, a2, i2,an) and similarly for Player 2. The corresponding sequence of payoffs is g1, g2,… and it is not known to the Players (unless it can be deduced from the signals).

A (behavioral) strategy σ for Player 1 is a map from Player 1 private histories to (I) the space of probability distributions on the set of actions I: in this way, σ defines the probability distribution of the current stage action as a function of the past events known to Player 1: a behavioral strategy τ for Player 2 is defined similarly. Together with the components of the game, π and Q, a pair (σ,τ) of behavioral strategies induces a probability distribution on plays, and hence on the sequence of payoffs. E(σ, τ) stands for the corresponding expectation.

2.1.2 Compact evaluations


Once the description of the repeated game is specified, strategy sets are well defined as well as the play (or the distribution on plays) that they...