GaeBlogX Arista Networks, Software-Defined Networking

Matrix Game Theory

2017-09-10
Yanxi Chen

Matrix Game: Two players, each makes a choice secretly and play simutaneously. And there is payoff.

Zero Sum Game

Saddle Point

$(X,Y)$ is a saddle point if entry $x$ is the largest value in column $X$ and smallest value in its row.

Thm: All saddle points has the same value and appears as corners of a rectangle.

2$\times$2 Game

All examples below are between two players Colin and Ros. Because it is a zero sum game, we only write Rose’s payoff, and Colin’s payoff is just the inverse.

Example:

  A B
A 2 -3
B -1 3

If Colin plays $\frac{2}{3}A,\frac{1}{3}B$: Rose A average payout is $2\times\frac{2}{3}-3\times\frac{1}{3}=\frac{1}{3}$ Rose A average payout is $-1\times\frac{2}{3}+3\times\frac{1}{3}=\frac{1}{3}$

For Colin playing $\frac{2}{3}A,\frac{1}{3}B$ minimizes Rose’s maximum average payout; similarly Rose wants to maximize her minimum average payout. So:

So she wants:

So the average payout is $2y-(1-y)=\frac{1}{3}$

Note: If there is a saddle point in the game, then it can’t be solved.

For

a b
c d

with no dominant row/column:

So for Colin whose payout is $x$:

$m\times n$ Zero Sum Game

Von Neumann’s Minimax Thm: Every $m\times n$ game has a solution. Each player has a distribution of their choices such that Rose’s average minimum payout is maximized, and Colin’s maximum average payout is minimized, and two payouts are the same.

The proof is omitted not because I am lazy but for exercise (fact!). ¯_(ツ)_/¯

Some Other Lemma and Prop

Let $A$ be a $m\times n$ matrix. $P$ is the payout vector for Rose, and $q$ is for column

Lemma:

Prop: $p,q$ are strategies for Rose, Colin such that

So how do we find the minimax\maximin? Use linear programming, which is not interesting at all so we will just skip that.

Non-zero Sum Game

All examples below are between two players Colin and Rose, where Colin’s payoff is the column one and Rose’s payoff is the row; i.e. for $(X,Y)$, Colin has $X$ and Rose has $Y$.

Let start with a simple example

(-1,-1) (-10, 0)
(0, -10) (-5, -5)

Obviously, the point (-5, -5) is the pure Nash equilibrium.

Mix Strategy

A mix strategy is just a vector of probability of strategy.

If $q$ is a mix strategy for Colin, a best response for Rose is any $p$ with $p_i=0$ if $(Rq)$ is not a max entry.

A Nash Equilibrium is a pair $(p,q)$ of mix strategy such that each is a best response for the other.

Thm: for $2\times2$ game, $(p,q)$ is a Nash Eq if $p^TC$ and $Rq$ have equal entries.

Definitions, Axioms, and Nash’s Theorm

We know that Nash equilibrium is not necessary global optimal. For the example above, they could have picked (-1, -1), which is better than (-5, -5), but they end up with the worse one. Therefore, we might want someone to pick a position for them, instead of letting them come up with a solution themselves.

Given a game, the God selects a position for R and C that is a “fair”. To be a “fair” decision the outcome should be:

  • Pareto Optimal: No $(x,y)$ with
  • should be at least their maximin strategy with repsect to their own game (or they will just pick the maximin strategy)

We call the numbers in the matrix utility, because it is not necessarily be money.

Also, we can map the matrix geometrically, where each payoff is just a point. I won’t show it here, because it should be obvious. After placing all the points, we can draw a polygon by connecting those points. So the boundary on the North-East direction of the convex hull is pareto optimal. If the maximin solution point is inside the convex hull, then the boundary of the convex hull on the North-East direction of the maximin solution (which is a point) is called negotiation set.

Then we have the following axioms: Any “fair” decision should be:

  1. In negotiation set.
  2. If either player’s utilities, say Roses’, is tranformed by $g(x)=mx+n,m>0$, then is a fair decision in the new game.
  3. If the payoff polygon is symmetric about the line $x=y$, then is on this line.
  4. Suppose $P$ is a payoff polygon, and a fixed “status quo” point, which could be the maximin or other “fall back” point, $SQ$ is given. Suppose $Q$ is another polygon contained in $P$ where $SQ$, , then is the fair decision for $Q$.

Nash’s Theorm

Nash’s Thm:

There is only one point that satisfies all of the above axioms.

If $SQ=(x_0,y_0)$, then maximizes $(x-x_0)(y-y_0)$, or

The proof is omitted not because I am lazy but for exercise (fact!). ¯_(ツ)_/¯


Similar Posts

Comments

Search