GT #5
1. All situations in which at least one agent can only act to maximize his utility through anticipating the responses to his actions by one or more other agents is called a game. Agents involved in games are referred to as players.
2. Each player in a game faces a choice among two or more possible strategies. A strategy is a predetermined 'programme of play' that tells her what actions to take in response to every possible strategy other players might use. The significance of the italicized phrase here will become clear when we take up some sample games below.
3. The next concept we need is that of a utility function.
4. Remember that an agent is, by definition, an entity with preferences. Economists describe these by means of an abstract concept called utility. This refers to how an agent ranks an object or event, so that an agent who, for example, adores the taste of pickles would be said to associate high utility with them, while an agent who can take or leave them associates a lower level of utility with them. Examples of this kind suggest that 'utility' denotes a measure of subjective psychological fulfillment, and this is indeed how the concept was generally (though not always) interpreted prior to the 1930s. During that decade, however, economists rejected the theoretical use of such unobservable entities as 'psychological fulfillment quotients.' They therefore defined utility in such a way that it became a purely technical concept. When we say that an agent acts so as to maximize her utility, we mean by 'utility' simply whatever it is that the agent's behavior suggests her to consistently desire. This could refer material goods, the happiness of others, world peace, tantric bliss, or anything.
5. Since game theory involves formal reasoning, we must have a device for thinking of utility maximization in mathematical terms. Such a device is called a utility function. The utility-map for an agent is called a 'function' because it maps ordered preferences onto the real numbers. Suppose we have three bundles over which agent X has preferences. (A 'bundle' might be: a new Toyota, a house in Camp's Bay, good sex four times a week and the establishment of democracy in Zimbabwe this year. Or anything else.) Suppose that X prefers bundle a to bundle b and bundle b to bundle c. We then map these onto a list of numbers, where the function maps the highest-ranked bundle onto the largest number in the list, the second-highest-ranked bundle onto the next-largest number in the list, and so on, thus:
bundle a ---> 3
bundle b --> 2
bundle c --> 1
The only property mapped by this function is order. The magnitudes of the numbers are irrelevant; that is, you must not read this as saying that X gets 3 times as much utility from bundle a as she gets from bundle c. Thus we could represent exactly the same utility function as that above by
bundle a --> 7,326
bundle b --> 12.6
bundle c --> -1,000,000
The numbers featuring in an ordinal utility function are thus not measuring any quantity of anything. A utility-function in which magnitudes do matter is called 'cardinal'. Whenever someone refers to a utility function without specifying which kind is meant, you should assume that it's ordinal. These are the sorts we'll need for the first set of games we'll examine. Later, when we come to seeing how to solve games that involve randomization -- our river-crossing game above, for example -- we'll need to treat utility functions as cardinal. The technique for doing this was given by von Neumann & Morgenstern (1947), and was an essential aspect of their invention of game theory. For the moment, however, we will need only ordinal functions.
6. A crucial aspect of the specification of a game involves the information that players have when they choose strategies. The simplest games are those in which agents have perfect information, meaning that at every point where each agent's strategy tells her to take an action, she knows everything that has happened in the game up to that point. A board-game of sequential moves in which in which both players watch all the action (and know the rules in common), such as chess, is an instance of such a game. By contrast, the example of the bridge-crossing game illustrates a game of imperfect information, since the fugitive must choose a bridge to cross without knowing the bridge at which the pursuer has chosen to wait, and the pursuer similarly makes her decision in ignorance of the actions of her quarry. Since game theory is about rational action given the strategically significant actions of others, it should not surprise you to be told that what agents in games know, or fail to know, about each others' actions makes a considerable difference to the logic of our analyses, as we will see.
7. The difference between games of perfect and of imperfect information is closely related to (though certainly not identical with!) a distinction between ways of representing games that is based on order of play. Let us begin by distinguishing between sequential-move and simultaneous-move games in terms of information. It is natural, as a first approximation, to think of sequential-move games as being ones in which players choose their strategies one after the other, and of simultaneous-move games as ones in which players choose their strategies at the same time. This isn't quite right, however, because what is of strategic importance is not the temporal order of events per se, but whether and when players know about other players' actions relative to having to choose their own. For example, if two competing businesses are both planning marketing campaigns, one might commit to its strategy months before the other does; but if neither knows what the other has committed to or will commit to when they make their decisions, this is a simultaneous-move game. Chess, by contrast, is normally played as a sequential-move game: you see what your opponent has done before choosing your own next action.
8. Sometimes the order of play matters in a game and sometimes it doesn't. Obviously, order of play doesn't matter in simultaneous-move games. And for some games, changing them to sequential-move format doesn't make any difference. Wherever order of play is irrelevant to us, for whatever reason, we can model the game in what is called strategic or normal form, using a matrix. We'll illustrate this using the world's most famous game: the Prisoner's Dilemma.
9. The name of the Prisoner's Dilemma game is derived from the following situation typically used to exemplify it. Suppose that the police have arrested two people whom they know have committed an armed robbery together. Unfortunately, they lack enough admissible evidence to get a jury to convict. They do, however, have enough evidence to send each prisoner away for two years for theft of the getaway car. The chief inspector now makes the following offer to each prisoner: If you will confess to the robbery, implicating your partner, and she does not also confess, then you'll go free and she'll get ten years. If you both confess, you'll each get 5 years. If neither of you confess, then you'll each get two years for the auto theft.
10. Our first step in modeling your situation as a game is to represent it in terms of utility functions. Both you and your partner's utility functions are identical:
Go free --> 4
2 years --> 3
5 years --> 2
10 years --> 0
The numbers in the function above are now used to express your and your partner's payoffs in the various outcomes possible in your situation. We will refer to you as 'Player I' and to your partner as 'Player II'. Now we can represent your entire situation on a matrix; this is the strategic form of your game.
11. The players, and analysts, can predict the outcome of the PD using a mechanical procedure known as iterated elimination of strictly dominated strategies. You, as Player I, can see by examining the matrix that your payoffs in each cell of the top row are higher than your payoffs in each corresponding cell of the bottom row. Therefore, it can never be rational for you to play your bottom-row strategy, viz., refusing to confess, regardless of what your opponent does. Since your bottom-row strategy will never be played, we can simply delete the bottom row from the matrix. Now it is obvious that Player II will not refuse to confess, since his payoff from confessing in the two cells that remain is higher than his payoff from refusing. So, once again, we can delete the one-cell column on the right from the game. We now have only one cell remaining, that corresponding to the outcome brought about by mutual confession. Since the reasoning that led us to delete all other possible outcomes depended at each step only on the premise that both players are economically rational - that is, prefer higher payoffs to lower ones - there is very strong grounds for viewing joint confession as the solution to the game, the outcome on which its play must converge. You should note that the order in which strictly dominated rows and columns are deleted doesn't matter. Had we begun by deleting the right-hand column and then deleted the bottom row, we would have arrived at the same solution.
12. The PD is not a typical game in many respects. One of these respects is that all its rows and columns are either strictly dominated or strictly dominant. In any strategic-form game where this is true, iterated elimination of strictly dominated strategies is guaranteed to yield a unique solution. Later, however, we will see that for many games this condition does not apply, and then our analytic task is less straightforward.
13. Most people think there's something disturbing about the outcome of the PD. Had you both refused to confess, you'd have arrived at the lower-right outcome in which you each go to prison for only 2 years, thereby both earning higher utility than you receive when you confess. This is the most important fact about the PD, and its significance for game theory is quite general. We'll therefore return to it below when we discuss equilibrium concepts in game theory. For now, however, let us stay with our use of this particular game to illustrate the difference between strategic and extensive forms.
14. When people introduce the PD into popular discussions, you'll usually hear them say that the police inspector must lock his prisoners into separate rooms so that they can't communicate with one another. The reasoning behind this idea seems obvious: if you could communicate, you'd surely see that you're both better off refusing, and could make an agreement to do so, no? This, one presumes, would remove your conviction that you must confess because you'll otherwise be sold up the river by your partner. In fact, however, this intuition is misleading and its conclusion is false.
15. When we represent the PD as a strategic-form game, we implicitly assume that the prisoners can't attempt collusive agreement since they choose their actions simultaneously. In this case, agreement before the fact can't help. If you are convinced that your partner will stick to the bargain then you can seize the opportunity to go scot-free by confessing. Of course, you realize that the same temptation will occur to her; but in that case you again want to make sure you confess, as this is your only means of avoiding your worst outcome. Your agreement comes to naught because you have no way of enforcing it; it constitutes what game theorists call 'cheap talk'.
16. But now suppose that you do not move simultaneously. That is, suppose that one of you can choose after observing the other's action. This is the sort of situation that people who think non-communication important must have in mind. Now you can see that your partner has remained steadfast when it comes to your choice, and you need not be concerned about being suckered. However, this doesn't change anything, a point that is best made by re-representing the game in extensive form. This gives us our opportunity to introduce game-trees and the method of analysis appropriate to them.
17. First, here are definitions of some concepts that will be helpful in analyzing game-trees:
Node: A point at which a player takes an action.
Initial node: The point at which the first action in the game occurs.
Terminal node: Any node which, if reached, ends the game. Each terminal node corresponds to an outcome.
Subgame: Any set of nodes and branches descending uniquely from one node.
Payoff: an ordinal utility number assigned to a player at an outcome.
Outcome: an assignment of a set of payoffs, one to each player in the game.
Strategy: a program instructing a player which action to take at every node in the tree where she could possibly be called on to make a choice.
18. Let's suppose that you and your partner have studied the matrix above and, seeing that you're both better off in the outcome represented by the lower-right cell, have formed an agreement to cooperate. You are to commit to refusal first, at which point she will reciprocate. We will refer to a strategy of keeping the agreement as 'cooperation', and will denote it in the tree with 'C'. We will refer to a strategy of breaking the agreement as 'defection', and will denote it on the tree with 'D'. As before, you are I and your partner is II. We'll now work through the tree.
19. What happens here is that you realize that if you play C (refuse to confess) at node 1, then your partner will be able to maximize her utility by suckering you and playing D. (On the tree, this happens at node 3.) This leaves you with a payoff of 0 (ten years in prison), which you can avoid only by playing D to begin with. You therefore defect from the agreement.
20. We have thus seen that in the case of the Prisoner's Dilemma, the simultaneous and sequential versions yield the same outcome. This will often not be true, however. In particular, only finite extensive-form (sequential) games of perfect information can be solved using backward induction (Zermelo's algorithm).
21. Sometimes we must represent simultaneous moves within games that are otherwise sequential. In all such cases the game as a whole will be one of imperfect information, so we won't be able to solve it using Zermelo's algorithm. We represent such games using the device of information sets.
22. The oval drawn around nodes b and c indicates that they lie within a common information set. This means that at these nodes players cannot infer back up the path from whence they came; II does not know, in choosing her strategy, whether she is at b or c. For this reason, what properly bear numbers in extensive-form games are information sets, conceived as 'action points', rather than nodes themselves; this is why the nodes inside the oval are labelled with letters rather than numbers. Put another way, II, when choosing, does not know what I has done at node a. But this is just what defines two moves as simultaneous. We can thus see that the method of representing games as trees is entirely general. If no node after the initial node is alone in an information set on its tree, so that the game has only one subgame (itself), then the whole game is one of simultaneous play. If at least one node shares its information set with another, while others are alone, the game involves both simultaneous and sequential play, and so is still a game of imperfect information. Only if all information sets are inhabited by just one node do we have a game of perfect information.
23. What we've called the 'solution' to the PD is the unique Nash equilibrium of the game. (The 'Nash' here refers to John Nash, the Nobel Prize-winning mathematician who in Nash (1950) did most to extend and generalize von Neumann & Morgenstern's pioneering work.) Nash equilibrium (NE) applies (or fails to apply, as the case may be) to whole sets of strategies, one for each player in a game. (Thus, to speak properly, you must say of a game that it's in NE; of a whole set of strategies, one per player, that the set is an NE; and of some one particular strategy that it is a NE strategy.) A set of strategies is a NE just in case no player could improve her payoff, given the strategies of all other players in the game, by changing her strategy. Notice how closely this idea is related to the idea of strict dominance: no strategy could be a NE strategy if it is strictly dominated. Therefore, if iterative elimination of strictly dominated strategies takes us to a unique outcome, we know we have found the game's unique NE. Now, almost all theorists agree that avoidance of strictly dominated strategies is a minimum requirement of rationality. This implies that if a game has an outcome that is a unique NE, as in the case of joint confession in the PD, that must be its unique solution. When we solve an extensive-form game by backward induction, we get a solution even stronger than a mere NE: we get the special NE that is the game's subgame-perfect equilibrium, which means that the strategy set is a NE for the whole game and also for each of its subgames. Sometimes there might be more than one NE, but there can never be more than one SPE.
24. To cement these concepts, we'll now work through a game with more interesting structure than the PD, in both extensive and strategic forms, and find its NE.
25. Note that in this game, as in the PD, we get an outcome that might be regretted from the social point of view. Player I would be better off, and Player II no worse off, at the left-hand node emanating from node 7 than at the SPE outcome. But Player I's rationality, and Player II's awareness of this, blocks the socially efficient outcome. If our players wish to bring about the more equitable outcome (4,5) here, they must do so by redesigning their institutions so as to change the structures of the games they play. Merely wishing that the world could be better doesn't make it so.
26. Here is where many people get confused and imagine that 'morality' could come to the rescue. Surely, the players might be able to just see that outcome (4,5) is socially and morally superior; and since we know they can also see the path of actions that leads to it, who is the game theorist to announce that, within the game they're playing, it's unattainable? Surely, you may think, it simply results from a combination of selfishness and paranoia on the part of the players. To begin with they have no regard for the social good, and then they shoot themselves in the feet by being too untrustworthy to respect agreements.
27. However, this reasoning rests on a misunderstanding of what a game says as a description of a situation. A utility function for a player is supposed to represent everything that player cares about, which may be anything at all. As we have described the situation of our prisoners they do indeed care only about their own relative prison sentences, but there is nothing essential in this. What makes a game an instance of the PD is strictly and only its payoff structure. Thus we could have two Mother Theresa types here, both of whom care little for themselves and wish only to feed starving children. But suppose the original Mother Theresa wishes to feed the children of Calcutta while Mother Juanita wishes to feed the children of Bogota. And suppose that the international aid agency will maximize its donation if the two saints nominate the same city, will give the second-highest amount if they nominate each others' cities, and the lowest amount if they each nominate their own city. Our saints are in a PD here, though hardly selfish or unconcerned with the social good.
28. To return to our prisoners, suppose that, contrary to our assumptions, they do value each other's well-being as well as their own. In that case, this must be reflected in their utility functions, and hence in their payoffs. If their payoff structures are changed, they will no longer be in a PD. But all this shows is that not every possible situation is a PD; it does not show that the threat of inefficient outcomes is a special artifact of selfishness. It is the logic of the prisoners' situation, not their psychology, that traps them in the inefficient outcome, and if that really is their situation then they are stuck in it (barring further complications to be discussed below). Agents who wish to avoid inefficient outcomes are best advised to prevent certain games from arising; the defender of the possibility of 'moral rescue' is really proposing that they try to dig themselves out of such games by turning themselves into different kinds of agents.
29. In general, then, a game is partly defined by the payoffs assigned to the players. If a proposed solution involves tacitly changing these payoffs, then this 'solution' is in fact a disguised way of changing the subject.