Axelrod, Robert M.;
The evolution of cooperation
Basic Books, 1984, 241 pages
ISBN 0465021220, 9780465021222
topics: | social | psychology | evolution |
The classic experiment in game theoretic analysis of multi-agent interactions. This body of work must be the most cited game theory work in areas from statistical mechanics to behavioural economics to cognitive psychology. For example, see Matt Ridley's The Origins of Virtue: Human Instincts and the Evolution of Cooperation (1998) or Philip Ball's Critical Mass: How One Thing Leads to Another, (2006) The book arose out of an experiment that Axelrod conducted in which he invited programmers to send in code for playing the Iterated Prisoner's Dilemma game. Read on...
I invited experts in game theory to submit programs for a Computer Prisoner's Dilemma tournament — much like a computer chess tournament. vii Each program would have available to it the history of the interaction so far and could use this history in making its choice of whether or not to cooperate on the current move. Entries came from game theorists in economics, psychology, sociology, political science, and mathematics. I ran the fourteen entries and a random rule against each other in a round robin tournament. To my considerable surprise, the winner was the simplest of all the programs submitted, TIT FOR TAT. TIT FOR TAT is merely the strategy of starting with cooperation, and thereafter doing what the other player did on the previous move. [defect if it defects, else keep cooperating] I then circulated the results and solicited entries for a second round of the tournament. This time I received sixty-two entries from six countries. Most of the contestants were computer hobbyists, but there were also professors of evolutionary biology, physics, and computer science, as well as the five disciplines represented in the first round. As in the first round, some very elaborate programs were submitted. There were also a number of attempts to improve on TIT FOR TAT itself. TIT FOR TAT was again sent in by the winner of the first round, Anatol Rapoport of the University of Toronto. Again it won. I suspected that the properties that made TIT FOR TAT so successful in the tournaments would work in a world where any strategy was possible. If so, then cooperation based solely on reciprocity seemed possible. The tournament results were published in the Journal of Conflict Resolution (Axelrod 1980a and 1980b), and are presented here in revised form in chapter 2. The theoretical results about initial viability, robustness, and stability were published in the American Political Science Review (Axelrod 1981). These findings provide the basis for chapter 3. After thinking about the evolution of cooperation in a social context, I realized that the findings also had implications for biological evolution. So I collaborated with a biologist— William Hamilton—to develop the biological implications of these strategic ideas. This resulted in a paper published in Science (Axelrod and Hamilton 1981) which appears here in revised form as chapter 5.
Each senator has an incentive to appear effective to his or her constituents, even at the expense of conflicting with other senators who are trying to appear effective to their constituents. But this is hardly a situation of completely opposing interests, a zero-sum game. On the contrary, there are many opportunities for mutually rewarding activities by two senators. These mutually rewarding actions have led to the creation of an elaborate set of norms, or folkways, in the Senate. Among the most important of these is the norm of reciprocity—a folkway which involves helping out a colleague and getting repaid in kind. It includes vote trading but extends to so many types of mutually rewarding behavior that "it is not an exaggeration to say that reciprocity is a way of life in the Senate" (Matthews 1960, p. 100; see also Mayhew 1975). Washington was not always like this. Early observers saw the members of the Washington community as quite unscrupulous, unreliable, and characterized by "falsehood, deceit, treachery" (Smith 1906, p. 190). In the 1980s the practice of reciprocity is well established.
First num = payoff for ROW ------------------------------ | Column player | | co-op defect | --------------------|----------------------------| co-op | 3, 3 0, 5 | Row player | | defect | 5, 0 1, 1 | -------------------------------------------------- 5 = Temptation to defect 0 = Sucker's payoff 3,3 = win-win 1,1 = lose-lose Dilemma: both players can see that, whether the other defects or not, each is better off defecting. but in the end, they end up at (1,1) which is not the best for either. Thus rational decisions lead to a poor outcome for both... This is the dilemma.
The most important kingmaker was based on an "outcome maximization" principle originally developed as a possible interpretation of what human subjects do in the Prisoner's Dilemma laboratory experiments (Downing 1975). This rule, called DOWNING, is a particularly interesting... DOWNING will try to get away with whatever it can by defecting. On the other hand, if the other player does seem responsive, DOWNING will cooperate. To judge the other's responsiveness, DOWNING estimates the probability that the other player cooperates after it (DOWNING) cooperates, and also the probability that the other player cooperates after DOWNING defects. For each move, it updates its estimate of these two conditional probabilities and then selects the choice which will maximize its own long-term payoff under the assumption that it has correctly modeled the other player. If the two conditional probabilities have similar values, DOWNING determines that it pays to defect, since the other player seems to be doing the same thing whether DOWNING cooperates or not. Conversely, if the other player tends to cooperate after a cooperation but not after a defection by DOWNING, then the other player seems responsive, and DOWNING will calculate that the best thing to do with a responsive player is to cooperate. Under certain circumstances, DOWNING will even determine that the best strategy is to alternate cooperation and defection. the one that scored lowest was also the one that was least forgiving. This is FRIEDMAN, a totally unforgiving rule that employs permanent retaliation. It is never the first to defect, but once the other defects even once, FRIEDMAN defects from then on. In contrast, the winner, TIT FOR TAT, is unforgiving for one move, but thereafter is totally forgiving of that defection. After one punishment, it lets bygones be bygones. JOSS has a sneaky rule that tries to get away with an occasional defection. This decision rule is a variation of TIT FOR TAT. Like TIT FOR TAT, it always defects immediately after the other player defects. But instead of always cooperating after the other player cooperates, 10 percent of the time it defects after the other player cooperates. Thus it tries to sneak in an occasional exploitation of the other player. This decision rule seems like a fairly small variation of TIT FOR TAT, but in fact its overall performance was much worse, and it is interesting to see exactly why. Table 1 shows the move-by-move history of a game between JOSS and TIT FOR TAT. At first both players cooperated, but on the sixth move JOSS selected one of its probabilistic defections. On the next move JOSS cooperated again, but TIT FOR TAT defected in response to JOSS's previous defection. Then JOSS defected in response to TIT FOR TAT's defection. In effect, the single defection of JOSS on the sixth move created an echo back and forth between JOSS and TIT FOR TAT. This echo resulted in JOSS defecting on all the subsequent even numbered moves and TIT FOR TAT defecting on all the subsequent odd numbered moves.
[The] entrants to the second round were all given the detailed analysis of the first round, including a discussion of the supplemental rules that would have done very well in the environment of the first round. An important element (ch.8) was the addition of noise: A player might not be certain about the choice actually made by the other player in the previous move. There could be problems of random noise or systematic misperception (Jervis 1976). To study this, the first round of the tournament was rerun with a 1 percent chance of misperception of the other's previous move. This resulted in yet another victory for TIT FOR TAT. This result indicates that TIT FOR TAT is relatively robust under conditions of moderate error in perception. Here modifications of TfT - later known as Contrite Tit-for-Tat - enable enhanced performance in the Iterated PD games. e.g. Boyd, Robert (1989). "Mistakes Allow Evolutionary Stability in the Repeated Prisoner's Dilemma Game". Journal of Theoretical Biology 136 (1): 47–56. In high noise situations, contrite TfT seems to do very well.
A fascinating case of the development of cooperation based on continuing interaction occurred in the trench warfare of World War I. In the midst of this very brutal war there developed between the men facing each other what came to be called the "live-and-let-live system." The troops would attack each other when ordered to do so, but between large battles each side would deliberately avoid doing much harm to the other side—provided that the other side reciprocated. The strategy was not necessarily TIT FOR TAT. Sometimes it was two for one. As a British officer wrote in his memoirs of the takeover of a new sector from the French: It was the French practice to "let sleeping dogs lie" when in a quiet sector . . . and of making this clear by retorting vigorously only when challenged. In one sector which we took over from them they explained to me that they had practically a code which the enemy well understood: they fired two shots for every one that came over, but never fired first. (Kelly 1930, p. 18) Such practices of tacit cooperation were quite illegal— but they were also endemic. For several years this system developed and elaborated itself despite the passions of the war and the best efforts of the generals to pursue a policy of constant attrition. The story is so rich in illuminating detail that all of the next chapter will be devoted to it. [...] Similarities in basic needs and activities let the soldiers appreciate that the other side would probably not be following a strategy of unconditional defection. Thus, in the summer of 1915, a soldier saw that the enemy would be likely to reciprocate cooperation based on the desire for fresh rations. It would be child's play to shell the road behind the enemy's trenches, crowded as it must be with ration wagons and water carts, into a bloodstained wilderness ... but on the whole there is silence. After all, if you prevent your enemy from drawing his rations, his remedy is simple: he will prevent you from drawing yours. (Hay 1916, pp. 224-25) [This last example seems more like a model of Contrite-TfT...]
cTfT has also spawned a number of variants, as seen in this paper... @article{boerlijst-nowak-97_logic-of-contrition, title={The logic of contrition}, author={Boerlijst, Maarten C and Nowak, Martin A and Sigmund, Karl}, journal={Journal of Theoretical Biology}, volume={185}, number={3}, pages={281--293}, year={1997}, publisher={Elsevier}, annote = { Abstract A highly successful strategy for the Repeated Prisoner's Dilemma is Contrite Tit For Tat, which bases its decisions on the “standings” of the two players. This strategy is as good as Tit For Tat at invading populations of defectors, and much better at overcoming errors in implementation against players who are also using it. However, it is vulnerable to errors in perception. In this paper, we discuss the merits of Contrite Tit For Tat and compare it with other strategies, like Pavlov and the newly-introduced Remorse. We embed these strategies into an eight-dimensional space of stochastic strategies which we investigate by analytical means and numerical simulations. Finally, we show that if one replaces the conventions concerning the “standing” by other, even simpler conventions, one obtains an evolutionarily stable strategy (called Prudent Pavlov) which is immune against both mis-perception and mis-implementation. }}
[also see fascinating later work, Axelrod and Bennett, on modeling the process of coalition formation as minimas in a coalition energy landscape. * Axelrod, Robert and Bennett, D Scott; A landscape theory of aggregation, British journal of political science, v.2(03), 1993. Two “energy” minimas emerge in the landscape of alliance formation before WW 2. deepest minima: v. similar to the split into Allied and Axis powers - only Portugal and Poland placed in the “wrong” camp. The other basin predicts a very different history, with Europe united against the Soviet Union. [merges a hypothetical figure in axelrod & bennett 93] fig. from Critical Mass: How One Thing Leads to Another by Philip Ball (2006)