APEC 3611w: Environmental and Natural Resource Economics
  • Course Site
  • Canvas
  1. 3. Market Failure
  2. 9. Commons
  • Home
  • Syllabus
  • Assignments
    • Assigment 01
    • Assigment 02
    • Weekly Questions 01
    • Weekly Questions 02
    • Weekly Questions 03
    • Weekly Questions 04
  • Midterm Exam
  • Final Exam
  • 1. Global Context
    • 1. Introduction
    • 2. The Doughnut
  • 2. Micro Foundations
    • 3. The Microfilling
    • 4. Supply and Demand
    • 5. Surplus and Welfare in Equilibrium
    • 6. Optimal Pollution
  • 3. Market Failure
    • 7. Market Failure
    • 8. Externalities
    • 9. Commons
  • 4. Macro Goals
    • 10. The Whole Economy
    • 11. GDP
    • 12. Kuznets Curve
    • 13. Inclusive Wealth
    • 14. Development
  • 5. Climate Change
    • 15. Climate Change
    • 16. Social Cost of Carbon
    • 17. Climate IAMs
    • 18. Air Pollution
    • 19. Water Pollution
  • 6. Natural Resources
    • 20. Non-renewables
    • 21. Will we run out?
    • 22. Fisheries
    • 23. Forestry
    • 24. Land as a resource
    • 25. Land-use change
  • 7. Natural Capital
    • 26. Ecosystem Services
    • 27. Valuing Nature
    • 28. Biodiversity
    • 29. GIS and Carbon
    • 30. Sediment Retention
    • 31. Ecosystem Tradeoffs
  • 8. Future Scenarios
    • 32. Uncertainty
    • 33. Possible Futures
    • 34. Positive Visions
  • 9. Policy Options
    • 35. Policy Analysis
    • 36. Market Policies
    • 37. Real World Policies
  • 10. Earth Economy Modeling
    • 38. Earth Economy Models
    • 39. Gridded Models
    • 40. EE in Practice
  • 11. Conclusion
    • 41. What Next?
  • Games and Apps

On this page

  • Content
  • Commons Dilemmas and Game Theory
    • Introduction to Game Theory and the Prisoner’s Dilemma
    • The Classic Prisoner’s Dilemma Setup
    • The Payoff Matrix Structure
    • Classroom Demonstration of the Prisoner’s Dilemma
    • Solving the Prisoner’s Dilemma: Dominant Strategies
    • Nash Equilibrium
    • The Normalized Prisoner’s Dilemma
    • Formal Definition of the Prisoner’s Dilemma
    • Connection to the Tragedy of the Commons
    • The Iterated Prisoner’s Dilemma
    • Classroom Demonstration of the Iterated Prisoner’s Dilemma
    • Real-World Applications of Iterated Prisoner’s Dilemmas
    • Beyond the Prisoner’s Dilemma: Other Game Structures
    • Strategic Diversity in Iterated Games
    • Solutions to Commons Dilemmas
  • Transcript
  1. 3. Market Failure
  2. 9. Commons

Commons

Always a Tragedy?

Content

Commons Dilemmas and Game Theory

This lecture explores commons dilemmas through the lens of game theory, examining how strategic decision-making leads to collective action problems and investigating potential solutions to these dilemmas.

Introduction to Game Theory and the Prisoner’s Dilemma

Game theory provides a powerful framework for understanding strategic interactions between rational decision-makers. While game theory has broad applications in microeconomics, particularly in analyzing competitive firm behavior and market strategies, this discussion emphasizes its applications to environmental and sustainability issues. The prisoner’s dilemma stands as the most famous game theory model and lies at the core of many sustainability challenges, including climate change.

The Classic Prisoner’s Dilemma Setup

The prisoner’s dilemma involves two suspects who have been charged with a crime. The police interrogate these prisoners in separate rooms, preventing any communication between them. The police are seeking to obtain a confession from at least one of the suspects.

To make the scenario concrete, suppose the two suspects allegedly robbed a bank while it was closed. The police have proof that both suspects were in the bank after hours and that they broke into it. However, the police have no proof that the suspects actually stole anything. Perhaps security cameras show them entering the building, but it remains unclear whether they or someone else actually committed the theft.

The Payoff Matrix Structure

In this game, Suspect A and Suspect B are the two players. Each player has two possible strategies: remain silent or blame the other. The payoffs under all possible strategy combinations must be carefully considered.

If both suspects remain silent, they can only be charged with the lesser crime of breaking and entering. In this case, each suspect receives two years in jail. When expressing these outcomes mathematically, a negative sign is applied because jail time represents a bad outcome. Thus, if both remain silent, Suspect A receives negative two (representing two years in jail) and Suspect B receives the same.

If both suspects blame each other, they can be convicted of bank robbery, resulting in seven years in jail for each. This outcome is represented as negative seven for both players.

The critical asymmetric outcomes occur when one suspect blames while the other remains silent. If Suspect A blames Suspect B while B remains silent, then A goes free with zero years in jail, achieving the best possible personal outcome. This is equivalent to receiving a plea deal. In this scenario, Suspect B faces severe consequences: because Player A blamed them, the police have strong evidence. B gets convicted of bank robbery and receives an additional year for lying under oath, totaling negative eight years.

The reverse situation applies if Suspect B blames while Suspect A remains silent. Suspect A receives negative eight years, while Suspect B goes free with zero years.

Classroom Demonstration of the Prisoner’s Dilemma

To demonstrate the prisoner’s dilemma with real stakes, volunteers can play the game for class participation points. Participants receive eight class points for volunteering, but will lose points according to the payoff matrix based on their decisions. If both blame each other, each loses seven points. If both remain silent, each loses only two points, keeping six points. If one blames while the other stays silent, the outcomes are zero and eight respectively.

A crucial modification to the standard prisoner’s dilemma can be introduced by allowing participants up to thirty seconds to communicate with each other. This fundamentally changes the game dynamics because communication represents one of the primary ways to overcome the prisoner’s dilemma.

When participants can communicate, they often achieve the cooperative outcome of mutual silence. Several factors contribute to this result that differ from the standard prisoner’s dilemma setup. First, the presence of an audience creates a commitment mechanism because participants may care about their reputation beyond just the game payoffs. Second, and perhaps most importantly, participants know they will see each other for the remainder of the semester, unlike two strangers who will never meet again.

Empirical evidence demonstrates that the more anonymous the participants and the greater the probability they will never encounter each other again, the more accurately people converge on the prisoner’s dilemma outcome of mutual blame. Conversely, when participants know who the other person is, outcomes tend toward mutual cooperation.

Solving the Prisoner’s Dilemma: Dominant Strategies

The standard economic approach to solving games like the prisoner’s dilemma involves identifying dominant strategies. A dominant strategy is a choice that is better regardless of what the other player does.

To identify dominant strategies mechanically, consider one player at a time and determine their best response given each possible action by the other player. Starting with Player A, examine what happens if Suspect B remains silent. Comparing the payoffs for Player A in this column, zero (from blaming) is greater than negative two (from staying silent). The better option receives a star. Repeating for the second column where Suspect B blames, negative seven is better than negative eight.

The analysis then flips to consider Suspect B’s perspective. If Suspect A stays silent, Suspect B’s payoffs are negative two (from staying silent) or zero (from blaming). Zero is better. If Suspect A blames, Suspect B’s payoffs are negative eight (from staying silent) or negative seven (from blaming). Negative seven is better.

The conclusion is that blame is a dominant strategy for Suspect A because regardless of what Suspect B does, A is better off blaming. Simultaneously, blame is a dominant strategy for Suspect B because regardless of what A does, B is better off blaming.

The fundamental behavioral statement is that rational agents will always play a dominating strategy.

Nash Equilibrium

When both players have a dominant strategy, a Nash equilibrium emerges. A Nash Equilibrium is a situation where no player can increase their payoff by unilaterally changing their strategy. The term “unilaterally” is crucial here: it means that if only one player changes their behavior while the other’s choice remains fixed, there is no way for that player to improve their outcome. This definition says nothing about cases where coordination might allow both players to simultaneously change their behavior.

A simpler operational definition of Nash equilibrium in a normal form game is any cell (or pair of choices) where both players have stars indicating their best responses. In the prisoner’s dilemma, there is only one Nash equilibrium: mutual blame.

To verify this is an equilibrium, consider whether either player would benefit from unilateral deviation. From the blame-blame outcome, if Suspect A considers switching to remain silent, they would receive negative eight instead of negative seven, making them worse off. Additionally, the other suspect would go free, which seems particularly unfair. The same logic applies to Suspect B.

The concept is called an equilibrium partly because economists favor equilibrium concepts, but more substantively because it provides powerful predictions. In situations involving rational decision-making, the Nash Equilibrium is very likely to occur. Understanding how to calculate Nash Equilibria enables strategic action. In business contexts, this knowledge improves profit maximization. More generally, like other equilibria in economics such as the supply-equals-demand equilibrium, Nash Equilibria are useful because forces push outcomes toward them. The marble-in-a-bowl analogy captures this: disturbances to the system tend to return to the equilibrium state.

The Normalized Prisoner’s Dilemma

The prisoner’s dilemma can be confusing when discussed in terms of years in prison with negative values. A normalization helps clarify the analysis by transforming outcomes into utility terms. Instead of years in prison, the payoffs represent utility levels: prison reduces utility below what would occur without imprisonment. This transformation connects back to basic preference theory. In the normalized version, the payoffs might show positive utilities that reflect how years in prison affect overall well-being. The structure remains a prisoner’s dilemma where mutual cooperation yields better outcomes than the Nash equilibrium of mutual defection.

Formal Definition of the Prisoner’s Dilemma

A prisoner’s dilemma is formally defined as a game with parameters (the numbers in the payoff matrix) such that two conditions hold. First, a dominant strategy must exist for both players. Second, both players end up worse off when they play their dominant strategies compared to if they had both played the alternative.

Connection to the Tragedy of the Commons

The prisoner’s dilemma connects directly to environmental commons problems. Resources with common access tend toward overuse, a market failure that occurs when there is no mechanism to limit access. This dynamic was famously analyzed as the Tragedy of the Commons by Garrett Hardin in 1968.

Hardin developed the most famous version of this scenario using the example of a commons where herders must decide how many goats or sheep to graze on communal pastureland. Every rancher has an incentive to increase their herd size because they capture all the benefits of additional animals while sharing the costs of overgrazing with all users. This incentive structure inevitably leads each herder to add animals until the commons collapses from overgrazing.

The tragedy of the commons is a tragedy precisely because it represents a prisoner’s dilemma that always drives outcomes toward the negative solution. Instead of silence versus blame, the choices become “limit your goats” versus “don’t limit your goats.” Every herder has an incentive not to limit their animals, and this collective behavior causes the collapse of the commons.

The Iterated Prisoner’s Dilemma

The standard prisoner’s dilemma contains unrealistic assumptions. Perhaps the most significant is that real-world interactions rarely involve one-shot games. Instead, people interact repeatedly over time.

In reality, reputation matters. Someone who consistently cheats, takes too much, or fails to contribute to shared responsibilities develops a reputation. Consider the mundane example of dishes in a shared living situation. If the game were truly one-shot, perhaps leaving dishes for a roommate to clean would be optimal. However, this game repeats every day, and reputation effects come into play.

Real-world situations more closely resemble the iterated prisoner’s dilemma, which unfolds over multiple rounds. This repetition fundamentally changes the strategic landscape because decisions in one round can depend on observations of opponent behavior in previous rounds. This creates information that is useful for predicting future opponent choices.

Laboratory experiments on the iterated prisoner’s dilemma reveal that outcomes depend on numerous factors including communication skills, trust between players, how well participants know each other, and whether they like each other. Personal feelings can enter utility functions: someone who dislikes their opponent might accept mutual defection outcomes that they would reject with a liked opponent.

Classroom Demonstration of the Iterated Prisoner’s Dilemma

Participants can be paired to play an iterated prisoner’s dilemma over six rounds. The payoff matrix maintains the prisoner’s dilemma structure with defection as a dominant strategy in any single round. Mutual cooperation yields ten points each, mutual defection yields zero each, and asymmetric outcomes yield twenty-five for the defector and negative five for the cooperator.

Each round, participants secretly write their strategy (cooperate or defect), then simultaneously reveal. They record individual scores and track cumulative totals. Brief communication periods between rounds allow for apology, persuasion, or strategic discussion.

Several strategic dynamics emerge from iterated play. The final round presents a unique strategic consideration because reputation costs disappear after the game ends. One possible strategy involves establishing cooperation through the early rounds, then defecting on the final round. If a pair cooperated five times, they would each have fifty points entering the final round. Mutual cooperation yields sixty total, but unilateral defection yields seventy-five for the defector (fifty plus twenty-five) with no reputational consequence since no seventh round exists.

When teams can divide winnings, a strategy emerges where one player consistently cooperates while the other consistently defects, maximizing total team points for later division. When payoffs cannot be shared, this strategy requires genuine altruism from the cooperating partner.

Real-World Applications of Iterated Prisoner’s Dilemmas

Iterated prisoner’s dilemmas appear throughout real-world strategic interactions. International trade provides a clear example. Tariffs and trade wars fundamentally involve tit-for-tat strategies in an iterated prisoner’s dilemma framework.

Countries face choices about reducing tariffs, which represents cooperation. If all countries reduce tariffs, freer trade benefits everyone. However, if all countries except one maintain free trade, the defecting country can extract significant benefits while free-riding on others’ openness. This creates a standard prisoner’s dilemma structure. When one country initiates trade restrictions, others typically respond in kind.

The actual strategies observed in international trade are quite complicated, reflecting the iterated nature of these interactions and the importance of reputation and future relationship considerations.

Beyond the Prisoner’s Dilemma: Other Game Structures

The prisoner’s dilemma represents just one possible game structure. By varying the payoff parameters in a two-player, two-strategy normal form game, at least seven distinct game types can be derived. Understanding these variations matters because not all strategic interactions are prisoner’s dilemmas, and different real-life negotiation scenarios correspond to different game structures.

The Stag Hunt (Coordination Game)

The stag hunt derives from a philosophical parable about two hunters who leave their separate cabins to hunt. They cannot coordinate before departing. Each hunter must choose between hunting stag or hunting rabbits.

Rabbit hunting requires only one person and essentially guarantees a meal. Stag hunting offers a much larger reward but requires two hunters working together. A single hunter cannot take down a stag.

The stag hunt differs from the prisoner’s dilemma because it contains two Nash equilibria. Both the mutual cooperation outcome (both hunt stag) and the mutual defection outcome (both hunt rabbit) are Nash equilibria. However, the stag hunt is much easier to solve through iteration because it primarily requires building trust. Once trust is established, both players prefer the superior equilibrium of mutual cooperation. This is why it is called a coordination game.

The key difference from the prisoner’s dilemma lies in the ordering of payoff parameters. When the reward for mutual cooperation exceeds the temptation payoff from unilateral defection, the game becomes a stag hunt rather than a prisoner’s dilemma.

Chicken (Hawk-Dove)

The game of Chicken gets its name from the dangerous driving game where two cars drive toward each other and the first driver to swerve loses. The decision-making involves weighing the benefit of not swerving (appearing to have stronger resolve, not looking “wimpy”) against the catastrophic outcome if neither swerves (crashing).

In the payoff structure, if one player swerves while the other does not, the non-swerving player gains status and the swerving player loses status. However, both players avoid the crash. The Nash equilibrium analysis reveals incentives that differ from both the prisoner’s dilemma and the stag hunt.

While literal car-based chicken games may be rare, the strategic structure appears frequently in international negotiations. The nuclear standoff during the Kennedy administration with Russia exemplifies Chicken dynamics. Both sides threatened nuclear escalation, essentially communicating “I’m moving forward with this threat, you need to back down.” The strategic question becomes how to manipulate parameters such that the opponent blinks first, or both sides blink simultaneously.

Strategic Diversity in Iterated Games

When playing iterated games, uncertainty about opponent type becomes strategically relevant. Reputation provides information about what type of player someone might be. Numerous researched strategies exist for iterated prisoner’s dilemma play, including various forms of tit-for-tat, always cooperate, always defect, and more sophisticated conditional strategies.

Solving for optimal play requires considering both known opponent strategies (where one can calculate best responses) and unknown opponent types (where optimal strategy must account for the distribution of possible opponent behaviors).

Solutions to Commons Dilemmas

Having established how commons dilemmas arise through prisoner’s dilemma dynamics, attention turns to potential solutions.

Government Ownership and Regulation

One straightforward solution involves government ownership of the commons resource with governmental decisions about access allocation. The government can pass laws requiring cooperation or, in the tragedy of the commons context, simply restrict access. Hunting permit requirements exemplify this approach.

Government solutions face significant challenges. When a small number of people receive rights to access a valuable resource, the winners receive enormous payoffs while others receive nothing. Equity concerns become paramount.

Two specific problems emerge from government-based solutions. First, lobbying becomes profitable when government policies create large potential gains for specific groups. The profit-maximizing choice for potential beneficiaries may be hiring lobbyists to influence policy. Economists call this behavior rent-seeking: attempting to manipulate government or laws for capture by a favored subset of people. Agricultural businesses and food industries statistically spend the most money lobbying government because they can benefit from agricultural subsidies that help them while imposing costs on broader populations.

Second, bureaucratic capture presents challenges independent of external lobbying. Public agencies demonstrate tendencies toward excessive bureaucratization, leading to slower processes compared to private sector alternatives. One theorized explanation is that public agencies behave as budget maximizers rather than utility maximizers or efficiency maximizers. Agencies resist any cuts to their budgets or redistribution to other purposes, creating perverse incentives that may not align with public benefit.

Privatization

An alternative to government ownership is privatization: arbitrarily assigning property rights to what was previously common property. This approach underlies programs like cap-and-trade systems.

Privatization faces its own challenges in commons contexts because the ownership of these newly created property rights itself becomes valuable. The government must choose winners and losers in the initial allocation, which creates its own political economy problems.

Building Robust Institutions

A third solution, sometimes underemphasized in standard environmental and natural resource economics, involves building robust institutions. Throughout human history, institutions have effectively solved commons dilemmas. Despite the theoretical predictions of prisoner’s dilemma dynamics always leading to tragedy, the actual historical record shows civilizations repeatedly solving commons dilemmas successfully.

Elinor Ostrom conducted foundational research in this area. She became the first woman to win the Nobel Prize in Economics, though this did not occur until 2009, which represents a troubling delay in recognition. Ostrom examined what actually happens in historical commons dilemmas rather than relying solely on theoretical predictions.

Across the history of commons dilemmas worldwide, Ostrom documented how civilizations develop their own solutions. Whether at the level of villages, tribal communities, or nation states, organizations tend to develop specific institutions that effectively manage common resources. Her research identified principles that successful institutions tend to share.

The key insight from Ostrom’s work is that outcomes are not as dire as the tragedy of the commons framework suggests. Success depends on the strength of institutions that communities can develop. These principles turn out to be highly relevant to contemporary policy frameworks including donut economics, where many policy precepts align closely with Ostrom’s institutional design principles.

The institutional approach provides conceptual bridges between local decision-making contexts and global-scale challenges. Understanding how communities have historically solved commons dilemmas through institutional development informs contemporary approaches to challenges ranging from local resource management to global climate change.

Transcript

All right, let’s get started. Welcome, everybody, to Monday. We’re going to talk about commons dilemmas and game theory, picking up where we left off last time.

So this artwork today is pretty dire-looking. The color scheme’s got me sad, but keeping with my theme of making an AI-generated image for the title of the slides based on the content, this is just what we get. So I guess we’re in the 50s in some sort of police dystopian drama or something like that.

But first, talking a little bit about the agenda. We’re going to pick up on the same set of slides. It’s the same link as before, but I made a few edits, so if you’re downloading these things, download the latest ones. Otherwise just click on the link, and that’s up to date.

A few timing notes: your weekly questions are due tonight by the end of the day today, and your assignment 2 is due next class on the 18th. That will correspond with the micro-quiz on similar concepts, so make sure to get those going. Any questions on the agenda, logistics, or any other outstanding issues?

All right, let’s dive on in. Last class, I just barely got started, so I’m probably going to review everything again. I made my joke about John Nash and the movie A Beautiful Mind, but I decided in this course we’re going to jump straight to the prisoner’s dilemma and spend some more time with that. Last class, I was using an example of Boeing versus its competitors. I want to skip that and go straight to the prisoner’s dilemma. This one has been the most famous game theory model, and it lies at the core of many sustainability issues, like climate change. I’m going to redo it again, and I think it’ll make it a little bit more clear this time.

We’ll actually walk it through in the prisoner’s dilemma form, rather than with two competing companies. Oftentimes in a microeconomics class where it’s more focused on making a lot of money, they will focus on market situations where it matches the game theory. It could be a prisoner’s dilemma, but game theory is widely useful in strategic analysis of how one firm should compete with another firm. So that’s why they tend to go in that direction, but here, we’re going to emphasize the parts that are more specific to the environment.

So in this formulation, we’ve got our two prisoners. They’ve been charged with a crime, and the police are busy interrogating these prisoners in different rooms, so they can’t communicate. They’re seeking to get a confession.

I want to add some details from last time. Let’s suppose they have allegedly robbed a bank while it was closed. That’s the situation. The police do have proof that they were in the bank after hours, that they broke into it. But for some reason, they have no proof that they actually stole anything. Maybe their cameras show them going in, but it’s unclear if them or somebody else actually stole the stuff.

We’re going to use this to fill out the prisoner’s dilemma. I’ll go faster through this than before, because we’ve already talked about the payoff matrix. We’re going to have Suspect A and Suspect B. Those are the players. Then the two strategies: silent or blame.

We’re going to think about what are the payoffs under all of those different alternatives. If they both remain silent, suppose that they can only be charged with the medium crime. Here the medium crime is that they did break into this bank, even if they didn’t steal anything. That’s still breaking and entering. Let’s say they would get two years of jail. One of the things that’s confusing is we’re going to put a negative sign on this, because that’s a bad thing. A year in jail is not good. So we’re going to say, if they are both silent, Suspect A gets minus 2 (so 2 years in jail, a value of minus 2), and Suspect B gets the same.

One thing you’ll often note is, as you get more used to this, the normal convention would actually be just listing it together, rather than trying to divide it this way. But we’ll move to that on our next round. They will both get two years in jail if they both remain silent.

If they both blame each other, conversely, they can be convicted of bank robbery, and so they’d get 7 years in jail.

But the catch here is that if A blames B, but B remains silent, then A gets to go scot-free. They get zero years in jail. Best possible outcome. This is the equivalent of getting a plea deal off of the people, and this is what often happens.

In that case, what happens to the other player? Now, because Player A blamed them, they have perfect evidence. The other player gets convicted of the bank robbery, but they also get another year in jail because they were lying under oath. So negative 8.

And the reverse would be the case if Suspect B chose blame and Suspect A went silent. Suspect A lied in court and robbed a bank, and Suspect B goes free.

Before we learn the correct way to solve these, I want to play this for real class points. I’m going to ask for two volunteers, so think about ahead of time if you want to get some class points.

One other thing about class points: I decided there’s a third possible expenditure for class points. I have been taking notes about who’s missing classes, and class participation is huge. For five class points, you can get 100% class participation score, and it can even be retroactive. If you missed a day and you got marked down for that, you can spend 5 class points and get those points back.

I’m just doing this because last year I would play these games with real dollars, but I’m not rich enough to provide enough money to make people actually be incentivized. Given that you’re paying thousands and thousands of dollars to come here, and your grade depends on that, I feel like I can’t afford thousands of dollars to give you, but I could play with the thousands of dollars you’re paying to hopefully pass this class.

I’ll do whatever it takes to teach people best in this class, so that’s what class points are about.

So the two volunteers will follow these rules. It’s basically exactly what you see here. If you both blame each other for the crime of ignoring economic principles, you’ll lose 7 class points, but just to make this all work, I’m going to give you 8 class points just for participating, because then I’m going to take them away. I feel like it’s bad to take things away from people without first giving something. So you get 8 class points just for being one of the volunteers, but you’ll lose 7 if you end up in the blame-blame scenario. You’ll basically lose according to the different scores. If you both remain silent, you’ll only lose 2, so you can have 6 class points. That’s a free participation day makeup, plus a little extra.

But if one of you blames and the other stays silent, it’s 0 and 8, and vice versa.

One thing I’m going to change: I’m going to say that you’ll be given up to 30 seconds to talk to each other. This totally changes things. The standard prisoner’s dilemma has them in separate rooms, specifically because communication is one of the ways that we can overcome the prisoner’s dilemma. That’s going to be a theme for the later parts of this lecture.

Can I see two hands for two volunteers? Come on up, just so that you can easily talk to each other. The reason for all this drama is you’ve got to do it simultaneous. Don’t write it yet. Any questions on the structure of this game?

You will get however many class points you have left over. So you have 8 right now, but you’re going to lose some based on whether or not you can do better. You have 30 seconds to communicate in any way that you would like with each other.

Now face away from each other and write down simply if you’ll blame or remain silent. And in a dramatic reveal in 3, 2, 1, show. Silent, silent! So you both made it out of the prisoner’s dilemma.

What’s different here? What’s different than the standard prisoner’s dilemma? There’s an audience. That’s a bit of a commitment mechanism, because in addition to the class points, maybe you care a little bit about your reputation. I’d say the biggest one is just that you all know you’re going to see each other for the rest of the semester, so you’re not two random people.

The empirical evidence on this is mixed. The more anonymous the people are, and the greater the probability that they’ll never see each other again, the more accurately people will go for the prisoner’s dilemma outcome of blame-blame. But to the extent that you know who the other people are, it goes towards the remain silent outcome.

That’s what typically happens. But now I want to go through the tools for how can we solve this in the standard econ way of doing this.

What I’m going to do here is talk through dominant strategies and show that both of you were irrational. I’m obviously saying that with air quotes because I’m really here just making fun of the definition of rationality, but we’re still going to use it because it’s powerful in certain circumstances.

Basically, whenever you have a normal form game like this (that just means there’s suspects, choices, and a payoff matrix), the way you solve these is you think about what is called a dominant strategy. The dominant strategy just means that one choice is better no matter what the other person does.

The mechanics of how we can see that: we’re going to start with one of our players, let’s just choose A, and we’re going to think about what is their best response, given the two different possible things that Player B will do. To simplify this, let’s take it one column at a time. Simply look at what if Suspect B did remain silent. What would be the better option for Player A?

Ignore the other column, make sure you’re looking at the ones that correspond with Player A, and say which one is better. We’ll circle or star whichever one in that row is the better option. So 0 is greater than negative 2. Then we’ll repeat for the second column and ask which one is better: negative 7 or negative 8. Obviously negative 7.

Now we’re going to flip it around and say we’re going to look at just the row here and the choice of what Suspect B should do. If Suspect A stayed silent, what’s going to be better for Suspect B: negative 2 or 0? Zero. Also down here, negative 7 is better than negative 8.

What this is saying is blame is a dominant strategy for A, because no matter what Suspect B does, they’re better off blaming. And simultaneously for Suspect B, no matter what A does, they’re better off blaming. The statement about human behavior is that rational agents will always play a dominating strategy.

Then we have a special name for what happens when both players have a dominant strategy. That’s going to be a Nash equilibrium. A Nash equilibrium is just a situation where no player can increase their payoff by unilaterally changing their strategy.

Unpacking that a little bit: unilaterally just means that if only the one player can change their behavior, there’s no way that they can do better. It’s not saying anything about the case of what if somehow you could coordinate the people and have both of them make a change to their behavior. That’s not a unilateral choice. We’re talking about just a single choice by yourself where you can’t do better. That’s going to be what happens whenever you have both players having a dominant strategy.

A simpler definition of a Nash equilibrium, if you have it set up in this structure (which is called normal form game), is just any cell, or any pair of choices (in this case, blame-blame), where there are two circles. In this particular one, there’s only one Nash equilibrium, and that’s going to be the blame-blame case.

We can walk backwards through the definition. Are we stuck here? Would Player A or B be better off by unilaterally changing? Suppose we’re here and Suspect A is thinking maybe I should remain silent. Well, that would be worse. I would find myself getting negative 8. Oh, and it’d be really unfair too, the other guy would go free. And the same thing for B: they’re saying, well, it’s not great, I’d love to get to the better outcome, but I’m only changing myself, and the only thing I will end up with is a worse payout for myself.

We call this an equilibrium partially because economists love equilibria, but because it gives us a powerful prediction. If you are in the situation of rational decision-making, like a marketplace, the Nash equilibrium is very likely to occur. If you can calculate what that Nash equilibrium is, you can act strategically. In business, this would make you a better profit maximizer.

Maybe the more general point is that, like any other equilibrium in economics (like the supply equals demand equilibrium we’ve spent the most time talking about), these are useful because they are very likely to happen, and you can make strong predictions. In other words, there are forces pushing them towards this outcome, hence the bowl analogy of dropping a marble.

That’s the concept of Nash equilibrium. Any questions about that?

We’re going to go a little farther. First off, I want to actually talk about the prisoner’s dilemma in normal form. I’m joking here a little bit, but typically the prisoner’s dilemma is very confusing to talk about because of the negative values that we had and the years in prison. We’re going to normalize everything so that you can think of this as utility.

It’s not going to be years in prison, but rather we’re going to say years in prison, when plugged into our utility function (because this is going to tie back to the basic preferences theory that we started with), you get less utility given the prison than you would if your utility function had no prison plugged into it.

So there’s a little bit of a transform here. This is easier to think about, but this is still the prisoner’s dilemma, where staying silent (as long as your partner does too) is the one that is better, but you know, it comes to this one being the Nash equilibrium.

Let’s practice this. I built a web app for this. If you’re in the slides, you can click on it, or you can click the link down there. I’d love feedback from anybody, especially using phones. I don’t test my stuff on phones, so tell me if it doesn’t work. It’s sometimes challenging to design things for widescreens and tall screens.

So what do we have here? First off, just to orient you, we’ve got our normal form game with those payoffs. It’s going to give some instructions here. One thing you’ll notice is there are little stars next to each of the different choices. The instructions are going to be three steps towards calculating what is the dominant strategy of Suspect 1, dominant strategy of Suspect 2, and then finding the Nash equilibrium.

Go ahead and solve that. It’s pretty straightforward. If you get this right, and if you get it wrong, it has a very annoying shaking error message that I spent way too long tweaking. The 5 is better than the 4, so it gets a star for blue. 2 is better than zero, so it gets a star, and then vice versa for the red person. In this particular case, this one is also going to be highlighted as the Nash equilibrium.

We’re just going to practice that, because eventually we’re going to actually do some games where we compete against bots and potentially each other. Has everybody got the basics of that? You should be pretty quick then, eventually, at seeing what is the Nash equilibrium.

While you’re working with that, I just want to point us to how we can now use this to define what the prisoner’s dilemma is more specifically.

A prisoner’s dilemma is going to be defined as just a game (and eventually we’re going to see there’s lots of other possible games) that has parameters (by parameters I just mean the numbers in the payoff matrix) such that a dominant strategy exists for both players. We did see that: blame is dominant, blame is dominant.

The second criterion is that both players end up worse off if they play their dominant strategy. That’s obviously going to depend on how these numbers are actually arranged, but when we have the numbers such that those two things are true, it’s going to be something that happens a lot and is a very bad thing.

That’s bad. Now, why are we talking about this? Obviously, I’ve been saying all throughout things like climate commons, other commons: these are ones that tend to have overuse. We talked about that market failure, that when you don’t have the ability to limit how many people come into a commons, it tends to go towards a tragedy.

This was actually famously talked about as the Tragedy of the Commons. This refers back to Garrett Hardin, way back in 1968. He essentially came up with the most famous version of the prisoner’s dilemma, and that’s going to be the Tragedy of the Commons.

Hardin talked through the example of the commons where we’re choosing how many goats or sheep or something like that should we have grazing on that communal land of grass. Inevitably, all of the ranchers or pastoralists have incentive to increase the number of sheep or goats that they have until eventually the commons collapses.

We’ll return to this again with some very detailed mathematics about how you can calculate what the optimal strategy is, but I just want to make a mapping here that the Tragedy of the Commons is a tragedy because it’s an example where it’s a prisoner’s dilemma. It causes it always to go towards the negative solution. You could think of it, instead of silence and blame, it would be limit your goats and don’t limit your goats. Everybody has a desire to not limit their goats, and it causes the collapse of the commons.

But can the prisoner’s dilemma be escaped? It’s a bad thing, but what is the most unrealistic part of the setup that we had for the prisoner’s dilemma? That’s a hard question, because there’s lots of things that don’t match real life. I would say the biggest one is that in the real world, we play repeated games.

If you are the sort of person who gets a bad reputation in life as being somebody who constantly cheats, or constantly takes too much, or constantly leaves your dishes in the sink without washing them (to put it more relevantly), you have a reputation. It’s not a one-shot game of should I clean up my dishes or hope that my roommate cleans up my dishes. If that was only one time, maybe it’d be optimal to leave it there, but you also have to think about the fact that you’re going to probably play this game again and again, every night.

In the real world, we don’t find ourselves playing the prisoner’s dilemma. We actually find ourselves playing the iterated prisoner’s dilemma, which, exactly as you might think, is going to be one that unfolds over multiple rounds. The big thing that this changes is your decision in one round might be a function of what you observed your opponent play in the previous rounds.

That changes everything, because that gives us information that is useful about how our opponent might play. We’ll show later on that there’s all sorts of laboratory experiments on this, and what are the factors that cause the iterated prisoner’s dilemma to be solved or not. It depends on all sorts of important things, like communication skills and trust, how well you know each other, whether or not you like each other. If the two of you happened to dislike each other when you did that class experiment, maybe you’d be more likely to say, yeah, I’m fine with the loss-loss, just because there’s other things in my utility function.

We’re going to play the iterated prisoner’s dilemma. Everybody’s going to play. We’re going to pair up. You two are together, you two are together, you two are together, you two are together, and you two are together. Physically relocate so that you can be next to the person, and we’re going to play a game.

We’re going to play this second experiment, which is going to be pretty similar to the first in terms of the normal form game. Player 1 is going to be given the option to cooperate or defect, and Player 2 has the same.

It’s a prisoner’s dilemma. You can sort of see it. There’s a dominant strategy to defect. Where would that be? 0 is better than negative 5, and 25 is better than 10. 0 is better than negative 5, 25 is better than 10, so you can kind of jump to it. This is the Nash equilibrium here. I’ve drawn it differently, so now it’s the upper left, not the lower right.

But this is the Nash equilibrium, and it’s got the case that it has this much better outcome here. Standard, except for the fact that we’re going to play it for 6 rounds.

You’ve paired off with your nearest neighbor. The way it’s going to work is you’re going to write down simply a D for defect or a C for cooperate, have it hidden, and then we’ll reveal it all at the same time. Then on that same row, based on what your opponent chose, you will write down how much you earned that round. If you revealed cooperate and your opponent also had cooperate, you would each write 10. Then we’re going to go to the next round.

We’re going to keep track of cumulative scores, and whoever in class has the highest score gets 10 class points. Not your actual score, but just whoever gets the very most. One of the key features of this is obviously 10 is better than 0, but there’s pretty significant value from defecting. In the other one, it was just a little bit different, but here is a big difference.

We’re going to get ready to play Round 1. Does everybody understand the mechanics of the game? You’re going to secretly write your strategy, and then I’ll say 3, 2, 1, reveal.

Everybody write your answer, a C or a D. And then we’re going to reveal in