A social norm can be defined as a rule calling for a certain kind of behavior in certain in a certain kind of circumstance. The grounds for the behavior are typically moral but following the norm confers benefits on the group, generally if not in every case, and a party subject to the rule would be sensible to follow it, again at least in general. The typical motivation is that other norms are in place calling on the group to reward compliance and/or punish violations (O’Neill 1999). The latter have been termed metanorms (Axelrod 1986) or supporting norms (Crawford and Ostrom 1995), and they sometimes call for the group to behave in ways that would normally be wrong. No one should be deprived of their freedom but the government has an obligation to do that if the person has committed a crime. (This pure stance, that compliance with a social norm is motivated only by the group’s response under other norms, is exaggerated and adopted here partly for simplicity. Often the supporting norm is not a social, but an internalized one, so that compliance is enforced by one’s conscience. Sometimes they are supported by conventions.)
It is central to the concept of a social norm that the group responds by rewards or punishments. If the party’s behavior were self-rewarding or punishing, the practice would be a convention. If, for example, I always meet my friend at the Grand Central clock but this time I go to Times Square, I will miss him and it is my own act that harms me. I am not being punished by others in response, so our practice is better termed a convention. Some rules are both norms and conventions – driving on the wrong side of the road is punished by the police and is self-punishing as well.
Supporting norms are just as full-fledged as the ones they support so they too must be supported by other norms, and this implies a network. From the network viewpoint understanding a norm means more than knowing what it calls on the party to do. We must know how other actors in the group, and even the party himself, should respond to compliance or violation, and know how the supporting norms are supported. For apologizing, we must ask many hypothetical questions: How should others respond when someone has failed to apologize? If the non-apologizer is to be ostracized but one group member refuses, how should others respond to the latter? If the individual makes a sincere apology, does the recipient have a duty to accept it? What is one committing to by accepting an apology, and in particular, is it the same as forgiving? Often the answers will depend on the context, on the offense and the recipient of the apology.
Game theory becomes relevant since a group member will generally follow a norm if he or she believes that the others are motivated to follow the supporting norms. The qualification “generally” is needed because, strictly speaking, norms are associated with types of situations, with game forms rather than specific games (O’Neill 1999). They show a typical pattern of utilities but include exceptions. Normative behavior is usually the equilibrium, but sometimes the payoffs motivate the actor to a violation -- to break a promise, steal or murder. Indeed sometimes overlapping norms exist that justify the violation when all factors are considered. The examples here are for a typical payoff pattern.
A simple normative system for apologies
The skeleton of a system around apologies can be shown by a repeated game. It has two players, who at each stage simultaneously choose one of three moves, with these payoff consequences to the mover:
Transfer (T): Transfer 12 units to the other at a cost of 6.
Withhold (W): Transfer 0 units to the other at a cost of 0.
Self-punish (S): Transfer 0 units at a cost of 1.
MATRIX 1 HERE
The stage game, in Matrix 1, is a Prisoner’s Dilemma augmented with a third row and column. The added moves seem pointless since they are strongly dominated by the second moves and all their outcomes are Pareto-inferior, but they will turn out to influence the equilibria when the game is repeated.
The players move at t = 1, 2, 3, . . ., and know all past moves. Each has the goal of maximizing the present value of its payoff stream and they use the same discount rate (0, 1). Thus, if both played T forever each would receive 6, 6, 6, . . ., and value that at 6 + 6 + 62 + . . . = 6/(1 - ).
A strategy in the game tells a player what to do at each stage for any possible history of what they both have done so far. Our task will be to assign the players a pair of strategies such that neither player can choose an alternative yielding a higher present value than the strategy assigned, given the opponent uses its assigned strategy. This property must hold for any situation they might find themselves in, even those arising from moves that were contrary to the strategies. That is, we will look for a subgame perfect Nash equilibrium.
Rather than consider strategies directly we take an approach due to Abreu (1988), which is both computationally convenient and fits with the concept of a network of norms. The equilibrium will be given indirectly by specifying three paths of play. A path of play is an infinite sequence of pairs of moves that the players might make – an example would be TT, WW, TT, WW, . . . An equilibrium is then defined by three paths, an initial path and two punishment paths, one for each player. The initial path states their joint play if they follow the equilibrium. A player’s punishment path specifies the pairs of moves that both make if that player deviates from the current path, whether the latter is the initial path or someone’s punishment path. At any point in the game a deviation from the current path has the same result: it switches play to the start of the deviator’s punishment paths. A failure to appropriately punish the other, for example, transfers the game to the start of one’s punishment path. A simultaneous deviation by both players, however, is ignored and the current path continues. Together the three paths are known as a simple strategy profile (SSP). From an SSP one can derive corresponding strategies for the players, but the reverse is not true – not every pair of strategies can be represented by three paths. However in terms of observed behavior in equilibrium the two methods are alike: if a certain sequence of moves arises from a subgame perfect Nash equilibrium it also arises from an SSP (Abreu 1988).
We will introduce four constraints on an equilibrium. The first is that it be in pure strategies, and the second that it yield mutual cooperation, TT forever. As in repeated PD games, mutual “Always Withhold” constitutes an equilibrium, but it is socially undesirable. Third, for the sake of simplicity the equilibrium must treat the players identically. Finally, after a deviation their paths must return to mutual cooperation (TT) reasonably soon. We require that it happen within two moves. This fourth condition is prompted by the earlier idea that norms are represented by game-types rather than games proper, and so they will be violated. They should therefore be “non-grim” – their violation should not lead to permanent harm.
One SSP, called Apologize-and-Restitute, constitutes a subgame perfect equilibrium that satisfies the conditions. Also, for the particular payoffs used it will be shown to be the most robust such equilibrium, in the sense that it produces cooperation at the lowest discount rate of any (Appendix.) Its definition is as follows:
The initial path gives players 6 forever, but should Row unilaterally go for the immediate payoff of 12 by choosing Withhold, Row must then choose Self-punish while Column chooses Withhold, and at the next stage must restitute Column by choosing Transfer while Column chooses Withhold. Then they resume mutual Transfers. Technically they are still on Row’s punishment path, but they are behaving the same as if there had been no deviation. The everyday notion of an apology gets translated into punishing oneself, paying a social cost in face and credibility, then giving restitution, which in the world might mean undoing the damage.
A player is induced to stay cooperative by the fear of having to self-punish and restitute. If that course is called for, a player is induced to endure it by the prospect of the imminent return to cooperation. If a player refuses to follow his or her punishment path -- then it will be restarted, in the sense that the other will choose Withhold at the two next stages. Contrary to Gilbert and Sullivan, we do not make the punishment fit the crime – all deviations are dealt with in the same way, so that only a few norms are enough.
The Apologize-and-Restitute SSP is a perfect equilibrium down to = .564 (as proven in the Appendix). This minimum measures the equilibrium’s robustness, and lower is better in the rough sense that players would be more ready to stay with the equilibrium if payoffs varied somewhat from those assumed. It can be proved that for the payoffs given, Apologize-and-Restitute has a lower minimum than any other equilibrium satisfying the cooperation-in-equilibrium, symmetry and cooperation-again-within-two-moves conditions. The success of Apologize-and-Restitute follows from the order of the outcomes on the punishment paths – the less costly apology comes first, then the more costly restitution. A deviation yields the deviator -1, -6, 6, 6, . . . , whereas the reverse SSP, “Restitute-and-Apologize”, would give -6, -1, 6, 6, 6, . . . as a payoff stream. It would not be an equilibrium at such a low discount rate since a violator would not accept -6 now while waiting two moves for cooperation.