What is Savage's subjective expected utility model?
Savage’s subjective expected utility model is an idea in decision theory similar to the VNM theorem. It concerns an agent with preferences about uncertain mixes of outcomes, satisfying some rationality assumptions, and it says the agent will maximize expected utility. However, unlike VNM, Savage doesn’t assume that the outcomes in these uncertain mixes start with probabilities attached. Rather, the uncertainty comes in the form of dependence on unknown world states. That it takes the form of a probability distribution is derived as a result.
Suppose you’re in a game show called “Guess How Much Water,” and the host shows you an unmarked jug. You can guess how much is in the jug based on how it looks, but you have no definite information. The world has a state that you’re uncertain about; different volumes of water, like 1L or 2L, are possible values of that state.
Then suppose the host starts offering you different deals. For example: “If you take the Risky Deal, then if the volume is under 2L, you get nothing; if it’s between 2L and 3L, you get a house; and if it’s over 3L, you get a television. But if you take the Safe Deal, then you get a boat regardless.”
In Savage’s framework, these deals are called “acts.” We’ll use “deals” and “acts” interchangeably to refer to mappings from states to outcomes. They’re similar to lotteries in VNM, but with uncertainty in the form of dependence on the unknown world state, rather than explicit probabilities.
Just like VNM uses preferences between all pairs of lotteries, Savage’s theorem uses preferences between all pairs of hypothetical deals — “if I had to choose between A and B, then I’d choose A.” So keep in mind that an “act” may not be something you do, or even something you’re given the option to do. (In a real game show or decision situation, you’d only get to choose from a limited range of acts. There are some results in the context of “revealed preference” that start from limited observations of your choices, but we won’t discuss those here.)
As with VNM, the theorem requires some assumptions to work, which you can think of as rationality requirements. Preferences between acts are assumed to satisfy:
- “Completeness”: for every pair of acts, you either prefer one act to the other, or are indifferent. (Same as VNM.)
- “Transitivity”: if you prefer A to B and B to C, then you prefer A to C. (Same as VNM.)
- The “sure-thing principle.” This principle says that if two deals have the same outcome in some possible worlds, it doesn’t matter what that outcome is; changing it won’t change your preference.1 For example, suppose in deal A, you get a bicycle if the volume is under 1L, and a house if the volume is between 1L and 2L. In deal B, you get a car if the volume is under 1L, and a boat if the volume is between 1L and 2L. Finally, in both deals, if the volume is over 2L, you get a gift box. Your preference between deal A and B will depend on your feelings about cars, houses, boats, bicycles, and the amount of water in the jug. But the sure-thing principle says it can’t depend on what is in the gift box. Learning that there’s a television or a snake in the box won’t change your preference between A and B, because it’s the same change in both deals. In other words, according to the sure-thing principle, your preferences between deals depend only on worlds where the deals differ.
- “Monotonicity in consequences.” This axiom roughly says that if there’s an outcome you unconditionally prefer to another, then switching from the bad to the good outcome in some world states always makes the deal more attractive. For example, if you’d rather unconditionally get a car than unconditionally get a goat, then you’d rather move from a deal where you get a goat between 5L and 6L and $1000 otherwise to a deal where you get a car between 5L and 6L and $1000 otherwise.
- “Independence of beliefs from tastes.” This axiom says: suppose there’s an outcome you prefer to another outcome, and you’d rather have the good outcome turn on one event (i.e., happen inside but not outside a set of world states) than another event. Then that’s a feature of those events, independent of the particular outcomes — you’d rather have the outcome turn on the one event than the other event for any good and bad outcome. You can think of the first event as the one you’re “more willing to bet on,” the one you “believe more in.” Concretely, suppose you’d rather (unconditionally) have a car than a bicycle, and rather (unconditionally) have a washing machine than a blender. Suppose the “car-if-low” deal gives you a car if the volume is under 1L and a bicycle otherwise, and the “car-if-high” deal gives you a car if the volume is above 2L and a bicycle otherwise. And suppose the “washer-if-low” deal gives you a washing machine if the volume is under 1L and a blender otherwise, and the “washer-if-high” deal gives a washing machine if the volume is above 2L and a blender otherwise. Then the axiom says if you prefer “car-if-low” to “car-is-high,” you also have to prefer “washer-if-low” to “washer-if-high.” You can’t “believe” the water has a large volume for purposes of making car decisions while also “believing” the water has a small volume for purposes of making washing machine decisions. In other words, events have some property that determines how much you care about prospects of good things resulting from those events, independent of what those good things are. (This property will turn out to be the events’ probability.)
- “Non-triviality.” This simply says there exists one deal that you prefer to another. (The theorem uses your deal preferences as information from which to pin down your subjective beliefs and your outcome preferences, so if you’re indifferent between all deals, there’s nothing to go by.)
- “Continuity in events.” This basically says you can divide events as finely as you need. If you have a better deal and a worse deal, and a third deal as a replacement, you can split the possible world states into sets that are so small that changing a deal to the replacement deal in any one set won’t change the preference. For example, suppose you’d rather have “car if volume over 2L, or else nothing” than “bicycle if volume under 1L, or else nothing,” and the replacement deal is you get a billion dollars. Even though you care a lot about getting a billion dollars, if you divide the possible volumes into nanoliter precision, the promise of a billion dollars in one nanoliter range probably wouldn’t change your preference. You’d still prefer “car if volume over 2L” to “bicycle if volume under 1L, or a billion dollars if volume between 1L and 1.000000001L.” Like with VNM’s continuity axiom, this says there aren’t outcomes you “care infinitely about.” It also says your beliefs aren’t concentrated in a single world state; with VNM, you’d get the same thing for free out of its assumption of real-numbered probabilities.
- (For infinite cases, there is a further technical assumption that replaces monotonicity — see Wikipedia for the details.)
From here, the first step is to construct your subjective beliefs: if you’d rather have a car (as opposed to nothing) if volume is between 1L and 2L than if volume is between 2L and 3L, then you must think the former range is “more likely.” If you look at lots of comparisons like this, the assumptions guarantee they’re all consistent with some probability distribution on volumes. For example, your beliefs might follow a normal distribution with mean 1.5L and standard deviation 0.2L.2
Then, based on this probability distribution, the next step is to construct your utility function, in a similar way to VNM. If you think the 1-2L range is twice as likely as the 2-3L range, and you’re indifferent between getting a bad car if 1-2L (and nothing otherwise) and getting a good car if 2-3L (and nothing otherwise), then your utility for getting a good car must be twice your utility for getting a bad car (if the utility of getting nothing is set to 0). As with VNM, the utility function isn’t unique: you can add the same number to all utilities, or multiply all utilities by the same factor. But if you compare two differences between two outcomes, they’ll always trade off at a fixed ratio.
So Savage’s claim is that if your preferences have the structure given by the axioms above, then you have a unique subjective distribution mapping sets of world states to probabilities, and a utility function mapping outcomes to utilities, and the expected value of the utility function under these probabilities determines your preferences. That is, you’re an expected utility maximizer. As with other coherence theorems, there’s no doubt that the theorem is mathematically true, but where it can be applied, and what it implies, is a matter of debate.
Another version of the “sure-thing principle” says: if you prefer A to B knowing an event will happen, and you also prefer A to B knowing the event won’t happen, then you should prefer A to B if you don’t know whether the event will happen. This follows from the (stronger) axiom used here, which is Wikipedia’s P2. ↩︎
Though presumably you’re certain that the volume is not below zero. ↩︎