Tuesday, May 09, 2006
For The Love of Counterintuition
by Tom Bozzo
So did Van den Bergh and Dewitte marshal the weight of evidence needed to establish that "men... become more economically rational after exposure to lingerie or sexy women?"*** Before readers start installing distractions in anticipation of the private capital guys coming in to seal a big deal, let me suggest not so much.
Herein lies a broader problem with experimental economics. Some types of experiments themselves are facially insufficient to support broad statements about the structure of preferences and/or the bounds on rational choice as a neoclassical (but otherwise mostly sensible) economist might understand the terms. Nor would I expect them to do much convincing for a rational choice doubter. They just don't explore enough of a range of economic decisions to say much one way or another without requiring some leap of faith.
The Van den Bergh/Dewitte experiment involved games played for a 10 euro stake. I gather that double digits of dollars or the foreign currency equivalent are considered "high stakes" in experimental economics-land, which if nothing else says something about the difficulty of obtaining adequate funding for economics experiments. Those stakes are several orders of magnitude smaller than those involved in decisions almost everyone in the developed world ends up making. Buy a house, pick a major, or start smoking, and the consequences may be valued in the hundreds of thousands, if not millions, of dollars. If the standard rational choice model can be screwed up in the vicinity of $10, what really can be extrapolated to the influence of the real estate agent's cosmetic surgery and choice of vehicle when ten or a hundred thousand times as much is at stake? Infinite sample size and an otherwise flawless experimental design wouldn't convince me of much beyond the ability of men to be distracted (or something) while playing small-stakes ultimatum games.
In short, some experiments are interesting but I'm not sure how much "there" is there, Bank of Sweden Prize in memory of Alfred Nobel or no. For instance, under the standard rational choice model in economics, you'd expect symmetry in how people value getting something and having the same thing taken away. If I'm willing to pay $500 for my bike, I should be willing to accept $500 from someone who wanted to take away my bike for some reason. However, there's a stylized fact (not solely derived from experimental economics, though it's been observed in some famous experiments) that people state lower "willingness to pay" (WTP) values than "willingness to accept" (WTA) values. An explanation for the WTP-WTA gap is called the "endowment effect," which is an excess resistance to having stuff taken away from us, even with compensation.
This has pretty obvious policy implications for areas such as environmental economics (measurement the benefits of an environmental regulation) or land use policy (figuring out why people offered more than the fair market values of their properties nevertheless lie down in front of the bulldozers). FWIW, my two cents is that the WTP and WTA scenarios seldom are truly symmetric outside of thought experiments, and at least some of the observed "endowment effect" or WTP-WTA gap is real and reflects the asymmetry of the problems more than an "unconventional" structure of preferences.
A recent paper by Charles Plott and Kathryn Zeiler in the AER purported to show that previous experimental results supporting the existence of an endowment effect were, in effect, artifacts of the experimental design, and the evidence that preferences are misbehaved is accordingly weak. This was heralded in some circles — e.g., anti-regulation "law and economics" circles — as a death knell to regulatory policies based on the existence of the endowment effect. At least Joshua Wright of the GMU law school recommended that interested scholars read the whole thing.
My reading was that there's a fine line between training out "misperceptions" of experimental subjects and training in the behavior the experiment is supposed to be observing neutrally, and I think it's at least conceivable that Plott and Zeiler got the result they trained their subjects to provide. But more than that, they were playing with lotteries involving $1-$8 (the upper end of the range being "high stakes" games) and the valuation of an $8.50 travel mug. Maybe there's just a little difference between those scenarios and real high-stakes decisions, eh? Is valuing an excess statistical case of cancer from allowing some incremental air pollution as hard of a problem as valuing a travel mug?
Fairness does require me to admit that the Van den Bergh/Dewitte result is not extraordinary in at least one important sense: using non-informative communications to alter the outcomes of games is not new. But susceptibility to certain kinds of bluffs or irrelevant signals is something that presumably can be trained away.
As a random bullet of nothing for Drek, who is probably a little disappointed by this reply, I learned while Googling around after yesterday's post that one of my zoologist neighbor's areas of research is the effect of sexual selection on speciation. So if the Jeffrey Skillings of the world exclusively married the former secretaries who would consider marrying the Jeffrey Skillings of the world, they could end up as a distinct species. As for the rest of us...
Even editors of top journals must need a laugh now and then, and I could just hear the yuks among the top ranks at Nature over the research Drek discussed yesterday. After all, what could be more counterintuitive than to suggest that the sight of an attractive woman* — hell, just about any woman — makes men "more rational." I hate to argue from sex comedy stereotype,** but just maybe Nature referees even have some personal experience to bring to bear. In any event, the claim appears to be among those requiring extraordinary evidence.
So did Van den Bergh and Dewitte marshal the weight of evidence needed to establish that "men... become more economically rational after exposure to lingerie or sexy women?"*** Before readers start installing distractions in anticipation of the private capital guys coming in to seal a big deal, let me suggest not so much.
Herein lies a broader problem with experimental economics. Some types of experiments themselves are facially insufficient to support broad statements about the structure of preferences and/or the bounds on rational choice as a neoclassical (but otherwise mostly sensible) economist might understand the terms. Nor would I expect them to do much convincing for a rational choice doubter. They just don't explore enough of a range of economic decisions to say much one way or another without requiring some leap of faith.
The Van den Bergh/Dewitte experiment involved games played for a 10 euro stake. I gather that double digits of dollars or the foreign currency equivalent are considered "high stakes" in experimental economics-land, which if nothing else says something about the difficulty of obtaining adequate funding for economics experiments. Those stakes are several orders of magnitude smaller than those involved in decisions almost everyone in the developed world ends up making. Buy a house, pick a major, or start smoking, and the consequences may be valued in the hundreds of thousands, if not millions, of dollars. If the standard rational choice model can be screwed up in the vicinity of $10, what really can be extrapolated to the influence of the real estate agent's cosmetic surgery and choice of vehicle when ten or a hundred thousand times as much is at stake? Infinite sample size and an otherwise flawless experimental design wouldn't convince me of much beyond the ability of men to be distracted (or something) while playing small-stakes ultimatum games.
In short, some experiments are interesting but I'm not sure how much "there" is there, Bank of Sweden Prize in memory of Alfred Nobel or no. For instance, under the standard rational choice model in economics, you'd expect symmetry in how people value getting something and having the same thing taken away. If I'm willing to pay $500 for my bike, I should be willing to accept $500 from someone who wanted to take away my bike for some reason. However, there's a stylized fact (not solely derived from experimental economics, though it's been observed in some famous experiments) that people state lower "willingness to pay" (WTP) values than "willingness to accept" (WTA) values. An explanation for the WTP-WTA gap is called the "endowment effect," which is an excess resistance to having stuff taken away from us, even with compensation.
This has pretty obvious policy implications for areas such as environmental economics (measurement the benefits of an environmental regulation) or land use policy (figuring out why people offered more than the fair market values of their properties nevertheless lie down in front of the bulldozers). FWIW, my two cents is that the WTP and WTA scenarios seldom are truly symmetric outside of thought experiments, and at least some of the observed "endowment effect" or WTP-WTA gap is real and reflects the asymmetry of the problems more than an "unconventional" structure of preferences.
A recent paper by Charles Plott and Kathryn Zeiler in the AER purported to show that previous experimental results supporting the existence of an endowment effect were, in effect, artifacts of the experimental design, and the evidence that preferences are misbehaved is accordingly weak. This was heralded in some circles — e.g., anti-regulation "law and economics" circles — as a death knell to regulatory policies based on the existence of the endowment effect. At least Joshua Wright of the GMU law school recommended that interested scholars read the whole thing.
My reading was that there's a fine line between training out "misperceptions" of experimental subjects and training in the behavior the experiment is supposed to be observing neutrally, and I think it's at least conceivable that Plott and Zeiler got the result they trained their subjects to provide. But more than that, they were playing with lotteries involving $1-$8 (the upper end of the range being "high stakes" games) and the valuation of an $8.50 travel mug. Maybe there's just a little difference between those scenarios and real high-stakes decisions, eh? Is valuing an excess statistical case of cancer from allowing some incremental air pollution as hard of a problem as valuing a travel mug?
Fairness does require me to admit that the Van den Bergh/Dewitte result is not extraordinary in at least one important sense: using non-informative communications to alter the outcomes of games is not new. But susceptibility to certain kinds of bluffs or irrelevant signals is something that presumably can be trained away.
As a random bullet of nothing for Drek, who is probably a little disappointed by this reply, I learned while Googling around after yesterday's post that one of my zoologist neighbor's areas of research is the effect of sexual selection on speciation. So if the Jeffrey Skillings of the world exclusively married the former secretaries who would consider marrying the Jeffrey Skillings of the world, they could end up as a distinct species. As for the rest of us...
Comments:
<< Home
Hell, Tom, how can I be disappointed? So far my guest-posts over here have provoked an equal or greater quantity of subsidiary posts from others. How can I ask for more?
The "stakes" criticism has been thrown again and again at experimentalists. Yes, these experiments are usually run on undergraduate students who end up earning, on average, $20/hour (at least this is true at George Mason, where I am more familiar with the procedures).
But even if the stakes are too low for some published results, these findings should be interesting to economists, since most economic theories don't assume "high" stakes.
Secondly, the only way I can see to test whether stakes are effecting the result is by replicating the experiment with higher stakes--not by discounting the experimental method. In the cases where these replications have been done, sometimes using industry professionals(in the case of Vernon Smith's original "Bubble" experiments) and other times by going to developing countries where the payoffs were phenomenally high, the high stakes results have validated the general finding of the low stakes results.
But even if the stakes are too low for some published results, these findings should be interesting to economists, since most economic theories don't assume "high" stakes.
Secondly, the only way I can see to test whether stakes are effecting the result is by replicating the experiment with higher stakes--not by discounting the experimental method. In the cases where these replications have been done, sometimes using industry professionals(in the case of Vernon Smith's original "Bubble" experiments) and other times by going to developing countries where the payoffs were phenomenally high, the high stakes results have validated the general finding of the low stakes results.
Michael: My original post wording was too sweeping in the critique of experiments, and the post as it stands is (or at least is meant to be) toned down. I was being cuter about it, but in mentioning the "difficulty of obtaining adequate funding for economics experiments," I agree that experimentalists should seek to confirm their results (or not) more broadly.
The bigger problem is in the interpretation of the results (not necessarily by the experimenters themselves). For instance, claims that the Plott/Zeiler results overturn other scholarship demonstrating WTP-WTA gaps should wait until Plott/Zeiler is replicated at many scales and (to the best of researchers' ability) for policy-relevant choices -- and some demonstration that Plott and Zeiler didn't train in their result. Ditto with experiments like Van den Bergh and Dewitte before anyone goes about making claims about the direction of scantily clad women-rationality effects.
Post a Comment
The bigger problem is in the interpretation of the results (not necessarily by the experimenters themselves). For instance, claims that the Plott/Zeiler results overturn other scholarship demonstrating WTP-WTA gaps should wait until Plott/Zeiler is replicated at many scales and (to the best of researchers' ability) for policy-relevant choices -- and some demonstration that Plott and Zeiler didn't train in their result. Ditto with experiments like Van den Bergh and Dewitte before anyone goes about making claims about the direction of scantily clad women-rationality effects.
<< Home