Confirmation Bias Maximizes Expected Accuracy

10/17/2020

(1700 words; 8 minute read.)

What rational polarization looks like.

It’s September 21, 2020. Justice Ruth Bader Ginsburg has just died. Republicans are moving to fill her seat; Democrats are crying foul.

Fox News publishes an op-ed by Ted Cruz arguing that the Senate has a duty to fill her seat before the election. The New York Times publishes an op-ed on Republicans’ hypocrisy and Democrats’ options.

Becca and I each read both. I—along with my liberal friends—conclude that Republicans are hypocritically and dangerous violating precedent. Becca—along with her conservative friends—concludes that Republicans are doing what needs to be done, and that Democrats are threatening to violate democratic norms (“court packing??”) in response.

In short: we both see the same evidence, but we react in opposite ways—ways that lead each of us to be confident in our opposing beliefs. In doing so, we exhibit a well-known form of confirmation bias.

And we are rational to do so: we both are doing what we should expect will make our beliefs most accurate. Here’s why.

Confirmation bias is the tendency to gather and interpret evidence in a way that can be expected to favor your prior beliefs (Nickerson 1998; Whittlestone 2017). There are two parts two this tendency.

Selective exposure is the tendency to look for evidence that confirms your prior beliefs (Frey 1986). This captures the fact that I (a liberal) tend to check the New York Times more than Fox News, and Becca (a conservative) does the opposite.

Biased assimilation is the tendency to interpet evidence in way that favors your prior beliefs (Lord et al. 1979). This is what happened when Becca and I read the same two op-eds about RBG’s vacant seat and came to opposite conclusions about them.

Set aside selective exposure for now; today let’s focus on biased assimilation. I’m going to argue that it’s the rational response to ambiguous evidence.

Consider what those who exhibit biased assimilation actually do (Lord et al. 1979; Taber and Lodge 2006; Kelly 2008).

They are presented with two pieces of evidence—one telling in favor of a claim C, one telling against it. They have limited time and energy to process this evidence. As a result, the group that believes C spends more time scrutinizing the evidence against C; the group that disbelieves C spends more time scrutinizing the evidence in favor of C.

In scrutinizing the evidence against their prior belief, what they are doing is looking for a flaw in the argument; a gap in the reasoning; or, more generally, an alternative explanation that could nullify the force of the evidence.

For example, when I read both op-eds, I spent a lot more time thinking about Cruz’s reasons in favor of appointing someone (I even did some googling to fact check them). In doing so, I was able to spot the fact that some of the reasoning was misleadingly worded; for instance:

“Twenty-nine times in our nation’s history we’ve seen a Supreme Court vacancy in an election year or before an inauguration, and in every instance, the president proceeded with a nomination.”

True. But this glosses over the fact that just 4 years ago, Obama did indeed “proceed with a nomination”—and in response Senate Republicans (with Cruz’s support) blocked that nomination using the excuse that it was an election year.

The point? I decided to spend little time thinking about the details of the New York Times’s argument, and so found little reason to object to it; instead, I spent my time scrutinizing Cruz’s argument, and when I did I found reasons to discount it.

Meanwhile, Becca did the opposite: she scrutinized the New York Times’s argument more than Cruz’s, and in doing so no doubt found flaws in the argument.

Notice what that means: although Becca and I were presented with the same evidence initially, the way we chose to process it meant we ended up with different evidence by the end of it. I knew subtle details about Cruz’s argument that Becca didn’t notice; Becca knew subtle details about the New York Times argument that I didn’t notice.

There are two claims I want to make about the way in which selective scrutiny led us to have different evidence.

First: such selective scrutiny leads to predictable shifts in our beliefs. For example, as I was setting out to scrutinize Fox’s op-ed, I could expect that doing so would make me more confident in my prior belief that RBG’s seat should not yet be replaced.

Second: nevertheless, such selective scrutiny is epistemically rational—if what you want is to get to the truth of the matter, it often makes sense to spend more energy scrutinizing evidence that disconfirms your prior beliefs than that which confirms them.

Why are these claims true?

Scrutinizing a piece of evidence is a form of cognitive search: you are searching for an alternative explanation that would fit the facts of the argument but remove its force.

If you’ve kept up with this blog, that should sound familiar: it’s a lot like searching your lexicon for a word that fits a string—i.e. a word-completion task. When I look closely at Cruz’s argument and search for flaws, cognitively what I’m doing is just like when I look closely at a string of letters—say, ‘_E_RT’—and search for a word that completes it.  (Hint: what’s in your chest?)

In both cases, if I find what I’m looking for (a problem with Cruz’s argument; a word that completes the string) I get strong, unambiguous evidence, and so I know what to think (the argument is no good; the string is completable).  But if I try and fail to find what I’m looking for, I get weak, ambiguous evidence—I should be unsure whether to think the argument is any good; I should be unsure how confident to be that the string is completable.

Thus scrutinizing an argument leads to predictable polarization in the exact same way our word-completion tasks do.  If I find a flaw in Cruz’s argument, my confidence in my prior belief goes way up; if I don’t find a flaw, it goes only a little bit down. Thus, on average, selective scrutiny will increase my confidence.

Nevertheless, such selective scrutiny is epistemically rational.  Why?

Because it's a good way to avoid ambiguous evidence—and, therefore, is often a good way to make your beliefs more accurate.

To see this, ask yourself: would you rather do a word-completion task where, if there’s a word, it’s easy to find (like ‘C_T’), or hard to find (like ’_EAR_T’)?  Obviously you’d prefer to do the former, since the easier it is to recognize a word, the easier it is to assess your evidence and come to an accurate conclusion.

Thus if you’re given a choice between two different cognitive searches—scrutinize Cruz’s argument, or scrutinize the NYT’s—often the best way to get accurate beliefs is to scrutinize the one where you expect to find a flaw.

Which one is that? More likely than not, the argument that disconfirms your prior beliefs, of course!  For, given your prior beliefs, you should think that such arguments are more likely to contain flaws, and that their flaws will be easier to recognize.

Thus I expect Cruz’s argument to contain a flaw, so I scrutinize it; and Becca expects the NYT’s argument to contain flaw, so she scrutinizes it. These choices are rational—despite the fact that they predictably lead our believes to polarize.

We can buttress this conclusion formalizing and simulating this process.

Given your prior beliefs and a piece of evidence to scrutinize, we can calculate the expected accuracy of doing so.  (As always, the belief-transitions in my models satisfy the value of evidence with respect to the live question—say, whether the evidence contains a flaw—so you always expect to get more accurate by scrutinizing it. See the technical appendix.)

I randomly generated 10,000 such potential cognitive searches of pieces of evidence, and plotted how likely you are to find a flaw in the evidence (if there is one) against how accurate you expect scrutinizing the argument to make you. As can be seen, there is a substantial positive correlation between the two:

Simulation of 10,000 random cognitive-searches, plotting change of finding item you're searching for against expected accuracy of the search.

This means that it makes sense to tend to scrutinize evidence for which you expect to be able to recognize its flaws—i.e., often, that which disconfirms your prior beliefs.

In particular, suppose Becca and I started out each expecting 50% of the pieces of evidence for/against replacing RBG to contain flaws, but I am slightly better at finding flaws in the supporting evidence, and she is slightly better at finding flaws in the detracting evidence.

Suppose then we are presented with a series of random pairs of pieces of evidence—one in favor, one against—and at each stage we decide to scrutinize the one that we expect to make us more accurate. Since accuracy is correlated with whether we expect to find flaws, this means that I will be slightly more likely to scrutinize the supporting evidence, and she will be slightly more likely to scrutinize the detracting evidence.

As a result, we’ll polarize. Even if, in fact, exactly 50% of the pieces of evidence tell in each direction, I will come to be confident that fewer than 50% of the pieces of evidence support replacing RGB, and she’ll come to be confident that more than 50% of them do:

Simulation of two groups of agents scrutinizing bits of evidence about a fixed question. Red lines are agents like Becca, who are better at finding flaws in detracting evidence. Blue are agents like me, better at finding flaws in supporting evidence. Thick lines are averages of each group.

Upshot: biased assimilation can be rational. People with opposing beliefs who care only about the truth and are presented with the same evidence can be expected to polarize, since the best way to assess that evidence will often be to apply selective scrutiny to the evidence that disconfirms their beliefs.

There is some empirical support for this type of explanation (though more is needed). Biased assimilation is clearly driven by the process of selective scrutiny (Lord et al. 1979, Taber and Lodge 2006, Kelly 2008). Biased assimilation is more common when the evidence is ambiguous or hard to interpret (Chaiken and Maheswaran 1994, Petty1998). And the best known “debasing” technique is to explicitly instruct people to “consider the opposite”, i.e. to do cognitive searches that are expected to disconfirm their prior beliefs (Koriat 1980, Lord et al. 1984).

If my explanation is right, this is, in effect, asking people to not let accuracy guide their choice of cognitive searches—and it therefore is no surprise that people do not do this spontaneously.

In fact, it means that we can prevent people from polarizing only by preventing them from trying to be accurate.

What next?
The argument of this post draws heavily on a fantastic paper by Tom Kelly about belief polarization. It’s definitely worth reading, along with Emily McWilliams’s reply.
Jess Whittlestone has a fantastic blog post summarizing her dissertation on confirmation bias—and how she completely changed her mind about the phenomenon.
For more details, as always, check out the technical appendix (§6).
Next post: Why arguments polarize us.

8 Comments

Travis McKenna

10/20/2020 03:13:59 pm

Hi Kevin! Thanks so much for this. I've been reading some of these posts on and off and I really enjoy them. I'm looking forward to seeing you around at Pitt in the spring.

I had one (likely naive) reaction to this post. The reason that you suggest that biased assimilation can be rational is because the best way to assess the evidence with which we are presented is often to apply selective scrutiny to the evidence that disconfirms their beliefs. Why can that be the case? Because it is better to apply selective scrutiny to the evidence that we expect to contain flaws, and naturally we expect evidence that disconfirms our prior beliefs to be flawed in some way.

Here's a situation I was considering that makes me feel somewhat uneasy about the above. Suppose given some hot button issue, I believe that A is the right course of action. I have the time to apply selective scrutiny to two opinion pieces that are arguing instead for some B incompatible with A. One is written by a popular YouTuber known to run pretty fast and loose with things in order to stir up internet outrage, and the other is written by an extremely thoughtful, well-meaning political commentator whom I respect but with whom I otherwise have fundamental disagreements.

It seems like the reasoning you present (unless I misunderstand, which I may well have) should encourage me to read the popular piece, filled as it likely is with easy-to-spot flaws. But I can't help but feel that most people would say: of course not! I should read the second piece. If I find a flaw in such a piece, the payoff is likely to be far greater. It seems like reading the second piece pushes me in the direction of finding something like the *right kind of flaw*. It seems like we learn more if we discover that a thoughtful and considered argument for B suffers from some kind of flaw than if we discover that an otherwise agenda-driven argument for B rests on some kind of misleading presentation of fact.

So I suppose I want to say: is it really discovering that an argument against my position is flawed simpliciter that ought to provide me with some significant confidence boost in my prior belief, or should that be reserved for the discovery that such an argument suffers from a flaw that is deep and/or instructive? If it is the case that the mere discovery of flaws in arguments for B should not necessarily provide me with a significant increase in confidence in my prior belief that A, then isn't there a problem with the strategy of directing my selective scrutiny towards the evidence likely to present me with the most flaws in a generic sense?

And this seems related to what people, at least in the colloquial sense I find myself exposed to, are trying to indicate when they gesture at something like 'confirmation bias'. There seems something wrong with saying: I had time enough to read 3 pieces of a list of 10 that argued against my view, and so I read the 3 most superficial since they were most likely to contain flaws. We want to say: you should in fact look at the pieces that are least likely to contain flaws, or at least least likely to contain flaws of a certain kind.

In short: it seems like the kind of flaw we find matters, right? If that's right, then it seems that our expectations should be a lot more open. That is to say, if our prior belief is that A, then sure it seems rational to think that arguments in favour of B are more likely to contain flaws than arguments in favour of A. And then it's rational to direct our selective scrutiny towards arguments in favour of B if we are simply looking for any kind of flaw. But if what matters is in fact finding flaws of a particular kind (whatever they may be), then it seems like simply having the prior belief that A doesn't rationally entitle us to any particular belief about the *kinds* of flaws that arguments in favour of B are likely to suffer from. And in that case, it doesn't seem that we are justified in directing our selective scrutiny only to arguments in favour of B. Or something like that. It may be that the pieces most likely to contain flaws are the pieces that are least likely to contain flaws *of the kind that matter for my confidence in my prior belief*, and it seems like a strategy that directs your scrutiny towards the arguments least likely to change your mind is not far away from what seems to be meant by 'confirmation bias'.

In any case, thanks again for the piece. The above may be a confused mess, but hopefully there is something vaguely useful in my reaction. Would love to hear what you think!

Kevin

10/28/2020 09:44:26 am

Thanks for the thoughts, Travis! This is great, and I think you're right that it gives a good articulation of the kind of thought behind seeing CB as a bias. I'll sketch out few thoughts, but will have to think more about it.

It's definitely right that we should care about more than whether or not the items are likely to contain flaws, but also how revealing that flaw will be for helping shift our opinions. And a good proxy for that might be how surprising a flaw would be, which would be a force pushing in the opposite direction from the one I focused on in the piece. I think the natural way to think about this in these models is how much finding a flaw would shift your opinion from your prior to your posterior in the claim at issue.

This points to a limitation of the model I used in the simulations, I think. (Not an in-principle limitation; but a helpful one for me to think about complicating the models!) We were only tracking your estimate of the proportion of evidence that favored a given claim, not its force. And indeed, in doing the expected-accuracy calculations in these models, we are using minimal models which focus on being accurate about *which way the evidence points* (and a few other distinctions the models track), not about how strongly it points that way. I'm pretty sure the same dynamic will arise once we add that complication to the model—but your comments helped me to see the importance of actually coding that in there!

There's another sense, though, in which the models already take account of this sort of variability. In particular the choice of which argument to scrutinize is NOT determined just by which one is more likely to contain a flaw. It's determined by how accurate doing the search will make you (albeit about a restricted range of questions—cf. the above point), which means that it is not uncommon to scrutinize evidence that is less likely to contain a flaw but because the evidence is otherwise better at getting to the truth. (Perhaps if you do/don't find a flaw it's easier to know what to do with it; e.g. imagine a word completion task where you don't find a word and it's really obvious there is none.) This is why in the simulations I displayed there's an imperfect correlation between chance of completing the search and expected accuracy (first plot). It's also why the polarization that emerges out of it very quickly approaches a limit, and doesn't extend further than that—trying to maximize accuracy only *sometimes* leads you to do confirmatory searches—the rate at which you do so determines how wide apart the red and blue lines come in the second plot. (If agents *always* scrutinized the evidence that disconfirmed their prior beliefs, they'd limit to overall estimates that were much further apart.) So in that sense, the models do take (some) account of these counterbalancing forces.

So short story: you're absolutely right, and the models partly account for that, but I think I should complicate them to do so even more. Thanks for prodding me to think about that!

Dave Baker

10/21/2020 09:12:04 pm

Very interesting stuff. I'm skeptical about whether real-life belief polarization typically takes the form you suggest, though. If my biased assimilation were a rational process in the way you suggest, I would spend more time scrutinizing the evidence I get from looking at a Huffington Post article (trashy progressive publication prone to making bad arguments for what I think are true conclusions) as compared with e.g. posts on Philippe Lemoine's blog (very smart conservative who maintains a consistently high standard of argument in advocating views I usually find beyond the pale).

But in fact, I (because I'm progressive) tend to skim a HuffPo article and nod along, whereas I go over the Lemoine post with a fine-toothed comb looking carefully for flaws. I would be much more likely to find flaws in the argument if I gave the HuffPo article the fine-toothed comb treatment!

10/28/2020 09:54:15 am

Interesting point! Thanks. I'll have to think on this example more—it's a good one.

Some of the stuff I said in the reply to Travis seems relevant here. I'm inclined to think that part of what's going on in such a case is that the heuristic I gave ("you expect to be more accurate when you expect to find flaws") is imperfect, and one of the ways it can fail is when there's an asymmetry in how good the evidence is your scrutinizing. I take it that part of the reason you don't look for flaws in HuffPo articles is that you don't think it'll make a big difference to what you think—it won't have a strong effect on the accuracy of your beliefs. Whereas if you're reading someone who's presenting cogent and powerful arguments for a conclusion you're inclined to disagree with, it *does* make a big difference to your posterior beliefs whether you find a flaw or not.

Here's where I think it's really important that what the models are formalizing is a search for accurate evidence, and that just happens to be *correlated* with how likely you are to find a flaw. That both gives them a better claim to normative force, and also helps explain why in any given instance you shouldn't do the search that's more likely to lead to flaws—namely, if the evidence differs in other important respects. As I said in the reply to Travis above, there are some ways in which the models account for this, but I think these examples helped me realize there are some important additions to make in that regard as well. I'm pretty sure the same dynamic will emerge in that case, but I need to confirm.

In short, I think your example does a good job of illustrating the looseness between chance of finding flaws and whether we scrutinize a particular piece of evidence. It's less clear to me that it actually reveals that people are doing this in a way that doesn't fit the qualitative constraints of the model, since the model says the choice is made on grounds of expected accuracy, rather than chance of finding. But I need to get a more rigorous version of that reply up and running.

Thanks!

Maarten van Doorn link

1/14/2022 12:20:35 pm

Hi Kevin,

I've spend all day going through your work on polarization, great stuff!

What I think I get is why *selective scrutiny* can be rational. The idea that it makes more sense to spend your limited time looking critically at belief-disconfirming evidence. Because (for all you can know) chances are highest you'll find a flaw there.

However, what I don't yet see, is why in scenarios like the famous Lord studies, IF we have then managed to find a flaw in the uncongenial evidence, we not only *discredit* that evidence but also become *more certain* of our opposite view.

That seems like: if Manchester United scores 1-0, but the goal is disallowed because Ronaldo was offside, now City goes 0-1 up (instead of the score returning to 0-0).

Which doesn't seem to make sense, right?

1/18/2022 11:02:01 am

Hi Maarten,

Thanks for your kind words and good question!

I think there are a few subtleties here. One issue is that in the normal setup from the Lord et al. article, you get studies pointing in both directions, and you only scrutinize one of them. So in your analogy, it's a bit like things start at 0-0, then the two studies come in moving things to 1-1, and then you scrutinize one of them, debunking it back down to end up with 0-1.

Of course, we can imagine versions of the same sort of process where you're only presented with one study, and it disfavors your view. In that case people will of course be inclined to scrutinize it, and I think I *still* want to say that *sometimes* (not always), finding a flaw in this uncongenial evidence can lead you to increase your confidence in your own position. Here I think the football analogy breaks down a bit, because part of what's important is that when you're presented with a study, you often know that the person is TRYING to convince you—so what you want to assess is whether the case they're making is more convincing that you would've expected, knowing they're trying to convince you.

Here's a different analogy. Suppose a defense attorney stands up to defend their client, and says, "My client is innocent—just look at him, he's such a sweet-looking guy!" If you were on a jury, I take it you'd read this as a pretty damning defense—you'd think, "Of course his defense attorney is going to say he's innocent. If that's the best argument she can come up with, she must not have much to work with—he's probably guilty!"

In that sense, evidence can sometimes (rationally) "backfire" in a way that football goals can't. And I think that plays an important role in seeing how selective scrutiny sometimes (but not always, of course!) increase your confidence in your prior beliefs.

What do you think?

Kevin

1/23/2022 03:49:24 pm

Hi Kevin,

Thanks for replying!

I think your analogy is better, because it's in an argumentative context. Unlike the football game. But then what's actually doing the work in how selective scrutiny can lead to polarization is just the "if that's your best argument, your view is probably not well supported" mechanism. Right?

And then the idea is that selective scrutiny triggers that response for things like the anti-death-penalty study in the Lord study for pro-death-penalty participants?

One thought here is that for this tendency to be rational, there has to be an upper bound in terms of informational quality for evidence/'arguments the lawyer makes' that triggers this backfiring. The 'backfiring' has to distinguish between good and bad arguments, be (somewhat) proportional to evidential quality.

Another question I have is about the connection between polarization in response to mixed evidence vs polarization in response to ambiguous evidence. As I understand it, the mediating mechanisms are:

mixed evidence --> selective scrutiny --> asymmetry in flaw-finding --> polarization

amibious evidence + differences in exposure --> differences in recognizing pro- and con-arguments + asymmetry in evidential value of recognizing vs not-recognizing --> polarization

Sometimes I felt you were saying something like: because of differential exposure to different kinds of ambiguity, you can recognize the *flaw* in opposing arguments. So the ambiguity case would be analogous (imprecise) to the selective scrutiny/Lord study scenario. But sometimes I felt you were saying not that the exposure to arguments that systematically differ in ambiguity allow you to recognize *positive arguments in the first place*. Like how you and Becca polarized because you were exposed to clear *pro* arguments and con arguments usually came with their argumentative force explained away. But in that case I don't understand the analogy the word-completion task. Because it would just be 1 or 0 - you accept the positive arguments and reject the negative arguments because of framing (or some such). Not the 7/12 thingy. So what am I missing here?

Hope you can clear things up! And again I'm deeply impressed by your work on the topic!

Another th

Nicholas Patterson link

10/18/2022 06:23:29 am

Generation statement remain treatment under get. Firm myself year prevent final table. Huge protect fly less effect. Direction son in protect.

Stranger Apologies

Confirmation Bias Maximizes Expected Accuracy

Leave a Reply.

Kevin Dorst

Archives

Categories