Click here to go see the bonus panel!

Hovertext:

Coming in 2032, the new SMBC book consisting entirely of jokes where a pickup line reveals deep psychological pain.

Today's News:

]]>

The core claim of this series is that political polarization is caused by individuals responding rationally to ambiguous evidence.

To begin, we need a possibility proof: a demonstration of how ambiguous evidence can drive apart those who are trying to get to the truth. That’s what I’m going to do today.

I’m going to polarize you, my rational readers.

In my hand I hold a fair coin. I’m going to toss it twice (…done). From those two tosses, I picked one at random; call it the **Random Toss**. How confident are you that the Random Toss landed heads? 50-50, no doubt––it’s a fair coin, after all.

But I’m going to polarize you on this question. What I’ll do is split you into two groups—the Headsers and the Tailsers—and give those groups different evidence. What’s interesting about this evidence is that we all can predict that it’ll lead Headsers to (on average) end up*more* than 50% confident that the Random Toss landed heads, while Tailsers will end up (on average) *less* than 50% confident. That is: everyone (yourselves included) can predict that you’ll polarize.

The trick? I’m going to use ambiguous evidence.

First, to divide you. If you were born on an*even* day of the month, you’re a **Headser**; if you were born on an *odd* day, you’re a **Tailser**. Welcome to your team.

You’re going to get different evidence about how the coin-tosses landed. That evidence will come in the form of**word-completion tasks**. In such a task, you’re shown a string of letters and some blanks, and asked whether there’s an English word that completes the string. For instance, you might see a string like this:

P_A_ET

And the answer is:*yes*, there is a word that completes that string. (Hint: what is Venus?) Or you might see a string like this:

CO_R_D

And the answer is:*no*, there is no word that completes that string.

That’s the*type* of evidence you’ll get. You’ll be given two different word-completion tasks—one for each toss of the coin. However, Headsers and Tailser will be given *different* tasks. Which one they’ll see will depend on how the coin landed.

__The rule:__

Here’s your job. Click on the appropriate link below to view a widget which will display the tasks for your group. You’ll view the first world-completion task, and then enter how confident you are (between 0–100%) that the string was completable. Enter “100” for “definitely completable”, “50” for “I have no idea”, “0” for “definitely*not* completable”, and so on.

If you’re a Headser, the number you enter is your confidence that the coin landed heads on the first toss. If you’re a Tailser, it’s your confidence that the coin landed*tails*—so the widget will subtract it from 100 to yield your confidence that the coin landed *heads. *(If you’re 60% confident that the coin landed tails, that means you’re 100–60 = 40% confident that it landed heads.)

You’ll do this whole procedure twice—once for each toss of the coin. Then the widget will tell you what your*average confidence in heads* was, across the two tosses. This is how confident you should be that the Random Toss landed heads, given your confidence in each individual toss. And this average is the number that will polarize across the two groups.

Enough set up; time to do the tasks.

But I’m going to polarize you on this question. What I’ll do is split you into two groups—the Headsers and the Tailsers—and give those groups different evidence. What’s interesting about this evidence is that we all can predict that it’ll lead Headsers to (on average) end up

The trick? I’m going to use ambiguous evidence.

First, to divide you. If you were born on an

You’re going to get different evidence about how the coin-tosses landed. That evidence will come in the form of

P_A_ET

And the answer is:

CO_R_D

And the answer is:

That’s the

- For each toss of the coin,
**Headsers**will see a__completable__string (like ‘P_A_ET’) if the coin landed__heads__; they’ll see an__uncompletable__string (like ‘CO_R_D’) if it landed__tails__. - Conversely: for each toss,
**Tailsers**will see a__completable__string if the coin landed__tails__; they’ll see an__uncompletable__string if it landed__heads__.

Here’s your job. Click on the appropriate link below to view a widget which will display the tasks for your group. You’ll view the first world-completion task, and then enter how confident you are (between 0–100%) that the string was completable. Enter “100” for “definitely completable”, “50” for “I have no idea”, “0” for “definitely

If you’re a Headser, the number you enter is your confidence that the coin landed heads on the first toss. If you’re a Tailser, it’s your confidence that the coin landed

You’ll do this whole procedure twice—once for each toss of the coin. Then the widget will tell you what your

Enough set up; time to do the tasks.

- If you’re a
**Headser, click here.** - If you’re a
**Tailser, click here**.

Welcome back. You have now, I predict, been polarized.

This is a statistical process, so your individual experience may differ. But my guess is that if you’re a Headser, your average confidence in heads was*greater* than 50%; and if you’re Tailser, your average confidence in heads was *less* than 50%.

I’ve run this study. Participants were divided into Headsers and Tailsers. They each saw four word-completion tasks. Here’s how the two groups’ average confidence in heads (i.e. their confidence that the Random Toss landed heads) evolved as they saw more tasks:

This is a statistical process, so your individual experience may differ. But my guess is that if you’re a Headser, your average confidence in heads was

I’ve run this study. Participants were divided into Headsers and Tailsers. They each saw four word-completion tasks. Here’s how the two groups’ average confidence in heads (i.e. their confidence that the Random Toss landed heads) evolved as they saw more tasks:

Both groups started out 50% confident on average, but the more tasks they saw, the more this diverged. By the end, the average Headser was 58% confident that the Random Toss landed heads, while the average Tailser was 36% confident of it.

(That difference is statistically significant; the 95%-confidence interval for the mean difference between the groups’ final average confidence in heads is [16.02, 26.82]; the Cohen’s d effect size is 1.58—usually 0.8 is considered “large”. For a full statistical report, including comparison to a control group with unambiguous evidence, see the Technical Appendix.)

Upshot: the more word-completion tasks Headsers and Tailsers see, the more they polarize.

**The crucial question:** *Why?*

(That difference is statistically significant; the 95%-confidence interval for the mean difference between the groups’ final average confidence in heads is [16.02, 26.82]; the Cohen’s d effect size is 1.58—usually 0.8 is considered “large”. For a full statistical report, including comparison to a control group with unambiguous evidence, see the Technical Appendix.)

Upshot: the more word-completion tasks Headsers and Tailsers see, the more they polarize.

(800 words left)

Getting a complete answer to this—and to why such polarization should be considered *rational*—will take us a couple more weeks. But the basic idea is simple enough.

A word-completion task presents you with evidence that is**asymmetrically ambiguous**. It’s easier to know what to think if there *is* a completion than if there’s *no* completion. If there *is* a completion, all you have to do is find one, and you know what to think. But if there’s *no* completion, then you can’t find one; but nor can you be certain there is *none*—for you can’t rule out the possibility that there’s one you’ve missed.

Staring at `_E_RT’, you may be struck by a moment of epiphany—`HEART!’—and thereby get unambiguous evidence that the string is completable.

But staring at `ST_ _RE’, no such epiphany is forthcoming; the best you’ll get is a sense that it’s probably not completable, since you haven’t yet found one. Nevertheless, you should remain unsure whether you’ve made a mistake: “Maybe it*does* have a completion and I should know it; maybe in a second, I’ll think to myself, `It *is* completable—duh!’” This self-doubt is the sign of ambiguous evidence, and it prevents you from being too confident that it’s not completable.

The result? When you’re presented with a string that’s completable, you often get strong, unambiguous evidence that it’s completable; when you’re presented with a string that’s*not* completable, you can only get *weak, ambiguous* evidence that it’s not. Thus when the string is completable, you should often be quite confident that it is; when it’s not, you should never be very confident that it’s not.

I polarized you by exploiting this asymmetry. Headsers saw completable strings when the coin landed heads; Tailsers saw them when it landed tails. That means that Headsers were good at recognizing heads-cases and bad at recognizing tails-cases, while Tailsers were good at recognizing tails-cases and bad at recognizing heads-cases.

As a result, if you ask Headsers, they’ll say, “It’s landed heads a lot!”; and if you ask Tailsers, they’ll say, “It’s landed tails a lot!”. They polarize.

Here it’s worth emphasizing a subtle point but important point—one that we’ll return to. The**ambiguous/unambiguous-evidence distinction** is *not* the **weak/strong-evidence distinction**. Ambiguous evidence is evidence *that you should be unsure how to react to*; unambiguous evidence is evidence that you *should* be sure how to react to.

Ambiguous evidence is necessarily weak, but unambiguous evidence can be weak too. Example: if I tell you I’m about to toss a coin that’s 60% biased towards heads, that is*weak but unambiguous* evidence--*weak* because you shouldn’t be very confident it’ll land heads, but *unambiguous* because you know exactly how confident to be (namely, 60%).

The claim is that it is asymmetries in*ambiguity*—not asymmetries in *strength*—which drive polarization. We can test this by comparing our ambiguous-evidence Headsers and Tailsers to a control group that received evidence that was sometimes strong and sometimes weak, but always (relatively) *un*ambiguous. (It came in the form of draws from an urn; see the Technical Appendix for details.)

Here is the evolution of the Headsers and Tailsers who got*un*ambiguous evidence:

A word-completion task presents you with evidence that is

Staring at `_E_RT’, you may be struck by a moment of epiphany—`HEART!’—and thereby get unambiguous evidence that the string is completable.

But staring at `ST_ _RE’, no such epiphany is forthcoming; the best you’ll get is a sense that it’s probably not completable, since you haven’t yet found one. Nevertheless, you should remain unsure whether you’ve made a mistake: “Maybe it

The result? When you’re presented with a string that’s completable, you often get strong, unambiguous evidence that it’s completable; when you’re presented with a string that’s

I polarized you by exploiting this asymmetry. Headsers saw completable strings when the coin landed heads; Tailsers saw them when it landed tails. That means that Headsers were good at recognizing heads-cases and bad at recognizing tails-cases, while Tailsers were good at recognizing tails-cases and bad at recognizing heads-cases.

As a result, if you ask Headsers, they’ll say, “It’s landed heads a lot!”; and if you ask Tailsers, they’ll say, “It’s landed tails a lot!”. They polarize.

Here it’s worth emphasizing a subtle point but important point—one that we’ll return to. The

Ambiguous evidence is necessarily weak, but unambiguous evidence can be weak too. Example: if I tell you I’m about to toss a coin that’s 60% biased towards heads, that is

The claim is that it is asymmetries in

Here is the evolution of the Headsers and Tailsers who got

As you can see, there is some divergence (most liked a “response bias” because of the phrasing of the questions), but significantly less divergence than in our ambiguous-evidence case. (Again, see the technical appendix for a statistical comparison.)

Upshot: ambiguous evidence can be used to drive polarization.

That concludes my possibility proof. The rest of this project will try to figure out what it means. To do that, we need to look further into both the theoretical foundations and the real-world applications.

To preview the foundations: there is a clear sense which

To preview the applications:

In fact, the same sort of ambiguity-asymmetry plays out in

That’s where we’re headed. To get there, we need to get a few more basics on the table. Empirically: what mechanisms have led to the recent rise in polarization? And theoretically: what would it

We’ll tackle those two questions in the coming weeks. Then we’ll put all the pieces together, and examine the role that ambiguity-asymmetries in evidence play in the mechanisms that drive polarization.

What next?

PS. Thanks to Branden Fitelson and especially Joshua Knobe for much help with the experimental design and analysis.

[**5/11 Update:** Since the initial post, I've gotten a ton of extremely helpful feedback (thanks everyone!). In light of some of those discussions I've gone back and added a little bit of material. You can find it by skimming for the purple text.]

[**5/28 Update:** If I rewrote this now, I'd now reframe the thesis as: "Either the gambler's fallacy is rational, or it's much less common than it's often taken to be––and in particular, standard examples used to illustrate it don't do so."]

[

A title like that calls for some hedges––here are two. First, this is work in progress: the conclusions are tentative (and feedback is welcome!). Second, all I'll show is that rational people would often exhibit this "fallacy"––it's a further question whether real people who actually commit it are being rational.

Off to it.

On my computer, I have a bit of code call a "koin". Like a coin, whenever a koin is "flipped" it comes up either heads or tails. I'm not going to tell you anything about how it works, but the one thing everyone should know about koins is the same thing that everyone knows about coins: they tend to land heads around half the time.

I just tossed the koin a few times. Here's the sequence it's landed in so far:

How likely do you think it is to land heads on the next toss? You might look at that sequence and be tempted to think a heads is "due", i.e. that it's more than 50% likely to land heads on the next toss. After all, koins usually land heads around half the time––so there seems to be an overly long streak of tails occurring.

But wait! If you think that, you're committing the

Wrong. Given your evidence about koins, you

I'll spend most of this post defending this claim for koins, and then talk about how it generalizes to real-life random processes––like *c*oins––at the end.

But first: why care? People don't appeal to the gambler's fallacy to explain polarization or to demonize their political opponents––so if you're here for those topics, this discussion may seem far afield.

But I think it's relevant. The irrationality and pervasiveness of the gambler's fallacy is one of the most widespread pieces of irrationalist folklore. It’s been taken to support a variety of unflattering views of the human mind, including a belief in the "law of small numbers", a tendency to use representativeness as a (poor) substitute for probability, an illusion of control, and even an (unfounded) belief in a just world. Insofar as a general belief that people are irrational leads us to demonize those who disagree with us––as I think it does—scrutinizing such irrationalist claims is important.

So back to gamblers.**What ***is* the gambler's fallacy? Many have suggested to me that it's the tendency to think that a heads is more likely after a string of tails, despite knowing that the tosses are statistically independent. But this can't be right––for no one commits *that* fallacy. After all, knowing that the tosses are independent is just knowing that a heads is not more (or less) likely after a string of tails; therefore anyone who thinks that a heads *is* more likely after a string of tails does *not* know that the tosses are independent.

Here's a more plausible account of the (supposed) fallacy. You commit the gambler's fallacy if, purely on the basis of your knowledge that the koin lands heads 50% of the time, you think it's more likely to land heads after a (long string of) tails. That's what I'll argue is rational.

All you know about koins is that they tend to land heads about half the time. You can infer from this that*on average––*across all flips––the koin's chance of landing heads on a given toss is around 50%. What are the ways that this could be true?

One (obvious) possibility is that the chance of heads is*always* 50%. Call this hypothesis:

But first: why care? People don't appeal to the gambler's fallacy to explain polarization or to demonize their political opponents––so if you're here for those topics, this discussion may seem far afield.

But I think it's relevant. The irrationality and pervasiveness of the gambler's fallacy is one of the most widespread pieces of irrationalist folklore. It’s been taken to support a variety of unflattering views of the human mind, including a belief in the "law of small numbers", a tendency to use representativeness as a (poor) substitute for probability, an illusion of control, and even an (unfounded) belief in a just world. Insofar as a general belief that people are irrational leads us to demonize those who disagree with us––as I think it does—scrutinizing such irrationalist claims is important.

So back to gamblers.

Here's a more plausible account of the (supposed) fallacy. You commit the gambler's fallacy if, purely on the basis of your knowledge that the koin lands heads 50% of the time, you think it's more likely to land heads after a (long string of) tails. That's what I'll argue is rational.

All you know about koins is that they tend to land heads about half the time. You can infer from this that

One (obvious) possibility is that the chance of heads is

Steady: On each toss, the koin has a 50% chance of landing heads. |

Given your knowledge about koins, you should leave open that Steady is true.

Should you be*sure* it’s true? If so, then the gambler's fallacy would indeed be a fallacy. But you shouldn't be sure of it, for here are two other hypotheses that would also vindicate your evidence that koins tend to land heads around half the time:

Should you be

Switchy: When the koin lands heads (tails), it's less than 50% likely to land heads (tails) on the next toss.Sticky: When the koin lands heads (tails), it's more than 50% likely to land heads (tails) on the next toss. |

The Switchy hypothesis says that the koin has a tendency to switch how it lands. For example, perhaps after landing heads (tails), it's 40% likely to land heads (tails) on the next toss, and 60% likely to switch to tails (heads). Similarly, the Sticky hypothesis says the koin has a tendency to stick to how it lands. For example, perhaps after landing heads (tails) it's 60% likely to stick with heads (tails) on the next toss, and 40% likely to land tails (heads).

We can represent hypotheses like Steady, Switchy, and Sticky with what are known as Markov chains: a series of states the koin might be in, along with its chance of transitioning from a given state at one time to other states at the next time. For instance, our example of a Switchy hypothesis can be represented like this:

We can represent hypotheses like Steady, Switchy, and Sticky with what are known as Markov chains: a series of states the koin might be in, along with its chance of transitioning from a given state at one time to other states at the next time. For instance, our example of a Switchy hypothesis can be represented like this:

This diagram indicates that whenever the koin is in state H (has just landed heads), it's 40% likely to land heads on the next flip and 60% likely to land tails on the next flip. Vice versa for when it's in state T (has just landed tails). We can similarly represent our Sticky and Steady hypotheses this way:

Given their symmetry, all of these hypotheses will make it so that the koin usually lands heads around half the time. (For aficionados: their stationary distributions are all 50-50.) Since that's all the evidence you have about koins, you should be uncertain which is true.

It follows from this uncertainty that, given your evidence, you*should* commit the gambler's fallacy: when it has just landed tails you should be more than 50% confident the the next toss will land heads; and vice versa when it has just landed heads.

Why? I'll focus on explaining a simple case; the Appendix below gives a variety of generalizations.

Let's suppose you can be sure that one of the three particular Sticky/Switchy/Steady hypotheses in Figures 1–3 are true, but you can't be sure which. Suppose you know that the koin has just landed tails (as it has). Given this, you should be more than 50% confident that it'll land heads––you should commit the gambler's fallacy! There are two steps to the reasoning.

First, you know that if Switchy is true, it has a 60% chance to land heads; that if Steady is true, it has a 50% chance to land heads; and that if Sticky is true, it has a 40% chance to land heads. So if you were very confident in Switchy, you'd be around 60% confident in heads; if you were very confident in Steady, you'd be around 50% confident in heads; and if you were very confident in Sticky, you'd be around 40% confident in heads. More generally, it follows (from total probability and the Principal Principle) that your confidence in heads should be a weighted average of these three numbers, with weights determined by how confident you should be in each of Switchy, Steady, an Sticky.

That is, where P(q) represents how confident you should be in q, your confidence that the next flip will be heads given that it has just landed tails should be:

It follows from this uncertainty that, given your evidence, you

Why? I'll focus on explaining a simple case; the Appendix below gives a variety of generalizations.

Let's suppose you can be sure that one of the three particular Sticky/Switchy/Steady hypotheses in Figures 1–3 are true, but you can't be sure which. Suppose you know that the koin has just landed tails (as it has). Given this, you should be more than 50% confident that it'll land heads––you should commit the gambler's fallacy! There are two steps to the reasoning.

First, you know that if Switchy is true, it has a 60% chance to land heads; that if Steady is true, it has a 50% chance to land heads; and that if Sticky is true, it has a 40% chance to land heads. So if you were very confident in Switchy, you'd be around 60% confident in heads; if you were very confident in Steady, you'd be around 50% confident in heads; and if you were very confident in Sticky, you'd be around 40% confident in heads. More generally, it follows (from total probability and the Principal Principle) that your confidence in heads should be a weighted average of these three numbers, with weights determined by how confident you should be in each of Switchy, Steady, an Sticky.

That is, where P(q) represents how confident you should be in q, your confidence that the next flip will be heads given that it has just landed tails should be:

\[P(H) ~=~ P(Switchy)\cdot 0.6 ~+~ P(Steady)\cdot 0.5 ~+~ P(Sticky)\cdot 0.4\]

Notice: whenever P(Switchy) > P(Sticky), this will average out to something greater than 50%. That is, whenever you should be more confident that the koin is Switchy than that it's Sticky, you should think a heads is more than 50% likely to follow from a tails, and (by parallel reasoning) that a tails is more than 50% likely to follow a heads.

**Upshot:** whenever you should be more confident the koin is Switchy than that it's Sticky, you should commit the gambler's fallacy!

(1500 words left.)

And you *should* be more confident in Switchy than Sticky––this is step two of the reasoning.

Why? Since you start out with no evidence either way, you should initially be equally confident in Switch and Sticky. And although both of these hypotheses fit with the observation that the koin tends to land heads about half the time, the Switchy hypothesis makes it*more* likely that this is so––and therefore is more confirmed than the Sticky hypothesis when you learn that the koin tends to land heads around half the time. This is because Switchy makes it less likely that there will be long runs of heads (or tails) than Sticky does, and therefore makes it more likely the overall proportion of heads will stay close to 50%.

We can see this in action by working through a small example by hand, and through bigger examples on a computer.

Small example first. Suppose all you know about the koin is that I've tossed it twice and it landed heads once. Why does Switchy make this outcome more likely that Sticky?

To land heads on one of two tosses is simply to either land HT or TH, i.e. to land one way initially and then switch. Switchy implies that such a switch is 60% likely, whereas Sticky implies that it is only 40% likely. (Meanwhile, Steady implies that it is 50% likely.) Therefore Switchy makes the "one head in two tosses" outcome more likely than Sticky does.

It follows, for example, that if you were initially equally confident in each of Switchy, Steady, and Sticky, then after learning that it landed heads once out of two tosses, you should become 40% confident in Switchy, 33% confident in Steady, and 27% confident in Sticky. Plugging these numbers into our above average shows that you should then be a bit over 51% confident that it'll switch again on the next toss––i.e. should commit the gambler's fallacy.

The reasoning in this small example generalizes. The closer the koin comes to landing heads 50% of the time, the more ways there are to do this that involve switching between heads and tails many times; meanwhile, the closer the koin comes to landing heads 0% or 100% of the time, the fewer switches there could have been. Switchy makes the former sorts of outcomes more likely; Sticky makes the latter sorts of outcomes more likely. So when you learn that the koin tend to land heads roughly 50% of the time, this is more evidence for Switchy than Sticky––and as a result, you should commit the gambler's fallacy.

So far as I know, there's no tractable formula for determining these likelihoods by hand. But since the systems are Markovian, we can use "dynamic programming" to recursively calculate the likelihoods on a computer.

For example, if we toss the koin 100 times we can plot how likely each our the three hypotheses would make various proportions of heads:

Why? Since you start out with no evidence either way, you should initially be equally confident in Switch and Sticky. And although both of these hypotheses fit with the observation that the koin tends to land heads about half the time, the Switchy hypothesis makes it

We can see this in action by working through a small example by hand, and through bigger examples on a computer.

Small example first. Suppose all you know about the koin is that I've tossed it twice and it landed heads once. Why does Switchy make this outcome more likely that Sticky?

To land heads on one of two tosses is simply to either land HT or TH, i.e. to land one way initially and then switch. Switchy implies that such a switch is 60% likely, whereas Sticky implies that it is only 40% likely. (Meanwhile, Steady implies that it is 50% likely.) Therefore Switchy makes the "one head in two tosses" outcome more likely than Sticky does.

It follows, for example, that if you were initially equally confident in each of Switchy, Steady, and Sticky, then after learning that it landed heads once out of two tosses, you should become 40% confident in Switchy, 33% confident in Steady, and 27% confident in Sticky. Plugging these numbers into our above average shows that you should then be a bit over 51% confident that it'll switch again on the next toss––i.e. should commit the gambler's fallacy.

The reasoning in this small example generalizes. The closer the koin comes to landing heads 50% of the time, the more ways there are to do this that involve switching between heads and tails many times; meanwhile, the closer the koin comes to landing heads 0% or 100% of the time, the fewer switches there could have been. Switchy makes the former sorts of outcomes more likely; Sticky makes the latter sorts of outcomes more likely. So when you learn that the koin tend to land heads roughly 50% of the time, this is more evidence for Switchy than Sticky––and as a result, you should commit the gambler's fallacy.

So far as I know, there's no tractable formula for determining these likelihoods by hand. But since the systems are Markovian, we can use "dynamic programming" to recursively calculate the likelihoods on a computer.

For example, if we toss the koin 100 times we can plot how likely each our the three hypotheses would make various proportions of heads:

Note that although all three hypotheses generate bell-shaped curves centered around 50% heads, the Switchy hypothesis generates a *tighter* bell curve around 50% heads.

**This is the crucial point.** Take any precise statement of what you know about koins––namely, that they "land heads around half the time". A precise version of that claim will take the form "the koin lands heads between (50 – *c*)% and (50 + *c*)% of the time" for some *c.* (For example, "the koin lands heads between 48% and 52% of the time.") Switchy generates a higher bell curve that Sticky, meaning it makes any such claim more likely than Sticky does––and therefore is *more* confirmed by what you know than Sticky is.

For example here's how likely each of our three hypotheses would make it that the koin lands heads "roughly 50" times out of 100 tosses, under various sharpenings of the claim:

For example here's how likely each of our three hypotheses would make it that the koin lands heads "roughly 50" times out of 100 tosses, under various sharpenings of the claim:

Since Switchy makes each of these more likely than Sticky, learning that the koin lands heads "roughly 50" of 100 times provides more evidence for the former.

In particular, if you started out ⅓ confident in each of Switchy, Steady, and Sticky, here's how confident you should be in them after updating on various versions of these "roughly 50" claims, along with the resulting you should have that it'll switch on the next flip:

In particular, if you started out ⅓ confident in each of Switchy, Steady, and Sticky, here's how confident you should be in them after updating on various versions of these "roughly 50" claims, along with the resulting you should have that it'll switch on the next flip:

In each case, since you should be more confident in Switchy than Sticky, you should perform the gambler's fallacy.

That said, as has been helpfully pointed out in the comments, as you observe a long string of tails, this provides evidence for Sticky––so sooner or later, that a long enough streak will make it so that you are no longer more confident in Switchy than Sticky. How quickly this will happen will depend on exactly what version of "the koin tends to land heads roughly half the time" you know beforehand. If all you know is "it landed heads on 50 of 100 tosses", a short streak of tails will dislodge your confidence in Switchy, and in fact make it rational to perform the "**hot hands**" fallacy and expect a *tails* to be more likely to follow a tails (see the discussion in the Appendix for more on this).

But for some versions of "the koin tends to land heads roughly half the time", your confidence in Switchy will be much more robust. Here's one version that's not an implausible characterization of what people often know about processes like this.

Suppose what you know about koins is: "on every set of tosses I've seen, it's landed heads around half the time––sometimes very close to 50%, sometimes a bit further. I can't remember the details, but it's always been between 40–60%, usually between 45–55%, and often between 48–52%". If this is what you know, then every one of those sets of tosses provides more evidence for Switchy over Sticky, meaning your confidence in Switchy will be quite robust.

For example, suppose you started out ⅓ in each hypothesis and then learned that in 10 sets of 100 tosses each, each set had between 40–60 heads, 7 of them had between 45–55 heads, and 4 had between 48–52 heads. Then you should become 72% confident it's Switchy, 22% confident it's Steady, and 6% confident it's Sticky (see the first "full calculation" section of the Appendix). As a result, you can see a string of up to 7 tails in a row (with no Switches), and still be more confident in Switchy than Sticky––and, therefore, still commit the gambler's fallacy.

That said, as has been helpfully pointed out in the comments, as you observe a long string of tails, this provides evidence for Sticky––so sooner or later, that a long enough streak will make it so that you are no longer more confident in Switchy than Sticky. How quickly this will happen will depend on exactly what version of "the koin tends to land heads roughly half the time" you know beforehand. If all you know is "it landed heads on 50 of 100 tosses", a short streak of tails will dislodge your confidence in Switchy, and in fact make it rational to perform the "

But for some versions of "the koin tends to land heads roughly half the time", your confidence in Switchy will be much more robust. Here's one version that's not an implausible characterization of what people often know about processes like this.

Suppose what you know about koins is: "on every set of tosses I've seen, it's landed heads around half the time––sometimes very close to 50%, sometimes a bit further. I can't remember the details, but it's always been between 40–60%, usually between 45–55%, and often between 48–52%". If this is what you know, then every one of those sets of tosses provides more evidence for Switchy over Sticky, meaning your confidence in Switchy will be quite robust.

For example, suppose you started out ⅓ in each hypothesis and then learned that in 10 sets of 100 tosses each, each set had between 40–60 heads, 7 of them had between 45–55 heads, and 4 had between 48–52 heads. Then you should become 72% confident it's Switchy, 22% confident it's Steady, and 6% confident it's Sticky (see the first "full calculation" section of the Appendix). As a result, you can see a string of up to 7 tails in a row (with no Switches), and still be more confident in Switchy than Sticky––and, therefore, still commit the gambler's fallacy.

That's what we should say about the gambler's fallacy with

I think we should say the same thing. Most people don't––shouldn't––be sure of how the outcomes from (most of) the random processes they encounter are generated. Many of these outcomes plausibly

So people––especially those who haven't taken statistics courses––should often leave open that various versions of the Sticky and Switchy hypotheses are true. And since they don't (can't) keep track of the full sequence of outcomes they've seen, what they know about the processes is often much more coarse-grained––e.g. that a given outcome tends to happen around 50% of the time. (See the Appendix for generalizations to other percentages.)

As we've just seen, if that's what they know then they are

Of course, this doesn't show that the way real people commit the fallacy is rational: they might commit it for the wrong reasons, or in too extreme a way. (See Brian Hedden's post on hindsight bias for a discussion for how we might probe those questions––and why it is difficult to do so.) But the mere fact that people commit the gambler's fallacy does not, on it's own, provide evidence that they are handling uncertainty irrationally––after all, it's exactly what we'd expect if they

Given subtleties like that, it's rather implausible to insist that someone who has never taken a statistics course nor studied coins in any detail should be

Given people's limited knowledge about the outcomes the random processes they encounter and the statistical mechanisms that give rise to them, they often

What next?

Here I’ll give some generalizations I’ve worked out, and some discussion of the robustness of these results.

Some people have questioned whether the results holds even in the simple case I focus on in the text, so I figured I'd work through the calculations to show that they do.

Take the versions of the Switchy/Steady/Sticky hypotheses we used above. Suppose you are initially ⅓ confident in each: P(Switchy) = P(Steady) = P(Sticky) = ⅓.

Now suppose you learn that 50 of 100 tosses landed heads ("

In particular:

The posterior credences you should have in each hypothesis follow from Bayes rule, which says that:

Parallel calculations show that P(Steady | 50H) ≈ 0.329 and P(Sticky | 50H) ≈ 0.268.

Given that, suppose you've learned that the coin just landed tails. This on its own provides no evidence about Switchy/Steady/Sticky (since you have no information about what state it was in beforehand). Thus your credence that the next toss will land heads should be: 0.403*0.6 + 0.329*0.5 + 0.268*0.4 ≈ 0.513––you should commit the gambler's fallacy.

Suppose you have more information about the koin, of the form discussed above. Out of 10 sets of 100 tosses, the koin landed heads between 40–60 times in all of them, between 45–55 in 7 of them, and between 48–52 in 4 of them. (Incidentally, this is what we'd expect you to see if the koin was in fact*Steady*, and all you remembered was how close it was to 50 heads).

The likelihoods of 40–60 heads, given Switchy/Steady/Sticky are 0.99 / 0.96 / 0.91. The likelihoods of 45–55 heads are 0.82 / 0.73 / 0.63. And the likelihoods of 48–52 heads are 0.46 / 0.38 / 0.32. The order doesn't matter, so we can just update by each of these likelihoods the relevant number of times (3 for the 40–60 likelihoods, 3 for 45–55 ones, and 4 for the 48–52 ones).

Starting at ⅓ in each hypotheses and updating on your information about the 10 sets of 100 tosses leaves you with posterior credences of P(Switchy) = 0.72, P(Steady) = 0.22, and P(Sticky) = 0.058.

Upshot: if your knowledge that the coins land heads "roughly half the time" amounts to knowledge like this––"it always lands heads around 50% of the time, and usually quite close to that"––then you should be*much* more confident in Sticky over Switchy, and that discrepancy will be robust to seeing a long series of tails in a row, meaning you'll still commit the gambler's fallacy. (In our example, up to 7 tails in a row with no heads and you'll still be more confident in Switchy than Sticky.)

Given that, suppose you've learned that the coin just landed tails. This on its own provides no evidence about Switchy/Steady/Sticky (since you have no information about what state it was in beforehand). Thus your credence that the next toss will land heads should be: 0.403*0.6 + 0.329*0.5 + 0.268*0.4 ≈ 0.513––you should commit the gambler's fallacy.

Suppose you have more information about the koin, of the form discussed above. Out of 10 sets of 100 tosses, the koin landed heads between 40–60 times in all of them, between 45–55 in 7 of them, and between 48–52 in 4 of them. (Incidentally, this is what we'd expect you to see if the koin was in fact

The likelihoods of 40–60 heads, given Switchy/Steady/Sticky are 0.99 / 0.96 / 0.91. The likelihoods of 45–55 heads are 0.82 / 0.73 / 0.63. And the likelihoods of 48–52 heads are 0.46 / 0.38 / 0.32. The order doesn't matter, so we can just update by each of these likelihoods the relevant number of times (3 for the 40–60 likelihoods, 3 for 45–55 ones, and 4 for the 48–52 ones).

Starting at ⅓ in each hypotheses and updating on your information about the 10 sets of 100 tosses leaves you with posterior credences of P(Switchy) = 0.72, P(Steady) = 0.22, and P(Sticky) = 0.058.

Upshot: if your knowledge that the coins land heads "roughly half the time" amounts to knowledge like this––"it always lands heads around 50% of the time, and usually quite close to that"––then you should be

We can easily generalize the Sticky/Switchy hypotheses. Let

\begin{align*} Switchy &= [Ch(H)>0.5] \\ Steady &= [Ch(H)=0.5] \\ Sticky &= [Ch(H)<0.5] \end{align*}

It follows from total probability that your confidence should be the following weighted average:

\begin{align*} P(\small H\normalsize ) ~=~ P(\small Ch(H)>0.5\normalsize )P(\small H\normalsize |\small Ch(H) > 0.5\normalsize) ~+~ P(\small Ch(H)=0.5\normalsize )P(\small H | Ch(H) = 0.5\normalsize) \\ ~+~ P(\small Ch(H)<0.5\normalsize )P(\small H | Ch(H) < 0.5\normalsize ) \end{align*}

It follows from the Principal Principle that P(H | Ch(H) > 0.5) > 0.5 and that P(H | Ch(H) < 0.5) < 0.5. In our situation you have no reason to treat these two options asymmetrically, so there should be some constant c such that P(H | Ch(H) > 0.5) = 0.5 + c, while P(H | Ch(H)<0.5) = 0.5 - c. It follows that:

\[ P(\small H\normalsize ) = P(\small Ch(H)>0.5\normalsize )(\small 0.5+c\normalsize ) + P(\small Ch(H)=0.5\normalsize )(\small 0.5\normalsize ) + P(\small Ch(H)<0.5\normalsize )(\small 0.5-c\normalsize ) \]

And again, this value will be greater than 50% iff P(Switchy) > P(Sticky).

The trick with using these definitions is that we now need to be careful about what the plausible versions of the Sticky/Switchy hypotheses amount to. We can no longer simply assume they are the 40%-60% hypotheses (from Figures 1–3) I assumed above, so we can’t straightforwardly calculate the likelihoods of various outcomes given Sticky and Switchy. Nevertheless, the plausible versions of these hypotheses will have the same general shape, although some will be more or less extreme in their divergences from 50% probabilities, some may have longer “memories” so that it takes longer streaks to reach these divergences, and so on. See below for direct handling of some of these issues.

Since I have no tractable algebraic expression for the likelihoods generated by various Sticky/Switchy hypotheses—even in the simple cases—there are limits on what I can prove about it. (Hunch: what matters for the difference in likelihoods between Switchy and Sticky is that the former has a shorter mixing time than the latter; perhaps that can be used in a proof?

Nevertheless, it’s easy to check that these results are robust. For example, here are the likelihoods for various proportions of heads from the three simple hypotheses at 10, 50, 100, and 500 tosses. Clearly we are (quickly) approaching a limit in the ratios of likelihoods of 50% heads, and the differences are not washing out.

And here are graphs that plot the likelihoods considering Switchy and Sticky hypotheses with different levels of probability of sticking or switching at 20, 50, 100, and 300 tosses (for example, Switchy (0.7) has 70% chance of switching; Sticky (0.4) has 40% chance of switching; etc.):

The explicit versions of the Sticky/Switchy hypotheses we’ve looked at so far all had “memories” of only size 1—the probabilities of outcomes only depend on how the

This can be modeled easily, simply by multiplying the states in our Markov chain. Instead of simply H and T, they will now include how long the streak of heads or tails has been, with the probabilities shifting gradually as the streak builds up. For example, here are diagrams representing 2-memory Switchy and Sticky hypotheses, where the probabilities build to a 60% chance to stick or switch, in both diagram and transition-matrix notation. (For the matrix, row i column j tells you the probability of transitions from state i to state j.) For example, the Switchy hypothesis says that after one heads, it's 55% likely to switch back to tails, and after two or more heads in a row it's 60% likely to switch to tails.

And though I'm not going to try to draw the 10-state diagram, here's the transition matrix for a 5-memory Switchy hypothesis that grows steadily to a 60% switch rate.

As you’d expect, the qualitative results from these hypotheses are the same as before, but (very) slightly dampened. For example, here are the likelihoods of various outcomes of 100 tosses from our original 1-memory 60% hypotheses (reproduced from Figure 4), vs. the likelihoods of outcomes with the 2-memory, 3-memory, and 5-memory 60% hypotheses:

It’s not until we get "memories" of size 10 or more that we start to see significant dampening of the divergence of likelihoods:

And it's worth noting that the *qualitative* results will be the same in all these cases, though the degree of gambler's fallacy warranted will decrease as the differences in the likelihoods get smaller.

It seems, empirically, that the versions of the Sticky and Switchy hypotheses that people take seriously are in the 5- to 10-memory range. For robustness checks, I'll show the likelihoods at 10, 50, 100, and 500 tosses for various 5-memory hypotheses whose probabilities move at constant increments up to a given extreme; for example "5-memory Switchy (0.7)" is the chain that takes 5 steps to become 70% likely to switch, and "5-memory Sticky (0.3)" is the chain that takes 5 steps to become 30% likely to switch:

It seems, empirically, that the versions of the Sticky and Switchy hypotheses that people take seriously are in the 5- to 10-memory range. For robustness checks, I'll show the likelihoods at 10, 50, 100, and 500 tosses for various 5-memory hypotheses whose probabilities move at constant increments up to a given extreme; for example "5-memory Switchy (0.7)" is the chain that takes 5 steps to become 70% likely to switch, and "5-memory Sticky (0.3)" is the chain that takes 5 steps to become 30% likely to switch:

Here are the robustness checks for these and other hypotheses at 10, 50, 100, and 500 tosses:

Upshot: the qualitative results will be the same under these various more realistic versions of the hypotheses.

**Hot Hands**

The “hot hands fallacy” is the tendency to think that an outcome is “streaky” in the sense that if a given outcome happens, it is*more* likely that it’ll happen again on the next trial. In that sense, it's the opposite of the gambler's fallacy: where gambler's expect things to switch, hot-handsers expect things to stick. (The issue from basketball; see this recent paper for a fascinating discussion of why there were statistical mistakes in the original papers claiming to show that there is not "hot hand" in basketball.)

We saw above that when P(Switchy) > P(Sticky), the gambler’s fallacy is rational, and you should be more than 50% confident that the koin will switch how it lands between tosses. By parallel reasoning, whenever P(Switchy) < P(Sticky), it follows that you should be*less* than 50% confident that the koin will land differently to how it did before—i.e. you should be more than 50% confident that it will land the same way. In other words, whenever P(Switchy) < P(Sticky), you should commit the hot hands fallacy!

Upshot: the*only* time when you should commit neither the gamblers fallacy nor the hot-hands fallacy is when you should be exactly equally confident in Switchy and Sticky: P(Switchy) **=** P(Sticky). Since such a perfect balance of evidence will be rare, you should almost always commit one of these “fallacies” (though perhaps to only a very small degree).

In particular, suppose you start out equally confident in each of Switchy and Sticky, and then learn what proportion of times the koin landed heads in some series of tosses. For example, return to our 100-toss example (Figure 4) with the 40/50/60 hypotheses, and recall the likelihoods:

The “hot hands fallacy” is the tendency to think that an outcome is “streaky” in the sense that if a given outcome happens, it is

We saw above that when P(Switchy) > P(Sticky), the gambler’s fallacy is rational, and you should be more than 50% confident that the koin will switch how it lands between tosses. By parallel reasoning, whenever P(Switchy) < P(Sticky), it follows that you should be

Upshot: the

In particular, suppose you start out equally confident in each of Switchy and Sticky, and then learn what proportion of times the koin landed heads in some series of tosses. For example, return to our 100-toss example (Figure 4) with the 40/50/60 hypotheses, and recall the likelihoods:

After learning the proportion of heads out of 100 tosses, should be more confident of Switchy than Sticky iff the blue curve is higher than the green one for the outcome (proportion of heads) you observe, and less confident if vice versa. In fact, there is *no* outcome of a 100-toss sequence that would make these exactly equal, so given any outcome, you should either commit the gambler’s or the hot-hands fallacy on the next toss.

**How can you learn it’s Steady?**

You might think we’ve run ourselves into bit of a paradox here. Note that in all the graphs I’ve shown, the Steady likelihoods almost never come out ahead overall. In the middle of the graphs, they are dominated by the Switchy likelihoods, and in the edges of the graph, they are dominated by the Sticky hypotheses. This remains true as we crank of the experiment to arbitrarily many tosses of the koins.

So… what gives? Does our reasoning show that it’s impossible to learn, by tossing a koin, that it is Steady? If so, it’s gone wrong somewhere.

But it doesn’t show that. What it shows is that*if all you learn is the proportion of heads*, you won’t be able to get strong evidence that the koin is Steady. To get that evidence, you’d really need to look closely at the *sequences* you observe. There Switchy hypotheses will make streaks very unlikely, while Sticky hypotheses will make repeated flips unlikely, and the Steady hypotheses will strike a balance. If you looked at the full sequence for long enough, you’d almost surely (in the technical sense) get to the truth of the matter about whether the koin is Sticky, Steady, or Switchy.

But what we*have* shown is that *without* that full data—with only tracking the proportions of heads—people will actually *not* be able to figure out whether the koin is Sticky, Steady, or Switchy. That is an interesting result. Because real people can’t keep track of full sequences of tosses--*at best* they can keep track of (rough) proportions. What our results do show is that given only that information, even perfect Bayesians wouldn’t be able to figure out whether the koin is Steady (or Sticky or Switchy).

**Non-50% versions**

Every version we’ve looked at so far is one where the number of heads stays around 50%. This is apt for coin tosses, but not so for other chancy events like basketball shots or drawing a face card. We’ll need to generalize what plausible Sticky and Shifty hypotheses look like for processes where the average number of heads (or “hits”) differs from 50%. For example, in the NBA—where the hot hands fallacy discussion is at home—shooting percentages are often around 45%.

The reasoning generalizes, but it gets a bit subtle. In the 50% case, all my examples assumed “symmetry” in the sense that the probability added (or subtracted) to getting a heads when it just landed heads is the same as that subtracted (or added) to getting a heads when it just landed tails.

This isn’t the right version of symmetry when the Steady hypotheses is no longer 50%. For example, suppose the Steady hypothesis is that no matter how it’s landed, the koin has a 40% chance to land heads on each toss. Then we expect that in the long run, it’ll land heads in 40% of all tosses. So here's our Steady hypothesis:

You might think we’ve run ourselves into bit of a paradox here. Note that in all the graphs I’ve shown, the Steady likelihoods almost never come out ahead overall. In the middle of the graphs, they are dominated by the Switchy likelihoods, and in the edges of the graph, they are dominated by the Sticky hypotheses. This remains true as we crank of the experiment to arbitrarily many tosses of the koins.

So… what gives? Does our reasoning show that it’s impossible to learn, by tossing a koin, that it is Steady? If so, it’s gone wrong somewhere.

But it doesn’t show that. What it shows is that

But what we

Every version we’ve looked at so far is one where the number of heads stays around 50%. This is apt for coin tosses, but not so for other chancy events like basketball shots or drawing a face card. We’ll need to generalize what plausible Sticky and Shifty hypotheses look like for processes where the average number of heads (or “hits”) differs from 50%. For example, in the NBA—where the hot hands fallacy discussion is at home—shooting percentages are often around 45%.

The reasoning generalizes, but it gets a bit subtle. In the 50% case, all my examples assumed “symmetry” in the sense that the probability added (or subtracted) to getting a heads when it just landed heads is the same as that subtracted (or added) to getting a heads when it just landed tails.

This isn’t the right version of symmetry when the Steady hypotheses is no longer 50%. For example, suppose the Steady hypothesis is that no matter how it’s landed, the koin has a 40% chance to land heads on each toss. Then we expect that in the long run, it’ll land heads in 40% of all tosses. So here's our Steady hypothesis:

You might think natural Sticky and Switchy hypotheses centered around this would simply add or subtract a fixed amount (say, 0.1) to the probabilities depending on the state, as before:

But that’s wrong. The “stationary distribution” of these two Markov chains is *not* 40/60—rather, for the Sticky hypothesis it’s around 38/62 and for the Switchy one it’s around 42/58, meaning that in the long run we would expect them to have *these* proportions of heads to tails, rather than 40-60. Accordingly, the likelihood graphs are not properly overlapping:

The proper Sticky/Switchy hypotheses if the overall proportion of heads is 40% are ones whose probabilities move depending on where they are, but in a way that leads them to have the same stationary as the Steady hypothesis. I haven’t yet figured out a general recipe for this (**any mathematicians care to help?**), but here is an example of 1-memory Sticky and Shifty hypotheses that have the correct stationaries:

And here are the likelihood graphs for 100 tosses of these two hypotheses vs. the 40%-heads Steady hypothesis:

Upshot: the same qualitative lessons hold for processes that don't come up "heads" 50% of the time; if all you know is that "roughly x% of the tosses land heads", you should be more confident in (the right version of) Switchy than Sticky, and so should commit the gambler's fallacy.

...Phew! Those are all the generalizations and further notes I have (for now). If you have any thoughts or feedback, please do send them along! Thanks!

]]>

...Phew! Those are all the generalizations and further notes I have (for now). If you have any thoughts or feedback, please do send them along! Thanks!

In *The Enigma of Reason* Hugo Mercier and Dan Sperber argue that the main use of reason is to justify and explain conclusions that we have arrived at sub-rationally. Some reactions to Extinction Rebellion’s blockading of newspaper distribution centres seem to me to illustrate their point.

For example, Culture Secretary Oliver Dowden said their action “damage[s] our democracy”. Johnson said: “A free press is vital in holding the government and other powerful institutions to account.” And Ian Murray, executive director of the Society of Editors, said that “shutting down free speech and an independent media is the first action of totalitarian regimes and dictators.”

Such fancy talk attributes to the press a sanctity it does not merit. And it’s not needed. If XR had blockaded a Cheesy Wotsits factory you might reasonably object that it was interrupting people’s right to earn a living: you wouldn’t need to argue for the gustatory merits of the Wotsit. Equally, you don’t need to attribute virtues of the press in order to defend its freedom. To do so is to seek an unwarranted justification for your priors.

Instead, this issue raises some interesting questions. One is: do we actually have a free press? Rightists say we do. Which is true in the sense of there being little state control, except for onerous libel laws (and guess who they benefit). Leftists say we don’t, as a handful of billionaires control most of it. This split reflects a general divide about the nature of freedom. The right has traditionally identified this with an absence of state intervention. The left has replied that freedom in this sense often permits too much private sector inequality – for example, in working conditions. As Corey Robin argued in *The Reactionary Mind*, Conservatives’ talk of freedom often disguises what is really a love of hierarchy. The issue here is what type of freedom we value rather than the nature of the press itself.

If we accept, arguendo, the rightist conception of freedom here, another question arises. How much good or harm does the exercise of this freedom do?

You might think here that I’ll object to its Tory bias. In this context, I won’t. Yes, there is some evidence that the media does boost (pdf) the Tory and UKIP (pdf) vote. But people have a right to advocate voting Tory, And the media is not the only source of ideas I believe to be mistaken. Pro-capitalist ideology can arise without the aid of the press.

Instead, there are other costs.

One is that far from embodying a diversity of opinion, the media actually narrows it down. Market forces tell us this. Since it became technologically possible, we’ve seen an explosion of blogs, podcasts and alternative online publications. This would not have happened if people had been satisfied with the range of opinion they were getting from the old media. If the media were truly diverse, you’d not be reading this.

Another is that the media systematically fosters false ideas. Simon Wren-Lewis has accused it, rightly, of promoting a false picture of economics. And the admirable Mic Wright documents many other shortcomings. These are not always matters of left-right bias. The media is also biased against the social sciences. It under-reports slow but important changes, ignores the countless unseen mechanisms that create our world; and looks too much for human agency where in fact there is emergence.

Surveys show that the public is woefully ill-informed about many social facts – albeit no more so in the UK than elsewhere. This is surely a clue that the media isn’t doing the job that its advocates claim.

There’s a further problem. Contrary to Johnson’s claim, the press does not adequately hold the government to account. This is not a new phenomenon. 25 years ago Patrick Dunleavy wrote (pdf) that:

Britain now stands out amongst comparable European countries, and perhaps amongst liberal democracies as a whole, as a state unusually prone to make large-scale, avoidable policy mistakes.

This was corroborated by Ivor Crewe and Anthony King in *The Blunders of our Governments*.

Which points to a longstanding failure of the press to improve the standard of government – perhaps because forensic policy analysis does not shift units.

I suspect that our democracy would work better if the mass media did not exist. In such a world, people would vote more on the basis of their local knowledge and one source of correlated opinion would disappear. The conditions required for there to be wisdom in crowds would therefore be closer to being fulfilled.

Which leads to a further problem: should anything be done about this?

One strong answer is nothing, at least legally.

Breaking up the Murdoch empire would not shift the press to the left: there are lots more right-wing billionaires who might buy up his papers.

Also, there are trade-offs between values. Perhaps we must sacrifice some good governance in order to have free speech, just as with Brexit we sacrifice some prosperity for democracy. This is especially true because it is woefully naïve to think restrictions on the press will impinge only upon the Sun and Telegraph. The law is often a weapon against the weak and marginalized. Restricting free speech would mean silencing bloggers, not the Sun.

Why not instead just let market forces and demographics do their job? Newspaper circulation is dropping, and is low among the young. They are losing relevance.

Which raises another possibility. Could we use countervailing power?

We know this can work. Capitalism was much more successful – even by its own lights – when trades unions restrained management. Might a similar thing be true of the press?

I’m not thinking here merely of XR blockades, nor even of the growing influence of new media. For me, one priority for a future Labour government would be the creation of institutions of deliberative (pdf) democracy such as citizens juries. These would provide space for people to consider policy issues on the basis of evidence provided by expert witnesses. Policy formation would then become insulated from press pressure, similar to how judges strive to ensure newspapers have no influence upon criminal trials. The press would then lose its political relevance.

Libertarian Marxists have traditionally looked forward to the state withering away. With some effort we might, more justifiably, look forward to the press doing so.

]]>