## Sunday, August 14, 2005

### Ravens and Probability Paradoxes

My lecturer wasn't entirely convinced by my solution to the raven paradox. Recall, I claimed that we only confirm a universal statement (e.g. "all ravens are black") through trying as hard as we can to falsify it (i.e. by finding coloured ravens). If our best attempts at falsification fail, then that confirms the original hypothesis.

An important implication of my view is that no confirmation can arise from sampling a population which couldn't possibly falsify the hypothesis in any case. For example, coloured ravens might turn up in a population of coloured things, but not in a population of herrings. So if you sample coloured things and find a red herring, that (slightly) confirms that there are no coloured ravens. But finding a red herring when you're sampling from a bucket of herrings does not provide any confirmation whatsoever. For the same reason, finding black ravens in a sample of black things (as opposed to, say, a sample of ravens) would, surprisingly enough, do nothing at all to confirm that all ravens are black.

Doug wanted to deny this, and claim instead that even my bucket of herrings confirms the hypothesis C0: 'there are no coloured ravens'. He says that C0 "is disconfirmed by everything that is a coloured raven and confirmed by everything else." In other words, he agrees with Hempel's R1 account of confirmation. Now, I think that R1 is false, and that confirmation instead depends upon surviving attempted falsification, for reasons previously explained. But my lecturer offered a couple of interesting arguments which I want to consider here.
E.g., consider your bucket of red herrings. It's true that once you tell me it's a bucket of herrings then there is no point in me searching through it in order to obtain confirmation of the raven hypothesis. But that is because I already got all the relevant information when you told me it contained herrings.

But note that if R1 were true, then it would make a difference how many herrings were in the bucket. Each individual herring co-exemplifies the properties of 'being coloured' and 'being non-raven', and thus (according to R1) confirms that all coloured things are non-raven (restatement of C0). But we should recognize that it makes no difference whatsoever whether there are two or two million herrings in the bucket. If I sample the bucket of herrings and find two million red herrings inside, that does not provide any greater evidence for C0 than it would if I merely found two herrings in the bucket. Hence R1 is a mistaken account of confirmation.

Doug's next argument is more complicated. Here's my gloss on it: There are various rival hypotheses we might make about the number of coloured ravens in the world. On one hypothesis, call it H0 (=C0), there are no coloured ravens. On another, let's call it H1, everything is a coloured raven. The bucket of herrings allows us to disconfirm H1, and thus it slightly confirms the rival hypothesis H0.

I don't think this argument works, for two reasons. Firstly, there are infinitely many such rival hypotheses. (Let r be any real number between 0 and 1. Then Hr is the hypothesis that r is the proportion of worldly objects that are coloured ravens.) So when H1 is disconfirmed, the probability value we had previously assigned to it (which might be infinitesimal anyway) gets redistributed over infinitely many rival hypotheses. So the confirming increment to H0 in particular is 1/infinity = zero. [Update: This is mistaken - see comments. However, for some different and better counterarguments, see my new post.]

Second reason: we can develop a paradox of interpretation out of this. Although Doug divided the rival hypotheses up in proportional terms, we might instead list our rival hypotheses in terms of brute cardinality. Let Cn be the claim that there are n coloured ravens. But note that the observation of red herrings does not disconfirm any hypothesis Cn. (The world could contain n coloured ravens plus a bucket of red herrings.) So the observation fails to confirm C0, the claim that there are no coloured ravens.

An interesting paradox arises if we held that H0 was confirmed. For C0 is identical to H0, but C0 was not confirmed. This is contradictory. We assign different probabilities to one and the same state of affairs, depending on how we describe it. (Of course, we can easily avoid this paradox by simply denying that H0 was really confirmed after all, but let's put that aside for now.) This reminds me of Bertrand's paradox:
A factory produces cubes with side-length between 0 and 1 foot; what is the probability that a randomly chosen cube has side-length between 0 and 1/2 a foot? The tempting answer is 1/2, as we imagine a process of production that is uniformly distributed over side-length. But the question could have been given an equivalent restatement: A factory produces cubes with face-area between 0 and 1 square-feet; what is the probability that a randomly chosen cube has face-area between 0 and 1/4 square-feet? Now the tempting answer is 1/4, as we imagine a process of production that is uniformly distributed over face-area. This is already disastrous, as we cannot allow the same event to have two different probabilities...

The principle of indifference is pretty dodgy. That's all I have to say for now.

1. Aren't there a finite number of worldly objects? If I knew there were N objects then observing k herrings rules out C(N-k) up to CN.

In the real world I don't know what N is, I just know it's big, and maybe I could describe some distribution to say how much I believe in all the possible Ns.

Then I would have to formulate a similar distribution describing how much I believe C0..Cn, because my beliefs about the number of coloured ravens depends on my beliefs over all the Ns.

In theory this distribution over C0..Cn should change a tiny bit when you tell me you have a bucket of herrings. In reality if I knew it was just a 10 litre bucket, I wouldn't bother to update my beliefs. But if you told me that you knew of a bucket containing say 10^30 herrings, then updating the belief distribution would be worthwhile.

2. Interesting idea. But I think it's backwards to hold N constant and revise your beliefs about Cn. You should do the opposite instead. If I'm told there exists a bucket with (say) a googol herrings in it, what I should conclude is that N is a whole lot larger than I'd previously thought. It says nothing at all about how many coloured ravens are in the world.

I just don't think it's legitimate (or rational) to treat N as an unrevisable constant.

3. Finging a googol herrings is stronger proof of a rise in the number of H's than of hte size of hte universe isnt it? So I still see a vague support for the argument.

4. I've never been comfortable with arguments about the validity of induction, because they all seem to be trying really hard to show that induction can produce truth, i.e. certainty. It seems much safer to me to fall back to the lesser hypothesis that induction produces estimates of the likelihood of a proposition.

Suppose that I estimate through some other means that there are 100 ravens in the world (obviously this is wrong, but the problem is not qualitatively different if there are a hundred million ravens, only quantitatively), and that I want to examine the possibility of colored ravens. If there is at least one colored raven in the world, I can ask myself how many random observations of ravens I would have to make before I would be, say, 99% sure to see one. The probability calculation is fairly simple. Suppose there's exactly one colored raven. Then the probability that a set of N observations in a row will contain entirely black ravens is (99/100)^N, where ^ denotes exponentiation. Playing with my trusty calculator for a few minutes shows that (99/100)^450 = .01086 (approx.). What this means is that making 450 random observations consisting only of non-colored ravens is very unlikely (1.1% chance) if my assumption that there is a colored raven is true.

Suppose, on the other hand, that there are a million "things" in the world, including the 100 ravens. How long would I have to examine random things before being able to conclude that colored ravens are unlikely? In other words, for what N is (999,999/1,000,000)^N < .01 ? More fidgeting with my calculator shows that N=4,606,000 gives .00999.

The hard part, of course, is making random observations within the right population. If I look out my window and see three black ravens, and do the same thing the next day, I haven't really made six random observations of ravens, I've made six observations of ravens that live near me. What if all the colored ravens live on the other side of the world? Also, it seems likely that the same ravens might be sitting outside my window two days in a row, so my second set of three observations isn't really very random.

All of these calculations are conditional on the accuracy of my initial estimate that there are 100 ravens in the world and 1000000 "things" in the world. If the estimate varies, the number of observations necessary to establish this kind of strong likelihood ( < 1% chance of error) will vary too. (Though less than I might have thought; if there are 150 ravens, then 450 observations of black ravens still gives less than 5% chance that a non-black raven exists.) But more important than the actual calculations, I think, is the more concrete look at exactly how the "evidence" accumulates.

5. Yeah, that's pretty much my take on it too. I wasn't exactly sure how to formalize it though -- is it Bayesian stats?

6. I don't think it's Bayesian. At least, it's not consciously so, since I really don't know anything significant about Bayesian statistics. As far as I'm aware, what I've described is just basic probability as I half-remember it from college years ago. I would hazard a guess that a more serious statistical treatment would not be nearly so careless about the randomness of the observations, and might be able to deal with uncertainty as to the size of the population, too. And there's probably half a dozen other oversimplifications in my post too that I'm not even aware of. Statistics is one of those disciplines I always promised myself I'd get around to learning something about one of these days...

7. I was bored, so I googled for "bayesian statistics introduction" and read this
brief paper on it. Based on that, I think a Bayesian treatment might go something like this.

Supposing there are 100 ravens in the world, we want to determine which hypothesis is most likely: that there are 0, 1, or 2 colored ravens among them. Since we're starting from total ignorance, we assign each hypothesis an equal prior probability: p(0) = p(1) = p(2) = 1/3. We then observe, say, 300 actual ravens, all of which are black. In hypothesis 0, the likelihood of this event is (1)^300 = 1. In hypothesis 1 it's (99/100)^300 = .049, and in hypothesis 2 it's (98/100)^300 = .00233. Using the tabular calculation described in the link above (apologies for the horrible formatting; no < pre> tag available here), we get:

hypothesis: 1 2 3
==============================================================
prior probability .333 .333 .333
likelihood 1 .049 .00233
prior probability
* likelihood .333 .0162 .000769 total: .350
posterior probability: .951 .0463 .00218

Which we would then interpret to mean that there was 95% chance that there are no colored ravens, less than 5% chance that there is exactly one, and .2% that there are two.

Remember, everything I know about Bayesian statistics is based on 20 minutes of googling, so don't take me as an authority, but it seems like Bayesian methods are good for comparing several (or large numbers of) hypotheses on a fixed, reasonably large set of data.

8. I suppose we are not yet taking into account richards counter that we are not searching for anything that might falsify the hypothesis? Ie that we would overlook lets say a coloured thing before noting it is a raven.
Does that weaken or negate the argument?

And back to a Richaresque theme
at a wider level I guess this may have practical issues? I guess i.e you almost have to remove ravens from the bucket at some point as the number of things in the bucket approaches the universe...

9. "I just don't think it's legitimate (or rational) to treat N as an unrevisable constant."

That's true, every time you get new information, you must revise your belief distribution over the Ns, as well as your belief distribution over C0...Cn, and all your other beliefs as well. But by how much you revise each belief will depend on the information.

So if you provided evidence that the 10^30 herrings you observed were somehow unaccounted for in all our previous estimates of the amount of matter in the universe, then I'd revise my belief distribution over Ns upwards, and I might not change my beliefs over C0..Cn.

On the other hand if you just told me that you'd discovered 10^30 herrings living on 10^20 ordinary planets whose surfaces were made entirely of water, I might revise my beliefs in C0...Cn without necessarily revising my beliefs about N.

[In either case the sudden appearance of all those previously unaccounted-for herrings would make a lot of my beliefs about the nature of the universe much less certain.]

But the main point is that any new information you receive should be examined to see how much it will alter all your existing beliefs.

10. Hmmm, it is rather off the topic, but I'm glad that there are not a google herrings in our visable universe because they wouldn't fit. I'm going to go try and work out if there is room for 10^30. There might be room, but it might also start a gravitational colapse to a Big Crunch.

Anybody consider trying to keep their examples more on the side of reality? Just a thought

11. driftwood,
well it would make great proof then!!
it might well prove ravens don't exist at all :)

12. If counterfactual A is true, then counterfactual B is true?

Boy, we could really go off in to La La Land with that one. (If wishes were horses...)

13. I once saw a wonderful proof that if there exists any statement that is both false and true at the same time, then every other possible statement is true (and false, since each statement's negation is also true). Good times, that.

14. I meant to write this comment earlier, so I hope that you're still reading these. Richard, your first argument fails. You write:

Firstly, there are infinitely many such rival hypotheses. (Let r be any real number between 0 and 1. Then Hr is the hypothesis that r is the proportion of worldly objects that are coloured ravens.) So when H1 is disconfirmed, the probability value we had previously assigned to it (which might be infinitesimal anyway) gets redistributed over infinitely many rival hypotheses. So the confirming increment to H0 in particular is 1/infinity = zero.

First, consider this as a general claim: redistributing probability from a single disconfirmed hypothesis to infinitely many other hypotheses never increases the probability of another hypothesis by more than an infinitesimal amount. This general claim is false, as the following counterexample shows. Suppose I flip a fair coin until I get a tails, and then stop. How many times will I flip the coin? There are an infinite number of possibilities (1,2,3,...), but each possibility has a nonzero probability. The probability of exactly n flips until the first tails is 1/(2^n). Let's say that we obtain the following piece of evidence: the first flip is a heads. Then the hypothesis of 1 flip is refuted, and the probability value that we had previously assigned to it (which was 1/2) gets redistributed over the other (countably) infinite number of other hypotheses. Each of these hypotheses gains a nonzero amount of probability (in fact, each of them doubles in probability), with the hypothesis of 2 flips gaining by 1/4.

So the general claim fails. What about the narrower claim that the probability gain to H0 is zero in the case of the ravens? If we assume that the total number of objects is finite, then the narrower claim also looks to be false. In that case, r cannot vary over all of the real numbers from 0 to 1. The only relevant hypotheses those where r is a rational number (since r is the number of colored ravens divided by the number of objects). This means that we're using discrete probability distributions, not continuous ones, since the domain is a countable set (the rational numbers). With discrete probability distributions you're dealing with the sums of probabilities (possibly infinite sums), so some individual points must have nonzero probabilities. Refuting a hypothesis with nonzero probability must add nonzero probability to at least one other hypothesis. (With continuous distributions, you are dealing with integrals over probability weights, and any single outcome has probability zero, so the kind of thing you're talking about does happen.)

I assumed that the total number of objects is finite, although it could be some really huge integer, and we might not even be able to put an upper bound on it (that is, for any integer n, there is a nonzero probability that the number of objects is greater than n). I think that we want to make this assumption because, besides the philosophical difficulties of saying that there are infinitely many objects, there are mathematical difficulties that create problems for your argument about ravens. For one, hypothesis testing becomes problematic if the total number of objects is infinite (e.g. finite random sampling no longer provides evidence for C0 over C1). Additionally, H0 would not be identical to C0, since the proportion of objects that are colored ravens could equal zero if there are infinitely many objects and any finite number of ravens (so H0 would include the disjunction of Cn for all nonnegative integers n).

I agree with you on the main points of the raven argument. R1 is false and the sampling procedure matters. I think that this becomes even clearer if you use a different example rather than the ravens. For instance, say you're looking for extraterrestrial life. Is there life outside our solar system? Your hypothesis is that, no, there is not: Every object that is C1) alive is also C2) within our solar system. If you look around your house and see your dog, then you have observed an object with both attributes: it is C1) alive and C2) within our solar system. According to R1 that is evidence for your hypothesis that there is no life outside our solar system. But this is ridiculous - there's no way that you could get direct evidence about whether there is life outside our solar system just by looking around your house.

15. Yeah, you're quite right about my first argument -- I hereby retract it.

I am worried about the "ruling out rival hypotheses" argument, however. For it leads to the denial of your final remark. Observing my neighbour's dog allows me to rule out the hypothesis that all lifeforms are extraterrestrial. Thus, my lecturer argues, this slightly confirms each rival hypothesis, including that no lifeforms are extraterrestrial. So we're stuck with R1's "ridiculous" conclusion, unless something like my second ("paradox of interpretation") argument works?

16. The main benefit of the extraterrestrial example over the raven example, I think, is plausibility. Namely, R1 is implausible in this case, whatever you say about proportions. Indeed, it seems more plausible (at least to me) that seeing the dog in this solar system would increase the probability that there are living things outside the solar system. This is because, in addition to ruling out the hypothesis that all lifeforms are extraterrestrial, it rules out the hypothesis that there are no lifeforms (we're assuming, for both of these, that the dog is the only lifeform known to exist - I suppose we could make weaker, probabilistic versions of these claims if we knew of the existence of other lifeforms, like ourselves).

This is clearer with another example: the search for black holes. I'm not sure if the following example is historically accurate, but suppose that the following events occurred, in chronological order:

1. Scientists determine that the sun is one star in a galaxy of many others, and that are galaxy is one many different galaxies of stars that are similar in many important respects.
2. Physicists develop a theory that allows for the existence of what are now known as black holes. There is no observational evidence for the existence of black holes.
3. A philosopher makes the following conjecture: every black hole in existence is located within our galaxy. That is, every object that is C1) a black hole is also C2) within our galaxy.
4. Astronomers observe a black hole in our galaxy.

What does this observation do to the philosopher's hypothesis? It seems obvious that it greaterly reduces the probability that the philosopher's hypothesis is true, but R1 holds that the reverse must happen. I believe that the hypothesis is unlikely to be true if there are many black holes, since galaxies are so similar. Its best chance of being true was if there were no black holes in existence anywhere, in which case it would be trivially true, but the observation rules out this possibility.

I think that this reason could be formalized. Let H(p/q) be the hypothesis that the proportion of some particular type of objects among a larger set of objects is p/q, where p/q is a fraction in reduced form (e.g. lifeforms outside our solar system out of all lifeforms). Let C(p) be the hypothesis that there are p of those objects. Let D(q) be the hypothesis that the larger set contains exactly q objects. Then H(p/q) is equivalent to the infinite disjunction {C(p)&D(q) OR C(2p)&D(2q) OR ... OR C(kp)&D(kq) OR ...}, where k takes on the value of every positive integer. Every observation rules out some joint hypotheses C(x)&D(y) - for instance, seeing a lifeform in our solar system rules out those joint hypotheses where y=0 and those where x=y. Observations can also probabilistically influence hypotheses, making some more likely than others. It's important to note that altering the probabilities of different proportions p/q could just occur by affecting the denominator q rather than the numerator p (in terms of hypotheses, the D's rather than the C's).

I think that the best way to counter R1, though, is to abandon these broad examples for something more controlled, like the Wason selection task. Suppose that it is given that each of these four cards on the table has a letter on one side (a-z) and a number on the other side (1-9). On the sides facing you the four cards say A, B, 1, and 2. The hypothesis in question is that every card that C1) has a vowel on its letter-side also C2) has an odd number on its number-side. R1 implies that flipping over the card with a 1 on it could give you evidence in favor of the hypothesis, but that is wrong, as Chris explained. We could even make this more rigorous, if we claim that the cards received their labels as follows: first, I wrote the letters and numbers that you see on them: A, B, 1, and 2. Then, for each card I flipped a fair coin, and if it landed heads I put an A or a 1 on the other side (depending if it was a letter-side or a number-side), and if it landed tails I put a B or a 2. Then there is no way that the card that says 1 could give you information about whether the hypothesis is correct.

Visitors: check my comments policy first.
Non-Blogger users: If the comment form isn't working for you, email me your comment and I can post it on your behalf. (If your comment is too long, first try breaking it into two parts.)