Lately I’ve been thinking a lot about the basics of probability theory. I originally learned the definition of probability measure from a logician — Wes Holliday — and his definition has probably permanently influenced how I think about probability, but I think that even analysts like myself agree that the formulation of probability in terms of sample spaces is a little unnatural. For example, one typically makes it a point to never mention specific points in the sample space, but only events — elements of a sigma-algebra over the sample space. Moreover, one typically identifies an event which happens almost surely with the sure event. Random variables are identified with their distributions, and so isomorphism of random variables “forgets” about events that almost surely do not happen as well as individual points in the sample space. Sample spaces are also utterly uninteresting: up to measurable isomorphism, there is only one standard Borel space, and in practice most probability spaces are Polish and therefore standard. There is also a lot of “junk” in the power set of a standard Borel space; all but a lesser cardinality of sets are not measurable. Simply put, the sample space contains far more information than a probabilist is willing to use.

Let me stress that one actually gets to serious probability theory, which I guess I would define as beginning when one proves the central limit theorem, the above is not terribly important (though it can help — see An uncountable Moore-Schmidt theorem, a paper in ergodic theory which in part inspired this blog post). Worrying about whether one has extra information won’t help you prove inequalities, which is what analysis is all about. But I think when one is still learning the basics, or is considering results which are either deeply infinitary (as in the case of the uncountable Moore-Schmidt theorem) or purely “measurable” (only having to do with sigma-algebras), it is instructive to step back and think about what a sample space really is.

As for myself, I’m on the teaching staff of a measure theory course this semester, and an exercise this week was to prove the following theorem, which appears in the exercises of both Billingsley and father Rudin.

**Theorem 1.** If is an infinite sigma-algebra, then has cardinality at least that of .

This exercise is hard essentially because it is not measure-theoretic. In fact, while the student is probably tempted to concoct a horribly complicated measure-theoretic argument, the statement is purely combinatorial in nature, and has a simple combinatorial proof if one *forgets* the underlying set over which is defined. Of course, this proof is totally equivalent to a similar proof about sigma-algebras, but there will be some pitfalls — I will point out one in particular — that arise if one is tempted to refer to , and so that one is not tempted by Lucifer in his guise , it is better to remove that temptation entirely.

Before proving Theorem 1, let me demonstrate how one can think about measure theory and probability without ever referring to .

We start with the combinatorial preliminaries.

**Definition 2.** By a *countably complete lattice*, one means a partially ordered set such that every countable subset has a supremum and infimum. In this case, we write (read “ and “) to denote , and similarly (“ or “) for . One says “ implies ” to mean .

**Definition 3.** By a *countably complete boolean algebra* one means a countably complete lattice equipped with a minimum (“false”) and a maximum (“true”), such that:

- For every , .
- For every , there is a unique (“not “) such that and .

Since we will only need countably complete boolean algebras, we will simply refer to such objects as *boolean algebras*. One immediately checks that the morphisms in the category of boolean algebras are those maps which preserve countable “and”, countable “or”, “true”, and “false”; that the usual laws of classical logic hold in any boolean algebra; and that every sigma-algebra is a boolean algebra under . Conversely, I think that given a suitable form of the axiom of choice, one can recover a measurable space from a boolean algebra by letting consist of a suitable subset of the set of ultrafilters on ; however, the choice of is not unique (a nice exercise).

The setting of boolean algebras feels like a very natural setting for probability theory to me, because probability theory by design is concerned with *events*, things for which it makes sense to conjoin with the words “and” and “or” and “not”. One can now easily use the language of propositional logic: for example, two events are mutually exclusive if . Let us now state the usual basic notions of probability theory in this new language:

**Definition 4.** Let be a boolean algebra. A *probability measure* on is a mapping such that, whenever are a sequence of pairwise mutually exclusive events, .

**Definition 5.** Let be a boolean algebra and be a standard Borel space. A -valued *random variable* on is a morphism of boolean algebras .

**Definition 6. **Let be a boolean algebra and be a standard Borel space. Let be a -valued random variable on . If is a probability measure on and , one writes to mean . The *distribution* of is the measure on . If , the *cdf* of is the function .

**Definition 7.** Let be boolean algebras. Let be random variables on respectively. A *morphism of random variables* is a morphism of boolean algebras such that .

Given these definitions, I think that it is now straightforward to formulate the central limit theorem, Borel-Cantelli lemma, et cetra. One can drop the word “almost” from many such statements by passing to the quotient where is the ideal of events that almost never happen.

We now prove Theorem 1 in two lemmata.

**Lemma 8.** Let be an infinite boolean algebra. Then there exists a infinite set of pairwise mutually exclusive events.

To prove Lemma 8, let and reason by induction. Suppose that we have an infinite boolean algebra which is a quotient of and events . Since is infinite, there exists such that is infinite and .

We now condition on , thus set . Here is the ideal generated by . There is a natural bijection , since any element can be decomposed as , giving a decomposition . Therefore we can view as not just an element of but an element of and as an element of (where it is identified with ).

The bijection implies that is infinite, so we can keep the induction going. Viewing as an element of , one easily checks that the form a sequence of pairwise mutually exclusive events. This completes the proof of Lemma 8.

Here, a lot of students got stuck (in the usual “sample space” language). The reason is that they tried to construct the set using properties of . For example, a common thing to do was to try to let be in bijection with by letting . In general might be a nonmeasurable set; this can be fixed by assuming towards contradiction that is countable. More importantly, the map is not injective in general, essentially for the same reason that is not uniquely determined by . By replacing with a suitable quotient, and showing that this quotient is also infinite, one can force to be injective, but this can be rather messy. In addition, assuming that is countable will weaken this lemma, so that it does not imply Theorem 1 unless one invokes the continuum hypothesis. Moral: do not ever refer to points in the sample space!

**Lemma 9.** Let be a boolean algebra and suppose that there is a countably infinite set of pairwise mutually exclusive events. Then the boolean algebra generated by has cardinality at least .

To prove Lemma 9, let be an enumeration of . Let be a set of natural numbers and define . Then is an injective map . Indeed, if , then suppose without loss of generality that . Then and , so . Since the are pairwise mutually exclusive, the only way that is if . This proves Lemma 9, and hence Theorem 1.

Fantastic Aiden! Of course, my social science brain canâ€™t absorb most of your paper (blog), but it is certainly fascinating and incredible. Keep up the great work!

LikeLike

Hi Johnson! Forgive me for being a bit overly verbose in this article, hah.

LikeLike