# Boolean algebras and probability

Lately I’ve been thinking a lot about the basics of probability theory. I originally learned the definition of probability measure from a logician — Wes Holliday — and his definition has probably permanently influenced how I think about probability, but I think that even analysts like myself agree that the formulation of probability in terms of sample spaces is a little unnatural. For example, one typically makes it a point to never mention specific points in the sample space, but only events — elements of a sigma-algebra over the sample space. Moreover, one typically identifies an event which happens almost surely with the sure event. Random variables are identified with their distributions, and so isomorphism of random variables “forgets” about events that almost surely do not happen as well as individual points in the sample space. Sample spaces are also utterly uninteresting: up to measurable isomorphism, there is only one standard Borel space, and in practice most probability spaces are Polish and therefore standard. There is also a lot of “junk” in the power set of a standard Borel space; all but a lesser cardinality of sets are not measurable. Simply put, the sample space contains far more information than a probabilist is willing to use.

Let me stress that one actually gets to serious probability theory, which I guess I would define as beginning when one proves the central limit theorem, the above is not terribly important (though it can help — see An uncountable Moore-Schmidt theorem, a paper in ergodic theory which in part inspired this blog post). Worrying about whether one has extra information won’t help you prove inequalities, which is what analysis is all about. But I think when one is still learning the basics, or is considering results which are either deeply infinitary (as in the case of the uncountable Moore-Schmidt theorem) or purely “measurable” (only having to do with sigma-algebras), it is instructive to step back and think about what a sample space really is.

As for myself, I’m on the teaching staff of a measure theory course this semester, and an exercise this week was to prove the following theorem, which appears in the exercises of both Billingsley and father Rudin.

Theorem 1. If $\Sigma$ is an infinite sigma-algebra, then $\Sigma$ has cardinality at least that of $\mathbb R$.

This exercise is hard essentially because it is not measure-theoretic. In fact, while the student is probably tempted to concoct a horribly complicated measure-theoretic argument, the statement is purely combinatorial in nature, and has a simple combinatorial proof if one forgets the underlying set $\Omega$ over which $\Sigma$ is defined. Of course, this proof is totally equivalent to a similar proof about sigma-algebras, but there will be some pitfalls — I will point out one in particular — that arise if one is tempted to refer to $\Omega$, and so that one is not tempted by Lucifer in his guise $\Omega$, it is better to remove that temptation entirely.

Before proving Theorem 1, let me demonstrate how one can think about measure theory and probability without ever referring to $\Omega$.

Definition 2. By a countably complete lattice, one means a partially ordered set $\mathbb B$ such that every countable subset $X \subseteq \mathbb B$ has a supremum and infimum. In this case, we write $x \wedge y$ (read “$x$ and $y$“) to denote $\inf \{x, y\}$, and similarly $x \vee y$ (“$x$ or $y$“) for $\sup \{x, y\}$. One says “$x$ implies $y$” to mean $x \leq y$.

Definition 3. By a countably complete boolean algebra one means a countably complete lattice $\mathbb B$ equipped with a minimum $0$ (“false”) and a maximum $1$ (“true”), such that:

• For every $x, y, z \in \mathbb B$, $x \wedge (y \vee z) = (x \wedge y) \vee (x \wedge z)$.
• For every $x \in \mathbb B$, there is a unique $!x \in \mathbb B$ (“not $x$“) such that $x \wedge !x = 0$ and $x \vee !x = 1$.

Since we will only need countably complete boolean algebras, we will simply refer to such objects as boolean algebras. One immediately checks that the morphisms in the category of boolean algebras are those maps which preserve countable “and”, countable “or”, “true”, and “false”; that the usual laws of classical logic hold in any boolean algebra; and that every sigma-algebra is a boolean algebra under $\subseteq$. Conversely, I think that given a suitable form of the axiom of choice, one can recover a measurable space $(\Omega, \mathbb B)$ from a boolean algebra $\mathbb B$ by letting $\Omega$ consist of a suitable subset of the set of ultrafilters on $\mathbb B$; however, the choice of $\Omega$ is not unique (a nice exercise).

The setting of boolean algebras feels like a very natural setting for probability theory to me, because probability theory by design is concerned with events, things for which it makes sense to conjoin with the words “and” and “or” and “not”. One can now easily use the language of propositional logic: for example, two events $E,F$ are mutually exclusive if $E \wedge F = 0$. Let us now state the usual basic notions of probability theory in this new language:

Definition 4. Let $\mathbb B$ be a boolean algebra. A probability measure $\mu$ on $\mathbb B$ is a mapping $\mu: \mathbb B \to [0, 1]$ such that, whenever $E_i$ are a sequence of pairwise mutually exclusive events, $\mu\left(\bigvee_i E_i\right) = \sum_i \mu(E_i)$.

Definition 5. Let $\mathbb B$ be a boolean algebra and $P = (P, \Sigma)$ be a standard Borel space. A $P$-valued random variable $X$ on $\mathbb B$ is a morphism of boolean algebras $X: \Sigma \to \mathbb B$.

Definition 6. Let $\mathbb B$ be a boolean algebra and $P = (P, \Sigma)$ be a standard Borel space. Let $X$ be a $P$-valued random variable on $\mathbb B$. If $\mu$ is a probability measure on $\mathbb B$ and $E \in \Sigma$, one writes $\mu(X \in E)$ to mean $\mu(X(E))$. The distribution of $X$ is the measure $\mu_X(E) = \mu(X \in E)$ on $P$. If $P = \mathbb R$, the cdf of $X$ is the function $x \mapsto \mu(X \leq x)$.

Definition 7. Let $\mathbb B,\mathbb B'$ be boolean algebras. Let $X,X'$ be random variables on $\mathbb B,\mathbb B'$ respectively. A morphism of random variables $F: X \to X'$ is a morphism of boolean algebras $F: \mathbb B \to \mathbb B'$ such that $F \circ X = X'$.

Given these definitions, I think that it is now straightforward to formulate the central limit theorem, Borel-Cantelli lemma, et cetra. One can drop the word “almost” from many such statements by passing to the quotient $\mathbb B/N$ where $N$ is the ideal of events that almost never happen.

We now prove Theorem 1 in two lemmata.

Lemma 8. Let $\mathbb B$ be an infinite boolean algebra. Then there exists a infinite set $A \subseteq \mathbb B$ of pairwise mutually exclusive events.

To prove Lemma 8, let $\mathbb B_0 = \mathbb B$ and reason by induction. Suppose that we have an infinite boolean algebra $\mathbb B_n$ which is a quotient of $\mathbb B_{n-1}$ and events $x_i \in \mathbb B_i,~i \leq n -1$. Since $\mathbb B_n$ is infinite, there exists $x_n \in \mathbb B_n$ such that $\{y \in \mathbb B_n: y \leq x_n\}$ is infinite and $!x_n \neq 0$.

We now condition on $x_n$, thus set $\mathbb B_{n+1} = \mathbb B_n/(!x_n)$. Here $(!x_n)$ is the ideal generated by $!x_n$. There is a natural bijection $\{y \in \mathbb B_n: y \leq x_n\} \to \mathbb B_{n+1}$, since any element $y \in \mathbb B_n$ can be decomposed as $(y \wedge x_n) \vee (y \wedge !x_n)$, giving a decomposition $\mathbb B_n = \mathbb B_{n+1} \oplus (!x_n)$. Therefore we can view $x_n$ as not just an element of $\mathbb B_n$ but an element of $\mathbb B_0$ and as an element of $\mathbb B_{n+1}$ (where it is identified with $1$).

The bijection $\{y \in \mathbb B_n: y \leq x_n\} \to \mathbb B_{n+1}$ implies that $\mathbb B_{n+1}$ is infinite, so we can keep the induction going. Viewing $x_n$ as an element of $\mathbb B_0$, one easily checks that the $x_i$ form a sequence of pairwise mutually exclusive events. This completes the proof of Lemma 8.

Here, a lot of students got stuck (in the usual “sample space” language). The reason is that they tried to construct the set $A$ using properties of $\Omega$. For example, a common thing to do was to try to let $A$ be in bijection with $\Omega$ by letting $x_\omega = \bigcap \{y \in \mathbb B: \omega \in y\}$. In general $x_\omega$ might be a nonmeasurable set; this can be fixed by assuming towards contradiction that $\mathbb B$ is countable. More importantly, the map $\omega \mapsto x_\omega$ is not injective in general, essentially for the same reason that $\Omega$ is not uniquely determined by $\mathbb B$. By replacing $\Omega$ with a suitable quotient, and showing that this quotient is also infinite, one can force $\omega \mapsto x_\omega$ to be injective, but this can be rather messy. In addition, assuming that $\Omega$ is countable will weaken this lemma, so that it does not imply Theorem 1 unless one invokes the continuum hypothesis. Moral: do not ever refer to points in the sample space!

Lemma 9. Let $\mathbb B$ be a boolean algebra and suppose that there is a countably infinite set $A \subseteq \mathbb B$ of pairwise mutually exclusive events. Then the boolean algebra $\sigma(A)$ generated by $A$ has cardinality at least $2^{\aleph_0}$.

To prove Lemma 9, let $\{x_i\}_{i \in \mathbb N}$ be an enumeration of $A$. Let $I$ be a set of natural numbers and define $f(I) = \bigvee_{i \in I} x_i \vee \bigvee_{i \notin I} !x_i$. Then $f$ is an injective map $I \to A$. Indeed, if $f(I) = f(J)$, then suppose without loss of generality that $i \in I \setminus J$. Then $f(I) \geq x_i$ and $f(I) \geq !x_i$, so $f(I) \geq x_i \vee !x_i = 1$. Since the $x_i$ are pairwise mutually exclusive, the only way that $f(I) = 1$ is if $I = J = \mathbb N$. This proves Lemma 9, and hence Theorem 1.