Jekyll2023-09-20T08:09:36+00:00https://jasonzeshengchen.github.io/feed.xmlJason Zesheng Chen’s HomepageHomepage of Jason Zesheng Chen. Based on [*folio](https://github.com/bogoli/-folio) design. Three funny proofs of the existence of incomparable Turing degrees2023-04-17T00:00:00+00:002023-04-17T00:00:00+00:00https://jasonzeshengchen.github.io/blog/2023/three-proofs-of-incomp-Turing-degrees

“People would usually spend a whole class in computability theory proving this. What they are doing is they are very carefully proving the Baire category theorem without explicitly saying it.”

I’ve recently learned of a pretty neat proof of the existence of incomparable Turing degrees. And this reminds me that I’ve actually seen quite a few funny (nuking-the-mosquito-type) proofs of this statement, so I decided to record them here.

The first proof was shown to me today by Andrew Marks. You can do this with either measure or category:

Consider the relation $$R(x,y)\Leftrightarrow x\leq_T y$$. First notice that this is a Borel subset of $$\mathbb{R}\times \mathbb{R}$$ (take your favorite interpretation of what $$\mathbb{R}$$ is). So it is measurable/has the Baire property. Second, observe that each section $$R_y$$ is countable, and so it is a null/meager subset of $$\mathbb{R}$$. By Fubini’s theorem/Kuratowski-Ulam theorem, $$R$$ is a null/meager subset of $$\mathbb{R}^2$$. The analogous argument works to show that $$R^{-1}$$ is also null/meager.

Hence, $$R\cup R^{-1}$$, the set containing all pairs that are Turing comparable, is null/meager. Therefore the set of pairs $$(x,y)$$ such that $$x,y$$ are Turing-incomparable is measure one/comeager.

The second proof I saw on the internet (for example here). It uses the observation that the continuum hypothesis follows from total comparability of the Turing degrees: each real computes countably many reals, so the reals ordered by $$\leq_T$$ will form an uncountable linear order in which every proper initial segment is countable. This implies there are at most $$\omega_1$$ many reals, so CH holds.

Now just force to negate CH. In the extension, $$\leq_T$$ is not a linear order. But the sentence “there exists two reals that are incomparable” is $$\Sigma^1_1$$, and hence by Mostowski absoluteness this alread holds in the ground model.

The third proof is somewhat similar to the second. I came up with it when I was thinking about the question “if a real is computable from a comeager set of reals, is it computable?” The measure analogue of this is true, and this was the first interaction between computability theory and measure theory. That result was indepdently obtained by Sacks, and De Leeuw-Moore-Shannon-Shapiro. The answer to my question is also yes, and the argument I came up with had already appeared in Andreas Blass’s Needed reals and recursion in generic reals 20 years ago.

The proof goes like this: force to add two Cohen reals, then neither computes the other. But again it’s $$\Sigma^1_1$$ to say there exists two incomparable reals, and so this already holds true in the ground model.

The key fact used in the proof is that Cohen reals hold no computation power. I think this is an independently interesting fact, so I’ll end this post with a properly written proof.

Theorem. Let $$M$$ be a countable transitive model of enough of ZFC, and let $$x$$ be a real in $$M$$ and $$c$$ a Cohen real over $$M$$. If $$x$$ is computable relative to $$c$$, then $$x$$ is computable.

Proof. If $$x$$ is computed by the Turing program $$\Phi^c_e$$, then this fact also holds true in $$M[c]$$, and so by the forcing theorem this is forced by some condition $$p$$. That is,

$p\Vdash \text{ the } \check e\text{th Turing program in the oracle }\dot c \text{ computes } \check x$

For any $$i\in\omega$$ we compute $$x(i)$$ as follows: run $$\Phi^s_e(i)$$ for all the $$s$$ extending $$p$$.

As soon as any of these computations halt, the output will be the correct value of $$x(i)$$. This is because: if $$s_0,s_1$$ are two different nodes extending $$p$$ and $$\Phi^{s_0}_e(i)=0\neq 1=\Phi^{s_1}_e(i)$$, then we can build two different filters $$G_0$$ and $$G_1$$ containing $$s_0,s_1$$ respectively. Now $$M[G_0]$$ and $$M[G_1]$$ will both think $$x$$ is computed by $$\Phi_e^a$$ (since both filters contain $$p$$. Note that they will interpret $$a$$ differently; but that doesn’t matter). So $$M[G_0]$$ thinks that $$x(i)=0$$ and $$M[G_1]$$ thinks $$x(i)=1$$. But whatever $$x(i)$$ is, this is an absolute fact about $$x\in M$$, so it should be answered in the same way by all transitive models extending $$M$$. Contradiction! $$\square$$

]]>
A perfect set of not-eventually-equal reals2022-09-02T00:00:00+00:002022-09-02T00:00:00+00:00https://jasonzeshengchen.github.io/blog/2022/hjorth-handbook-remark(This post is tagged #lolbvious, because it’s one of those things that’s supposed to be obvious/clear/straightforward/true by the usual argument… etc, but just makes you go lol, yeah right)

I remember being puzzled by the following passage from the chapter on Borel equivalence relations (by Greg Hjorth) in the Handbook of Set Theory. The claim is that there is a Borel map $$f:2^\omega\to 2^\omega$$ that reduces identity to eventual equality. In other words, $$f$$ is such that

$x=y\Leftrightarrow (f(x)(n)=f(y)(n) \text{ for all but finitely many } n)$

It is well-known (or maybe I should say well-documented?1) that the existence of such a map is equivalent to saying that there is a perfect set of inequivalent elements for the following equivalence relation denoted $$E_0$$:

$xE_0 y \Leftrightarrow x(n)=y(n) \text{ for all but finitely many } n$

which is to say that $$x$$ and $$y$$ are eventually equal.

Hjorth says it’s routine to prove the existence of a perfect set of mutually generic reals in the Cantor space. This is puzzling at first glance: mutual genericity is a notion in forcing, which is typically used for proving consistency results, instead of existence.

It turns out that this is one of those cases where one can prove an existence claim using forcing. The trick, of course, is to appeal to Shoenfield absoluteness.

To see this, consider the statement: there is a perfect set such that any two elements are not eventually equal. Now perfect sets are coded by perfect trees, which can in turn be coded by a single real. So this sentence is really saying that there is a real coding a perfect tree, such that any two branches (i.e., real numbers tracing these branches) are not eventually equal.

Since it is arithmetic to say two reals are not eventually equal, this makes the statement $$\Sigma^1_2$$. So if we can force this, this already holds true in the ground model by Shoenfield absoluteness.

But then this is easy to prove: the forcing to add a perfect set of Cohen reals2 will make this true. This is because the perfect set of Cohen reals will all fail to be eventually equal with one another: if some $$c_1,c_2$$ are eventually equal, then from $$c_1$$ we can define $$c_2$$ by only chaning $$c_1$$ on some finite initial segment. That will contradict the fact that $$c_1$$ and $$c_2$$ are mutually generic.

So I think this is what Hjorth means to say with mutual genericity.

Of course, since Cohen forcing is essentially a Baire-category method, I’ve committed theft over honest toil by sweeping under the Cohen rug any mention of “dense, meager, comeager”, etc. The interested reader can find an argument using Baire category notions in Su Gao’s Invariant Descriptive Set Theory, Theorem 5.3.1, a stronger theorem which is attributed to Mycielski.

1. See, for example, Proposition 5.1.12 in Su Gao’s Invariant Descriptive Set Theory.

]]>
What’s difficult about forcing?2022-01-20T00:00:00+00:002022-01-20T00:00:00+00:00https://jasonzeshengchen.github.io/blog/2022/what-is-difficult-about-forcingNote: the goal of this article is to give a sense of the kind of difficulties, both conceptual and technical, one might encounter if one decides to learn the mathematical tool of forcing. As such it is not my intention to provide an intuitive motivation for forcing, or explain the inner workings of it in a perspicuous way. Already there are excellent sources for that: for example see A Philosopher’s Guide to Forcing by Toby Meadows or A Beginner’s Guide to Forcing by Timothy Chow.

What I intend to achieve here is somewhat particular: after a first course in formal logic, set theory, or Gödel’s theorems, a motivated student might see the term “forcing” pop up on the occasional Google chase. They might be intrigued as to why this technique won Paul Cohen, its discoverer, a Fields Medal, for instance. But “what is forcing” is a question that is highly difficult to address in office hours, because there are quite a few moving pieces involved. Each of these pieces brings its own special kind of unease to the student, making the whole thing quite daunting. It is simply my hope here to record these moving pieces their associated difficulties.

Due to this peculiar aim, the article is sprinkled with pausing remarks (click to expand) on what is difficult about the matter at hand and whether it’s conceptual or technical. My judgment is that forcing is not substantially harder than any other main theorems in graduate math textbooks. It is simply due to the variety of concepts involved and the unfamiliarity of the tools used, that forcing has gained some kind of notoriety.

An earlier simpler draft of this post was first published in Chinese on the Q & A platform Zhihu. One may view the original version here: link to the Chinese version.

## What is forcing?

Forcing is the name of a technical method to establish independence results in set theory. These are results saying that certain statements cannot be deduced from the axioms of set theory (the Zermelo-Fraenkel axioms with Choice).

To show that a statement cannot be deduced from a theory, one shows that the negation of the statement is consistent with the theory. Given a mathematical theory $T$, we write $\mathsf{Con}(T)$ for the statement that $T$ is consistent: i.e., there is no proof that starts with the axioms of $T$ and ends with a contradiction.

If for some sentence $P$ we succeed in showing $\mathsf{Con}(\mathsf{ZFC}+P)$, then we have succeeded in showing that $\mathsf{ZFC}$ cannot refute $P$. This is because if $\mathsf{ZFC}$ refutes $P$, then this means $\mathsf{ZFC}$ proves $\neg P$ (“not-P”); therefore from the theory $\mathsf{ZFC}+P$ one can deduce $P \wedge \neg P$, a contradiction, by doing the following: first deduce $\neg P$ from $\mathsf{ZFC}$ alone, and then deduce $P$.

Forcing allows one to conclude, under suitable assumptions laid out below, $\mathsf{Con}(\mathsf{ZFC}+P)$ for various choices of $P$. Different people have different ways of understanding and implementing it, but arguably the most straightforward way to make sense of it is that it is a method of building new models of ZFC from old ones.

## I thought we were talking about proving things. So why models?

It turns out that talking about proofs is not the easiest thing to do. It is easier to work with models instead. A model of a theory, roughly put, is just a set of things that satisfies that theory. The connection between proofs and models was confirmed by Kurt Gödel in his doctoral thesis:

Fact. (Gödel’s Completeness Theorem): $\mathsf{Con}(T)$ if and only if $T$ has a model.

Technical Difficulties
A technical difficulty lies in understanding the statements $\mathsf{Con}(\mathsf{ZFC})$ and "$\mathsf{ZFC}$ has a model" as precise mathematical statements.

In the strictest sense, $\mathsf{Con}(\mathsf{ZFC})$ is to be construed as the number-theoretic claim that, roughly, there does not exist a natural number which in Unicode encodes a sequence of symbols, in which the starting lines are axioms of $\mathsf{ZFC}$, and the last line is the sentence $0=1$, and the intermediate lines are all obtained by valid rules of inferences in first-order logic. This whole business of encoding sequences of symbols by natural numbers is very fuzzy and not very fun to go through. Fortunately this accords with one's intuition how proofs work, so this is familiar territory for someone familiar with proofs. It's just painstakingly formalized in this case.

Fact: One can write down formulas $\mathsf{Axiom}(x)$, $\mathsf{isProof}(x,y)$ in the language of number theory, such that if number theory can prove $\mathsf{Axiom}(n)$, then $n$ really is the Unicode encoding of one of the $\mathsf{ZFC}$ axioms; and if number theory can prove $\mathsf{isProof}(a,b)$, then $a$ really is the Unicode encoding of a $\mathsf{ZFC}$-proof of the statement encoded by $b$.

Similarly, one needs to strictly formalize what a theory is and what it means for a theory to have a model. This is of course routine now in an advanced logic course, but it is highly unlikely that someone without a logic background can spell out what a model really is. Again, the good news is that this doesn't matter much, because it stays pretty close with our mathematical practice and intuition.

So if our goal is to show that $\mathsf{ZFC}$ cannot deduce $P$ (so we want to show $\mathsf{Con}(\mathsf{ZFC}+\neg P)$), it is enough to build a model of $\mathsf{ZFC}+\neg P$. This shifts the focus from talking about proofs, which are strings of symbols generated by some fixed rule, to models, which are perhaps more tangible and familiar.

Unfortunately, one cannot just build a model of $\mathsf{ZFC}$ out of thin air. Another theorem by Gödel, his second incompleteness theorem, in simple terms states:

If $\mathsf{ZFC}$ proves that $\mathsf{ZFC}$ has a model, then it can also deduce a contradiction.

So if $\mathsf{ZFC}$ is consistent to begin with, we cannot use it to deduce $\mathsf{Con}(\mathsf{ZFC})$, let alone $\mathsf{Con}(\mathsf{ZFC}+P)$. This left us with the second best option: assume that $\mathsf{ZFC}$ is consistent, and then show $\mathsf{Con}(\mathsf{ZFC}+P)$.

This is where models come in: assuming $\mathsf{ZFC}$ is consistent provides us with a model of it. With forcing, one can carefully pick out an element, add it to the model, and make sure that the resulting model does what one wants.

## Countable Transitive Models

Models of $\mathsf{ZFC}$ are very complex. A model of $\mathsf{ZFC}$ will contain everything that $\mathsf{ZFC}$ proves to exist (i.e., pretty much all of math). In particular, it proves that the following set exists: $\{f \mid f: \mathbb{N}\to \{0,1\}\}$. Intuitively, this is the set of all infinite sequences of zeroes and ones. There is a way of viewing them as the real numbers, which is why set theorists and recursion theorists often like to refer to elements in this set as the real numbers. In particular, the set $\{f \mid f: \mathbb{N}\to \{0,1\}\}$, which we denote by $2^\mathbb{N}$, is uncountable.

Paul Cohen invented forcing for the express purpose of adding such a $0-1$ sequence to a model of $\mathsf{ZFC}$. To do this, he restricted his attention to models of $\mathsf{ZFC}$ that he knew for sure would be missing at least one $0-1$ sequence: the countable transtive ones.

Conceptual and Technical Difficulties
One of the first great stumbling blocks for learners of forcing (and mathematical logic really) is the existence of countable models of $\mathsf{ZFC}$. This presents a seeming contradictions of facts at hand:

1. (Cantor's Theorem) $\mathsf{ZFC}$ proves that uncountable sets exist; in particular, the set of real numbers, or $2^\mathbb{N}$, is uncountable; but,
2. (Downward Löwenheim-Skolem Theorem) If $\mathsf{ZFC}$ has any model at all, then it has a countable model.

These two facts seem to contradict each other in a confusing way. So much so that it deserves its own name and article in the Stanford Encyclopedia of Philosophy.

The existence of countable models certainly presents conceptual difficulties for someone seeing this for the first time. Indeed this is one of those "you'll get used to it" bits in math. But perhaps let me give an intuitive explanation:

$\mathsf{ZFC}$ has only countably many formulas. This entails that the theory can only talk about countably many things. In particular, it can only define countably many things, and prove that they exist. A structure can be a model of $\mathsf{ZFC}$ if it doesn't rule out the existence of these things. So to obtain a model of $\mathsf{ZFC}$, we are only required to throw in the things that $\mathsf{ZFC}$ defines and proves to exist: there are only countably many of these!

What about the set of real numbers, which $\mathsf{ZFC}$ proves to exist? Well, we are only building a structure that satisfies a collection of sentences. There is no guarantee that we are really building the actual mathematical universe. Thus if $M$ is a countable model, $M$ is going to satisfy "there exists a unique set of real numbers", but we are not guaranteed in having that the unique set that $M$ judges to be the real numbers is really the set of all real numbers (although there might be overlap).

A model $M$ is transitive if and only if $x\in M$ implies $x\subseteq M$. Intuitively, the model $M$ “sees” the elements of its elements, the elements of those elements, and so on. Transitive models are nice, in that if $M_1\subseteq M_2$ are transitive models, then they agree on the meaning of basic, simple terms. For instance, if $x\in M_1$ is such that $M_1$ satisfies the sentence $x \text{ is the set of natural numbers}”$, then $M_2$ will satisfy the same sentence. Recall that the use of forcing is to build a model of $\mathsf{ZFC}$ on top of another one, so this $M_1\subseteq M_2$ situation is precisely what we will end up with. Therefore it is important to be able to hold the meaning of certain terms fixed across these two models.

Technical Difficulties
One needs of course to define what "basic, simple terms" are and prove that they behave as intended. For those interested, these refer to the things on the $\Delta_1$ level of the Levy-hierarchy . And the technical term for "holding the meaning fixed across models" is "absoluteness".

One should also note that "there exists a countable transitive model of $\mathsf{ZFC}$" is strictly stronger than "there exists a countable model of $\mathsf{ZFC}$". In fact, it is much, much stronger. One can show that "there exists a countable transitive model of $\mathsf{ZFC}$" implies $\mathsf{Con(ZFC)}$, $\mathsf{Con(ZFC+Con(ZFC))}$, $\mathsf{Con(ZFC+Con(ZFC+Con(ZFC)))}$, and so much more.

This means we are really cheating a little bit when we restrict attention to countable transitive models, whose existence is not really guaranteed by the starting assumption of merely $\mathsf{Con(ZFC)}$. The technical difficulty here lies in dispensing with the transitive model assumption while still working with it.

This is done via The Reflection Principle, which says given any finite subset of the $\mathsf{ZFC}$ axioms, one can find a countable transitive model for that. And the proof of $\mathsf{Con(ZFC)}\rightarrow \mathsf{Con(ZFC+P)}$ proceeds in 2 steps:

1. If $\mathsf{Con(ZFC+P)}$ does not hold, then it can deduce a contradiction in finitely many steps, using only finitely many axioms from $\mathsf{ZFC+P}$. Let's say it uses the axioms $A_1,...,A_{1024}, P$.
2. But the forcing construction will also end up only using finitely many axioms of $\mathsf{ZFC}$, say the axioms $B_1,...,B_{2048}$. Using the Reflection Principle, one obtains a countable transitive model of $\{A_1,...,A_{1024}, B_1,...,B_{2048}\}$, and use forcing to produce a countable transitive model of $\{A_1,...,A_{1024}, B_1,...,B_{2048}, P\}$. But we just assumed the latter theory is inconsistent!
Therefore, working from the $\mathsf{ZFC}$ axioms $\{A_1,...,A_{1024}, B_1,...,B_{2048}\}$, we proved the existence of a model of $\{A_1,...,A_{1024}, P\}$. This is impossible, if $\mathsf{ZFC}$ is consistent. So our starting assumption must be wrong. Hence $\mathsf{Con(ZFC+P)}$ must hold.

## So we have a countable transitive model. What can we do with it?

Suppose $M$ is a countable transitive model of $\mathsf{ZFC}$. Since $M$ is countable, we know for sure that there are elements that $M$ is missing. The function telling us that $M$ is countable, for example ($g: M\leftrightarrow \mathbb{N}$). Another example is the “height of $M$”. This is a common name for the set of ordinals in $M$, that is $M \cap \text{ Ordinals}$. This is because being an ordinal is a basic, simple notion that $M$ correctly recognizes. So if $M \cap \text{ Ordinals}$ is an element of $M$, then $M$ will satisfy “the class of ordinals form a set”, and we know this isn’t a theorem of $\mathsf{ZFC}$.

It is also clear that adding a single set to $M$ won’t be enough. This is because $\mathsf{ZFC}$ axioms are “generative” in spirit: if $x$ is a set, then $\{x\}$ is also a set; if $x,y$ are sets, then so is $\{x,y\}$, and so on. So whatever set we add to $M$, we will also have to add the sets that are generated by it, as required by the axioms. And the sets that are generated by those, ad infinitum.

The point of the last two paragraphs is this: while we know there are plenty of things that cannot be in $M$, we need to be careful in what we want to add to it. For instance, it should be not too informative, in that it makes the countability of $M$, or the “set-ness” of its ordinals, obvious.

Recall that our goal is to add a $0-1$ sequence to $M$. Such a sequence must exist, since there are uncountably many of these and $M$ is countable. But how do we pick out such a sequence? It cannot be too simple, because simple notions are guaranteed to exist; it cannot be too specific, because adding it to $M$ might not result in a model of $\mathsf{ZFC}$ any more. This was the challenge that Cohen faced, which he recounts in his account of the discovery of forcing.

## Poset and Filter

The innovation of forcing derives from the remarkable decision that this sequence is going to be constructed using partial information. Given that our object of interest is an infinite $0-1$ sequence, we take partial information to mean the finite initial segments of the sequence.

Definition. The Cohen poset (short for partially ordered set) is the set of finite sequences of $0-1$: $\{s\mid s: \{0,…,n\}\to \{0,1\}, n\in \mathbb N\}$.

And the informal notion of construction using partial information is made precise by appealing to the notion of a filter.

Definition. A filter on the Cohen poset is a collection $G$ of finite $0-1$ sequences that are coherent in the following sense: 1) the empty sequence is in $G$; 2) if a sequence is in $G$, then all the initial sequences of it is also in $G$; 3) if $s_0,s_1$ are two sequences in $G$, then one must be an extension of the other.

For example, if $f: \mathbb N\to\{0,1\}$ is any $0-1$ sequence whatsoever, the set of all its initial segments will be a filter. This tells us that filters are not enough: each $0-1$ sequence corresponds to a filter. So this idea of “construct it using partial information” is too broad. It does not avoid the “too informative” or “too specific” or “too simple” problem. What’s missing is an additional restriction on the filters so that the object we end up obtaining is perfectly average: not too simple as to be fixed across all models, not too specific as to be generated outright in $M$, and not too informative as to be contradicting $\mathsf{ZFC}$.

Cohen realized this missing restriction was to be the topological notion of genericity. In addition to being coherent, the filter is further required to pick out elements that are perfectly average in the following sense.

Definition. A dense subset of the Cohen poset is one in which every finite $0-1$ sequence can find an extension. A filter is $M$-generic if it intersects every dense subset of the Cohen poset that is in $M$.

So generic filters capture the idea of being not too specific.

They are not too simple either. In fact, if $G$ is a $M$-generic filter on the Cohen poset, then it can be shown that $G$ is not an element of $M$. This is roughly because, if $G\in M$, then we can use $G$ to express the property of “not being in $G$” and show that the finite $0-1$ sequences having this property form a dense subset in $M$.

We observe that, for each natural number $n$, the set of finite binary sequences of length at least $n$ is dense. Or in other words, it is a generic property of binary sequences to have length $\geq n$. So for each $n$, a generic filter will contain finite sequences of length at least $n$.

This implies that the construction indeed gives us what we want. If $G$ is a $M$-generic filter, then piecing together its element (i.e., taking $\bigcup G$) gives us a genuine function $f: \mathbb N\to \{0,1\}$, an infinite $0-1$ sequence that is not in $M$.

## Do M-generic filters exist?

If $M$ is a countable transitive model, then yes. This result is known as the Rasiow-Sikorski lemma, proven about a decade before the invention of forcing.

By all means this is one of those pleasant moments in mathematics where ideas from various fields converge. It turns out that the existence of a generic filter, which is motivated by our attempt to build an object outside of $M$ that is perfectly average, is a close relative to the Baire category theorem. The exact equivalence is spelled out in this paper by Goldblatt.

The proof of the Rasiowa-Sikorski lemma reveals a strong flavor of diagonal arguments, hinting at a uniformity between the Baire category proof of the uncountablility of the reals and the more familiar diagonalization proof. This analogy underlies much of the study of what’s known today as forcing axioms (see for instance the exposition by Viale).

The resulting sequence obtained by piecing together the generic filter is now called a Cohen (generic) real over $M$. Due to later works of Solovay, we now know in hindsight that if $M$ is a countable transitive model of $\mathsf{ZFC}$, then in the sense of topology almost every real number is generic over $M$.

## What about all the sets generated by the generic object?

Forcing is a tool to build new models out of old ones. So far, we started with a countable transitive model, and motivated the notion of generic filters. Such filters exist, and the resulting object is perfectly average in a precise sense. But we’ve only built an infinite binary sequence. How do we build new models of $\mathsf{ZFC}$?

In the above, we observed that it is not enough to just add the generic object to $M$. We need to add things that are generated by it. A doomed attempt is to try to find some way and describe what one needs to add to $M$: i.e., if $f$ is the generic sequence, we require that $\{f\}$ be in the new model, $\{\{f\}\}$ be in the new model, etc etc. This is doomed, because the $\mathsf{ZFC}$ axioms are so complicated that there is simply o hope for some tangible description to round out what one needs to add to $M$ in order to obtain a model of $\mathsf{ZFC}$. What’s missing, then, is some kind of process that “takes care of itself”, so to speak.

The incredibly ingenious realization by Cohen was that, just like the generic object is built using partial information, so can the entire extension model of $M$. To unuderstand this, we need to understand a theorem (really an axiom) of $\mathsf{ZFC}$: every set can be obtained in the following recursive process, carried out transfinitely:

1. stage 0: take the empty set
2. stage $\alpha+1$: take all the subsets of the things in stage $\alpha$
3. at limit stage $\lambda$: collect all things gathered so far.

The familiar reader will notice that this is an informal description of the von Neumann hierarchy, where: $V_0=\emptyset$, $V_{\alpha+1}=\mathcal{P}(V_\alpha)$, and $V_\lambda = \bigcup_{\beta<\lambda}V_\beta$ for limit ordinals $\beta$.

Conceptual and Technical Difficulties
We are slowly getting into the unfamiliar territory of definition by transfinite recursion and proof by transfinite induction. The reader who has never heard of the term ordinal might be at a loss here. In more familar recursions and inductions, one moves along the natural numbers. The ordinals, simply put, are objects that mark the steps of a recursion or induction that goes on longer than the natural numbers.

Some math textbooks will cover transfinite recursion or transfinite induction (for instance in Terrence Tao's Analysis). But more likely than not, transfinite recursion and induction will take a little getting used to. Moreover, the transfinite recursions to come will be massively more complicated than what one typically encounters in real analysis or formal logic. This might create a sense of unease for the faint of heart.

It is an axiom of $\mathsf{ZFC}$ that every set belongs to one of these stages. Intuitively, the von Neumann hierarchy describes a process building a model of $\mathsf{ZFC}$. Cohen decided to mimic this process, using the generic sequence we obtained.

## Enter the Names

The trick here requires a new perspective. That is, we no longer view the statement $x\in X$ as a proposition, but we view it as expressing a degree of truth. Of course, in the classical case, to assert $x\in X$ is to assert that $x\in X$ is true to degree 1, and to assert $x\notin X$ is to assert that $x\in X$ is true to degree 0 (i.e., not true at all).

That is, we would like to identify each set $X\in M$ with its characteristic function $char_X: X\to\{0,1\}$, where $char_X(x)=1$ if and only if $x\in X$. Since this is set theory, mind you, functions are really just sets of ordered pairs. So we shall view a set $X$ as the set $X \times \{0,1\}$, and we take the “true” elements of $X$ to be those that are paired with $1$.

One way to “extend” $M$ (in scare quotes because we are not really adding anything), is to do this for each element of $M$, so we end up getting the set of “possible elements of $M$”:

$\{(x,i)\mid i\in \{0,1\}~\&~ x\in M\}$

We can view these possible elements of $M$ as being undetermined whether they are in $M$ or not (so a little bit like Schrödinger’s poor cat), waiting to be given a “truth value” as to the question “am I in $M$?” One can also think of them as names of elements that will go into $M$, waiting to be called out by some kind of selector.

Of course, it is trivial to recover $M$ from the set of its possible elements: just take all the $(x,1)$, and take the left coordinate.

Let’s pause here and reflect the above way of thinking. We have a way of naming the possible elements of $M$, and we can recover $M$ by designating a privileged “truth value” to pick out what names we call out (in our case $1$). Our target is to modify this strategy so that it names the possible elements of the bigger model extending $M$, and we would like to find some ways to use the generic filter as well.

There is a shortcoming with the current strategy: it’s not expressive enough. By this I mean the names are taking the following form: $(a,b)$ where $a$ is an element of $M$, and $b$ is either $0$ or $1$. In their present form, they are too closely tied to the elements of $M$ to let us name an element that is not in it. It also relies too much of $M$: in order to “extend” $M$, we need to first have $M$ ready to form the names and call out the correct ones. These defects doom the current strategy as a way of building/talking about a structure before we have it.

Recall that it is wise to leave the construction of the extension model to a process that somehow “takes care of itself”. And a sensible candidate is to imitate the von Neumann hierarchy in a way that reflects the recursive nature of the construction. The essential idea of the von Neumann hierarchy is this:

• Build things in stages. At each stage, use only what is available at that stage.

So our current strategy fails to capture this guiding principle, because it is consider every element of $M$ at once, ignoring the hierarchical nature of the sets.

So let us mimic the von Neumann hierarchy, but not in the construction of the new sets, but in the construction of the names.

Definition. Call the two element poset $\{0,1\}$ the truth value poset

Definition. Let us consider the Name hierarchy for the truth value poset:

1. $\text{Name}_0$ contains only the empty set. The empty set is the only level $0$ name.
2. $\text{Name}_{\alpha+1} = \mathcal{P}(\text{Name}_\alpha\times \{0,1\})$.
3. For limit ordinals $\lambda$, $\text{Name}_\lambda=\bigcup_{\beta<\lambda}\text{Name}_\beta$.

Of course, the name hierarchy described above is solely restricted to the truth value poset, and intended as an illustration.

Conceptual Difficulty
The definition of names is probably one of the greatest challenges for someone seeing forcing for the first time.

The key point is that the construction of the names fully mimics the construction of the von Neumann hierarchy. At stage 0, we take the empty set and declare that to be a name. At stage $\alpha+1$, we take the names obtained in the previous stage, pair those with either $0$ or $1$, take arbitrary collections of the results, and declare these to be names of level $\alpha+1$. At limit stages, gather all names we've obtained so far.

In some texts, names are given the following definition: given a poset $\mathbb{P}$, a $\mathbb{P}$-name is any (possibly empty) set of ordered pairs of the form $(\sigma, p)$, where $\sigma$ is a $\mathbb{P}$-name and $p\in \mathbb{P}$. This seemingly circular definition is secured by noting that the empty set is a name, and names are built in the hierarchical way described above. An analogous description of what a **set** is might help: the empty set is a set; a set is anything that contains elements of the form $x$, where $x$ is a set.

I've attempted to tell a somewhat motivating story as how names come about. It's also interesting to note what Cohen himself remarked, on the discovery of (something in the spirit of) the name hierarchy:

"There are certainly moments in any mathematical discovery when the resolution of a problem takes place at such a subconscious level that, in retrospect, it seems impossible to dissect it and explain its origin. Rather, the entire idea presents itself at once, often perhaps in a vague form, but gradually becomes more precise.
This idea as it presented itself to me, appeared so different from any normal way of thinking, that I felt it could have enormous consequences."

## Evaluation by a generic filter

So we’ve improved our description of the names of the possible elements in the “extension” of $M$. This is done by mimicking the von Neumann construction of the sets. But what we have are just names so far. We don’t have a model yet. We need a way, again, to turn names into real elements.

Definition. The filter $T$ on the truth value poset is the set $\{1\}$.

To recover $M$ from the names for the truth value poset, we also follow von Neumann’s guiding principle. This is sensible, because the names themselves are constructed recursively in a hierarchical fashion.

Note that we can’t just say “take left elements that are paired with $1$” like before, since these left elements are also themselves names. One needs to recursively give the names values like the following:

1. $M_0$ contains only the empty set. That is, the empty set (as a name) gets evaluated by the filter $T$ to be the empty set.

2. $M_{\alpha+1}$ contains the $T$-evaluation of the names in $\text{Name}_\alpha$ in the following sense: if $\tau$ is a name in stage $\text{Name}_{\alpha+1}$, then its $T$-evaluation is the $T$-evaluation of set of names in it. Or in symbols:

$val_T(\tau) = \{val_T(\sigma) \mid (\sigma, 1)\in \tau\}$

3. At limit stage, $M_\lambda$ is the set of all evaluations of the stages before $\lambda$.

One may verify, somewhat tediously, that the set of evaluations of the names is exactly $M$.

## Generic Extensions

So we’ve improved our method of naming possible elements of a model and then instantiating them to be really elements of a model. This was done in the toy example with the truth value poset $\{0,1\}$ and the filter $T=\{1\}$. Forcing builds new models by taking more complicated posets and using generic filters on them as valuations. For instance:

Definition. The name hierarchy for the Cohen poset $\mathbb{C}$ is defined recursively as follows

1. $\text{Name}_0$ contains only the empty set. The empty set is the only level $0$ name.
2. $\text{Name}_{\alpha+1} = \mathcal{P}(\text{Name}_\alpha\times \mathbb{C})$
3. For limit ordinals $\lambda$, $\text{Name}_\lambda=\bigcup_{\beta<\lambda}\text{Name}_\beta$

Notice that, in the view that we are setting up, elements in the Cohen poset itself takes the role of truth values. So we still think of a statement of the form $x\in X$ as an assertion of a degree of truth; however, we no longer think of “degrees of truth” as being just the set $\{0,1\}$. We abstractly take the “degrees of truth” to be the Cohen poset.

And if $G$ is a $M$-generic filter of the Cohen poset, the set of evaluations of the names in $M$, which we shall call the generic extension of $M$ by $G$ (in symbols, $M[G]$), is also obtained recursively.

1. $M[G]_0$ contains only the empty set. That is, the empty set (as a name) gets evaluated by the filter $G$ to be the empty set.

2. $M[G]_{\alpha+1}$ contains the $G$-evaluation of the names in $\text{Name}_\alpha$ in the following sense: if $\tau$ is a name in stage $\text{Name}_{\alpha+1}$, then its $G$-evaluation is the $G$-evaluation of set of names in it. Or in symbols:

$val_G(\tau) = \{val_G(\sigma) \mid \exists s\in G~ (\sigma, s)\in \tau\}$

3. At limit stage, $M[G]_\lambda$ is the set of all evaluations of the stages before $\lambda$.

$M[G]$ denotes the union of all such $M[G]_\alpha$’s. The term “forcing” derives from the following definition of the relation between the poset in question and what $M[G]$ satisfies.

Definition. If $s$ is an element of the Cohen poset and $P$ is a statement about elements of $M[G]$, then we write $s\Vdash P$ (“$s$ forces $P$”) to mean: every generic filter $G$ having $s$ as an element will make $M[G]$ satisfy $P$.

In many texts at this point, basic properties of $M[G]$ will be established. The most important ones are:

1. $M[G]$ is countable transitive. $G\in M[G]$. $M\subseteq M[G]$. And any transitive model satisfying these conditions will contain $M[G]$.
2. $M[G]$ satisfies $\mathsf{ZFC}$.

The proof of 2 is far from trivial. It requires a careful study of the relation between $M, G,$ and $M[G]$, as well as the forcing relation. A key component is the The Truth Lemma and the Definability Lemma, which jointly say that the forcing relation $s \Vdash P$ turns out to be equivalent to a topological property of the poset in question. This makes it a mathematical property that $M$ can handle using the $\mathsf{ZFC}$ axioms, rather than a metamathematical property about countable transitive models.

Technical Difficulty
The majority of the time and energy on a first go at forcing will be spent on these facts, which involve very tedious proofs by induction. In particular, the equivalent topological property given by the truth and definiability lemmas will involve complex recursive definitions that are not intuitively clear at all.

## Forcing more generally

So that is how forcing builds new models of $\mathsf{ZFC}$ out of old ones. I should stress that we’ve only looked at forcing with toy examples, namely the Cohen poset and the truth value poset. Forcing itself is incredibly versatile, in that one can use forcing with whatever poset one desires, and sophisticated arguments and extensions have been produced using clever designs of posets. For instance, let me say a little bit about how one obtains a model in which $2^{\aleph_0}\neq \aleph_1$ (so the continuum hypothesis fails in that model).

This is done by considering a suitable product of the Cohen poset.

• $M$ thinks that there is a poset that is $\aleph_2$-many copies of the Cohen poset.
• Forcing with it produces a generic extension that has that many infinite $0-1$ sequences, and hence that many real numbers.
• One then checks that the generic filter $G$ is not too informative as to disturb the calculation of what $\aleph_2$ (“the second uncountable cardinal”) is between $M$ and $M[G]$. This step is necessary because as we’ve seen, adding things to $M$ might reveal the countability of elements of $M$.
• This is ensured by appealing to the truth lemma and the definability lemma, and using a topological property of the poset known as the countable chain condition.
• So one concludes that, in $M[G]$, $2^{\aleph_0}\geq \aleph_2$.

I should also point out that the approach outlined in this post, namely the approach using countable transitive models, is one of several equivalent approaches to forcing. Some others are:

• Syntactic forcing.

In syntactic forcing, one completely dispenses with any talks of models or truth in a model. Rather, one proceeds directly to the topological property given by the truth lemma and the definability lemma, making forcing really the study of topology and the Unicodes of formulas.

Pro: no need to talk about countable transitive models or the existence of generics any more.
Con: involves complicated coding and translating semantics notions (such as truth) into somewhat stilted syntactic notions that are not very intuitive.

• Boolean-valued models.

In this approach, one takes the truth value metaphor seriously. If one forces with the special class of posets known as the complete Boolean algebras, one can make precise the notion of a Boolean-valued characteristic function (which is what names are in this approach) and the corresponding Boolean-valued model. So names end up playing the roles of actual elements in a Boolean-valued model. This was invented by Solovay and Scott shortly after Cohen’s invention of forcing, in order to get a conceptually perspicuous way in thinking about what forcing is really doing.

Pro: it is easier to understand what forcing is trying to do in this way, and if one has category-theoretic backgrounds, this is arguably the most natural way of understanding forcing. In addition, many of the results from the study of Boolean algebras can be applied to the study of generic extensions, a connection that is obscured in the countable transitive model approach.
Con: substantial prerequisites in the theory of complete Boolean algebras, including the separative quotient of a poset, regular open completion of a separative poset, etc.

• Boolean-valued ultrapowers.

This is a slight variant of Boolean-valued models. One uses the Boolean analogue of the ultrapower construction to obtain an (uncountable) model over which generics exist. Joel Hamkins provided a naturalist account of forcing based on this idea.

## So I am ready to learn forcing / have learned forcing. What now?

• For an introductory text, Kunen’s Set Theory: An Introduction to Independence Proofs is really hard to beat. This book takes the countable transitive model approach, but has a section devoted to addressing other approaches.
• Jech’s Set Theory: The Third Millennium Edition is another classic. This encyclopedia of a book contains an introductory chapter to forcing and uses the Boolean-valued approach.
• If Kunen’s and Jech’s books are too long or detailed, two other shorter, more focused alternatives are: Forcing for Mathematicians by Weaver and Fast Track to Forcing by Džamonja. Besides being both dedicated to an introduction to forcing, the former has the advantage of surveying many actual mathematical problems that are proven to be independent using forcing and has a few chapters on forcing axioms; and the latter surveys many sophisticated posets invented by set theorists.
• This set of lecture notes by Kameryn Williams. Very compact and self-contained material that takes one from logical preliminaries all the way to applications of forcing to the study of large cardinals (what a ride!). It has a section devoted to the introduction of forcing.
• Intermediate extensions or symmetric submodels. If $M$ satisfies $\mathsf{ZFC}$, then $M[G]$ does too. So it might seem that independence results about the axiom of choice cannot be obtained by forcing. This is not the case, as (Cohen proved) one can look at models of set theory between $M$ and $M[G]$ in a systematic way, to obtain models where the axiom of choice fails. This subject is treated in Jech’s The Axiom of Choice. The tour de force of this line of thought is no doubt Solovay’s classic 1970 paper on the Annals of Mathematics, where he produced a model where every set of the real numbers is Lebesgue measurable (and much more!).
• Applications of forcing to mathematics. The later chapters of Weaver’s book contain proofs of independence of some open problems. Forcing also provides new tools to study sets of real numbers, which is the topic of Miller’s book Descriptive Set Theory and Forcing: How to prove theorems about Borel sets the hard way.. A more comprehensive overview of set-theoretic tools used in real analysis, see Set Theory and the Analyst.
• The proliferation of generic extensions makes these extensions and the forcing method themselves interesting objects of study. This is a subfield known as Set-theoretic Geology. See also The Modal Logic of Forcing.
• The proliferation of generic extensions also invites the question: are the axioms of set theory more similar to the axioms of group theory than the axioms of real numbers? If so, what can one say about the truth/falsity of independent statements, such as the continuum hypothesis? A provocative view comes from Joel Hamkins in his influential paper The Set-theoretic Multiverse. This whole debate is nicely recounted in the early sections of A Reconstruction of Steel’s Multiverse Project by Maddy and Meadows. The same sections also contain a riveting narrative of how the independence phenomenon influenced the development of set theory and set theorists’ thinking.
• And much much more!
]]>
Jason Zesheng Chen
Well-ordering of the reals is not measurable2021-08-23T15:12:00+00:002021-08-23T15:12:00+00:00https://jasonzeshengchen.github.io/blog/2021/wellordering-of-reals-unmeasurableCertain sets of real numbers occur frequently in Mathematical Logic but lack a nice geometric presentation to make it into Real Analysis textbooks. Some of them are quite fun examples to have in one’s repository. I’d like to refer to these as “logical” subsets of the reals, for the boring reason that logic is where they’re typically found. A particularly nice example is a well-ordering of the reals, considered as a subset of the plane. A quick proof using Fubini’s Theorem demonstrates that it is not (Lebesgue) measurable, thus providing another proof that the axiom of choice implies there is a nonmeasurable set. The core idea of the proof is, to the best of my knowledge, due to Sierpiński.

The key to this proof is the following special case of Fubini’s Theorem:

Theorem. Let $$A$$ be a measurable subset of $$\mathbb{R}^2$$. For each $$y\in \mathbb{R}$$ we define the section of $$A$$ by $$y$$ as $$A_y:=\{x\in \mathbb{R}\mid (x,y)\in A\}$$. Then $$A$$ has measure zero iff almost no point can give $$A$$ non-measure-zero sections. That is, iff $$\{y\in\mathbb{R}\mid A_y \text{ is not measure zero}\}$$ is a measure-zero subset of $$\mathbb{R}$$.

Here, by not measure zero (or non-null), I mean either having positive measure or nonmeasurable.

The key to utilizing Fubini’s Theorem along with a well-ordering of the reals is that we can now identify the first place where things start to have positive measure. In other words, we find a place on the well-ordering where all the proper initial segments are either measure zero or nonmeasurable. This will end up giving us a set which satisfies Fubini’s Theorem.

Let us see this in action with a warm-up proof.

Theorem. Assume the continuum hypothesis, which says there is a well-ordering $$\prec$$ of $$\mathbb{R}$$ such that every proper initial segment is countable. Then $$\prec$$ is not measurable as a subset of the plane.

Proof: For each $$x\in\mathbb{R}$$, the initial segment $$\prec_x$$ determined by $$x$$ is countable, and hence of measure zero. It is helpful to visualize $$(\mathbb{R},\prec )\times (\mathbb{R},\prec )$$ as a plane. Assume towards a contradiction that the set $$\prec =\{(x,y)\mid x\prec y\}$$ is measurable. Then it follows that it has measure zero, because each of its section is an initial segment in the well-ordering $$\prec$$ and hence countable. But then the complement of this well-ordering $$\succ:=\{(x,y)\mid y\prec x \text{ or } y=x\}$$ would also have measure zero, by the same argument (except now we consider the sections along the other axis). This would then mean that $$\mathbb{R}^2=\prec \cup \succ$$ is null. Contradiction! So this shows that under $$\mathsf{CH}$$, a well-ordering of the reals is not measurable. we now observe that we used $$\mathsf{CH}$$ to establish that every initial segment in the well-ordering has measure zero. This can be dispensed with, because one might simply consider a similar least position, below which everything is either measure zero or nonmeasurable. This works as follows.

Theorem. Let $$\prec$$ be a well-ordering of the reals. Then $$\prec$$ is not measurable.

Proof: Suppose it were measurable. Let $$a$$ be the least real such that the product $$\prec_a\times\prec_a$$ has positive measure ($$\prec_a$$ here denotes the initial segment determined by $$a$$ in this well-ordering). It could very well be that such a real doesn’t exist, in which case we consider again the whole space $$(\mathbb{R},\prec)\times(\mathbb{R},\prec)$$.

Now we restrict the well-ordering to everything $$\prec$$-below $$a$$, that is, consider $$\prec\upharpoonright a :=\{(x,y)\mid x,y\prec a \text{ and }x\prec y\}$$. This set must have measure zero: if a section of it by some $$z$$ is non-null, then $$z\prec a$$ would violate the minimality of $$a$$. So Fubini’s Theorem implies that $$\prec\upharpoonright a$$ has measure zero.

But the other half of the square $$\prec_a\times\prec_a$$ has the same measure as $$\prec\upharpoonright a$$. And hence the square $$\prec_a\times\prec_a$$ is the union of two measure zero sets, which implies that it has measure zero itself, contradicting the choice of $$a$$. $$\square$$ ]]>