Understanding Metatheory and the Broader Picture of Foundational Set Theory

Question

So I'm trying to put together a clearer picture of what is going on when we study set theory. I'll describe my current picture which I'd appreciate some feedback on, and I'll ask some specific questions as well.

So from the start: if we take a Platonist perspective (which I was taught as the most pedagogically effective philosophy to have when learning set theory) then we assume sets in some way or another exist along with the intuitive properties like membership. Then when we list the ZFC axioms (which can be done via some bootstrapping process without need for sets) and we are just saying that sets satisfy these objects. Using our intuitive mathematical reasoning and the axioms we can develop all everyday mathematics, including mathematical logic. Is it fair to say that this intuitive notion of a set and mathematical reasoning is the `most' meta metatheory?

However, now having developed mathematical logic, using this metatheory we can consider ZFC formally as a mathematical object along with a sequent calculus (which I believe can also be developed purely syntactically without the need for sets) and use results like the completeness theorem to reason about the mathematical object of ZFC. In particular, using the metatheory we can say $ZFC \not\vdash CH$ and $ZFC \not\vdash \neg CH$. This however says that there is no formal proof of $CH$ or its negation. However, if a formal proof is just a mathematical object that is made to faithfully represent our informal notion of a proof within the metatheory, then how do we know that there is no informal proof of $CH$ or $\neg CH$? I guess to show that a formal proof is the same as an informal proof would require us to step outside the current metatheory so that we may talk more concretely about it, but this is not possible as it is the `most' meta, so we just believe that this formal and informal notion of a proof agree?

That's my current picture, but here are a couple of other questions, hopefully not making this post to long.

I am confused about how to know when we are working in a metatheory, and more generally what a metatheory is? For example I can't make sense of this paragraph in the lecture notes of a class I took: "For sets $x_1,...,x_n$, we let {x_1,...,x_n} be the set containing exactly $x_1,...,x_n$. (We could prove this exists by induction on n, but one then has to ask where this induction takes place. At this stage it would take place in the metatheory (which is fine). Only once the Axiom of Infinity is introduced could we endeavour to prove a corresponding internal version within set theory.)" What is induction in the metatheory here?

Another question is how to make sense of classes. In ZFC they are informally defined and we think of something belonging to a class if and only if it satisfies some logical formula. We when work classes and say some property of them holds, we are really just saying, if something satisfies this formula of the class then it has such and such a property? I also understand that in Von Neumann-Bernays-Gödel set theory classes are given a formal existence.

I am quite confused and want to try get the big picture as straight as possible, one that puts me in a comfortable position to proceed with learning more set theory. I am currently working through Kunen's introduction to independence proofs where there is a strong emphasis on distinguishing between the theory and the metatheory, so any response and even helpful references are really appreciated.

You write: "I am confused about how to know when we are working in a metatheory". The term metatheory refers to the theory you work in; that's exactly what the term means. This is in contrast to the object theory, which is the theory that you study. — Z. A. K., Feb 18 '24 at 05:16

Soundwave · Answer 1 · 2024-02-18T16:27:42.163

I'll take a crack at this.

Any time you make a logical argument, you maybe presume some things that are true (your axioms) and always presume some rules about how you can work with true premises to reach true conclusions (your rules). Together, they form a theory.

Maybe I'm doing some set theory, and I'm working in the theory $\mathsf{ZFC}$. Or maybe I'm doing some arithmetic and I'm working in Peano Arithmetic, $\mathsf{PA}$. But then I get curious, or maybe a little manic, or maybe my name is Hilbert, and I start thinking things like "I wonder if can I be sure the assumptions of my theory don't contradict each other?" or "I wonder if I can prove everything that is true with this theory?" or "My name is Hilbert and I can prove everything with this theory." Then a smart kid by the name of Gödel comes a long and shows that actually, you can't. You can construct a model of $\mathsf{PA}$ inside $\mathsf{PA}$ and then show $\mathsf{PA}\not\vdash\ulcorner\mathsf{PA\ is\ consistent}\urcorner$.

It sounds like this all probably isn't news to you. But here's the key point: $$\mathsf{ZFC}\vdash\ulcorner\mathsf{PA\ is\ consistent}\urcorner$$

and trivially so! $\mathsf{PA}$ is intended to model that natural numbers and very basic properties of them, so the natural numbers within $\mathsf{ZFC}$ serves as a model quite readily.

The takeaway here is that whether or not something is provable is not just a function of its truth, but also of the theory the proof is to be conducted in. Always. Even (especially) if this theory is informal.

When we're proving theorems about a theory, whether explicitly like making statements like the ones above, or implicitly by say, doing axiomatic set theory and working within a particular object theory like $\mathsf{ZFC}$, the theory the theorems are argued in is called the metatheory. In fields like model theory, semantics, and constructive mathematics, we're often explicit about what our metatheory is. In logic and other axiomatized fields, we're usually at least careful to be precise about what properties it has. In other fields, less so these things.

We are always working in a metatheory.

What metatheory? Well, it depends, and also often doesn't matter; much of mathematics is not all that precise about it. Here's a selections of properties that are important:

Is your metatheory classical?
Informal status: if unstated, almost certainly yes.

That is, is $P\vee\neg P$ true for all statements $P$? It's very intuitive that this should be true, and used frequently. "Either P or not P. If P, Q. Also if not P, Q. Thus, Q." But it has some nasty properties that lead some fields of math to avoid it. Due to how intuitive it is, it's really easy to footgun yourself by accidentally using it in informal reasoning when you mean to avoid it. It also likes to hide in other seemingly innocuous statements. Informal reasoning tends to just take it as fact.

To what extent does your metatheory admit infinities?
Informal status: Usually. Sometimes rejected so that the metatheory has nice properties w.r.t. computation or as a philosophical exercise. Countable infinities crop up here and there informally without concern. Uncountable infinities are very suspicious (but also very rare).

Is your metatheory okay with making arguments about an infinite collection? Generally the bar is "effectively enumerable," as in in principle you could go through all the elements in a particular order. Then the infinity is tamer, and this becomes the same as asking...

Does your metatheory admit induction?
Informal status: Usually. May be avoided as a philosophical exercise.

Not much more to say here. Talking about unending generalities is useful, it's done frequently, only the strictest of finitists rejecting this as a philosophical exercise.

I will mention two theorems that are common to take for granted that do come from metatheoretic induction, because knowing that they are arguments in the metatheory and not provable in their respective theories is an important point of order: the deduction theorem, that if you can deduce $Q$ from assuming $P$ then you can conclude $P\rightarrow Q$, and predicate extensionality, $\forall x\forall y[x = y\rightarrow\phi(x)=\phi(y)]$.

Does your metatheory admit choice?
Informal status: Depends. If you're doing very object level math where the metatheory is implicitly some sort of set theory, like analysis or topology, then usually yes, although the use of choice is generally noteworthy. Otherwise usually not, although category theorists are notoriously bad about this.

This one is mind-bending, and a whole thing. Basically argument forms like "There is an X, consider one such X" or "Consider each X in turn", which are innocuous if you make finitely many such choices, can become a whole other thing if you have to make infinitely many such choices or consider over an infinite set. Much has been written about choice and why one may consider leaving it behind, I will not pretend to be able to summarize it succinctly here.

Those are some of the properties you might consider. Technically there's other non-trivial properties, like for example whether talking about "functions from A to B" even makes sense, but virtually every metatheory takes them for granted so you will rarely worry about them. It otherwise usually doesn't matter exactly what the metatheory is, because the point of much of mathematics is that the metatheory is abstracted away, so you can focus on the object theory. Most cases you will be working in a classical metatheory with induction and infinity, with or without choice. It might as well be $\mathsf{ZFC}$ or $\mathsf{ZF}$, depending. If you're more into modern foundations you might consider the metatheories to be $\mathsf{MLTTW + EM}$ with or without non-constructive choice. If you're really out there you might consider yourself working in a topos with or without enough projectives. It really... just... doesn't matter. Pick your poison. There isn't a best theory, and none of these theories, even unspecified informal ones, are immune to formalization and analysis by yet a more metatheory, if you were so inclined. As you might be catching on to...

There isn't a "most" meta theory. No one theory is "best" or "canonical."

So how do we know that the informal theories that we adopt casually are fair to be considered to coincide with are formal ones?

Well, at one point, we didn't! For a time we had $\mathsf{ZF}$, and then choice was discovered when close inspection of informal theories of analysis showed that presuming the reals admitted a well-order was a non-trivial assumption. In the very beginning we had unrestricted comprehension, until we found out that it's inconsistent. But at this point, we've learned a lot about theories of logic and sets, and we know all of known informal mathematics (modulo some sticky points at the limits of category and type theory) can be modelled by $\mathsf{ZFC}$, and furthermore that $\mathsf{ZFC}$ is a comically powerful theory, so identifying informal reasoning and $\mathsf{ZFC}$ is a conservative assumption.

We identify formal and informal proofs because the formal notion of proof in modern mathematics is very strong, and it's a conservative assumption to identify them.

But just as we've been wrong about comprehension and choice in the past, we could be wrong about this too. Indeed, informal reasoning outside of the known reaches of $\mathsf{ZFC}$ is common, and is important to steering how research is conducted. For example, you ask working computer scientists if $\mathsf{P}=\mathsf{NP}$, and they likely have an informal argument (probably) that they are not equal. Most research then takes the flavor of trying to find descriptions of these classes that are precise and tractable enough to distinguish them conclusively.

Once an argument coalesces into something formalizable, it becomes natural to then ask how this relates to other theories with known properties. Can the premises be proven from another theory? If not, is it philosophically sensible to include these premises in our metatheory from herein out? Without answering these questions, an informal argument really fails to be very meaningful. After all, as far as logic is concerned "suppose $\neg\mathsf{CH}$, then $\neg\mathsf{CH}$" is as good an informal proof as any that the continuum hypothesis is false, but it doesn't elucidate anything. Hence why we asked these questions, and discovered it was independent from $\mathsf{ZFC}$, and that both supposing it and its negation have interesting consequences.

We don't know that an informal "proof" of the continuum hypothesis doesn't exist (and I'd argue there's actually some fairly convincing arguments for taking it as true), but we do know it is independent of $\mathsf{ZFC}$, so whatever metatheory that proof may come in, it is stronger than $\mathsf{ZFC}$

It's tempting to cling onto the informal notion of proof as some truth, but in the echo of every proof is always a metatheory. If that metatheory is philosophically compelling, then you may consider the proof philosophically compelling as well. If you don't, then the proof is perhaps a mere mathematical curiosity (or triviality, as the case may be).

It's metatheories all the way down. Where you decide to stop is a matter between you and the fractal.

And, a humble word of advice from a constructivist: don't try to get smart with $\mathsf{ZFC}$. It's got hands. It's probably already whispering more lies than truths. It'll treat you well.

Classes: An Appendix

Aah yes. Classes. The budding set theorist's first point of metatheoretic subtlety. Here's the deal with classes. If we're playing strictly by the rules, there are no classes. They are merely syntactic abbreviations, as follows: $$\{x\ |\ \phi(x)\}\in\{y\ |\ \psi(y)\} := \exists x[\forall y[y\in x\leftrightarrow\phi(y)]\wedge x\in\{y\ |\ \psi(y)\}]$$ $$x\in\{y\ |\ \psi(y)\} := \psi(x)$$ This is confusing, both because you very quickly build up layers upon layers of nested abbreviations, and because informal arguments start getting a little fast and loose about whether classes are really objects of the theory. But they are just abbreviations.

Yes, classes are just abbreviating statements about the properties of sets. Working with classes is in a very precise sense just working with properties of their elements.

The idea is, in principle, you could unroll those abbreviations and pin down the formalities, and you'd just get theorems about intensely complicated statements not involving classes. You just really don't want to do that. For example, here's a humble extension axiom for $\mathsf{ZFC}$ concerning large sets. With 163 symbols (not including the expansion of equality which is taken as primitive in this metatheory), it's basically illegible without abbreviations. But it can be done.

Now $\mathsf{NBG}$ is just a theory that says, look, if we're serious about classes being "just an abbreviation" and our informal treatment of classes kinda sorta seems like they're sets, surely this should be formalizable. And it turns out it is, and furthermore, it has a nice, expected property: we say $\mathsf{NBG}$ is a conservative extension of $\mathsf{ZFC}$, meaning that if provable statement of $\mathsf{NBG}$ looks like a statement of $\mathsf{ZFC}$ (or in the case of $\mathsf{NBG}$ even stronger: if it can be interpreted as a statement of $\mathsf{ZFC}$ by the natural interpretation of classes in $\mathsf{ZFC}$), then that statement is also provable in $\mathsf{ZFC}$. This is why informally we consider the theories interchangeable, because there's a very narrow formal sense in which they prove the same things.

Potato · Answer 2 · 2024-02-19T14:50:57.770

Regarding what constitutes the metatheory, Kunen defines it as "basic finitistic reasoning" in his Foundations of Mathematics (in I.7.2) and elaborates in his section III:

We conclude with two additional questions: What exactly is the metatheory? Why is it beyond reproach, as we said above, for even consistent?

For the first question, unfortunately, we cannot say exactly. Roughly, as we said, the metatheory is basic finitistic reasoning about finite objects such as finite numbers and finite symbolic expressions. One could attempt to give a precise definition of exactly what finitistic reasoning is — for example, we could say that it is what can be formalized within the system PRA mentioned above. But if you look at the definition of PRA (or of any other formal system), you will see that to understand the definition, you need to understand already basic finitistic reasoning. That is, starting from nothing, you can't explain anything.

(I think implicit here is the claim that finitistic reasoning is essentially the most minimal, conservative reasoning system possible. So if we use it as the metatheory, even though the reasoning is informal, there will be no doubts about its validity or the validity of our foundational project.)

For the second question, unfortunately, we cannot say exactly. Presumably, the metatheory uses the same reasoning that we use to reason about the world around us, so this "must" be correct or else we would all be insane but of course, that's not a proof. But, we can contrast this question with the question of whether ZFC is consistent. If ZFC turns out to be inconsistent, this may not be relevant to most people outside of pure mathematics. Perhaps the inconsistency lies with cardinals $\gamma$ such that $\gamma = \beth_\gamma$; then this might not even affect many set theorists. Of course, we would have to revise the axioms so as not to generate such cardinals, just as Zermelo had to axiomatize Cantor's informal set theory so as not to produce $\{x: x \notin x\}$. But much of set theory and independence proofs involves cardinals around $2^\aleph_0$ and $2^{2^{\aleph_0}}$, and these might survive the revision unscathed. But, if there are inconsistencies in ordinary finitistic reasoning, then we would all have to revise the way we think about the world. This is an interesting question in general philosophy (how do we know we are sane?), but it departs from the philosophy of mathematics.

I also find the following quote from Mileti's Modern Mathematical Logic (Section 8.2) helpful:

If you believe that there exists a real world of true mathematics, then there is no issue. We do normal mathematics in that world, and when we say the “set of axioms of ZFC,” we mean Axiomatic Set Theory set in that sense, not in the sense of ZFC itself. In other words, the metatheory is the world of normal mathematical practice. Working in that metatheory, we are just building a toy world axiomatized by AxZFC, which is capable of embedding mathematics. Of course, we should not confuse the real world of true mathematics with this toy first-order theory, just like we should not confuse the actual laws of physics with a computer simulation of the laws of physics. Thinking about exotic models of ZFC is certainly fun and interesting, and can give us insight into which other axioms we might adopt, but a first-order theory is not meant to be a perfect reflection of mathematical reality. Instead, it is a playground that we can explore and analyze in order to gain insight about the true nature of mathematics (just as a computer simulation can give us insight into the physical universe).

The simulation analogy to me feels very apt. To explain why, let me digress a bit.

When Kunen discusses recursion theory in his Foundations, he gives a detailed discussion of the Church–Turing thesis, which basically says that the formal notation of computable functions he gives via $\Delta_1$ sets exactly coincides with our informal, pre-theoretic notion of "computable by an algorithm." Of course, one cannot prove this "rigorously." However, Kunen gives two pieces of evidence. First, we've been studying computability for many decades and have no plausible counterexample to the thesis. Second, all the "reasonable" attempts to formalize the informal notation have ended up being equivalent (the $\Delta_1$ description is provably equivalent the the Turing machine one, for example).

Kunen compares this (somewhat humorously) to the "Newton thesis" that the derivative of momentum (that is, the $F$ in $F = ma = \frac{d}{dt} p$) correctly captures our intuitive notion of "force." Again, there is no way to prove this, but our vast experience with physics suggests it is correct.

To the Church–Turing thesis and the Newton thesis, we might add the "ZFC thesis": all normal mathematical practice is formalizable in ZFC. The justification is the same: some combination of experience and the fact that all other proposed formalizations end up being equivalent, or nearly so. We take this as evidence that ZFC is sufficient to "simulate" our real mathematical practice. By "nearly so," I mean that we may quibble about things like the axiom of choice or whether to admit classes, like in NBG, but this doesn't affect the ability the formalize everyday mathematics. I'm also ignoring proposed formalizations that are not set-theoretic, like type-theoretic foundations, but their strength can again be understood in terms of set theories, so again there's nothing essentially new happening.

(Incidentally, the simulation analogy explains why it is silly to worry about "junk theorems" as in this question, at least according to the Platonist view adopted by your question and this answer. To worry about whether $3 \in \pi$, or whether $\pi$ is "actually" a set confuses the real concept of $\pi$ with its representation in ZFC. This is similar to being worried that a computer simulation of colliding billiard balls is telling us that billiard balls "truly are" bits stored in computer memory with properties like "existing at so-and-so address in memory." Just because we can answer mathematical questions by translating them into questions about sets doesn't imply that the objects of those mathematical questions "truly are" sets or have set-theoretic properties; all claims of "junk theorems" are a result of this conceptual confusion and hence wrong.)

For the formalist point of view, see III.2 of Kunen's Foundations and the discussion surrounding the above quote from Mileti's book.

Regarding your specific question about "induction in the metatheory," the author is just flagging that the discussion at that point is taking place in the metatheory. Induction is part of basic finitistic reasoning and so should be unobjectionable when used to justify that the set $\{x_1, \dots, x_n\}$ exists.
For basic discussion of classes, see Jech's Set Theory (Third Millennium Edition), pages 5 and 6.

Although we work in ZFC which, unlike alternative axiomatic set theories, has only one type of object, namely sets, we introduce the informal notion of a class. We do this for practical reasons: It is easier to manipulate classes than formulas.

So it's an eliminable shorthand. In a true formalization you would just use the appropriate formula. (This is explained further in the "appendix" at the end of Soundwave's answer.)

Thanks for the great response, the simulation analogy is almost clicking.. If I'm to understand correctly, from the Platonist perspective, the mathematical universe exists in some abstract sense independent of language etc (as do propositions about these objects). Then in axiomatic set theory, platonically believing in only sets, experience shows this is enough to formalise the entire mathematical universe whereby a formal proof, $ZFC \vdash \phi$, coincides with our intuitive notion? So in this formalisation, or interpretation of ZFC, it may be true that $3\in \pi$, but not in our `reality'? — space_broccoli, Feb 21 '24 at 10:50
I guess I am confused then about where sit when doing `everyday maths'. Are we in the simulation working with set theoretic encodings of abstract mathematical objects, or are we somewhere else? When one does group theory for example, we require a set and a binary operation satisfying some axioms. How do we know how to work with a binary relation without reducing it to be defined in terms of sets? One could then ask the same thing of the all numbers, and then wouldn't we get set theoretic encodings and junk theorems again? — space_broccoli, Feb 21 '24 at 11:06
@space_kale Thanks for your comments. For your first question, I agree with what you say. In particular, from the Platonist perspective, it may be true that $3_{\mathrm{ZFC}} \in \pi_{\mathrm{ZFC}}$, where the subscript ZFC stands for "the encoding of this number as a set in ZFC", but it is not true that $3 \in \pi$, where $3$ and $\pi$ are the numbers we talk about in everyday mathematics. — Potato, Feb 21 '24 at 12:29
@space_kale The questions in your second comment are more philosophical, so I can only offer my own opinion. I believe that everyday math typically takes place outside the "simulation" of ZFC, and we only enter the "simulation" when we want to talk about metamathematical issues, or our informal reasoning leads to an obvious error or ambiguity. High school students competently manipulate the real numbers everyday without knowing about how to encode them as Dedekind cuts, for example (and knowing about this encoding wouldn't help them at all). — Potato, Feb 21 '24 at 12:32
Groups were defined and used in the 1800's before ZFC existed. So clearly binary relations and the other necessary ingredients can be explained informally/intuitively, perhaps with an informal notion of "set" or "collection," without entering the "simulation" and picking up junk theorems. — Potato, Feb 21 '24 at 12:35
Also, the irrelevance of set theory (beyond the basic definitions) to most of research mathematics, and the fact that most mathematicians probably couldn't state the axioms of ZFC, suggests that they basically all operate outside of the "simulation." — Potato, Feb 21 '24 at 12:42

score 1 · Answer 3 · 2024-02-19T16:51:14.783

So from the start: if we take a Platonist perspective (which I was taught as the most pedagogically effective philosophy to have when learning set theory) then we assume sets in some way or another exist along with the intuitive properties like membership.

We might do that, or we might e.g. take a more neutral formalist approach, so no metaphysical import: ultimately, I'd rather contend that the most "pedagogically effective" philosophy (as far as any non-philosophical discipline is concerned, at least) is no philosophy at all, e.g. we might just think of/present any theory, mathematical or otherwise, as a game.

Then when we list the ZFC axioms (which can be done via some bootstrapping process without need for sets) and we are just saying that sets satisfy these objects.

Satisfy these properties, rather. Right, but that does not solve, rather moves the foundational goalpost one step deeper, to the "bootstrapping" process now, namely the bootstrapping of a formal logic to even start writing statements of any (formal) theory.

Here is what I think is a wonderful introduction to what that really looks like starting from blank page. Notice that it is about classical logic then standard set theory, which are not the only possible logics and set theories, but the method is what matters (he eventually proceeds to mathematics for physics, which is the goal of the course, but that is beside our point): Frederic Schuller, Lectures on Geometrical Anatomy of Theoretical Physics (YouTube), lectures 01 and 02 in particular.

Using our intuitive mathematical reasoning and the axioms we can develop all everyday mathematics, including mathematical logic. Is it fair to say that this intuitive notion of a set and mathematical reasoning is the `most' meta metatheory?

No: formal logic (mathematical or otherwise, mathematical is an application) is a pre-requisite to writing any theory: it is the (formal) language in which the theory is written (see my notes above), and it comprises (simply put) logical symbols and rules of inference.

That said, and to the crucial point of your question, it is not meta-theoretic that is relevant to the foundational problem/construction, it is pre-theoretic that we are talking about! (Schuller dubs it "pre-formal": indeed I do not think there is a standard terminology for it.)

In simple terms, we must at least be able to distinguish symbols and count to even start talking about any logic, and those are intuitive notions that it is pointless to formalize (which, by the way, would be meta-theoretical: a theory about another theory), since the very formalization endeavour in principle (I mean, to begin with) has to rely on those intuitive notions...

A quick clarification: If I understand correctly, you are using "pre-theoretic" to mean basic finitistic reasoning. This what the question asker is calling the "metatheory." That is the terminology Kunen's book (referenced in the question) uses. — Potato, Feb 19 '24 at 19:40
You might call it "basic reasoning", finitistic or not, but it's pre-theoretic, and it's just not the same thing as meta-theoretic: in particular, and as I have said, it becomes meta-theoretic if you start theorising about it... and I do not think Kunen is saying otherwise, but I am not particularly an expert on Kunen. — , Feb 20 '24 at 12:01

score 0 · Answer 4 · answered Jun 17 '25 at 23:22

I am going to answer your questions piece by piece so we can establish the proper prerequisites to answer in full. I am going to start with these questions:

I am confused about how to know when we are working in a metatheory, and more generally what a metatheory is? For example I can't make sense of this paragraph in the lecture notes of a class I took: "For sets x1,...,xn, we let {x_1,...,x_n} be the set containing exactly x1,...,xn. (We could prove this exists by induction on n, but one then has to ask where this induction takes place. At this stage it would take place in the metatheory (which is fine). Only once the Axiom of Infinity is introduced could we endeavour to prove a corresponding internal version within set theory.)" What is induction in the metatheory here?

Just as a theory contains axioms about some mathematical structure, a metatheory contains basic axioms about provability from a theory. So in a metatheory we can talk about the syntax of a theory, the the logical and nonlogical symbols and the strings of symbols that are well formed formulas of the language of the theory as well as the collection of formulas which are axioms of the theory along with the rules of inference and axioms of the logic used (usually first order logic). in a metatheory we can also talk about proofs of wff from the axioms and rules of inference of the theory. Using first order logic over a metatheory you can also ask questions like for example does there exist a proof from the axioms of ZFC of CH or is ZFC consistent ect. It is a landmark result of Kurt Gödel as a lemma of his incompleteness theorems that you can encode formulas and provability from any computably enumerable theory into the theory of arithmetic and prove some basic properties about it, in other words the theory of arithmetic can be used as a metatheory for any practical proof system we might use.

Here is a method for recognizing if we are working in a metatheory or the object theory. first translate the statement into a formal expression of the metatheory, then if it quantifies over formulas you know you are in the metatheory, if it is a statement contained within provability from the object theory you are working inside the object theory ex. $\forall n. ZFC\vdash \varphi_n$ is in the metatheory since it means that for each of the $\varphi_n$, ZFC is able to prove $\varphi_n$ while $ZFC\vdash \forall x. \varphi (x)$ says that ZFC proves for all x, $\varphi (x)$, which is fully contained inside the object theory and so we are not working in the metatheory. Now in the quoted passage the author of your lecture notes is presenting a definitional extension of the language of ZFC to contain a new term $\{x_1,...,x_n\}$ for the set that contains exactly the sets $x_1,...,x_n$. the issue is that for this definitional extension to be sound you need to be able to prove from the axioms that they exist, which ZFC can prove. the second issue is that there are two ways to view the indexing, external or internal. in the external case one would need to show $\forall n. ZFC\vdash \exists y.y=\{x_1...x_n\}$ in the metatheory, while in the internal case one would need to show $ZFC\vdash \forall n\in \mathbb{N}.\forall (x_1,...,x_n)\in \mathbb{N}^n.\exists y.y=\{x_1...x_n\}$ which is a proved in the object theory. It is clear from context that the indexing is intended to be external, the proof in the metatheory goes through even with just the axioms they had already presented. It seems they also wanted to note that the internal version is also provable from ZFC, however the proof requires more axioms than what they had already presented.

Now we can describe what induction in the metatheory is. it is a method for showing in the metatheory that an n-indexed collection of formulas are all provable in the object theory. you first start with a proof in the object theory of the initial formula, then a method to generate from a proof in the object theory of the nth formula a proof of the n+1th formula, then in the metatheory you can prove by induction that for all n the nth formula is provable in the object theory. Notice how the hypothesis of induction provide you with a way for any particular n to produce a proof in the object theory that the nth formula is true, but as n grows larger the proofs gets longer and longer, also note that the conclusion states provability which is a notion that can only expressed in the metatheory. in general statements that an infinite sets of formulas are provable are called theorem schemes and proofs of theorem schemes always take place inside the metatheory. you need some induction axioms in the metatheory for this argument to work, but not the full scheme of induction, just induction for formulas involving provability without quantifiers which isn't a lot of induction.

Another question is how to make sense of classes. In ZFC they are informally defined and we think of something belonging to a class if and only if it satisfies some logical formula. We when work classes and say some property of them holds, we are really just saying, if something satisfies this formula of the class then it has such and such a property? I also understand that in Von Neumann-Bernays-Gödel set theory classes are given a formal existence.

There are a few different ways to handle classes over ZFC. Since ZFC can't talk about classes they all involve reasoning in the metatheory. One way to handle classes is to show how certain formulas involving definable classes can be transformed into formulas not involving classes. Formulas involving class builder notation can be reduced to equivalent formulas in the metatheory not involving class builder notation. The Metamath Proof Explorer has a nice presentation of this idea. Essentially you can transform formulas that contain subformulas of the form $x\in\{y|\varphi(y)\}$ get substituted with $\varphi(x)$ in its place. This handles most uses of classes in set theory, however there might be times where quantification over classes is desired, in which case we must work in a metatheory. If we follow standard convention of lower case variables ranging over sets and uppercase variables ranging over class variables then we can interpret a formula of the form $\forall A.\varphi(A)$ would then become the metatheoretic formula $\forall\psi.ZFC\vdash \varphi(\{x|\psi(x)\})$ so quantification over classes is transformed into quantification over formulas in the metatheory. another approach which avoids direct refrence to the metatheory involves moving to an object theory which axiomatizes the properties of definable classes. This is what VBG class theory provides. Classes are formed from the scheme of Predicitive class comprehension that is for each $\varphi(x)$ which does not quantify over classes (class parameters allowed) $\exists A.A=\{x|\varphi(x)\}$ along with the axioms of ZF. the axiom schemes of ZFC are also given as axioms quantifying over classes seperation becomes $\forall A.\forall x.\exists y.y=A∩x$ and replacement and (global) choice are packaged as the limitation of size axiom asserting that every proper class is in bijection with the universal class. An important metatheroetic result about VBG is that any statement in the language of ZFC that is provable in VBG is already provable in ZFC which justifies using provability in VBG when studying provability in ZFC. There are some differences with working with classes in the metatheory and working in an object theory with classes, the way they were introduced in the metatheory implies that we are only talking about classes that are definable by a formula in ZFC, while in VBG the only classes that provably exist are definable it is possible that a model of the axioms of VBG contains classes that aren't definable. Statements provable in VBG must apply to classes in general, The metatheoretic method proves statements that specifically apply to classes definable in ZFC. The two approaches are related by the fact that if we have a model M of ZFC then the classes definable with parameters over M will satisfy the axioms of VBG.

However, now having developed mathematical logic, using this metatheory we can consider ZFC formally as a mathematical object along with a sequent calculus (which I believe can also be developed purely syntactically without the need for sets) and use results like the completeness theorem to reason about the mathematical object of ZFC. In particular, using the metatheory we can say ZFC⊬CH and ZFC⊬¬CH. This however says that there is no formal proof of CH or its negation. However, if a formal proof is just a mathematical object that is made to faithfully represent our informal notion of a proof within the metatheory, then how do we know that there is no informal proof of CH or ¬CH?

One of the remarkable aspects of set theoretic independence results is that most of the work is done internally to the object theory ZFC, the only essential roles the metatheory plays is the ability to state the theorem and the use of Gödel's completeness theorem to provide a model of ZFC to work in. Since we are usually working with a metatheory that (by Gödel's incompleteness theorems) can not prove the consistency of ZFC the official version of the theorem is the statement $Con(ZFC)\implies ZFC\not\vdash CH\land ZFC\not\vdash\neg CH$. This was originally proved in two parts first Gödel proved the $ZFC\not\vdash\neg CH$ part by showing inside any model of ZFC how to define a class of sets known as the constructible sets by starting with the empty set and iterating the definable powerset operation along the ordinals of the model.the $\alpha$th stage of the construction is called $L_\alpha$ and can be proved to exist as a set in ZFC for each ordinal $\alpha$. The class L = $\{x|\exists \alpha.x\in L_\alpha\}$ is the class of constructible sets. He then showed that if $\varphi$ is an axiom of ZFC then $ZF\vdash \varphi^L$ where $\varphi^L$ is the restriction of every quantifier in $\varphi$ to the class L. He also demonstrated that $ZF\vdash CH^L$ and so from a model M of ZFC restricting the domain to its class L generates an inner model of ZFC + CH which by the hypothesis Con(ZFC) implies Con(ZFC + CH) which then implies $ZFC\not\vdash\neg CH$. the first proof of $ZFC\not\vdash CH$ was done by Cohen using his method of forcing. The details are somewhat technical the general idea is you start with a forcing notion which is a set that in a sense contains partial descriptions of a new set to be added to the model, then you need a generic filter to stitch them together to form a complete description of the new set (no nontrivial generic filter can be defined over a model of set theory, some simple arguments in the metatheory could be used to prove a model with an external generic filter exists or a somewhat more involved argument using the techniques in this paper of Hamkins and Seabold to simulate its existence within ZFC). Cohen showed that a particular forcing notion forces the new model to satisfy $\neg CH$ and therefore $ZFC\not\vdash CH$.

Since CH is independent of ZFC if an informal argument for $CH$ or $\neg CH$ exists it must use properties of sets that cannot be derived from ZFC. How one approaches this possibility depends strongly on ones philosophy of set theory. Those of a Universist persuasion accept the idea that there is one background notion of sets that can be taken as correct and definite, from this position it is natural to try to find new axioms to add to ZFC that aught to be true in some intuitive sense of this background structure. two common ways to try to resolve CH, one is to describe an inner model that should contain all of the sets. These axioms usually show the CH is true. Another alternative commonly explored are forcing axioms which try to say in some sense the universe of sets is saturated under some class of forcing notions. these often resolve the CH as false. those of a Multiversist persuasion claim there are many competing structures that could be called the sets, none of which has the strongest claim to being all of the sets. Generally under this view the CH is neither true nor false, it is simply satisfied in some set theoretic universes and not in others. a Multiversist generally believes an informal proof of $CH$ or $\neg CH$ to be impossible since it would rely on principles only found in some set theoretic universes and not others.

I guess to show that a formal proof is the same as an informal proof would require us to step outside the current metatheory so that we may talk more concretely about it, but this is not possible as it is the `most' meta, so we just believe that this formal and informal notion of a proof agree?

Here there is an important dichotomy about provability that comes from Gödel's Incompleteness theorems. The provability predicate defined in the proof of the incompleteness theorem satisfies the property that for a computably enumerable theory T that interprets enough arithmetic $T\vdash A\implies T\vdash T\vdash\ulcorner A\urcorner$ in fact we can internalize this inside T as $T\vdash T\vdash \ulcorner A\urcorner\rightarrow T\vdash\ulcorner T\vdash \ulcorner A\urcorner\urcorner$ (these follow from the axioms of Provability Logic). the idea is that since the axioms are computably enumerable you can represent provability in arithmetic, then from a proof from T of A you can break down the steps into axioms and rules of inference, instances of the nth axiom $A_n$ are replaced with the formula $T\vdash\ulcorner A_n\urcorner$, and the rules of inference are replaced with ones involving the provability predicate instead. this will then be a proof from T of $T\vdash\ulcorner A\urcorner$. while provability is shown to be adequately represented the conclusion of the incompleteness theorem shows unprovability is quite a different story. if T is a computably enumerable theory that interprets enough arithmetic then $T\not\vdash A\implies T\not\vdash T\not\vdash\ulcorner A\urcorner$. First note that $T\not\vdash A$ is equivalent to $Con(T+\neg A)$ which implies $Con(T)$ which by Gödel incompleteness theorem implies that $T\not\vdash \ulcorner Con(T)\urcorner$ which by the previous equivalence and implication shows $T\not\vdash T\not\vdash\ulcorner A\urcorner$. the previously linked article also mentions another limitive result about provability that is known as Löb's Theorem $T\vdash (T\vdash\ulcorner T\vdash\ulcorner A\urcorner\rightarrow A\urcorner)\rightarrow (T\vdash\ulcorner A\urcorner)$. by contrapositive and substituting $\neg B$ for $A$ and using that nonprovability of $A$ is equivalent to consistency of $\neg A$ with T Löb's Theorem becomes $Con(T+B)\rightarrow Con(T+\neg(\neg Con(T+B)\rightarrow \neg B)$ using that $\neg(C\rightarrow D)$ is equivalent to $C\land\neg D$ it then becomes $Con(T+B)\rightarrow Con(T+B+\neg Con(T+B))$ which is what the incompletness theorem proves. A theory T such that the metatheory can prove $\forall\varphi.( T\vdash\varphi)\rightarrow\varphi $ is said to be sound or the metatheory is said to satisfy the syntactic reflection scheme over T. Löb's theorem says that the only instances of soundness T can prove of itself are the trivialy true instances. So yes if we use ZFC as a metatheory, although we can show that metatheoretic proofs reflect down to provabilty from ZFC we can not show that ZFC provability implies metatheoretic provability without going beyond ZFC.

So from the start: if we take a Platonist perspective (which I was taught as the most pedagogically effective philosophy to have when learning set theory) then we assume sets in some way or another exist along with the intuitive properties like membership. Then when we list the ZFC axioms (which can be done via some bootstrapping process without need for sets) and we are just saying that sets satisfy these objects. Using our intuitive mathematical reasoning and the axioms we can develop all everyday mathematics, including mathematical logic. Is it fair to say that this intuitive notion of a set and mathematical reasoning is the `most' meta metatheory?

There are a few ways to look at ZFC status as a metatheory, if we are only interested in looking at provability from computably enumerable first order theories then we can work in arithmetic and get by. there are some unique advantages to working in set theory over arithmetic, especially from a Platonist perspective. Just like it is convenient to have classes over sets it is beneficial to extend the language of arithmetic with at least some set even just for expressing proofs about statements in arithmetic. There is also the fact that in set theory uncontable sets can be discussed. If ZFC is sound for the sets then the provable statements about uncountable sets will actually be true. there is the interesting issue sometimes called Skolem's paradox that because ZFC is a first order theory by the downward Löwenheim-Skolem theorem it has a model which is countable, so none of the sets ZFC proves exists are in a sense genuinely uncountable. I believe understanding Second order logic and it's semantics will help explain why set theory looks like it can talk about uncountable sets and also lead to an argument that this allows set theory to simulate being a higher level metatheory than arithmetic. This article is a good overview of second order logic but for this claim I am only going to use the fact that you can almost turn a second order theory into a first order one by adding class theory over it. The only needed second order axiom to insure that the classes reflect the second order theory of it's object domain is called fullness. fullness says that for every property of the object domain there exists a class whose members are exactly those sets that satisfy the property. fullness is not first order expressible, it is only second order expressible in the sense that if we have a full domain of properties to quantify over then we can represent fullness. The best we can do with a computable theory is to have a scheme (separation) expressing that for each subset definable in the full language of the theory exists in the powerset of the domain. then to strengthen the language to be able to define more subsets we add axioms of infinity (all the hereditarily finite sets are codable in arithmetic) then powerset, union and instances of replacement allow us to show there are many uncountable infinities which allow for almost any subset we would care about to be defined, and if not extend ZFC with large cardinals to get more. ZFC then becomes a robust fragment of a metatheory for logics beyond first order logic, including second order and higher order logics with full semantics and infinitary logics which allow infinitely large formulas and proofs, as well as model theory in it's generality. The way all regular mathematics is reducible to sets is that if we can provide some platonic characterization of a structure the size of a set, we should be able to find an isomorphic copy inside the pure sets and therefore be able to prove statements about the structure in set theory that will then be true platonically of that structure. This does not mean that ZFC can fully reflect this intuitive platonic structure especially for uncountable structures. Since ZFC is a first order theory we could very well be working inside a countable model, if so we can use forcing to show there are properties of infinite sets in which another countable model with more infinite sets satisfies it and others that don't. If we are investigating a property which is at a large cardinal strength for which there is a fine structure theory for an inner model at that level of strength we can get a very robust picture of the independence phenomenon at that level of strength and answer a lot of questions about what would be true of the platonic sets of they satisfied the property and had no inner models beyond the strength of the fine structural one. Ultimately a platonist needs to find some way to show intuitively a method of dealing with the forcing phenomenon in order to demonstrate whether the platonic sets satisfied statements like the continuum hypothesis.

Understanding Metatheory and the Broader Picture of Foundational Set Theory

4 Answers4

Classes: An Appendix

Linked