11

I'm wondering what exactly is meant when people say "Skolemization preserves satisfiability but not validity". I'm having trouble wrapping my brain around it because I think of Skolemization, when considered as an inference rule, to simply "be an okay thing to do" when working with a second-order formula.

Skolemization is said to preserve satisfiability but not validity. What exactly do satisfiability and validity mean in the context? Does "co-Skolemization" (dual to Skolemization as a second-order equivalence) have similar properties as Skolemization?


Skolemization, as far as I can tell, is based on the equivalence between the first-order formula (and therefore degenerate second-order formula)

$$ \forall x_1 \cdots x_n . \exists y . P(x_1, \cdots, x_n, y) $$

and the following second order formula, that moves the $\exists$ in front of the $\forall$.

$$ \exists f. \forall x_1 \cdots x_n . P(x_1, \cdots, x_n, f(x_1, \cdots, x_n))$$

It makes perfect sense when the expression is thought of as a game. In the first expression, the opponent first makes $n$ moves and then the player makes 1. In the second expression, the player moves first but commits to a response function of sorts rather than committing to a single move. That's not a proof, but it seems like good intuitive justification for why rewriting an expression like this is okay in the first place.

It seems to me that performing this translation at all only makes sense if we're talking about a closed expression. That is, $P(\cdots)$ must have no free variables or, equivalently, $P$ must be an $n+1$-place predicate (primitive or defined).


In propositional logic, satisfiability and validity can be though of as describing what happens to free variables. All variables are free because there's no way to bind them.

$ P \lor P $ is satisfiable because the assignment $ \{ P = \top \} $ satisfies it.

$ P \lor \lnot P $ is a tautology for obvious reasons. I don't know whether it's right to say $ P \lor \lnot P $ is valid.

An inference rule that maps $P$ to $P \lor Q$ is valid (disjunction introduction).

A rule that maps $P$ to $P \land Q$ preserves satisfiability. We can take any old assignment $\mu$ and construct $\mu \cup \{ Q = \top \}$ which satisfies the new expression.

In the context of Skolemization, are the meanings of satisfiability and validity similar?


Let "co-Skolemization" (probably has a real name) be the translation from

$$ \exists x_1 \cdots x_n . \forall y . P(x_1, \cdots, x_n, y) $$

to

$$ \forall f . \exists x_1 \cdots x_n . P(x_1, \cdots, x_n, f(x_1, \cdots, x_n)) $$

That seems just as legitimate as Skolemization, considered as an inference rule in a second-order setting. There's a caveat here that in a first-order theory definitions are provided for all fully saturated combinations of function symbols and arguments. Therefore "co-Skolemization" definitely takes us out of first-order land, but I don't know how relevant that is.

Greg Nisbet
  • 12,281
  • 3
    It's tangential to your question, but I don't understand your co-Skolemization. Why on earth would you need $f$ to take any inputs at all? In the first place, from the original "$\exists forall$" you immediately get "$\forall \exists$"... – user21820 Oct 07 '18 at 04:16
  • 4
    Then you're just plain wrong. As I stated in my first comment, it is a trivial first-order logic fact that "$\exists \forall$" implies "$\forall \exists$", and I don't know why you don't see that. Moreover, the reverse direction also trivially fails, so even satisfiability is not preserved. Consider "$\exists x \forall y ( x \ne y )$". – user21820 Oct 07 '18 at 09:41
  • @user21820 . removed reference to "co-Skolemization". Thanks. – Greg Nisbet Oct 07 '18 at 19:23
  • 1
    @user21820: It appears that the "co-Skolemization" arose as "negate the sentence, Skolemize the negation, and then negate the result of that (including quantifiers binding the Skolem functions)". Since the higher-order Skolemization (which adds $\exists f$ bindings at the front of a sentence) preserves truth values, its dual would do likewise. – hmakholm left over Monica Oct 08 '18 at 01:06
  • @HenningMakholm: Okay so I had made a silly mistake; I saw the obvious first-order implication, and so co-Skolemization was pointless from the point of view of 'simplifying' a theory, but failed to realize that if we do indeed want equi-satisfiability then the second-order form works; it is stronger than the first-order form. Taking the example of my prior comment, "∃x ∀y ( x≠y )" is unsatisfiable, whereas "∀y ∃x ( x≠y )" is satisfiable, but "∀F ∃x ( x≠F(x) )" is unsatisfiable under both full semantics and Henkin semantics with definable comprehension. Thanks a lot for pointing out my error! – user21820 Oct 08 '18 at 05:23
  • 1
    Perhaps it would be good if you can include that bit about the fact that the second-order Skolemization preserves truth, and that removing the outer second-order existential quantifier (and adding the symbol) preserves satisfiability, into your answer, along with the similar explanation about co-Skolemization. @GregoryNisbet, sorry for my error, which I've explained above. We can also view co-Skolemization as the second-order tautology: ∃x[1..k] ∀y ( P(x[1..k],y) ) ≡ ∀F ∃x[1..k] ( P(x[1..k],F(x[1..k])) ). – user21820 Oct 08 '18 at 05:36
  • @user21820 Thanks. I'll restore the language about "co-Skolemization" in the question. – Greg Nisbet Oct 08 '18 at 05:45

4 Answers4

12

When the existentials that are being removed in the process of Skolemization are not preceded y universals, you simply use a new constant for the respective variables.

As such, consider the formula:

$$\exists x (P(a) \lor \neg P(x))$$

This is a valid formula, since in any domain there is always an object that will satisfy the formula $P(a) \lor \neg P(x)$, namely whatever object $a$ refers to.

However, if we Skolemize, we get:

$$P(a) \lor \neg P(b)$$

which is not a valid formula, since there is a domain and interpretation where this statement is false: make sure that $a$ and $b$ refer to different objects, and where the object that $a$ refers to does not have the property expressed by $P$, while the object referred to by $b$ does have that property.

Bram28
  • 103,721
  • Why is it not valid? You can have $a=b$, right? – Kevin Oct 07 '18 at 02:31
  • 5
    @Kevin Check the definition of "valid". It doesn't mean that the formula could be true if you cleverly choose meanings for the basic non-logical symbols (like $P,a,b$) in it. It means that the formula is guaranteed to be true no matter what meanings you give its basic non-logical symbols. – Andreas Blass Oct 07 '18 at 02:35
  • @AndreasBlass: That's a fine explanation, but it should be in the answer. – Kevin Oct 07 '18 at 02:37
  • @Kevin "Valid" being a standard technical term in this context, I think it is appropriate for the answer to assume it. – Noah Schweber Oct 07 '18 at 15:23
  • 3
    I'm new to this subject, so this question might be silly, but the Skolemization as defined by OP is about switching the order of the quantifiers and creating a function for this. I don't see any of these here, so what is happening? Why did the existence quantifier disappear? How do you Skolemize without a forall quantifier? Where is the function? Thank you very much in advance :) – Pedro A Oct 07 '18 at 16:50
  • 1
    @PedroA Remember that constant symbols are just $0$-ary function symbols. The $b$ which is introduced is an $n$-ary function symbol, where $n$ is the number of universal quantifiers preceding the corresponding existential quantifier; since $n=0$ we're just looking at a constant symbol. – Noah Schweber Oct 07 '18 at 18:10
  • @NoahSchweber I see, but still: since the constant is the same thing as a 0-ary function, I don't see why the existential quantifier is dropped. Why 0-ary functions are treated differently? I'd expect $\exists b$ since $b$ is playing the role of $f$. – Pedro A Oct 07 '18 at 22:50
10

A sentence is valid if it is true in every interpretation of its logical language.

A sentence is satisfiable if it is true in some interpetation of its logical language.

Since Skolemization adds new symbols to the logical language, I think it is useful to introduce concepts of relative validity and satisfiability:

Suppose $\mathcal L\subseteq \mathcal L'$ are logical languages, and $\mathfrak A$ is an interpretation of $\mathcal L$. Then,

  • An $\mathcal L'$-sentence is valid relative to $\mathfrak A$ if it is true in every extension of $\mathfrak A$ to an interpretation of $\mathcal L'$.
  • An $\mathcal L'$-sentence is satisfiable relative to $\mathfrak A$ if it is true in some extension of $\mathfrak A$ to an interpretation of $\mathcal L'$.

(Beware that "relative" validity and satisfiability is probably not standard terminology; I made it up for the purpose of this explanation).

Now, when we Skolemize an $\mathcal L$-sentence $\varphi$, we get an extended language $\mathcal L'\supseteq \mathcal L$ and an $\mathcal L'$-sentence $\varphi'$. This trades truth for satisfiability in the sense that

For every interpretation $\mathfrak A$ of $\mathcal L$ it holds that $\varphi$ is true in $\mathfrak A$ if and only if $\varphi'$ is satisfiable relative to $\mathfrak A$.

In other words, if $\varphi$ is true in $\mathfrak A$, then we can find some interpretation of the Skolem functions that makes $\varphi'$ true, and vice versa. But we have no guarantee that every possible interpretation of the Skolem functions will preserve the truth of $\varphi$ -- indeed, usually they won't (as Bram28's counterexample shows).

This means that if we just have $\varphi$ and is not looking at a particular $\mathfrak A$, the Skolemization preserves validity:

  1. $\varphi$ is satisfiable $\iff$
  2. There is some interpretation of $\mathcal L$ where $\varphi$ is true $\iff$
  3. There is some interpretation of $\mathcal L$ which extends to some interpretation of $\mathcal L'$ where $\varphi'$ is true $\iff$
  4. There is some interpretation of $\mathcal L'$ where $\varphi'$ is true $\iff$
  5. $\varphi'$ is satisfiable.

If we try to replicate this argument with "valid" instead of "satisfiable", we get stuck after step 3:

  1. Every interpretation of $\mathcal L$ extends to some interpretation of $\mathcal L'$ where $\varphi'$ is true.

Whereas the original (3) had two "some" that we could collapse to one in (4), here we have a combination of "every" and "some". And that cannot be collapsed to a simple statement about interpretations of $\mathcal L'$ and $\varphi'$ -- in particular it is not equivalent to $\varphi'$ being valid.

5

When you Skolemize, you drop the existential quantifier in front of $f$. Every structure that is a model of the original formula can be extended by providing an interpretation of $f$. Such an extension may or may not be a model of the Skolemized formula. Even though the original formula may be valid, the Skolemized formula doesn't have to be. What is guaranteed is that the two formulae are equisatisfiable.

0

Skolemization for first order formulas merely preserves satisfiability because, in general, the wff will not be true in every interpretation of the expanded language with new non-logical symbols.


Two and a half years ago when I wrote this, I did not really understand what a model was. That's also why this questions makes no reference at all to the semantics of first and second-order logic. The question does talk about the semantics of propositional logic, though, but it's not connected to the rest of the question. At the time, I thought of closed statements in higher order logic as inherently true or false in some kind of absolute sense, so naturally the way to handle free variables was to move to the higher-order setting and add quantifiers to bind all the free variables.

The reason why Skolemization doesn't preserve validity in general is because it introduces new, nonlogical symbols into the language. These are functions, constant symbols, and relations. If I skolemize $ \exists x \mathop. \phi(x) $ and get $\phi(c)$ where $c$ is a fresh constant, then it is not the case that an arbitrary value of $c$ will preserve the truth of the statement.

Additionally, Skolemization is not thought of as taking a first order formula with some non-logical symbols, translating it into an equivalent second-order formula by adding explicit quantifiers for the non-logical symbols and making them bound variables, transforming the second order formula in a truth-preserving way, and then converting it back to a first order formula.

Although, if you do view it though that lens and do translate the first-order sentences to their corresponding second-order sentences without non-logical symbols, (first order) Skolemization preserving validity would imply the following, which is not true.

$$ \textbf{WRONG:}\;\; \forall x_1 \cdots x_n \mathop. \exists y \mathop. \phi(x_1, \cdots, x_n, y) \;\; \text{implies} \;\; {\huge\forall} f \mathop. \forall x_1 \cdots x_n \mathop.\phi(x_1,\cdots,x_n, f) $$

Rather, what we do have is the following:

$$ \forall x_1 \cdots x_n \mathop. \exists y \mathop. \phi(x_1, \cdots, x_n, y) \;\; \text{implies} \;\; \exists f \mathop. \forall x_1 \cdots x_n \mathop.\phi(x_1,\cdots,x_n, f) $$

The left-hand side is presumed to be valid because that's what it means to take it as a premise and $\phi$ might have free variables. The right hand side, however, potentially has free variables in $\text{FV}(\varphi) \setminus \{f\}$. Those free variables are implicitly universally quantified because the implication as a whole is valid in second order logic, but $f$ is explicitly existentially quantified. Thus Skolemization preserves satisfiability (assuming that the domain is non-empty). We have to give up the implicit universal quantification for the variables in $\text{FV}(\varphi) \setminus \{f\}$ and downgrade it to implicit existential quantification.

Greg Nisbet
  • 12,281