18. Beyond First-Order Logic
We have explored propositional logic, first-order logic, and axiomatic first-order theories like linear and integer arithmetic. These form the basis for several automated reasoning tools used today, including SAT solvers, SMT solvers, and first-order theorem provers.
In a sense, we never need to go beyond first-order logic. Many mathematicians take Zermelo-Fraenkel set theory with the axiom of choice (ZFC), an axiomatic theory based on first-order logic, to be the official foundation of mathematics. In principle, we can encode any mathematical statement in ZFC and search for a proof using a first-order theorem prover. The line between logic and mathematics is not sharp, but many philosophers take first-order logic to be the embodiment of general logical reasoning, whereby additional axioms provide the rest of mathematics.
Nonetheless, there are often practical reasons to add more syntax and rules to our logical systems and to give these special standing in our implementations. Doing so gives us better means to isolate decidable fragments of mathematical reasoning, and to design efficient decision procedures and search procedures based on the additional structure of the logics. It provides more expressive power to proof assistants, and it enables us to design better mechanisms to support interactive proof.
There is a tradeoff: making a logic more powerful and more expressive generally makes it harder to automate. One common strategy is to choose a powerful and expressive logic as the basis for a proof assistant like Lean, and, within that framework, invoke automation built on more restricted logics to carry out focused reasoning tasks.
Lean’s underlying logic is a version of dependent type theory, which we will not discuss in detail here. You can learn more about dependent type theory from Theorem Proving in Lean or a textbook like Avigad, Mathematical Logic and Computation. This chapter takes a complementary, bottom-up, approach to exploring extensions of first-order logic by describing specific extensions one at a time. All of them are subsumed by Lean’s logic and we can therefore use Lean to illustrate the syntax.
18.1. Sorts
We have already seen that Lean’s foundation includes propositional logic and first-order logic.
variable (P Q R : Prop)
#check P ∧ Q
#check P ∨ Q
#check P → Q
#check True
#check False
#check ¬ P
#check P ↔ Q
variable (U : Type)
-- function symbols
variable (f : U → U) (g : U → U → U)
-- a unary predicate and a binary relation
variable (S : U → Prop) (T : U → U → Prop)
#check ∀ x, S x → ∃ y z, y ≠ z ∧ T y (g x z)
In first-order logic, as we have presented it, all the variables and quantifiers range over a single universe of objects. In many applications, it is useful to have different universes, or sorts, of objects. For example, in geometry, we might want separate variables ranging over points, lines, and planes. In many-sorted first-order logic, we can have different sorts of objects. Function symbols and relation symbols are specified with not only their arities but also the types of arguments they can have. We can easily model this in Lean by using type variables to represent the different sorts.
variable (U1 U2 U3 : Type)
variable (h : U1 → U1 → U2)
variable (Q : U1 → U3 → Prop)
In this example, h
is a function that takes two arguments of sort U1
and returns an argument of sort U2
, and Q
is a binary relation between
U1
and U3
.
In this way, for example, we can express the relationship of a point being on a line,
and write down axioms that express that any two distinct points determine a unique line.
variable (Point Line Plane : Type)
variable (onl : Point → Line → Prop)
variable (ax1 : ∀ p q : Point, ∃ L : Line, onl p L ∧ onl q L)
variable (ax2 : ∀ p q : Point, ∀ L M : Line, p ≠ q → onl p L ∧ onl q L →
onl p M → onl q M → L = M)
We have already seen that SMT solvers distinguish between different sorts of objects, like integers, real numbers, and bitvectors. Many first-order provers have native support for sorted reasoning as well. Many-sorted logic is only a mild extension of first-order logic, since we can instead use predicates and relativization to distinguish different types of objects. For example, the statements of geometry above could be represented in ordinary first-order logic with additional predicates \(\fn{Point}(x)\), \(\fn{Line}(x)\), and \(\fn{Plane}(x)\).
18.2. Function types and product types
In first-order logic, we have function symbols, but we do not have variables that range over functions. We obtain such a system by adding function types to our logic.
#check Nat → Bool
#check (Nat → Nat) → Nat
variable (f : Nat → Nat) (g : Nat → Bool) (h : (Nat → Nat) → Nat)
In this example, we start with basic types Nat
and Bool
and declare a function
f
from Nat
to Nat
, a function g
from Nat
to Bool
, and a function
h
that takes a function from Nat
to Nat
as an argument and returns a Nat
.
A system known as the simply-typed lambda calculus provides two fundamental rules for building complex expressions:
Function abstraction (also known as lambda abstraction): If
x
is a variable of typeA
andt
is an expression of typeB
, thenfun x : A => t
(also writtenλ x : A, t
) is an expression of typeA → B
.Function application: If
s
is an expression of typeA → B
andt
is an expression of typeA
, thens t
is an expression of typeB
.
-- function abstraction
#check fun x : Nat => x
#check λ x : Nat => x -- alternate notation
-- function application
#check f 3
#check h f
The two operations are related by an equation known as the beta reduction rule,
which says that the result of applying λ x, s
to t
is equal to s[t/x]
,
where s[t/x]
denotes the result of substituting t
for x
in s
.
example : (fun x => f (f x)) (f 3) = f (f (f 3)) := rfl
Some people take the simply-typed lambda calculus to include product types as well, with a pairing operation and projections. The corresponding reduction then says that pairing two objects and taking the first or second projection returns the first or second object, respectively.
variable (p : Nat × Nat)
#check (3, 5)
#check p.1
#check p.2
example : (3, 5).1 = 3 := rfl
As a logical system, the simply typed lambda calculus is usually presented as a purely
equational calculus, that
is, a system for deriving equations s = t
using equality rules and axioms.
You can think of the simply typed lambda calculus as a simple programming language
together with a means for reasoning about it.
One can extend the calculus, for example, with recursive definitions on the natural numbers
or if-then-else expressions based on a boolean conditional.
One can show that any expression can be reduced to a unique normal form by applying reduction rules until no more reductions are possible. As a result, one can decide whether two expressions are provably equal by determining whether they have the same normal form.
In the simply typed lambda calculus, one can represent relations as functions that
return an element of Bool
, but the built-in equality relation is not represented that way,
no does the simply typed lambda calculus include quantifiers.
Extending the simply typed lambda calculus to include equality and quantifiers as
term-forming operations gives rise to a much more expressive system known as simple type theory.
This is, in turn, a version of higher-order logic, which we turn to next.
18.3. Higher-order logic
First-order logic allows us to quantify over objects in the domain of discourse, but it does not allow us to quantify over functions, predicates, or relations. If we could do that in a theory of arithmetic, for example, we could express the principle of induction on the natural numbers as follows:
This says that any property P
of natural numbers that holds for 0 and is preserved
by adding one holds of all the natural numbers. Without the ability to quantify over
P
, the best we can do is state an induction principle for each particular formula
A(x)
separately.
Similarly, if we can quantify over functions, we can say that the universe is infinite by
saying that there exists an injective function that is not surjective.
This something we cannot do in first-order logic.
One option is to extend first-order logic with sorts for functions and predicates on the universe, and add variables ranging over those sorts. That gives rise to second-order logic. For unary predicates, for example, we would add an axiom that says that every formula \(A(x)\) gives rise to a predicate:
This is known as a comprehension axiom.
Notice that A(x)
may involve second-order quantifiers as well, so, in general, it’s
impossible to interpret predicates as ranging only over formulas. For example, if \(A\)
has a universal quantifier over predicates, then that quantifier ranges over all predicates,
including the one \(P\) that is defined by the axiom.
Having added functions and predicates on objects, we might then want to add
functions and predicates that take functions and predicates as arguments, and so on.
Iterating this process gives rise to higher-order logic. The language of simple type
theory provides a convenient way to do this. If we define a type of propositions, Prop
,
we can take the propositional connectives to be functions on Prop
, and we can take
the equality symbol to return values in Prop
.
We can take the universal and existential quantifiers to be higher-order functions that
take predicates to propositions:
variable (U : Type) (P : U → Prop)
#check (Exists : (U → Prop) → Prop)
example : (∃ x, P x) = Exists P := rfl
(In Lean’s foundation, the existential quantifier is treated this way, but the universal quantifier is not; the latter is more integrally a part of the logical foundation.) The comprehension axiom is now subsumed by function abstraction:
variable (f g : U → U) (h : U → U → U)
#check (fun x : U => ∀ y, ∃ z, f (f y) = h x (g z) : U → Prop)
In other words, if A
is any expression of type Prop
, possibly involving the variable x
,
the expression fun x => A
is the predicate that, when applied to an expression t
, asserts
that A
holds of t
.
There is a duality between sets S
of elements of a type U
and predicates P : U → Prop
,
namely,
for any set S
, we can define the property fun x => x ∈ S
, and for any predicate P
,
we can define the set {x | P x}
.
For that reason, the philosopher W. V. O. Quine called second-order logic set theory in sheep’s
clothing, by which he meant that higher-order reasoning is really set-theoretic reasoning in disguise.
That makes higher-order logic very powerful. If we add a basic type of natural numbers, for example,
higher-order logic is strong enough to interpret a substantial amount of mathematics.
This expressivity comes at a cost, however. To start with, there is no axiomatic proof system that is complete for even second-order logic, under the standard semantics where a quantifier over predicates is interpreted as ranging over all predicates on the universe. There is a precise technical sense in which second-order logic is more undecidable than first-order logic, and higher-order logic is even more undecidable than that. Even fundamental logical operations that are straightforwardly computable in the first-order setting can become uncomputable in the higher-order setting. For example, we have seen that there is a straightforward procedure for determining whether two first-order terms are unifiable. In contrast, the decision procedure for even second-order terms is undecidable.
Because functions, predicates, and relations are so fundamental to mathematical reasoning, it often makes sense to consider them part of the underlying logic. Provers like Vampire have built-in support for higher-order reasoning. Theory tells us that there is no complete search procedure for certifying even second-order validity, but that need not dissuade us from trying to design search procedures that capture common patterns of inference, and the expressivity of higher-order logic is often useful in practice.
18.4. Inductive Types
Another way to extend a logic is to build in a notion of an inductively defined data type, or inductive type for short. An inductive type is a data type generated by constructors, where “generated by” means, first, that applying a constructor to an element of a type gives a new element, and second, that every element of the type is obtained in this way. The latter condition justified the principles of induction and recursion for the inductive type.
You have already seen inductive data types informally, in
Chapter 2, and formally, in Lean, in
Chapter 13.
The arcehtypal example of an inductive type is the natural numbers, Nat
, generated
by an element zero : Nat
and a function succ : Nat → Nat
.
In Lean, it is declared as follows:
inductive Nat : Type where
| zero : Nat
| succ : Nat → Nat
This declaration gives rise to the principles of induction and recursion that we saw in Chapter 13. Here is the analogous definition of a binary tree:
inductive BinaryTree where
| empty : BinaryTree
| node : BinaryTree → BinaryTree → BinaryTree
We have already seen the induction and recursion principles play out formally in Chapter 13.
The data type of lists of elements of another type α
is another example of an inductive type,
generated by a constructor for the empty list and, for each a : α
, a constructor cons a
that takes a list l
and returns a new list with a
as its head and l
as its tail.
inductive List (α : Type) where
| nil : List α
| cons : α → List α → List α
Different logical systems allow different forms of inductive types. In Lean,
constructor can take a function as argument.
The example CBTree
below is a type of countably branching trees; given a sequence
of trees indexed by the natural numbers, the constructor sup
returns a new one
that has a node at the root and children given by the sequence.
inductive CBTree where
| leaf : CBTree
| sup : (Nat → CBTree) → CBTree
The logical strength of adding inductive types depends on the nature of inductive types allowed and the other logical constructions and rules. Because induction and recursion are so fundamentally useful, it makes sense to give them special treatment in a system for interactive theorem proving or automated reasoning, rather than, say, simply adding axiomatized constants. For example, SMT solvers like cvc5 and Z3, as well as first-order provers like Vampire, have built-in support for reasoning about inductive types.
18.5. Dependent Types
We have seen that one can extend first-order logic by adding sorts,
which are basic, unaxiomatized data types, as well as function types, product types, and
inductive types.
Dependent type theory goes one step further by making the type system more expressive.
What makes dependent type theory “dependent” is that data types depend on parameters.
For example, the type List α
of lists of elements of type α
depends on a variable
α
that which can be instantiated in various ways.
#check List Nat
#check List (Nat × Nat)
#check List (List Nat)
In this case, the List
type is parameterized by a data type α
.
Types can also be parameterized by mathematical objects like numbers.
For example, in Lean, the type Vector α n
of vectors of length n
of elements of type α
depends on both the type α
and the number n
.
Similarly, Fin n
, the type of natural numbers less than n
, depends
on the parameter n
.
variable (n : Nat)
#check Vector (List Nat) 5
#check Vector Nat (n + 3)
#check Fin n
#check Fin (2*n + 5)
In Lean’s type theory, all the following types of expression are on the same footing:
data types, like
Nat : Type
andBool : Type
objects of those data types, like
5 : Nat
andtrue : Bool
propositions, like
2 + 2 = 4 : Prop
and∀ x : Nat, ∃ y : Nat, y > x
proof of those propositions, like
rfl : 2 + 2 = 4
.
Type dependency means that all these components can be mixed in a single expression. The syntactic details are complicated, and, as a result, dependent type theory is hard to automate. But the expressive power turns out to be very useful for interactive theorem proving, including formalization of mathematics and hardware and software verification. As a result, it provides a framework in which the methods we have explored in this course can be put to good use.