11. Implementing First-Order Logic

Our implementation of first-order logic in Lean is similar to our implementation of propositional logic in Lean, covering both the syntax and the semantics. We will also show how to implement unification, and algorithm that is fundamental to the mechanization of first-order reasoning.

11.1. Terms

Our implementation of terms is straightforward.

inductive FOTerm
  | var : String  FOTerm
  | app : String  List FOTerm  FOTerm
  deriving Repr, Inhabited, BEq

A term is either a variable or a function symbol applied to a list of terms. We have defined syntax for FOTerm:

def ex1 := term!{ %x }
def ex2 := term!{ c }
def ex3 := term!{ f(f(a, %x), f(g(c, f(%y, d)), b)) }

#print ex1
#print ex2
#print ex3

#eval ex1
#eval ex2
#eval ex3

The notation %x is used for a variable. Notice that a constant like c is represented as an application of the symbol to the empty list. Notice also that the definition does nothing to check the arity of the function symbols. Ordinarily, first-order logic allows us to specify that f and g are binary functions and that another function, h, is unary. Our definition of FOTerm allows the application of any string to any number of arguments. This simplifies a number of aspects of the implementation. As an exercise, you might want to write a function well-formed in Lean that, given a specification of a list of symbols and their arities, checks that an FOTerm uses only those symbols and with the correct arity. Later in the course, we will talk about systems more expressive than first-order logic that provide other ways of specifying a function’s intended arguments. Lean’s type system provides a very elaborate and expressive means for doing so, and you can think of the specification of arities in first-order logic as being a minimal form of a typing judgment.

Remember that to evaluate a first-order language, we need an assignment of values to the variables, as well as interpretations of the function and relation symbols. Since our symbols are identified as strings, in general an interpretation of symbols is an assignment of values to strings:

def FOAssignment α := String  α

Any function definable in Lean can serve this purpose. Keep in mind that we have to fix a type α, corresponding to the universe of the structure in which we carry out the interpretation.

Since it is often useful to specify an assignment by giving a finite list of values, we have implemented syntax for that:

#eval assign!{x  3, y  5, z  2}

You can type the symbol as \mapsto. Formally, the notation produces an association list, essentially just a list of key / value pairs. But we have also told Lean how to coerce such an association list to an FOAssignment when necessary. The following examples provide a few different Lean idioms for specifying that an assign! expression should be interpreted as an FOAssignment. (It should also happen automatically when you pass such an expression as an argument to a function that expects an FOAssignment.)

def assign1 : FOAssignment Nat := assign!{x  3, y  5, z  2}
#check assign1
#eval  assign1 "x"

#check (assign!{x  3, y  5, z  2} : FOAssignment Nat)

#check @id (FOAssignment Nat) assign!{x  3, y  5, z  2}

#check show FOAssignment Nat from assign!{x  3, y  5, z  2}

#check let this : FOAssignment Nat := assign!{x  3, y  5, z  2}
       this

It is now easy to define substitution for terms. Such a function should take a term and an assignment of terms to variables, and replace the variables by the assigned terms.

partial def subst (σ : FOAssignment FOTerm) : FOTerm  FOTerm
  | var x   => σ x
  | app f l => app f (l.map (subst σ))

Here we try it out:

#eval ex3.subst assign!{x  term!{h(a)}, y  term!{f(a, h(a), d)}}

11.2. Evaluating terms

To evaluate a term, we need not only an assignment of values to the variables occurring in the term, but also an interpretation of all the function symbols. Setting aside concerns about arities, we can interpret any function taking some number of elements of α to α as an element of type List α → α.

/-- an interpretation of function symbols -/
def FnInterp α := FOAssignment (List α  α)

-- coerces an association list to a function
instance [Inhabited α] : Coe (AssocList String (List α  α)) (FnInterp α) :=
fun l => l.getA

If a function is intended to be used as a binary function, we really care about the interpretation when it is applied to lists of length two. In our quick-and-dirty implementation, we have to define values for lists of other lengths, but any values will do. For example, we can define an interpretation of constants and functions on the natural numbers as follows:

def arithFnInterp : FnInterp Nat
  | "plus"  => fun l => l.getA 0 + l.getA 1
  | "times" => fun l => l.getA 0 * l.getA 1
  | "zero"  => fun _ => 0
  | "one"   => fun _ => 1
  | "two"   => fun _ => 2
  | "three" => fun _ => 3
  | _       => fun _ => default

Or, alternatively:

def arithFnInterp' : FnInterp Nat :=
assign!{
  plus  fun l : List Nat => l.getA 0 + l.getA 1,
  times  fun l => l.getA 0 * l.getA 1,
  zero  fun _ => 0,
  one  fun _ => 1,
  two  fun _ => 2,
  three  fun _ => 3 }

With FnInterp in place, it is easy to define a function that evaluates terms:

/-- evaluate a term relative to a variable assignment -/
partial def eval {α} [Inhabited α] (fn : FnInterp α) (σ : FOAssignment α) : FOTerm  α
  | var x   => σ x
  | app f l => fn f (l.map (eval fn σ))

Even though the function always terminates, Lean 4 is not yet able to prove termination automatically, so we use the keyword partial. Let’s try it out.

def arith_ex1 := term!{ plus(times(%x, two), plus(%y, three)) }

#eval arith_ex1.eval arithFnInterp assign!{ x  3, y  5 }
#eval arith_ex1.eval arithFnInterp assign!{ x  7, y  11 }

When we talked about propositional logic, we proved a theorem that says that evaluating the result of a substitution is the same as evaluating the original formula relative to an assignment of the values of the substituted formula. In the context of terms, the identity is as follows:

\[\tval{t[s/x]}_\sigma = \tval{t}_{\sigma[x \mapsto \tval{s}_\sigma]}.\]

The proof is essentially the same. Our current implementation is more general in that it allows us to substitute multiple terms at once, but we can see the principle at play in the fact that the two evaluations below produce the same answer.

def arith_ex2 := term!{ plus(one, times(three, %z))}

def arith_ex3 := term!{ plus(%z, two) }

-- these two should give the same result!

#eval (arith_ex1.subst
        assign!{ x  arith_ex2, y  arith_ex3 }).eval
        arithFnInterp assign!{z  7}

#eval arith_ex1.eval arithFnInterp
  assign!{ x  (arith_ex2.eval arithFnInterp assign!{z  7}),
           y  (arith_ex3.eval arithFnInterp assign!{z  7}) }

And here is another crazy idea: we can view substitution as the result of evaluating a term in a model where the universe consists of terms, and where each function symbol f is interpreted as the function “build a term by applying f.”

def TermFnInterp : FnInterp FOTerm := FOTerm.app

def FOTerm.subst' := eval TermFnInterp

-- the same!
#eval arith_ex1.subst assign!{ x  arith_ex2, y  arith_ex3 }
#eval arith_ex1.subst' assign!{ x  arith_ex2, y  arith_ex3  }

You should think about what is going on here. Such a model is known as a term model.

11.3. Formulas

Since the universe of a first-order model may be infinite, we would not expect to be able to evaluate arbitrary first-order formulas in an arbitrary model. Evaluating quantifiers in our model of arithmetic would require testing instantiations to all the natural numbers. If we could do that, we could easily settle the truth of the sentence

\[\fa x \ex y y > x \land \fn{prime}(y) \land \fn{prime}(y + 2),\]

which says that there are infinitely many values \(y\) such that \(y\) and \(y + 2\) are both prime. This is known as the twin primes conjecture, and it is a major open question in number theory.

We can, however, evaluate quantifiers over finite universes. In the definition of a first-order model below, we assume that the universe is given by a finite list of values.

/-- an interpretation of relation symbols -/
def RelInterp α := FOAssignment (List α  Bool)

structure FOModel (α : Type) where
  (univ : List α)
  (fn : FnInterp α)
  (rel : RelInterp α)

In our quick-and-dirty implementation, we don’t require that the universe univ is closed under the functions. In other words, it’s possible that we can apply a function to some elements on the list univ and get something that isn’t in the list. It wouldn’t be hard to write a procedure that checks that, given a finite list of functions and their intended arities. (A more efficient way of handling this is instead to prove that the functions all return values in universe, using Lean’s theorem proving capabilities.) In the examples in this section, we won’t use function symbols at all, except for some constants that are clearly o.k.

To handle the quantifiers, we need a procedure that takes an assignment \(\sigma\) and produces an updated assignment \(\sigma[x \mapsto v]\).

def FOAssignment.update (σ : FOAssignment α) (x : String) (v : α) : FOAssignment α
  | y => if y == x then v else σ y

With that in hand, the evaluation function is entirely straightforward.

def FOForm.eval {α} [Inhabited α] [BEq α]
    (M : FOModel α) (σ : FOAssignment α) : FOForm  Bool
  | eq t1 t2 => t1.eval M.fn σ == t2.eval M.fn σ
  | rel r ts =>  M.rel r (ts.map <| FOTerm.eval M.fn σ)
  | tr => true
  | fls => false
  | neg A => !(eval M σ A)
  | conj A B => (eval M σ A) && (eval M σ B)
  | disj A B => (eval M σ A) || (eval M σ B)
  | impl A B => !(eval M σ A) || (eval M σ B)
  | biImpl A B => (!(eval M σ A) || (eval M σ B)) && (!(eval M σ B) || (eval M σ A))
  | ex x A => M.univ.any fun val => eval M (σ.update x val) A
  | all x A => M.univ.all fun val => eval M (σ.update x val) A

Let’s try it out on a baby model of arithmetic that only has the numbers 0 to 9 in the universe. We reuse the function interpretation arithFnInterp from before, so that we have the constants zero, one, two, and three. The interpretation also gives us addition and multiplication, but we won’t use those. As far as relations, we interpret two: the binary less-than relation lt and the predicate even. We also define the trivial assignment which assigns 0 to all the variables.

def babyArithMdl : FOModel Nat where
  univ := List.range 10  /- 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 -/
  fn   := arithFnInterp
  rel  := assign!{
            lt  fun l : List Nat => if l.getA 0 < l.getA 1 then true else false,
            even  fun l : List Nat => l.getA 0 % 2 == 0 }

def trivAssignment : FOAssignment Nat := fun _ => 0

We can try it out:

#eval fo!{even(%x)}.eval babyArithMdl assign!{x  5}
#eval fo!{even(%x)}.eval babyArithMdl assign!{x  6}
#eval fo!{ y. lt(%x, %y)}.eval babyArithMdl assign!{x  8}
#eval fo!{ y. lt(%x, %y)}.eval babyArithMdl assign!{x  9}

It’s an unpleasant feature of our syntax that we have to put a percent sign in front of variables in order to distinguish them from constants, while we do not use the percent sign with the quantifiers. To facilitate testing sentences, we write a simple function testeval that evaluates any formula in our model with respect to the trivial variable assignment.

def FOForm.testeval (A : FOForm) : Bool := A.eval babyArithMdl trivAssignment

#eval fo!{even(two)}.testeval
#eval fo!{even(three)}.testeval
#eval fo!{ x. even(%x)}.testeval
#eval fo!{ x. even(%x)}.testeval

#eval fo!{ x. lt(%x, two)  even(%x)}.testeval
#eval fo!{ x.  y. lt(%x, %y)  lt(%y, two)  even(%x)  even(%y)}.testeval
#eval fo!{ x. even(%x)  lt(%x,two)  %x = zero}.testeval
#eval fo!{ x. even(%x)  lt(%x,three)  %x = zero}.testeval
#eval fo!{ x. even(%x)  lt(%x,three)  %x = zero  %x = two}.testeval

#eval fo!{ x.  y. lt(%x, %y)}.testeval
#eval fo!{ x. even(%x)   y. lt(%x, %y)}.testeval
#eval fo!{ x. ¬ even(%x)   y. lt(%x, %y)}.testeval

A software package called Tarski’s World by Jon Barwise and John Etchemendy offers a fun way of experimenting with first-order logic. It allows users to lay down blocks on an \(8 \times 8\) grid and evaluate sentences about them. Each block is either a tetrahedron, a cube, or a dodecahedron, and each is either small, medium, or large. The language includes predicates for these, as well as relations like FrontOf(x, y) and Between(x, y, z). The second one of these holds if and only if x, y, and z are either in the same row, in the same column, or on the same diagonal, and x is between y and z.

The file TarskisWorld in the Examples folder includes a simple implementation in Lean. You can define a world and display it:

def myWorld : World := [
  tet, medium, 0, 2⟩,
  tet, small, 0, 4⟩,
  cube, small, 4, 4⟩,
  cube, medium, 5, 6⟩,
  dodec, small, 7, 0⟩,
  dodec, large, 7, 4 ]

#eval myWorld.show

#3d_world myWorld

/-
-----------------------------------------
| D- |    |    |    | D+ |    |    |    |
-----------------------------------------
|    |    |    |    |    |    |    |    |
-----------------------------------------
|    |    |    |    |    |    | C  |    |
-----------------------------------------
|    |    |    |    | C- |    |    |    |
-----------------------------------------
|    |    |    |    |    |    |    |    |
-----------------------------------------
|    |    |    |    |    |    |    |    |
-----------------------------------------
|    |    |    |    |    |    |    |    |
-----------------------------------------
|    |    | T  |    | T- |    |    |    |
-----------------------------------------
-/

Here a plus symbol means that the object is large, a minus symbol means that it is small, and no symbol means that it is of medium size. You can then evaluate statements about the world:

#eval myWorld.eval fo!{ x.  y.  z. Between(%x, %y, %z)}
#eval myWorld.eval fo!{ x.  y.  z. Cube(%x)  Between(%x, %y, %z)}
#eval myWorld.eval fo!{ x.  y.  z. Dodec(%x)  Between(%x, %y, %z)}
#eval myWorld.eval fo!{ x. Small(%x)}
#eval myWorld.eval fo!{ x. Small(%x)  Cube(%x) }
#eval myWorld.eval fo!{ x.  y. Cube(%x)  Tet(%y)  FrontOf(%x, %y)}
#eval myWorld.eval fo!{ x.  y. Cube(%x)  Dodec(%y)  FrontOf(%x, %y)}
#eval myWorld.eval fo!{ x. Tet(%x)   y. Cube(%y)  RightOf(%y, %x) }
#eval myWorld.eval fo!{ x. Dodec(%x)   y. Tet(%y)  RightOf(%y, %x) }

The file TWExamples provides two puzzles, taken from Tarski’s World, for you to try.

11.4. Unification

Suppose we are working with a language that is meant to describe the real numbers and we have either proved or assumed the following sentence:

\[\fa {x, y, z} x < y \to x + z < y + z.\]

Suppose also that we are trying to prove

\[ab + 7 < c + 7.\]

Even though we haven’t talked about proof systems for first-order logic yet, it should be clear that we want to instantiate the universal quantifiers in the sentence by substituting \(ab\) for \(x\), \(c\) for \(y\), and \(7\) for \(z\), so that the conclusion matches the goal.

You probably did not have to think about this much. You identified the problem as that of finding a substitution for \(x\), \(y\), and \(z\) that has the effect of making the expression \(x + z < y + z\) the same as \(ab + 7 < c + 7\). This is what is known as a matching problem for first-order terms. This kind of pattern matching is fundamental to reasoning, and hence also fundamental to logic. The general formulation of the problem is as follows: we are given pairs of terms \((s_1, t_1), (s_2, t_2), \ldots, (s_n, t_n)\) and we are looking for a substitution \(\sigma\) with the property that for every \(i\), \(\sigma \, s_i = t_i\).

The generalization of the matching problem in which we are allowed to substitute for the variables in the terms \(t_i\) as well as the terms \(s_i\) is known as unification. In other words, the input to a unification problem is the same as before, but now we are looking for a substitution \(\sigma\) with the property that for every \(i\), \(\sigma \, s_i = \sigma \, t_i\). For example, consider the following two expressions:

\[f(x, f(x, a)) < z \quad\quad f(b, y) < c\]

If we substitute \(b\) for \(x\), \(f(b, a)\) for \(y\), and \(c\) for \(z\), both expressions become \(f(b, f(b, a)) < c\). When we discuss resolution proofs for first-order logic, we will see that such unification problems come up in a context where are are trying to prove a contradiction and we have derived something like \(\fa {x, z} f(x, f(x, a)) < z\) and \(\fa y f(b, y) \not< c\). The unification problem tells us how to instantiate the universal quantifiers in order to prove a contradiction.

To prevent confusion in the future, we would like to point out that what counts as a variable or a constant depends on the context. For example, we can ask for an assignment to \(x\) and \(y\) that unifies \(f(x, z)\) and \(f(z, y)\) without assigning anything to \(z\). In that case, we are treating \(x\) and \(y\) as variables and \(z\) as a constant. In Lean, the variables \(x\) and \(y\) that can be assigned in a unification problem are sometimes called metavariables and written as \(?x\) and \(?y\). For simplicity, we will continue to use letters like \(x\), \(y\), and \(z\) for variables and letters like \(a\), \(b\), and \(c\) as constants. What is important when specifying a unification problem is simply that it is clear which symbols play the role of variables and which have to remain fixed.

There is a linear-time algorithm that solves any unification problem or determines that there is no solution, and, in fact produces a most general unifier, or mgu. A most general unifier \(\sigma\) for a problem is a substitution that unifies each pair in the list and has the property that if \(\tau\) is any unifier for the problem, then \(\tau\) can be expressed as the result of following \(\sigma\) by another substitution. Here we will describe such an algorithm, but we will not go so far as to prove its correctness or show that it produces a most general unifier. Our implementation is adapted from one by John Harrison in his Handbook of Practical Logic and Automated Reasoning, which is an excellent reference for most of the topics we cover in this book, and a lot more. In particular, it proves that the algorithm we describe below always terminates with a most general unifier if the pairs can be unified at all, and fails with none otherwise. There is also a nice presentation of unification in the book Term Rewriting and All that by Franz Baader and Tobias Nipkow.

Some unification problems are easy. To unify \(x\) and \(f(a, z)\), we simply assign \(x\) to \(f(a, z)\). To be a most general unifier, it is important that the resulting term, \(f(a, z)\) has the variable z. We can also solve the problem by assigning \(f(a, a)\) to \(x\) and \(a\) to \(z\), but that is less general. It can be seen as the result of mapping \(x\) to \(f(a, z)\) and then applying another substitution that maps \(z\) to \(a\).

For another easy example, we can unify \(f(x, b)\) and \(f(g(c), b)\) by mapping \(x\) to \(g(c)\). Since both expressions are of the form \(f(\cdot, \cdot)\), to unify the expressions it is clearly necessary and sufficient to unify the arguments. For the same reason, it is clear that \(f(x, b)\) and \(g(c)\) can’t be unified.

An interesting phenomenon arises with the unification problem \((x, f(y)), (y, g(x))\). The first pair tells us we have to map \(x\) to \(f(y)\), and the second pair tells us we have to map \(y\) to \(g(x)\). Applying this substitution to each pair yields \((f(y), f(g(x)))\) in the first pair and \((g(x), g(f(y)))\) in the second, which doesn’t solve the problem. Repeating the substitution doesn’t help. The problem is that when we start with \(x\), we find a chain of assignments to variables in the terms on the right-hand side that ultimately leads to a term that involves \(x\). In others words, the list of associations contains a cycle. If we ever reach a point where we are forced to admit a cycle of associations, the algorithm should fail. This is known as the occurs check.

The algorithm below maintains an association list of pairs \((x, t)\) indicating that the variable \(x\) should be unified with the term \(t\). (An association list is essentially just a list of pairs, but for efficiency it uses a constructor with three arguments to cons a pair onto the list.) Each variable \(x\) occurs only once on the left side of a pair. We allow the list to contain pairs like \((x, y)\), \((y, z)\), and \((z, w)\), since we can clean that up later, say, by assigning all the variables to \(w\).

Suppose we have an association list \(\fn{env}\) and we are considering adding the pair \((x, t)\). The following function returns some true if the assignment is trivial, which is to say, \(t\) is \(x\) or there is a chain of variable associations that amounts to assigning \(x\) to \(x\). In that case, we can ignore the pair. The function returns some false for any nontrivial assignment, unless it detects a cycle, in which case it returns none to indicate failure.

partial def isTriv? (env : AssocList String FOTerm) (x : String) :
    FOTerm  Option Bool
  | var y      => if y = x then true
                   else if !env.contains y then false
                   else isTriv? env x (env.getA y)
  | app _ l    => loop l
where
  loop : List FOTerm  Option Bool
    | []    => false
    | a::as => match isTriv? env x a with
                  | true  => none
                  | false => loop as
                  | none  => none

With that in place, the following function takes an association list, env, and a list of pairs to unify, and it returns an association list. The clauses are as follows:

  • If the list of pairs is empty, there is nothing to do.

  • If the first pair on the list is a pair of function applications, then

    • if the pair is of the form \(f(t_1, \ldots, t_n)\) and \(f(s_1, \ldots, s_n)\), add the pairs \((s_1, t_1) \ldots (s_n, t_n)\) to the list of pairs to unify and continue recursively, and

    • if the function symbols don’t match or the number of arguments is not the same, fail.

  • If the pair is of the form \((x, t)\), then

    • if there is a pair \((x, s)\) already in \(\fn{env}\), add \((s, t)\) to the list of pairs to unify and continue recursively, and

    • otherwise add \((x, t)\) to \(\fn{env}\) unless it is a trivial assignment, and continue recursively with the remaining pairs.

  • If the pair is of the form \((t, x)\), then turn it around and recursively use the previous case.

The algorithm is implemented as follows.

partial def unify? (env : AssocList String FOTerm) : List (FOTerm × FOTerm) 
                                                      Option (AssocList String FOTerm)
  | [] => some env
  | (app f1 l1, app f2 l2) :: eqs =>
      if f1 = f2  l1.length = l2.length then
        unify? env ((l1.zip l2) ++ eqs)
      else none
  | (var x, t) :: eqs =>
      if env.contains x then unify? env (eqs.cons (env.getA x, t))
      else match isTriv? env x t with
        | true  => unify? env eqs
        | false => unify? (env.cons x t) eqs
        | none  => none
  | (t, var x) :: eqs => unify? env ((var x, t) :: eqs)

The final association list might contain pairs \((x, t)\) and \((y, s)\) where \(s\) contains the variable \(x\). This means that the variable \(x\) has to unify with \(t\), which we can achieve by mapping \(x\) to \(t\). But the resulting substitution will also replace \(x\) in \(s\), so we had better carry out the substitution for \(x\) there too. The following function, usolve, cleans up the list of pairs by iteratively substituting the terms on the right for the variables on the left, until the association list no longer changes. The fact that we have avoided cycles guarantees that it terminates. The function after that, fullUnify, puts it all together: given a list of pairs of terms to unify, it computes the association list and uses usolve to turn it into a substitution.

partial def usolve (env : AssocList String FOTerm) : AssocList String FOTerm := Id.run do
  let env' := env.mapVal (fun _ => subst env)
  if env' == env then env else usolve env'

partial def fullUnify (eqs : List (FOTerm × FOTerm)) : Option (AssocList String FOTerm) :=
  match unify? AssocList.nil eqs with
    | some l => usolve l
    | none   => none

Below we try the procedure out by computing some unifiers and applying them to the original pairs to make sure that the pairs are indeed unified.

partial def unifyAndApply (eqs : List (FOTerm × FOTerm)) : Option (List (FOTerm × FOTerm)) :=
  match fullUnify eqs with
    | some l => let σ : FOAssignment FOTerm := l
                eqs.map (fun (s, t) => (subst σ s, subst σ t))
    | none   => none

def unify_ex1 := [ (term!{ f(%x, g(%y))}, term!{ f(f(%z), %w) }) ]

#eval toString <| fullUnify unify_ex1
#eval toString <| unifyAndApply unify_ex1

def unify_ex2 := [ (term!{ f(%x, %y) }, term!{ f(%y, %x) }) ]

#eval toString <| fullUnify unify_ex2
#eval toString <| unifyAndApply unify_ex2

def unify_ex3 := [ (term!{ f(%x, g(%y)) }, term!{ f(%y, %x) }) ]

#eval toString <| fullUnify unify_ex3

def unify_ex4 := [ (term!{ %x0 }, term!{ f(%x1, %x1) } ),
                   (term!{ %x1 }, term!{ f(%x2, %x2) } ),
                   (term!{ %x2 }, term!{ f(%x3, %x3) } ),
                   (term!{ %x3 }, term!{ f(%x4, %x4) } )]

#eval toString <| fullUnify unify_ex4
#eval toString <| unifyAndApply unify_ex4