3. Lean as a Programming Language
3.1. About Lean
Lean 4 is a new programming language and interactive proof assistant. It is currently used to formalize mathematics, to verify hardware and software, and to explore applications of machine learning to mathematical reasoning. Though the system is still new and under active development, in many ways it is an ideal system for working with logical syntax and putting logic to use.
You can learn more about Lean on the Lean home page, on the Lean community home page, and by asking questions on the Lean Zulip chat, which you are heartily encouraged to join. Lean has a very large mathematical library, known as mathlib, which you can learn more about on the Lean community pages.
The following documentation is available:
As a functional programming language, Lean bears some similarity to Haskell and OCaml. If you are new to functional programming, you might also find it helpful to consult an introduction to functional programming in Haskell. Lean 4 has a number of interesting features, and it is designed for implementing powerful logic-based systems, as evidenced by the fact that most of Lean 4 is implemented in Lean 4 itself.
The goal of this section is to give you a better sense of what Lean is, how it can possibly be a programming language and proof assistant at the same time, and why that makes sense. The rest of this section will give you a quick tour of some of its features, and we will learn more about them as the course progresses.
At the core, Lean is an implementation of a formal logical foundation known as type theory. More specifically, it is an implementation of dependent type theory, and even more specifically than that, it implements a version of the Calculus of Inductive Constructions. Saying that it implements a formal logic foundation means that there is a precise grammar for writing expressions, and precise rules for using them. In Lean, every well-formed expression has a type.
#check 2 + 2
#check -5
#check [1, 2, 3]
#check #[1, 2, 3]
#check (1, 2, 3)
#check "hello world"
#check true
#check fun x => x + 1
#check fun x => if x = 1 then "yes" else "no"
You can find this example in the file using_lean_as_a_programming_language/examples1.lean in the
LAMR/Examples folder of the course repository.
We strongly recommend copying that entire folder into the User
folder,
so you can edit the files and try examples of your own.
That way, you can always find the original file in the folder LAMR/Examples
,
which you should not edit.
It will also make it easier to update your copy when we make changes.
If you hover over the #check
statements or move your cursor to one
of these lines and check the information window,
Lean reports the result of the command.
It tells you that 2 + 2
has type Nat
, -5
has type Int
, and so on.
In fact, in the formal foundation, types are expressions as well.
The types of all the expressions above are listed below:
#check Nat
#check Int
#check List Nat
#check Array Nat
#check Nat × Nat × Nat
#check String
#check Bool
#check Nat → Nat
#check Nat → String
Now Lean tells you each of these has type Type
, indicating that they are
all data types. If you know the type of an expression, you can ask Lean to confirm it:
#check (2 + 2 : Nat)
#check ([1, 2, 3] : List Nat)
Lean will report an error if it cannot construe the expression as having the indicated type.
In Lean, you can define new objects with the def
command.
The new definition becomes part of the environment: the defined expression
is associated with the identifier that appears after the word def
.
def four : Nat := 2 + 2
def isOne (x : Nat) : String := if x = 1 then "yes" else "no"
#check four
#print four
#check isOne
#print isOne
The type annotations indicate the intended types of the arguments and the result, but they can be omitted when Lean can infer them from the context:
def four' := 2 + 2
def isOne' x := if x = 1 then "yes" else "no"
So far, so good: in Lean, we can define expressions and check their types. What makes Lean into a programming language is that the logical foundation has a computational semantics, under which expressions can be evaluated.
#eval four
#eval isOne 3
#eval isOne 1
The #eval
command evaluates the expression and then
displays the return value.
Evaluation can also have side effects,
which are generally related to system IO.
For example, displaying the string “Hello, world!”
is a side effect of the following evaluation:
#eval IO.println "Hello, world!"
Theoretical computer scientists are used to thinking about programs as expressions and identifying the act of running the program with the act of evaluating the expression. In Lean, this view is made manifest, and the expressions are defined in a formal system with a precise specification.
But what makes Lean into a proof assistant? To start with, some expressions in the proof system express propositions:
#check 2 + 2 = 4
#check 2 + 2 < 5
#check isOne 3 = "no"
#check 2 + 2 < 5 ∧ isOne 3 = "no"
Lean confirms that each of these is a proposition
by reporting that each of them has type Prop
.
Notice that they do not all express true propositions;
theorem proving is about certifying the ones that are.
But the language of Lean is flexible enough to express just about
any meaningful mathematical statement at all. For example,
here is the statement of Fermat’s last theorem:
def Fermat_statement : Prop :=
∀ a b c n : Nat, a * b * c ≠ 0 ∧ n > 2 → a^n + b^n ≠ c^n
In Lean’s formal system, data types are expressions of type Type
,
and if T
is a type, an expression of type T
denotes an object
of that type. We have also seen that propositions are expressions
of type Prop
. In the formal system, if P
is a proposition,
a proof of P
is just an expression of type P
.
This is the final piece of the puzzle:
we use Lean as a proof assistant by writing down a proposition P
,
writing down an expression p
, and asking Lean to confirm that
p
has type P
. The fact that 2 + 2 = 4 has an easy proof,
that we will explain later:
theorem two_plus_two_is_four : 2 + 2 = 4 := rfl
In contrast, proving Fermat’s last theorem is considerably harder.
theorem Fermat_last_theorem : Fermat_statement := sorry
Lean knows that sorry
is not a real proof, and it flags a warning there.
If you manage to replace sorry
by a real Lean expression, please let us know.
We will be very impressed.
So, in Lean, one can write programs and execute them, and one can state propositions and prove them. In fact, one can state propositions about programs and then prove those statements as well. This is known as software verification; it is a means of obtaining a strong guarantee that a computer program behaves as intended, something that is important, say, if you are using the software to control a nuclear reactor or fly an airplane.
This course is not about software verification. We will be using Lean 4 primarily as a programming language, one in which we can easily define logical expressions and manipulate them. To a small extent, we will also write some simple proofs in Lean. This will help us think about proof systems and rules, and understand how they work. Taken together, these two activities embody the general vision that animates this course: knowing how to work with formally specified expressions and rules opens up a world of opportunity. It is the key to unlocking the secrets of the universe.
3.2. Using Lean as a functional programming language
The fact that Lean is a functional programming language means that instead of presenting a program as a list of instructions, you simply define functions and ask Lean to evaluate them.
def foo n := 3 * n + 7
#eval foo 3
#eval foo (foo 3)
def bar n := foo (foo n) + 3
#eval bar 3
#eval bar (bar 3)
There is no global state: any value a function can act on is passed as an explicit argument and is never changed. For that reason, functional programming languages are amenable to parallelization.
Nonetheless, Lean can do handle system IO using the IO monad, and can accommodate an imperative style of programming using do notation.
def printExample : IO Unit:= do
IO.println "hello"
IO.println "world"
#eval printExample
Recursive definitions are built into Lean.
def factorial : Nat → Nat
| 0 => 1
| (n + 1) => (n + 1) * factorial n
#eval factorial 10
#eval factorial 100
Here is a solution to the Towers of Hanoi problem:
def hanoi (numDisks start finish aux : Nat) : IO Unit :=
match numDisks with
| 0 => pure ()
| n + 1 => do
hanoi n start aux finish
IO.println s!"Move disk {n + 1} from peg {start} to peg {finish}"
hanoi n aux finish start
#eval hanoi 7 1 2 3
You can also define things by recursion on lists:
def addNums : List Nat → Nat
| [] => 0
| a::as => a + addNums as
#eval addNums [0, 1, 2, 3, 4, 5, 6]
In fact, there are a number of useful functions built
into Lean’s library. The function List.range n
returns the list
[0, 1, ..., n-1]
, and the functions List.map
and List.foldl
and List.foldr
implement the usual map and fold functions for lists.
By opening the List
namespace, we can refer to these as range
, map
,
foldl
, and foldr
. In the examples below,
the operation <|
has the same effect as putting parentheses around
everything that appears afterward.
#eval List.range 7
section
open List
#eval range 7
#eval addNums <| range 7
#eval map (fun x => x + 3) <| range 7
#eval foldl (. + .) 0 <| range 7
end
The scope of the open
command is limited to the section,
and the cryptic inscription (. + .)
is notation for the
addition function. Lean also supports projection notation
that is useful when the corresponding namespace is not open:
def myRange := List.range 7
#eval myRange.map fun x => x + 3
Because myRange
has type List Nat
, Lean interprets
myrange.map fun x => x + 3
as List.map (fun x => x + 3) myrange
.
In other words, it automatically interprets map
as being
in the List
namespace,
and then it interprets myrange
as the first List
argument.
This course assumes you have some familiarity with functional programming. One way to cope with the fact that there is not yet much documentation for Lean is to nose around the Lean code base itself. If you ctrl-click on the name of a function in the Lean library, the editor will jump to the definition, and you can look around and see what else is there. Another strategy is simply to ask us, ask each other, or ask questions on the Lean Zulip chat. We are all in this together.
When working with a functional programming language,
there are often clever tricks for doing things that you
may be more comfortable doing in an imperative programming language.
For example, as explained in Section 2.3,
here are Lean’s definitions of the reverse
and append
functions for lists:
namespace hidden
def reverseAux : List α → List α → List α
| [], r => r
| a::l, r => reverseAux l (a::r)
def reverse (as : List α) :List α :=
reverseAux as []
protected def append (as bs : List α) : List α :=
reverseAux as.reverse bs
end hidden
The function reverseAux l r
reverses the elements of list l
and adds them to the front of r
. When called from reverse l
,
the argument r
acts as an accumulator, storing the partial result.
Because reverseAux
is tail recursive, Lean’s compiler
can implement it efficiently as a loop rather than a recursive function.
We have defined these functions in a namespace named hidden
so that they don’t conflict with the ones in Lean’s library
if you open the List
namespace.
In Lean’s foundation, every function is totally defined.
In particular, every function that Lean computes has to
terminates (in principle) on every input.
Lean 4 will eventually support arbitrary recursive definitions in which
the arguments in a recursive call decrease by some measure,
but some work is needed to justify these calls in the underlying
foundation. In the meanwhile, we can always cheat by using the partial
keyword,
which will let us perform arbitrary recursive calls.
partial def gcd m n :=
if n = 0 then m else gcd n (m % n)
#eval gcd 45 30
#eval gcd 37252 49824
Using partial
takes us outside the formal foundation; Lean
will not let us prove anything about gcd
when we define it this way.
Using partial
also makes it easy for us to shoot ourselves in the foot:
partial def bad (n : Nat) : Nat := bad (n + 1)
On homework exercises, you should try to use structural recursion
when you can,
but don’t hesitate to use partial
whenever Lean complains
about a recursive definition.
We will not penalize you for it.
The following definition of the Fibonacci numbers does not require
the partial
keyword:
def fib' : Nat → Nat
| 0 => 0
| 1 => 1
| n + 2 => fib' (n + 1) + fib' n
But it is inefficient; you should convince yourself that the natural evaluation strategy requires exponential time. The following definition avoids that.
def fibAux : Nat → Nat × Nat
| 0 => (0, 1)
| n + 1 => let p := fibAux n
(p.2, p.1 + p.2)
def fib n := (fibAux n).1
#eval (List.range 20).map fib
Producing a list of Fibonacci numbers, however, as we have done here is inefficient; you should convince yourself that the running time is quadratic. In the exercises, we ask you to define a function that computes a list of Fibonacci values with running time linear in the length of the list.
3.3. Inductive data types in Lean
One reason that computer scientists and logicians tend to like functional programming languages is that they often provide good support for defining inductive data types and then using structural recursion on such types. For example, here is a Lean definition of the extended binary trees that we defined in mathematical terms in Section 2.3:
inductive BinTree
| empty : BinTree
| node : BinTree → BinTree → BinTree
deriving Repr, DecidableEq, Inhabited
open BinTree
The command import Init
imports a part of the initial library for us to use.
The command open BinTree
allows us to write empty
and node
instead of
BinTree.empty
and BinTree.node
.
Note the Lean convention of capitalizing the names of data types.
The last line of the definition, the one that begins with the word deriving
,
is boilerplate.
It tells Lean to automatically generate a few additional functions that are
useful. The directive deriving Repr
tells Lean to define an internal function
that can be used to represent any BinTree
as a string.
This is the string that is printed out by any #eval
command whose argument
evaluates to a BinTree
.
Adding DecidableEq
defines a function that tests whether two BinTrees
are equal,
and adding Inhabited
defines an arbitrary value of the data type to serve as
a default value for function that need one. The following illustrates their use.
#eval node empty (node empty empty)
#eval empty == node empty empty -- evaluates to false
#eval (default : BinTree) -- BinTree.empty
We can now define the functions size
and depth
by structural recursion:
def size : BinTree → Nat
| empty => 0
| node a b => 1 + size a + size b
def depth : BinTree → Nat
| empty => 0
| node a b => 1 + Nat.max (depth a) (depth b)
def example_tree := node (node empty empty) (node empty (node empty empty))
#eval size example_tree
#eval depth example_tree
Lean also supports match
syntax.
def foo (b : BinTree) : Nat :=
match b with
| empty => 0
| node _ _ => 1
#eval foo (node empty empty)
In fact, the List
data type is also inductively defined.
#print List
You should try writing the inductive definition on your own. Call
it MyList
, and then try #print MyList
to see how it compares.
Option
types are commonly used in functional programming to
represent functions that might fail to return a value.
For any type α
, and element of type Option α
is either
of the form some a
, where a
is an element of α
, or none
.
You can use a match
to determine which case we are in.
#print Option
def bar (n? : Option Nat) : Nat :=
match n? with
| some n => n
| none => 0
#eval bar (some 5)
#eval bar none
It is a Lean convention to use variable names like n?
to range over an option type.
Similarly, functions that return an element of an option type
usually have names that end with a question mark.
The function Option.getD
can be used to return a default
value in case the input is none.
#eval (some 5).getD 0
#eval none.getD 0
3.4. Using Lean as an imperative programming language
The fact that Lean is a functional programming language means that there is no global notion of state. Functions take values as input and return values as output; there are no global or even local variables that are changed by the result of a function call.
But one of the interesting features of Lean is a functional programming language is that it incorporates features that make it feel like an imperative programming language. The following example shows how to print out, for each value \(i\) less than 100, the the sum of the numbers up to \(i\).
def showSums : IO Unit := do
let mut sum := 0
for i in [0:100] do
sum := sum + i
IO.println s!"i: {i}, sum: {sum}"
#eval showSums
You can use a loop not just to print values, but also to compute values. The following is a boolean test for primality:
def isPrime (n : Nat) : Bool := Id.run do
if n < 2 then false else
for i in [2:n] do
if n % i = 0 then
return false
if i * i > n then
return true
true
You can use such a function with the list primitives to construct a list of the first 10,000 prime numbers.
Note that in both cases, the program begins with the special
identifier do
,
which invokes notation that makes sense when the return type is what
is known as a monad.
In the first case, the return value is in the IO
monad.
You can think of the fact that showSums
has type IO Unit
as saying that it doesn’t return any data but has side effects, namely, sending output to the standard output channel.
In the second case, Bool
is not a monad, but Lean allows us
to treat it as one by inserting the prefix Id.run
.
Technically, it is reinterpreting Bool
as Id Bool
,
where Id
is the identity monad.
Don’t worry about the details, though.
For the most part, you can treat do
notation as a magical black box.
#eval (List.range 10000).filter isPrime
Within a do
block, there is nice syntax for handling option types.
def bar (n? : Option Nat) : IO Unit := do
let some n := n? |
IO.println "oops"
IO.println n
#eval bar (some 2)
#eval bar none
You can also combine do
blocks with Lean’s support for arrays.
Within the formal foundation these are modeled as lists,
but the compiler implements them as dynamic arrays, and for efficiency
it will modify values rather than copy them whenever the old value is
not referred to by another part of an expression.
def primes (n : Nat) : Array Nat := Id.run do
let mut result := #[]
for i in [2:n] do
if isPrime i then
result := result.push i
result
#eval (primes 10000).size
Notice the notation: #[]
denotes a fresh array (Lean infers the type from context),
and the Array.push
function adds a new element at the end of the array.
The following example shows how to compute a two-dimensional array, a ten by ten multiplication table.
def mulTable : Array (Array Nat) := Id.run do
let mut table := #[]
for i in [:10] do
let mut row := #[]
for j in [:10] do
row := row.push ((i + 1) * (j + 1))
table := table.push row
table
#eval mulTable
Alternatively, you can use the function Array.mkArray
to initialize an array
(in this case, to the values 0), and then use the Array.set!
function
to replace the elements later one.
def mulTable' : Array (Array Nat) := Id.run do
let mut s : Array (Array Nat) := mkArray 10 (mkArray 10 0)
for i in [:10] do
for j in [:10] do
s := s.set! i <| s[i]!.set! j ((i + 1) * (j + 1))
s
Here we replace the ith row by the previous ith row, with the jth column updated.
The notation s[i]!
asks Lean’s type checker to trust that the array access is
within bounds. If it isn’t, Lean will throw an error at runtime.
Lean also provides mechanisms by which we can provide a static guarantee
that the array access is in bounds by providing a proof. But talking about how to do that
now would take us too far afield.
The following snippet prints out the table. The idiom show T from t
is a way of telling Lean that term t
should have type T
.
Writing @id T t
has a similar effect, as does writing (t : T)
.
(A difference is that the first two expressions have type T
exactly,
whereas (t : T)
only ensures that t
has a type that Lean recognizes as being
equivalent to T
.)
#eval show IO Unit from do
for i in [:10] do
for j in [:10] do
let numstr := toString mulTable[i]![j]!
-- print 1-3 spaces
IO.print <| " ".pushn ' ' (3 - numstr.length)
IO.print numstr
IO.println ""
3.5. Exercises
Using operations on
List
, write a Lean function that for every \(n\) returns the list of all the divisors of \(n\) that are less than \(n\).A natural number \(n\) is perfect if it is equal to the sum of the divisors less than \(n.\) Write a Lean function (with return type Bool) that determines whether a number \(n\) is perfect. Use it to find all the perfect numbers less than 1,000.
Define a recursive function \(\fn{sublists}(\ell)\) that, for every list \(\ell\), returns a list of all the sublists of \(\ell\). For example, given the list \([1, 2, 3]\), it should compute the list
\[[ [], [1], [2], [3], [1,2], [1,3], [2, 3], [1, 2, 3] ].\]The elements need not be listed in that same order.
Prove in Lean that the length of \(\fn{sublists}(\ell)\) is \(2^{\fn{length}(\ell)}\).
Define a function \(\fn{permutations}(\ell)\) that returns a list of all the permutations of \(\ell\).
Prove in Lean that the length of \(\fn{permutations}(\ell)\) is \(\fn{factorial}(\fn{length}(\ell))\).
Define in Lean a function that, assuming \(\ell\) is a list of lists representing an \(n \times n\) array, returns a list of lists representing the transpose of that array.
Write a program that solves the Tower of Hanoi problem with \(n\) disks on the assumption that disks can only be moved to an adjacent peg. (See Section 2.5.)
Write a program that solves the Tower of Hanoi problem with \(n\) disks on the assumption that disks can only be moved clockwise. (See Section 2.5.)
Define a Lean data type of binary trees in which every node is numbered by a label. Define a Lean function to compute the sum of the nodes in such a tree. Also write functions to list the elements in a preorder, postorder, and inorder traversal.
Write a Lean function
pascal
which, on inputn
, outputs the firstn
rows of Pascal’s triangle.