9. Linear algebra

9.1. Vector spaces and linear maps

9.1.1. Vector spaces

We will start directly abstract linear algebra, taking place in a vector space over any field. However you can find information about matrices in Section 9.4.1 which does not logically depend on this abstract theory. Mathlib actually deals with a more general version of linear algebra involving the word module, but for now we will pretend this is only an eccentric spelling habit.

The way to say “let \(K\) be a field and let \(V\) be a vector space over \(K\)” (and make them implicit arguments to later results) is:

variable {K : Type*} [Field K] {V : Type*} [AddCommGroup V] [Module K V]

We explained in Chapter 7 why we need two separate typeclasses [AddCommGroup V] [Module K V]. The short version is the following. Mathematically we want to say that having a \(K\)-vector space structure implies having an additive commutative group structure. We could tell this to Lean. But then whenever Lean would need to find such a group structure on a type \(V\), it would go hunting for vector space structures using a completely unspecified field \(K\) that cannot be inferred from \(V\). This would be very bad for the type class synthesis system.

The multiplication of a vector v by a scalar a is denoted by a • v. We list a couple of algebraic rules about the interaction of this operation with addition in the following examples. Of course simp or apply? would find those proofs. There is also a module tactic that solves goals following from the axioms of vector spaces and fields, in the same way the ring tactic is used in commutative rings or the group tactic is used in groups. But it is still useful to remember that scalar multiplication is abbreviated smul in lemma names.

example (a : K) (u v : V) : a  (u + v) = a  u + a  v :=
  smul_add a u v

example (a b : K) (u : V) : (a + b)  u = a  u + b  u :=
  add_smul a b u

example (a b : K) (u : V) : a  b  u = b  a  u :=
  smul_comm a b u

As a quick note for more advanced readers, let us point out that, as suggested by terminology, Mathlib’s linear algebra also covers modules over (not necessarily commutative) rings. In fact it even covers semi-modules over semi-rings. If you think you do not need this level of generality, you can meditate the following example that nicely captures a lot of algebraic rules about ideals acting on submodules:

example {R M : Type*} [CommSemiring R] [AddCommMonoid M] [Module R M] :
    Module (Ideal R) (Submodule R M) :=
  inferInstance

9.1.2. Linear maps

Next we need linear maps. Like group morphisms, linear maps in Mathlib are bundled maps, i.e. packages made of a map and proofs of its linearity properties. Those bundled maps are converted to ordinary functions when applied. See Chapter 7 for more information about this design.

The type of linear maps between two K-vector spaces V and W is denoted by V →ₗ[K] W. The subscript l stands for linear. At first it may feel odd to specify K in this notation. But this is crucial when several fields come into play. For instance real-linear maps from \(ℂ\) to \(ℂ\) are every map \(z ↦ az + b\bar{z}\) while only the maps \(z ↦ az\) are complex linear, and this difference is crucial in complex analysis.

variable {W : Type*} [AddCommGroup W] [Module K W]

variable (φ : V →ₗ[K] W)

example (a : K) (v : V) : φ (a  v) = a  φ v :=
  map_smul φ a v

example (v w : V) : φ (v + w) = φ v + φ w :=
  map_add φ v w

Note that V →ₗ[K] W itself carries interesting algebraic structures (this is part of the motivation for bundling those maps). It is a K-vector space so we can add linear maps and multiply them by scalars.

variable (ψ : V →ₗ[K] W)

#check (2  φ + ψ : V →ₗ[K] W)

One downside of using bundled maps is that we cannot use ordinary function composition. We need to use LinearMap.comp or the notation ∘ₗ.

variable (θ : W →ₗ[K] V)

#check (φ.comp θ : W →ₗ[K] W)
#check (φ ∘ₗ θ : W →ₗ[K] W)

There are two main ways to construct linear maps. First we can build the structure by providing the function and the linearity proof. As usual, this is facilitated by the structure code action: you can type example : V →ₗ[K] V := _ and use the code action “Generate a skeleton” attached to the underscore.

example : V →ₗ[K] V where
  toFun v := 3  v
  map_add' _ _ := smul_add ..
  map_smul' _ _ := smul_comm ..

You may wonder why the proof fields of LinearMap have names ending with a prime. This is because they are defined before the coercion to functions is defined, hence they are phrased in terms of LinearMap.toFun. Then they are restated as LinearMap.map_add and LinearMap.map_smul in terms of the coercion to function. This is not yet the end of the story. One also wants a version of map_add that applies to any (bundled) map preserving addition, such as additive group morphisms, linear maps, continuous linear maps, K-algebra maps etc… This one is map_add (in the root namespace). The intermediate version, LinearMap.map_add is a bit redundant but allows to use dot notation, which can be nice sometimes. A similar story exists for map_smul, and the general framework is explained in Chapter 7.

#check (φ.map_add' :  x y : V, φ.toFun (x + y) = φ.toFun x + φ.toFun y)
#check (φ.map_add :  x y : V, φ (x + y) = φ x + φ y)
#check (map_add φ :  x y : V, φ (x + y) = φ x + φ y)

One can also build linear maps from the ones that are already defined in Mathlib using various combinators. For instance the above example is already known as LinearMap.lsmul K V 3. There are several reason why K and V are explicit arguments here. The most pressing one is that from a bare LinearMap.lsmul 3 there would be no way for Lean to infer V or even K. But also LinearMap.lsmul K V is an interesting object by itself: it has type K →ₗ[K] V →ₗ[K] V, meaning it is a K-linear map from K —seen as a vector space over itself— to the space of K-linear maps from V to V.

#check (LinearMap.lsmul K V 3 : V →ₗ[K] V)
#check (LinearMap.lsmul K V : K →ₗ[K] V →ₗ[K] V)

There is also a type LinearEquiv of linear isomorphisms denoted by V ≃ₗ[K] W. The inverse of f : V ≃ₗ[K] W is f.symm : W ≃ₗ[K] V, composition of f and g is f.trans g also denoted by f ≪≫ₗ g, and the identity isomorphism of V is LinearEquiv.refl K V. Elements of this type are automatically coerced to morphisms and functions when necessary.

example (f : V ≃ₗ[K] W) : f ≪≫ₗ f.symm = LinearEquiv.refl K V :=
  f.self_trans_symm

One can use LinearEquiv.ofBijective to build an isomorphism from a bijective morphism. Doing so makes the inverse function noncomputable.

noncomputable example (f : V →ₗ[K] W) (h : Function.Bijective f) : V ≃ₗ[K] W :=
  .ofBijective f h

Note that in the above example, Lean uses the announced type to understand that .ofBijective refers to LinearEquiv.ofBijective (without needing to open any namespace).

9.1.3. Sums and products of vector spaces

We can build new vector spaces out of old ones using direct sums and direct products. Let us start with two vectors spaces. In this case there is no difference between sum and product, and we can simply use the product type. In the following snippet of code we simply show how to get all the structure maps (inclusions and projections) as linear maps, as well as the universal properties constructing linear maps into products and out of sums (if you are not familiar with the category-theoretic distinction between sums and products, you can simply ignore the universal property vocabulary and focus on the types of the following examples).

section binary_product

variable {W : Type*} [AddCommGroup W] [Module K W]
variable {U : Type*} [AddCommGroup U] [Module K U]
variable {T : Type*} [AddCommGroup T] [Module K T]

-- First projection map
example : V × W →ₗ[K] V := LinearMap.fst K V W

-- Second projection map
example : V × W →ₗ[K] W := LinearMap.snd K V W

-- Universal property of the product
example (φ : U →ₗ[K] V) (ψ : U →ₗ[K] W) : U →ₗ[K]  V × W := LinearMap.prod φ ψ

-- The product map does the expected thing, first component
example (φ : U →ₗ[K] V) (ψ : U →ₗ[K] W) : LinearMap.fst K V W ∘ₗ LinearMap.prod φ ψ = φ := rfl

-- The product map does the expected thing, second component
example (φ : U →ₗ[K] V) (ψ : U →ₗ[K] W) : LinearMap.snd K V W ∘ₗ LinearMap.prod φ ψ = ψ := rfl

-- We can also combine maps in parallel
example (φ : V →ₗ[K] U) (ψ : W →ₗ[K] T) : (V × W) →ₗ[K] (U × T) := φ.prodMap ψ

-- This is simply done by combining the projections with the universal property
example (φ : V →ₗ[K] U) (ψ : W →ₗ[K] T) :
  φ.prodMap ψ = (φ ∘ₗ .fst K V W).prod (ψ ∘ₗ .snd K V W) := rfl

-- First inclusion map
example : V →ₗ[K] V × W := LinearMap.inl K V W

-- Second inclusion map
example : W →ₗ[K] V × W := LinearMap.inr K V W

-- Universal property of the sum (aka coproduct)
example (φ : V →ₗ[K] U) (ψ : W →ₗ[K] U) : V × W →ₗ[K] U := φ.coprod ψ

-- The coproduct map does the expected thing, first component
example (φ : V →ₗ[K] U) (ψ : W →ₗ[K] U) : φ.coprod ψ ∘ₗ LinearMap.inl K V W = φ :=
  LinearMap.coprod_inl φ ψ

-- The coproduct map does the expected thing, second component
example (φ : V →ₗ[K] U) (ψ : W →ₗ[K] U) : φ.coprod ψ ∘ₗ LinearMap.inr K V W = ψ :=
  LinearMap.coprod_inr φ ψ

-- The coproduct map is defined in the expected way
example (φ : V →ₗ[K] U) (ψ : W →ₗ[K] U) (v : V) (w : W) :
    φ.coprod ψ (v, w) = φ v + ψ w :=
  rfl

end binary_product

Let us now turn to sums and products of arbitrary families of vector spaces. Here we will simply see how to define a family of vector spaces and access the universal properties of sums and products. Note that the direct sum notation is scoped to the DirectSum namespace, and that the universal property of direct sums requires decidable equality on the indexing type (this is somehow an implementation accident).

section families
open DirectSum

variable {ι : Type*} [DecidableEq ι]
         (V : ι  Type*) [ i, AddCommGroup (V i)] [ i, Module K (V i)]

-- The universal property of the direct sum assembles maps from the summands to build
-- a map from the direct sum
example (φ : Π i, (V i →ₗ[K] W)) : ( i, V i) →ₗ[K] W :=
  DirectSum.toModule K ι W φ

-- The universal property of the direct product assembles maps into the factors
-- to build a map into the direct product
example (φ : Π i, (W →ₗ[K] V i)) : W →ₗ[K] (Π i, V i) :=
  LinearMap.pi φ

-- The projection maps from the product
example (i : ι) : (Π j, V j) →ₗ[K] V i := LinearMap.proj i

-- The inclusion maps into the sum
example (i : ι) : V i →ₗ[K] ( i, V i) := DirectSum.lof K ι V i

-- The inclusion maps into the product
example (i : ι) : V i →ₗ[K] (Π i, V i) := LinearMap.single K V i

-- In case `ι` is a finite type, there is an isomorphism between the sum and product.
example [Fintype ι] : ( i, V i) ≃ₗ[K] (Π i, V i) :=
  linearEquivFunOnFintype K ι V

end families

9.2. Subspaces and quotients

9.2.1. Subspaces

Just as linear maps are bundled, a linear subspace of V is also a bundled structure consisting of a set in V, called the carrier of the subspace, with the relevant closure properties. Again the word module appears instead of vector space because of the more general context that Mathlib actually uses for linear algebra.

variable {K : Type*} [Field K] {V : Type*} [AddCommGroup V] [Module K V]

example (U : Submodule K V) {x y : V} (hx : x  U) (hy : y  U) :
    x + y  U :=
  U.add_mem hx hy

example (U : Submodule K V) {x : V} (hx : x  U) (a : K) :
    a  x  U :=
  U.smul_mem a hx

In the example above, it is important to understand that Submodule K V is the type of K-linear subspaces of V, rather than a predicate IsSubmodule U where U is an element of Set V. Submodule K V is endowed with a coercion to Set V and a membership predicate on V. See Section 7.3 for an explanation of how and why this is done.

Of course, two subspaces are the same if and only if they have the same elements. This fact is registered for use with the ext tactic, which can be used to prove two subspaces are equal in the same way it is used to prove that two sets are equal.

To state and prove, for example, that is a -linear subspace of , what we really want is to construct a term of type Submodule whose projection to Set is , or, more precisely, the image of in .

noncomputable example : Submodule   where
  carrier := Set.range (() :   )
  add_mem' := by
    rintro _ _ n, rfl m, rfl
    use n + m
    simp
  zero_mem' := by
    use 0
    simp
  smul_mem' := by
    rintro c - a, rfl
    use c*a
    simp

The prime at the end of proof fields in Submodule are analogous to the one in LinearMap. Those fields are stated in terms of the carrier field because they are defined before the MemberShip instance. They are then superseded by Submodule.add_mem, Submodule.zero_mem and Submodule.smul_mem that we saw above.

As an exercise in manipulating subspaces and linear maps, you will define the pre-image of a subspace by a linear map (of course we will see below that Mathlib already knows about this). Remember that Set.mem_preimage can be used to rewrite a statement involving membership and preimage. This is the only lemma you will need in addition to the lemmas discussed above about LinearMap and Submodule.

def preimage {W : Type*} [AddCommGroup W] [Module K W] (φ : V →ₗ[K] W) (H : Submodule K W) :
    Submodule K V where
  carrier := φ ⁻¹' H
  zero_mem' := by
    dsimp
    sorry
  add_mem' := by
    sorry
  smul_mem' := by
    dsimp
    sorry

Using type classes, Mathlib knows that a subspace of a vector space inherits a vector space structure.

example (U : Submodule K V) : Module K U := inferInstance

This example is subtle. The object U is not a type, but Lean automatically coerces it to a type by interpreting it as a subtype of V. So the above example can be restated more explicitly as:

example (U : Submodule K V) : Module K {x : V // x  U} := inferInstance

9.2.2. Complete lattice structure and internal direct sums

An important benefit of having a type Submodule K V instead of a predicate IsSubmodule : Set V Prop is that one can easily endow Submodule K V with additional structure. Importantly, it has the structure of a complete lattice structure with respect to inclusion. For instance, instead of having a lemma stating that an intersection of two subspaces of V is again a subspace, we use the lattice operation to construct the intersection. We can then apply arbitrary lemmas about lattices to the construction.

Let us check that the set underlying the infimum of two subspaces is indeed, by definition, their intersection.

example (H H' : Submodule K V) :
    ((H  H' : Submodule K V) : Set V) = (H : Set V)  (H' : Set V) := rfl

It may look strange to have a different notation for what amounts to the intersection of the underlying sets, but the correspondence does not carry over to the supremum operation and set union, since a union of subspaces is not, in general, a subspace. Instead one needs to use the subspace generated by the union, which is done using Submodule.span.

example (H H' : Submodule K V) :
    ((H  H' : Submodule K V) : Set V) = Submodule.span K ((H : Set V)  (H' : Set V)) := by
  simp [Submodule.span_union]

Another subtlety is that V itself does not have type Submodule K V, so we need a way to talk about V seen as a subspace of V. This is also provided by the lattice structure: the full subspace is the top element of this lattice.

example (x : V) : x  ( : Submodule K V) := trivial

Similarly the bottom element of this lattice is the subspace whose only element is the zero element.

example (x : V) : x  ( : Submodule K V)  x = 0 := Submodule.mem_bot K

In particular we can discuss the case of subspaces that are in (internal) direct sum. In the case of two subspaces, we use the general purpose predicate IsCompl which makes sense for any bounded partially ordered type. In the case of general families of subspaces we use DirectSum.IsInternal.

-- If two subspaces are in direct sum then they span the whole space.
example (U V : Submodule K V) (h : IsCompl U V) :
  U  V =  := h.sup_eq_top

-- If two subspaces are in direct sum then they intersect only at zero.
example (U V : Submodule K V) (h : IsCompl U V) :
  U  V =  := h.inf_eq_bot

section
open DirectSum
variable {ι : Type*} [DecidableEq ι]

-- If subspaces are in direct sum then they span the whole space.
example (U : ι  Submodule K V) (h : DirectSum.IsInternal U) :
   i, U i =  := h.submodule_iSup_eq_top

-- If subspaces are in direct sum then they pairwise intersect only at zero.
example {ι : Type*} [DecidableEq ι] (U : ι  Submodule K V) (h : DirectSum.IsInternal U)
    {i j : ι} (hij : i  j) : U i  U j =  :=
  (h.submodule_independent.pairwiseDisjoint hij).eq_bot

-- Those conditions characterize direct sums.
#check DirectSum.isInternal_submodule_iff_independent_and_iSup_eq_top

-- The relation with external direct sums: if a family of subspaces is
-- in internal direct sum then the map from their external direct sum into `V`
-- is a linear isomorphism.
noncomputable example {ι : Type*} [DecidableEq ι] (U : ι  Submodule K V)
    (h : DirectSum.IsInternal U) : ( i, U i) ≃ₗ[K] V :=
  LinearEquiv.ofBijective (coeLinearMap U) h
end

9.2.3. Subspace spanned by a set

In addition to building subspaces out of existing subspaces, we can build them out of any set s using Submodule.span K s which builds the smallest subspace containing s. On paper it is common to use that this space is made of all linear combinations of elements of s. But it is often more efficient to use its universal property expressed by Submodule.span_le, and the whole theory of Galois connections.

example {s : Set V} (E : Submodule K V) : Submodule.span K s  E  s  E :=
  Submodule.span_le

example : GaloisInsertion (Submodule.span K) (() : Submodule K V  Set V) :=
  Submodule.gi K V

When those are not enough, one can use the relevant induction principle Submodule.span_induction which ensures a property holds for every element of the span of s as long as it holds on zero and elements of s and is stable under sum and scalar multiplication.

As an exercise, let us reprove one implication of Submodule.mem_sup. Remember that you can use the module tactic to close goals that follow from the axioms relating the various algebraic operations on V.

example {S T : Submodule K V} {x : V} (h : x  S  T) :
     s  S,  t  T, x = s + t  := by
  rw [ S.span_eq,  T.span_eq,  Submodule.span_union] at h
  induction h using Submodule.span_induction with
  | mem y h =>
      sorry
  | zero =>
      sorry
  | add x y hx hy hx' hy' =>
      sorry
  | smul a x hx hx' =>
      sorry

9.2.4. Pushing and pulling subspaces

As promised earlier, we now describe how to push and pull subspaces by linear maps. As usual in Mathlib, the first operation is called map and the second one is called comap.

section

variable {W : Type*} [AddCommGroup W] [Module K W] (φ : V →ₗ[K] W)

variable (E : Submodule K V) in
#check (Submodule.map φ E : Submodule K W)

variable (F : Submodule K W) in
#check (Submodule.comap φ F : Submodule K V)

Note those live in the Submodule namespace so one can use dot notation and write E.map φ instead of Submodule.map φ E, but this is pretty awkward to read (although some Mathlib contributors use this spelling).

In particular the range and kernel of a linear map are subspaces. Those special cases are important enough to get declarations.

example : LinearMap.range φ = .map φ  := LinearMap.range_eq_map φ

example : LinearMap.ker φ = .comap φ  := Submodule.comap_bot φ -- or `rfl`

Note that we cannot write φ.ker instead of LinearMap.ker φ because LinearMap.ker also applies to classes of maps preserving more structure, hence it does not expect an argument whose type starts with LinearMap, hence dot notation doesn’t work here. However we were able to use the other flavor of dot notation in the right-hand side. Because Lean expects a term with type Submodule K V after elaborating the left-hand side, it interprets .comap as Submodule.comap.

The following lemmas give the key relations between those submodule and the properties of φ.

open Function LinearMap

example : Injective φ  ker φ =  := ker_eq_bot.symm

example : Surjective φ  range φ =  := range_eq_top.symm

As an exercise, let us prove the Galois connection property for map and comap. One can use the following lemmas but this is not required since they are true by definition.

#check Submodule.mem_map_of_mem
#check Submodule.mem_map
#check Submodule.mem_comap

example (E : Submodule K V) (F : Submodule K W) :
    Submodule.map φ E  F  E  Submodule.comap φ F := by
  sorry

9.2.5. Quotient spaces

Quotient vector spaces use the general quotient notation (typed with \quot, not the ordinary /). The projection onto a quotient space is Submodule.mkQ and the universal property is Submodule.liftQ.

variable (E : Submodule K V)

example : Module K (V  E) := inferInstance

example : V →ₗ[K] V  E := E.mkQ

example : ker E.mkQ = E := E.ker_mkQ

example : range E.mkQ =  := E.range_mkQ

example ( : E  ker φ) : V  E →ₗ[K] W := E.liftQ φ 

example (F : Submodule K W) ( : E  .comap φ F) : V  E →ₗ[K] W  F := E.mapQ F φ 

noncomputable example : (V  LinearMap.ker φ) ≃ₗ[K] range φ := φ.quotKerEquivRange

As an exercise, let us prove the correspondence theorem for subspaces of quotient spaces. Mathlib knows a slightly more precise version as Submodule.comapMkQRelIso.

open Submodule

#check Submodule.map_comap_eq
#check Submodule.comap_map_eq

example : Submodule K (V  E)  { F : Submodule K V // E  F } where
  toFun := sorry
  invFun := sorry
  left_inv := sorry
  right_inv := sorry

9.3. Endomorphisms

An important special case of linear maps are endomorphisms: linear maps from a vector space to itself. They are interesting because they form a K-algebra. In particular we can evaluate polynomials with coefficients in K on them, and they can have eigenvalues and eigenvectors.

Mathlib uses the abbreviation Module.End K V := V →ₗ[K] V which is convenient when using a lot of these (especially after opening the Module namespace).

variable {K : Type*} [Field K] {V : Type*} [AddCommGroup V] [Module K V]

variable {W : Type*} [AddCommGroup W] [Module K W]


open Polynomial Module LinearMap

example (φ ψ : End K V) : φ * ψ = φ ∘ₗ ψ :=
  LinearMap.mul_eq_comp φ ψ -- `rfl` would also work

-- evaluating `P` on `φ`
example (P : K[X]) (φ : End K V) : V →ₗ[K] V :=
  aeval φ P

-- evaluating `X` on `φ` gives back `φ`
example (φ : End K V) : aeval φ (X : K[X]) = φ :=
  aeval_X φ

As an exercise manipulating endomorphisms, subspaces and polynomials, let us prove the (binary) kernels lemma: for any endomorphism \(φ\) and any two relatively prime polynomials \(P\) and \(Q\), we have \(\ker P(φ) ⊕ \ker Q(φ) = \ker \big(PQ(φ)\big)\).

Note that IsCoprime x y is defined as a b, a * x + b * y = 1.

#check Submodule.eq_bot_iff
#check Submodule.mem_inf
#check LinearMap.mem_ker

example (P Q : K[X]) (h : IsCoprime P Q) (φ : End K V) : ker (aeval φ P)  ker (aeval φ Q) =  := by
  sorry

#check Submodule.add_mem_sup
#check map_mul
#check LinearMap.mul_apply
#check LinearMap.ker_le_ker_comp

example (P Q : K[X]) (h : IsCoprime P Q) (φ : End K V) :
    ker (aeval φ P)  ker (aeval φ Q) = ker (aeval φ (P*Q)) := by
  sorry

We now move to the discussions of eigenspaces and eigenvalues. The eigenspace associated to an endomorphism \(φ\) and a scalar \(a\) is the kernel of \(φ - aId\). Eigenspaces are defined for all values of a, although they are interesting only when they are non-zero. However an eigenvector is, by definition, a non-zero element of an eigenspace. The corresponding predicate is End.HasEigenvector.

example (φ : End K V) (a : K) : φ.eigenspace a = LinearMap.ker (φ - a  1) :=
  End.eigenspace_def

Then there is a predicate End.HasEigenvalue and the corresponding subtype End.Eigenvalues.

example (φ : End K V) (a : K) : φ.HasEigenvalue a  φ.eigenspace a   :=
  Iff.rfl

example (φ : End K V) (a : K) : φ.HasEigenvalue a   v, φ.HasEigenvector a v  :=
  End.HasEigenvalue.exists_hasEigenvector, fun _, hv  φ.hasEigenvalue_of_hasEigenvector hv

example (φ : End K V) : φ.Eigenvalues = {a // φ.HasEigenvalue a} :=
  rfl

-- Eigenvalue are roots of the minimal polynomial
example (φ : End K V) (a : K) : φ.HasEigenvalue a  (minpoly K φ).IsRoot a :=
  φ.isRoot_of_hasEigenvalue

-- In finite dimension, the converse is also true (we will discuss dimension below)
example [FiniteDimensional K V] (φ : End K V) (a : K) :
    φ.HasEigenvalue a  (minpoly K φ).IsRoot a :=
  φ.hasEigenvalue_iff_isRoot

-- Cayley-Hamilton
example [FiniteDimensional K V] (φ : End K V) : aeval φ φ.charpoly = 0 :=
  φ.aeval_self_charpoly

9.4. Matrices, bases and dimension

9.4.1. Matrices

Before introducing bases for abstract vector spaces, we go back to the much more elementary setup of linear algebra in \(K^n\) for some field \(K\). Here the main objects are vectors and matrices. For concrete vectors, one can use the ![…] notation, where components are separated by commas. For concrete matrices we can use the !![…] notation, lines are separated by semi-colons and components of lines are separated by colons. When entries have a computable type such as or , we can use the eval command to play with basic operations.

section matrices

-- Adding vectors
#eval !![1, 2] + !![3, 4]  -- !![4, 6]

-- Adding matrices
#eval !![1, 2; 3, 4] + !![3, 4; 5, 6]  -- !![4, 6; 8, 10]

-- Multiplying matrices
#eval !![1, 2; 3, 4] * !![3, 4; 5, 6]  -- !![4, 6; 8, 10]

It is important to understand that this use of #eval is interesting only for exploration, it is not meant to replace a computer algebra system such as Sage. The data representation used here for matrices is not computationally efficient in any way. It uses functions instead of arrays and is optimized for proving, not computing. The virtual machine used by #eval is also not optimized for this use.

Beware the matrix notation list rows but the vector notation is neither a row vector nor a column vector. Multiplication of a matrix with a vector from the left (resp. right) interprets the vector as a row (resp. column) vector. This corresponds to operations Matrix.vecMul, with notation ᵥ* and Matrix.mulVec, with notation ` *ᵥ`. Those notations are scoped in the Matrix namespace that we therefore need to open.

open Matrix

-- matrices acting on vectors on the left
#eval !![1, 2; 3, 4] *ᵥ ![1, 1] -- ![3, 7]

-- matrices acting on vectors on the left, resulting in a size one matrix
#eval !![1, 2] *ᵥ ![1, 1]  -- ![3]

-- matrices acting on vectors on the right
#eval  ![1, 1, 1] ᵥ* !![1, 2; 3, 4; 5, 6] -- ![9, 12]

In order to generate matrices with identical rows or columns specified by a vector, we use Matrix.row and Matrix.column, with arguments the type indexing the rows or columns and the vector. For instance one can get single row or single column matrixes (more precisely matrices whose rows or columns are indexed by Fin 1).

#eval row (Fin 1) ![1, 2] -- !![1, 2]

#eval col (Fin 1) ![1, 2] -- !![1; 2]

Other familiar operations include the vector dot product, matrix transpose, and, for square matrices, determinant and trace.

-- vector dot product
#eval ![1, 2] ⬝ᵥ ![3, 4] -- `11`

-- matrix transpose
#eval !![1, 2; 3, 4] -- `!![1, 3; 2, 4]`

-- determinant
#eval !![(1 : ), 2; 3, 4].det -- `-2`

-- trace
#eval !![(1 : ), 2; 3, 4].trace -- `5`

When entries do not have a computable type, for instance if they are real numbers, we cannot hope that #eval can help. Also this kind of evaluation cannot be used in proofs without considerably expanding the trusted code base (i.e. the part of Lean that you need to trust when checking proofs).

So it is good to also use the simp and norm_num tactics in proofs, or their command counter-part for quick exploration.

#simp !![(1 : ), 2; 3, 4].det -- `4 - 2*3`

#norm_num !![(1 : ), 2; 3, 4].det -- `-2`

#norm_num !![(1 : ), 2; 3, 4].trace -- `5`

variable (a b c d : ) in
#simp !![a, b; c, d].det -- `a * d – b * c`

The next important operation on square matrices is inversion. In the same way as division of numbers is always defined and returns the artificial value zero for division by zero, the inversion operation is defined on all matrices and returns the zero matrix for non-invertible matrices.

More precisely, there is general function Ring.inverse that does this in any ring, and, for any matrix A, A⁻¹ is defined as Ring.inverse A.det A.adjugate. According to Cramer’s rule, this is indeed the inverse of A when the determinant of A is not zero.

#norm_num [Matrix.inv_def] !![(1 : ), 2; 3, 4]⁻¹ -- !![-2, 1; 3 / 2, -(1 / 2)]

Of course this definition is really useful only for invertible matrices. There is a general type class Invertible that helps recording this. For instance, the simp call in the next example will use the inv_mul_of_invertible lemma which has an Invertible type-class assumption, so it will trigger only if this can be found by the type-class synthesis system. Here we make this fact available using a have statement.

example : !![(1 : ), 2; 3, 4]⁻¹ * !![(1 : ), 2; 3, 4] = 1 := by
  have : Invertible !![(1 : ), 2; 3, 4] := by
    apply Matrix.invertibleOfIsUnitDet
    norm_num
  simp

In this fully concrete case, we could also use the norm_num machinery, and apply? to find the final line:

example : !![(1 : ), 2; 3, 4]⁻¹ * !![(1 : ), 2; 3, 4] = 1 := by
  norm_num [Matrix.inv_def]
  exact one_fin_two.symm

All the concrete matrices above have their rows and columns indexed by Fin n for some n (not necessarily the same for rows and columns). But sometimes it is more convenient to index matrices using arbitrary finite types. For instance the adjacency matrix of a finite graph has rows and columns naturally indexed by the vertices of the graph.

In fact when simply wants to define matrices without defining any operation on them, finiteness of the indexing types are not even needed, and coefficients can have any type, without any algebraic structure. So Mathlib simply defines Matrix m n α to be m n α for any types m, n and α, and the matrices we have been using so far had types such as Matrix (Fin 2) (Fin 2) . Of course algebraic operations require more assumptions on m, n and α.

Note the main reason why we do not use m n α directly is to allow the type class system to understand what we want. For instance, for a ring R, the type n R is endowed with the point-wise multiplication operation, and similarly m n R has this operation which is not the multiplication we want on matrices.

In the first example below, we force Lean to see through the definition of Matrix and accept the statement as meaningful, and then prove it by checking all entries.

But then the next two examples reveal that Lean uses the point-wise multiplication on Fin 2 Fin 2 but the matrix multiplication on Matrix (Fin 2) (Fin 2) .

section

example : (fun _  1 : Fin 2  Fin 2  ) = !![1, 1; 1, 1] := by
  ext i j
  fin_cases i <;> fin_cases j <;> rfl

example : (fun _  1 : Fin 2  Fin 2  ) * (fun _  1 : Fin 2  Fin 2  ) = !![1, 1; 1, 1] := by
  ext i j
  fin_cases i <;> fin_cases j <;> rfl

example : !![1, 1; 1, 1] * !![1, 1; 1, 1] = !![2, 2; 2, 2] := by
  norm_num

In order to define matrices as functions without losing the benefits of Matrix for type class synthesis, we can use the equivalence Matrix.of between functions and matrices. This equivalence is secretly defined using Equiv.refl.

For instance we can define Vandermonde matrices corresponding to a vector v.

example {n : } (v : Fin n  ) :
    Matrix.vandermonde v = Matrix.of (fun i j : Fin n  v i ^ (j : )) :=
  rfl
end
end matrices

9.4.2. Bases

We now want to discuss bases of vector spaces. Informally there are many ways to define this notion. One can use a universal property. One can say a basis is a family of vectors that is linearly independent and spanning. Or one can combine those properties and directly say that a basis is a family of vectors such that every vectors can be written uniquely as a linear combination of bases vectors. Yet another way to say it is that a basis provides a linear isomorphism with a power of the base field K, seen as a vector space over K.

This isomorphism version is actually the one that Mathlib uses as a definition under the hood, and other characterizations are proven from it. One must be slightly careful with the “power of K” idea in the case of infinite bases. Indeed only finite linear combinations make sense in this algebraic context. So what we need as a reference vector space is not a direct product of copies of K but a direct sum. We could use i : ι, K for some type ι indexing the basis But we rather use the more specialized spelling ι →₀ K which means “functions from ι to K with finite support”, i.e. functions which vanish outside a finite set in ι (this finite set is not fixed, it depends on the function). Evaluating such a function coming from a basis B at a vector v and i : ι returns the component (or coordinate) of v on the i-th basis vector.

The type of bases indexed by a type ι of V as a K vector space is Basis ι K V. The isomorphism is called Basis.repr.

variable {K : Type*} [Field K] {V : Type*} [AddCommGroup V] [Module K V]

section

variable {ι : Type*} (B : Basis ι K V) (v : V) (i : ι)

-- The basis vector with index ``i``
#check (B i : V)

-- the linear isomorphism with the model space given by ``B``
#check (B.repr : V ≃ₗ[K] ι →₀ K)

-- the component function of ``v``
#check (B.repr v : ι →₀ K)

-- the component of ``v`` with index ``i``
#check (B.repr v i : K)

Instead of starting with such an isomorphism, one can start with a family b of vectors that is linearly independent and spanning, this is Basis.mk.

The assumption that the family is spanning is spelled out as Submodule.span K (Set.range b). Here is the top submodule of V, i.e. V seen as submodule of itself. This spelling looks a bit tortuous, but we will see below that it is almost equivalent by definition to the more readable v, v Submodule.span K (Set.range b) (the underscores in the snippet below refers to the useless information v ).

noncomputable example (b : ι  V) (b_indep : LinearIndependent K b)
    (b_spans :  v, v  Submodule.span K (Set.range b)) : Basis ι K V :=
  Basis.mk b_indep (fun v _  b_spans v)

-- The family of vectors underlying the above basis is indeed ``b``.
example (b : ι  V) (b_indep : LinearIndependent K b)
    (b_spans :  v, v  Submodule.span K (Set.range b)) (i : ι) :
    Basis.mk b_indep (fun v _  b_spans v) i = b i :=
  Basis.mk_apply b_indep (fun v _  b_spans v) i

In particular the model vector space ι →₀ K has a so-called canonical basis whose repr function evaluated on any vector is the identity isomorphism. It is called Finsupp.basisSingleOne where Finsupp means function with finite support and basisSingleOne refers to the fact that basis vectors are functions which vanish expect for a single input value. More precisely the basis vector indexed by i : ι is Finsupp.single i 1 which is the finitely supported function taking value 1 at i and 0 everywhere else.

variable [DecidableEq ι]

example : Finsupp.basisSingleOne.repr = LinearEquiv.refl K (ι →₀ K) :=
  rfl

example (i : ι) : Finsupp.basisSingleOne i = Finsupp.single i 1 :=
  rfl

The story of finitely supported functions is unneeded when the indexing type is finite. In this case we can use the simpler Pi.basisFun which gives a basis of the whole ι K.

example [Finite ι] (x : ι  K) (i : ι) : (Pi.basisFun K ι).repr x i = x i := by
  simp

Going back to the general case of bases of abstract vector spaces, we can express any vector as a linear combination of basis vectors. Let us first see the easy case of finite bases.

example [Fintype ι] :  i : ι, B.repr v i  (B i) = v :=
  B.sum_repr v

When ι is not finite, the above statement makes no sense a priori: we cannot take a sum over ι. However the support of the function being summed is finite (it is the support of B.repr v). But we need to apply a construction that takes this into account. Here Mathlib uses a special purpose function that requires some time to get used to: Finsupp.linearCombination (which is built on top of the more general Finsupp.sum). Given a finitely supported function c from a type ι to the base field K and any function f from ι to V, Finsupp.linearCombination K f c is the sum over the support of c of the scalar multiplication c f. In particular, we can replace it by a sum over any finite set containing the support of c.

example (c : ι →₀ K) (f : ι  V) (s : Finset ι) (h : c.support  s) :
    Finsupp.linearCombination K f c =  i  s, c i  f i :=
  Finsupp.linearCombination_apply_of_mem_supported K h

One could also assume that f is finitely supported and still get a well defined sum. But the choice made by Finsupp.linearCombination is the one relevant to our basis discussion since it allows to state the generalization of Basis.sum_repr.

example : Finsupp.linearCombination K B (B.repr v) = v :=
  B.linearCombination_repr v

One could wonder why K is an explicit argument here, despite the fact it can be inferred from the type of c. The point is that the partially applied Finsupp.linearCombination K f is interesting in itself. It is not a bare function from ι →₀ K to V but a K-linear map.

variable (f : ι  V) in
#check (Finsupp.linearCombination K f : (ι →₀ K) →ₗ[K] V)

The above subtlety also explains why dot notation cannot be used to write c.linearCombination K f instead of Finsupp.linearCombination K f c. Indeed Finsupp.linearCombination does not take c as an argument, Finsupp.linearCombination K f is coerced to a function type and then this function takes c as an argument.

Returning to the mathematical discussion, it is important to understand that the representation of vectors in a basis is less useful in formalized mathematics than you may think. Indeed it is very often more efficient to directly use more abstract properties of bases. In particular the universal property of bases connecting them to other free objects in algebra allows to construct linear maps by specifying the images of basis vectors. This is Basis.constr. For any K-vector space W, our basis B gives a linear isomorphism Basis.constr B K from ι W to V →ₗ[K] W. This isomorphism is characterized by the fact that it sends any function u : ι W to a linear map sending the basis vector B i to u i, for every i : ι.

section

variable {W : Type*} [AddCommGroup W] [Module K W]
         (φ : V →ₗ[K] W) (u : ι  W)

#check (B.constr K : (ι  W) ≃ₗ[K] (V →ₗ[K] W))

#check (B.constr K u : V →ₗ[K] W)

example (i : ι) : B.constr K u (B i) = u i :=
  B.constr_basis K u i

This property is indeed characteristic because linear maps are determined by their values on bases:

example (φ ψ : V →ₗ[K] W) (h :  i, φ (B i) = ψ (B i)) : φ = ψ :=
  B.ext h

If we also have a basis B' on the target space then we can identify linear maps with matrices. This identification is a K-linear isomorphism.

variable {ι' : Type*} (B' : Basis ι' K W) [Fintype ι] [DecidableEq ι] [Fintype ι'] [DecidableEq ι']

open LinearMap

#check (toMatrix B B' : (V →ₗ[K] W) ≃ₗ[K] Matrix ι' ι K)

open Matrix -- get access to the ``*ᵥ`` notation for multiplication between matrices and vectors.

example (φ : V →ₗ[K] W) (v : V) : (toMatrix B B' φ) *ᵥ (B.repr v) = B'.repr (φ v) :=
  toMatrix_mulVec_repr B B' φ v


variable {ι'' : Type*} (B'' : Basis ι'' K W) [Fintype ι''] [DecidableEq ι'']

example (φ : V →ₗ[K] W) : (toMatrix B B'' φ) = (toMatrix B' B'' .id) * (toMatrix B B' φ) := by
  simp

end

As an exercise on this topic, we will prove part of the theorem which guarantees that endomorphisms have a well-defined determinant. Namely we want to prove that when two bases are indexed by the same type, the matrices they attach to any endomorphism have the same determinant. This would then need to be complemented using that bases all have isomorphic indexing types to get the full result.

Of course Mathlib already knows this, and simp can close the goal immediately, so you shouldn’t use it too soon, but rather use the provided lemmas.

open Module LinearMap Matrix

-- Some lemmas coming from the fact that `LinearMap.toMatrix` is an algebra morphism.
#check toMatrix_comp
#check id_comp
#check comp_id
#check toMatrix_id

-- Some lemmas coming from the fact that ``Matrix.det`` is a multiplicative monoid morphism.
#check Matrix.det_mul
#check Matrix.det_one

example [Fintype ι] (B' : Basis ι K V) (φ : End K V) :
    (toMatrix B B φ).det = (toMatrix B' B' φ).det := by
  set M := toMatrix B B φ
  set M' := toMatrix B' B' φ
  set P := (toMatrix B B') LinearMap.id
  set P' := (toMatrix B' B) LinearMap.id
  sorry
end

9.4.3. Dimension

Returning to the case of a single vector space, bases are also useful to define the concept of dimension. Here again, there is the elementary case of finite-dimensional vector spaces. For such spaces we expect a dimension which is a natural number. This is Module.finrank. It takes the base field as an explicit argument since a given abelian group can be a vector space over different fields.

section
#check (Module.finrank K V : )

-- `Fin n → K` is the archetypical space with dimension `n` over `K`.
example (n : ) : Module.finrank K (Fin n  K) = n :=
  Module.finrank_fin_fun K

-- Seen as a vector space over itself, `ℂ` has dimension one.
example : Module.finrank   = 1 :=
  Module.finrank_self 

-- But as a real vector space it has dimension two.
example : Module.finrank   = 2 :=
  Complex.finrank_real_complex

Note that Module.finrank is defined for any vector space. It returns zero for infinite dimensional vector spaces, just as division by zero returns zero.

Of course many lemmas require a finite dimension assumption. This is the role of the FiniteDimensional typeclass. For instance, think about how the next example fails without this assumption.

example [FiniteDimensional K V] : 0 < Module.finrank K V  Nontrivial V  :=
  Module.finrank_pos_iff

In the above statement, Nontrivial V means V has at least two different elements. Note that Module.finrank_pos_iff has no explicit argument. This is fine when using it from left to right, but not when using it from right to left because Lean has no way to guess K from the statement Nontrivial V. In that case it is useful to use the name argument syntax, after checking that the lemma is stated over a ring named R. So we can write:

example [FiniteDimensional K V] (h : 0 < Module.finrank K V) : Nontrivial V := by
  apply (Module.finrank_pos_iff (R := K)).1
  exact h

The above spelling is strange because we already have h as an assumption, so we could just as well give the full proof Module.finrank_pos_iff.1 h but it is good to know for more complicated cases.

By definition, FiniteDimensional K V can be read from any basis.

variable {ι : Type*} (B : Basis ι K V)

example [Finite ι] : FiniteDimensional K V := FiniteDimensional.of_fintype_basis B

example [FiniteDimensional K V] : Finite ι :=
  (FiniteDimensional.fintypeBasisIndex B).finite
end

Using that the subtype corresponding to a linear subspace has a vector space structure, we can talk about the dimension of a subspace.

section
variable (E F : Submodule K V) [FiniteDimensional K V]

open Module

example : finrank K (E  F : Submodule K V) + finrank K (E  F : Submodule K V) =
    finrank K E + finrank K F :=
  Submodule.finrank_sup_add_finrank_inf_eq E F

example : finrank K E  finrank K V := Submodule.finrank_le E

In the first statement above, the purpose of the type ascriptions is to make sure that coercion to Type* does not trigger too early.

We are now ready for an exercise about finrank and subspaces.

example (h : finrank K V < finrank K E + finrank K F) :
    Nontrivial (E  F : Submodule K V) := by
  sorry
end

Let us now move to the general case of dimension theory. In this case finrank is useless, but we still have that, for any two bases of the same vector space, there is a bijection between the types indexing those bases. So we can still hope to define the rank as a cardinal, i.e. an element of the “quotient of the collection of types under the existence of a bijection equivalence relation”.

When discussing cardinal, it gets harder to ignore foundational issues around Russel’s paradox like we do everywhere else in this book. There is no type of all types because it would lead to logical inconsistencies. This issue is solved by the hierarchy of universes that we usually try to ignore.

Each type has a universe level, and those levels behave similarly to natural numbers. In particular there is zeroth level, and the corresponding universe Type 0 is simply denoted by Type. This universe is enough to hold almost all of classical mathematics. For instance and have type Type. Each level u has a successor denoted by u + 1, and Type u has type Type (u+1).

But universe levels are not natural numbers, they have a really different nature and don’t have a type. In particular you cannot state in Lean something like u u + 1. There is simply no type where this would take place. Even stating Type u Type (u+1) does not make any sense since Type u and Type (u+1) have different types.

Whenever we write Type*, Lean inserts a universe level variable named u_n where n is a number. This allows definitions and statements to live in all universes.

Given a universe level u, we can define an equivalence relation on Type u saying two types α and β are equivalent if there is a bijection between them. The quotient type Cardinal.{u} lives in Type (u+1). The curly braces denote a universe variable. The image of α : Type u in this quotient is Cardinal.mk α : Cardinal.{u}.

But we cannot directly compare cardinals in different universes. So technically we cannot define the rank of a vector space V as the cardinal of all types indexing a basis of V. So instead it is defined as the supremum Module.rank K V of cardinals of all linearly independent sets in V. If V has universe level u then its rank has type Cardinal.{u}.

#check V -- Type u_2
#check Module.rank K V -- Cardinal.{u_2}

One can still relate this definition to bases. Indeed there is also a commutative max operation on universe levels, and given two universe levels u and v there is an operation Cardinal.lift.{u, v} : Cardinal.{v} Cardinal.{max v u} that allows to put cardinals in a common universe and state the dimension theorem.

universe u v -- `u` and `v` will denote universe levels

variable {ι : Type u} (B : Basis ι K V)
         {ι' : Type v} (B' : Basis ι' K V)

example : Cardinal.lift.{v, u} (.mk ι) = Cardinal.lift.{u, v} (.mk ι') :=
  mk_eq_mk_of_basis B B'

We can relate the finite dimensional case to this discussion using the coercion from natural numbers to finite cardinals (or more precisely the finite cardinals which live in Cardinal.{v} where v is the universe level of V).

example [FiniteDimensional K V] :
    (Module.finrank K V : Cardinal) = Module.rank K V :=
  Module.finrank_eq_rank K V