\nonstopmode
\documentclass{scrartcl} % KOMA-Script article
\usepackage{amsmath}
% current annoyance: this will be fixed
% by the next update of agda.fmt
\def\textmu{}
\newcommand{\RR}{\mathbb{R}}
\newcommand{\hide}[1]{}
%include agda.fmt
% %include polycode.fmt
\title{An Introduction to Dependent Types and Agda}
\author{Andreas Abel}
\date{3 July 2009}
\begin{document}
\maketitle
\section{Types}
Types in programming languages serve several purposes:
\begin{enumerate}
\item Find runtime-errors at compile-time.
\item Enable compiler optimizations.
\item Allow the programmer to express some of his intention about the
code.
\item Provide a machine-checked documenation.
\end{enumerate}
Strongly typed languages include JAVA and Haskell. Dependent types
allow the programmer to add even more redundant information about his
code which leads to detection logical errors or full program
verification.
\subsection{What is a dependent type?}
Dependent types mean that types can refer to values, not just to other
types. From the mathematics, uses of dependent types are familiar.
For instance, in linear algebra we define the type of vectors $\RR^n$
of length $n$ by
\[
\begin{array}{lll}
\RR^n = \underbrace{\RR \times \dots \times \RR}_{n \mbox{ times}}
\end{array}
\]
and the inner product of two vectors of length $n$ receives type
$\RR^n \times \RR^n \to \RR$. This type is dependent on the value
$n$. Its purpose is to make clear that the inner product is not
defined on two arbitrary vectors of reals, but only of those of the
same length. Most strongly typed programming languages do not support
dependent types. Haskell has a rich type system that can simulate
dependent types to some extent \cite{mcBride:fakingIt}.
Dependently typed languages include Agda, Coq, Omega, ATS (replacing
DML). Cayenne is no longer supported.
\section{Core Features of Agda}
In the following, we give a short introduction into dependent types
using the language Agda. Agda is similar to Haskell in many aspects,
in particular, \emph{indentation matters}!
\begin{code}
module DepTypes where
\end{code}
\subsection{Dependent Function Type}
As opposed to ordinary function types |A -> B|,
dependent function types |(x : A) -> B| allow to give a name, |x|, to
the domain |A| of the function space. The name |x| can be
subsequently used in the codomain |B| to refer to the particular value
|x| of type |A| that has been passed to the function.
Using a dependent function type, we can specify the |inner| product
function as follows:
\begin{spec}
inner : (n : Nat) -> Vect Nat n -> Vect Nat n -> Nat
\end{spec}
Herein |Vect Nat n| denotes the type of vectors of natural numbers of
length |n|, which we will define later. An application |inner 5 v w|
is only well-typed if |v| and |w| both have length |5|.
\subsection{Inductive Types}
As in Haskell, we can declare new data types by giving their
constructors. Only the new |data| syntax of Haskell is supported by
Agda:
\begin{code}
data Nat : Set where
zero : Nat
suc : Nat -> Nat
\end{code}
This means that we introduce a new type, a |Set| in Agda terminology,
with a nullary constructor |zero| and a unary, recursive constructor
|suc|. Thus, natural numbers are possibly empty
sequences of |suc| terminated by a |zero|, which is a unary
presentation of natural numbers, as opposed to a binary representation.
Polymorphic data types, \emph{parametric} data types in proper
terminology, can be defined by providing a sequence of variable
declarations just after the name of the data type. For example,
polymorphic lists have one parameter |A : Set|, which is the list
element type. All the parameter names are in scope in the constructor
declarations.
\begin{code}
data [_] (A : Set) : Set where
[] : [ A ]
_::_ : A -> [ A ] -> [ A ]
\end{code}
Agda supports pre-, post-, in-, and mixfix identifiers. Here, we
have declared the type |[ A ]| of lists over |A| as mixfix identifier,
the constructor |[]| for empty lists is an ordinary identifier but
made up of special symbols, the constructor |_::_| is infix, to be
used in the form |x :: xs|.
\subsection{Inductive Families}
Vectors are lists over a certain element type |A| of a certain length
|n|. While the \emph{parameter} |A| remains fixed for the whole list, the
\emph{index} |n| varies for each sublist. Indexed inductive types are
called inductive families and declared like
\begin{spec}
data Vect (A : Set) : Nat -> Set where
vnil : Vect A zero
vcons : (n : Nat) -> A -> Vect A n -> Vect A (suc n)
\end{spec}
Note that |Vect| itself is of type |Set -> Nat -> Set| and |Vect A| is
of type |Nat -> Set|.
\subsection{Recursive Definitions and Pattern Matching}
In order to define the inner product of two vectors of natural
numbers, we define addition and multiplication for natural numbers first.
\begin{code}
infix 2 _+_
infix 3 _*_
_+_ : Nat -> Nat -> Nat
n + zero = n
n + suc m = suc (n + m)
_*_ : Nat -> Nat -> Nat
n * zero = zero
n * suc m = n * m + n
\end{code}
Both are defined by recursion and case distinction over the second
argument.
Our first attempt to define the inner product is:
\begin{spec}
inner : (n : Nat) -> Vect Nat n -> Vect Nat n -> Nat
inner zero vnil vnil = zero
inner (suc n) (vcons n x xs) (vcons n y ys) = x * y + inner n xs ys
-- FAILS
\end{spec}
However this fails. The second clause for |inner| violates the
\emph{linearity condition}. The variable |n| is mentioned thrice in
the patterns on the left hand side. Each variable can however only be
bound once. What we meant to express is that whatever the values of
the three occurrences of |n| are, because of the type of |inner| we
know they are equal. This can be expressed properly using Agda's
\emph{dot patterns}.
\begin{spec}
inner : (n : Nat) -> Vect Nat n -> Vect Nat n -> Nat
inner zero vnil vnil = zero
inner (suc .n) (vcons .n x xs) (vcons n y ys) = x * y + inner n xs ys
\end{spec}
There is now only one binding occurrence of |n|, the other two
occurrences have been dotted. A dot can be followed by \emph{any}
expression, it does not have to be a variable as in our case. What
|.n| it means is: \emph{whatever stands here, do not match against it,
for I know it is equal to |n|}.
\subsection{Omitted and Hidden Arguments}
A special expression is the hole |_| which stands for any
expression we do not care about. Agda's unification procedure tries
to find the correct expressions which fit into the holes for us. For
instance we can leave the administration of the vector length to Agda,
since it is inferable from the constructors |vnil| and |vcons|.
\begin{spec}
inner : (n : Nat) -> Vect Nat n -> Vect Nat n -> Nat
inner ._ vnil vnil = zero
inner ._ (vcons ._ x xs) (vcons _ y ys) = x * y + inner _ xs ys
\end{spec}
We can even hide things we do not care about completely. The hidden
dependent function space |{x : A} -> B| contains functions whose
argument is declared as hidden, i.e., it is not written by default,
but one can supply it enclosed in braces. It is convenient to hide
the length annotations in vectors so that they look like lists.
\begin{code}
data Vect (A : Set) : Nat -> Set where
vnil : Vect A zero
vcons : {n : Nat} -> A -> Vect A n -> Vect A (suc n)
inner : {n : Nat} -> Vect Nat n -> Vect Nat n -> Nat
inner vnil vnil = zero
inner (vcons x xs) (vcons y ys) = x * y + inner xs ys
\end{code}
Internally, the last definition of |inner| is read as:
\begin{spec}
inner : {n : Nat} -> Vect Nat n -> Vect Nat n -> Nat
inner {_} vnil vnil = zero
inner {_} (vcons {._} x xs) (vcons {._} y ys) = x * y + inner {_} xs ys
\end{spec}
Can we really not fool the type checker? Let us test it.
\begin{code}
one = suc zero
two = suc one
v1 = vcons one vnil
v2 = vcons zero v1
\end{code}
Now the following program is rejected by the type checker:
\begin{spec}
foo = inner v1 v2 -- FAILS
\end{spec}
\section{Some Library Functions for Vectors}
In the following, we gain some more experiences with Agda by playing
around with vectors. Let us try concatenation of vectors first.
\begin{spec}
append : {A : Set}{n m : Nat} ->
Vect A n -> Vect A m -> Vect A (n + m) -- FAILS
append vnil ys = ys
append (vcons x xs) ys = vcons x (append xs ys)
\end{spec}
This code is rejected by Agda with the error message |.m != zero + .m
of type Nat| pointing at the right hand side of the first clause.
What is going on here? Since |vnil : Vect A n| the type checker
infers |n = zero|, thus it expects the right hand side |ys| to be of type
|Vect A (zero + m)|. It has type |Vect A m|, and zero plus something
is something, so where is the problem? The problem is that we know
knowledge about addition that Agda does not have. By definition, Agda
knows that |m + zero = m|, but it does not know that addition is
commutative.\footnote{Agda is dumb as a chicken!} Flipping the sum
solves our problem:
\begin{code}
append : {A : Set}{n m : Nat} ->
Vect A n -> Vect A m -> Vect A (m + n)
append vnil ys = ys
append (vcons x xs) ys = vcons x (append xs ys)
\end{code}
We got away this time, but in other cases we might have to teach Agda
that addition is commutative!
A nice application of vectors is that looking up the element at a
certain index can be made safe statically. I.e., we can define a
lookup function |_!!_| that only accept indices that are below the
length of the vector. The trick is done using finite sets:
\begin{code}
data Fin : Nat -> Set where
fzero : {n : Nat} -> Fin (suc n)
fsuc : {n : Nat} -> Fin n -> Fin (suc n)
\end{code}
The finite set |Fin n| contains exactly |n| elements. In particular,
|Fin 0| is empty, and |Fin (suc n)| contains the elements |fzero|,
|fsuc fzero|, \dots, |fsuc|$^{n-1}$ |fzero|.
Using |Fin|, we construct the following total lookup function
\begin{code}
_!!_ : {A : Set}{n : Nat} -> Vect A n -> Fin n -> A
vnil !! ()
vcons x xs !! fzero = x
vcons x xs !! fsuc m = xs !! m
\end{code}
The first clause uses the absurd pattern |()|. If the vector is
|vnil|, then |n = 0| and |Fin n| is empty, so there is no match for
the index. Of course, no right hand side has to be given in this case.
Again, we cannot fool the type checker:
\begin{spec}
foo = vnil !! fzero -- FAILS
\end{spec}
\section{Sorted Lists and Logic in Agda}
Let us introduce booleans and comparison of naturals:
\begin{code}
data Bool : Set where
true : Bool
false : Bool
if_then_else_ : {A : Set} -> Bool -> A -> A -> A
if true then b1 else b2 = b1
if false then b1 else b2 = b2
_≤_ : Nat -> Nat -> Bool
zero ≤ n = true
(suc m) ≤ zero = false
(suc m) ≤ (suc n) = m ≤ n
\end{code}
An idea to implement a type of descendingly sorted lists of natural
numbers is to index the list type by a lower bound for the next
element which can be prepended.
\begin{spec}
data SList : Nat -> Set where
snil : SList zero
scons : {n : Nat}(m : Nat) -> (n ≤ m) -> SList n -> SList m
-- FAILS
\end{spec}
The empty list is indexed by |zero| so any natural number can be
prepended without violating sorting. The list |scons {n} m _ l| is
only sorted if |n ≤ m|. The placeholder needs to be filled with some
evidence for this fact, or in mathematical terms we need to provide a
proof that can be checked by Agda!
However, the way we wrote it above does not make sense. The term
|n ≤ m| is a Boolean, i.e., either |true| or |false|, thus, it is a value
and not a set and we cannot form a function space from a value to a set.
\subsection{The Curry Howard Correspondence}
We need to find a representation of truth values as sets, such that we
can integrate evidence for valid conditions, such as |n ≤ m| into data
structures. The solution is based on an observation by Haskell Curry (1934
and 1958) and William A. Howard (1969) that a proposition can be viewed
as the set of its proofs, and proofs correspond to programs, at least
in intuitionistic logic, which is a logic without the principle of the
excluded middle. The correspondence is called
Curry-Howard-Correspondence or -Isomorphism, sometimes also
Curry-Howard-de Bruijn Correspondence.
\begin{code}
Proposition = Set
\end{code}
An empty set corresponds to a unprovable, i.e., false proposition, an
inhabited (i.e., non-empty) set to a true proposition.
\begin{code}
data Absurd : Proposition where
data Truth : Proposition where
tt : Truth
\end{code}
Consequently, absurdity can be modelled as a data type with no
constructors, and truth as a data type with a constructor that has no
arguments. Pure truth needs no further evidence (but also conveys no
information).
\begin{code}
data _∧_ (A B : Proposition) : Proposition where
_,_ : A -> B -> A ∧ B
\end{code}
To prove the conjunction |A ∧ B|, we need a proof |a : A| of |A| and a proof
|b : B| of |B|, which we put together to form the pair |a , b|.
\begin{code}
fst : {A B : Proposition} -> A ∧ B -> A
fst (a , b) = a
snd : {A B : Proposition} -> A ∧ B -> B
snd (a , b) = b
\end{code}
If we have a proof of |A ∧ B|, we can obtain proofs of |A| and |B| by
projection. Viewed as set, the conjunction is just the Cartesian product.
\begin{code}
_×_ = _∧_
\end{code}
A disjunction |A ∨ B| is proven by either providing a proof of |A| or
a proof of |B|. In Agda this is modelled by a data type with two
constructors |inl| and |inr|, the first to turn a proof of |A| into a
proof of |A ∨ B| and the second to turn a proof of |B| into a proof of
|A ∨ B|.
\begin{code}
data _∨_ (A B : Proposition) : Proposition where
inl : A -> A ∨ B
inr : B -> A ∨ B
\end{code}
A proof of a disjunction |A ∨ B| can be used by performing case analysis,
i.e., distinguishing whether a proof of |A| was given or a proof of
|B|.
\begin{code}
case : {A B C : Proposition} -> A ∨ B -> (A -> C) -> (B -> C) -> C
case (inl a) f g = f a
case (inr b) f g = g b
\end{code}
So if we have a method |f : A -> C| to turn a proof of |A| into a
proof of |C|, and we have a method |g : B -> C| to obtain a proof of
|C| from a proof of |B|, then given a proof of |A ∨ B| we obtain a
proof of |C| by case analysis.
Implication is just the functions space. Thus, a proof |f : A -> B|
of the implication ``|A| implies |B|'', is a function which computes a
proof of |B| from a proof of |A|. For example, we can prove the
following two (trivial) propositions:
\begin{code}
lemma : {A : Proposition} -> A -> A ∧ Truth
lemma a = (a , tt)
comm∧ : {A B : Proposition} -> A ∧ B -> B ∧ A
comm∧ (a , b) = (b , a)
\end{code}
The first lemma states that |A| implies |A ∧ Truth|, its proof is a program
taking |a : A| and producing the pair |a , tt|. The second shows that
conjunction is commutative, the proof just swaps the two components of
the pair.
\subsection{Booleans and Propositions}
We can translate booleans into propositions by mapping |true| to
|Truth| and |false| to |Absurd|.
\begin{code}
True : Bool -> Proposition
True true = Truth
True false = Absurd
\end{code}
The opposite translation is not possible, since there are more
propositions than just the trivial ones (truth and absurdity).
\subsection{Proof by Induction}
Now we can show that comparison |≤| is reflexive.
\begin{code}
refl≤ : (n : Nat) -> True (n ≤ n)
refl≤ zero = tt
refl≤ (suc n) = refl≤ n
\end{code}
The proof of |True (n ≤ n)| proceeds by induction on |n : Nat|. In
case |zero|, the term |zero ≤ zero| simplifies to |true| by definition
of |≤|. Thus, it remains to show |True true| which in turn simplifies
to |Truth| and this has the trivial proof |tt|. In case |suc n| we
have to show |True (suc n ≤ suc n)| which simplifies to |True (n ≤ n)|
by definition. We conclude by the induction hypothesis, which is
obtained via the induction hypothesis |refl≤ n : True (n ≤ n)|.
By the Curry-Howard correspondence, a proof by induction is a
recursive function over the natural numbers, with one important
condition: It needs to be terminating on all inputs. Termination of
|refl≤| is checked by Agda using the \emph{structural ordering} on the
natural numbers. The term |n| is structurally smaller than |suc n|
since |n| is an argument to the constructor |suc|. The structural
ordering is wellfounded for any data type, thus, termination checking
using the structural ordering is sound.
Using non-terminating programs, we can prove false propositions:
%% BEGIN WORKAROUND lhs2TeX
\hide{
\begin{code}
{-# NO_TERMINATION_CHECK #-}
\end{code}
}
\begin{verbatim}
{-# NO_TERMINATION_CHECK #-}
\end{verbatim}
%% END
\begin{code}
bla : (n : Nat) -> True (one ≤ zero)
bla n = bla (suc n)
\end{code}
\hide{
\begin{code}
{-# NO_TERMINATION_CHECK #-}
\end{code}
}
\begin{verbatim}
{-# NO_TERMINATION_CHECK #-}
\end{verbatim}
\begin{code}
foo : Absurd
foo = foo
\end{code}
The Agda termination checker marks all programs that fail the
termination check in red
color, indicating that it cannot guarantee logical consistency in the
presence of non-termination. However, it does not report an error,
since the program might terminate even if Agda cannot guarantee it.
(Remember the halting problem: termination is undecidable in general.)
Another source of logical inconsistency are incomplete pattern
matches, for instance:
\begin{spec}
A : Bool -> Proposition
A true = Truth
A false = True (one ≤ zero)
prf : (b : Bool) -> A b
prf true = tt
-- FAILS
bar : True (one ≤ zero)
bar = prf false
\end{spec}
The proof |prf| is incomplete the case |b = false| has been omitted,
which happens to be unprovable. Now |bar| is a proof of an
inconsistency, which if executed, would result in a runtime error.
However, Agda already rejects |prf| since the pattern matching is
incomplete. By declaring the options
\begin{spec}
{-# OPTIONS --no-termination-check --no-coverage-check #-}
\end{spec}
termination and case coverage check can be turned off, which might be
convenient during drafting a program/verification project.
\subsection{Statically Sorted Lists}
We can now come back to the data type of descending lists.
\begin{code}
data SList : Nat -> Set where
snil : SList zero
scons : {n : Nat}(m : Nat) -> True (n ≤ m) -> SList n -> SList m
slist1 = scons one tt (scons zero tt snil)
\end{code}
Whenever we prepend a new element to the list with |scons|, we need to
provide a proof that the new element |m| is greater or equal to the
head element |n| of the List (or |zero| if the list is empty). For
concrete numbers, these proofs are always trivial, since |≤| is
decidable.
This completes the crash course in Agda. More information can be found in Ulf
Norell's tutorial or the LERNET summer school notes by Ana Bove and
Peter Dybjer.
\bibliographystyle{alpha}
\bibliography{auto-DepTypes}
\end{document}