\nonstopmode
\documentclass{scrartcl} % KOMA-Script article
\usepackage{amsmath}
%include agda.fmt
%% dot patterns:
%format . = "\mathord{.\!\!}"
\DeclareUnicodeCharacter{9675}{\ensuremath{\circ}}
\title{Verifying Program Optimizations in Agda \\
Case Study: List Deforestation}
\author{Andreas Abel}
\date{3 July 2012}
\begin{document}
\maketitle
This is a case study on proving program optimizations correct. We
prove the |foldr-unfold| fusion law, an instance of deforestation. As
a result we show that the summation of the first |n| natural numbers,
implemented by producing the list |n ∷ ... ∷ 1 ∷ 0 ∷ []| and summing
up the its elements, can be automatically optimized into a version
which does not use an intermediate list.
\begin{code}
module Fusion where
open import Data.Maybe
open import Data.Nat
open import Data.Product
open import Data.List hiding (downFrom)
open import Relation.Binary.PropositionalEquality
import Relation.Binary.EqReasoning as Eq
\end{code}
From |Data.List| we import |foldr| which is the standard iterator for
lists.
\begin{spec}
foldr : {a b : Set} → (a → b → b) → b → List a → b
foldr c n [] = n
foldr c n (x ∷ xs) = c x (foldr c n xs)
\end{spec}
Further, |sum| sums up the elements of a list by replacing |[]| by |0|
and |_∷_| by |+|.
\begin{spec}
sum : List ℕ → ℕ
sum = foldr _+_ 0
\end{spec}
Finally, |unfold| is a generic list producer. It takes two
parameters, |f : B → Maybe (A × B)|, the transition function, and |s :
B|, the start state. Now |f| is iterated on the start state. If the
result of applying |f| on the current state is |nothing|, an empty
list is output and the list production terminates. If the application
of |f| yields |just (x , s')| then |x| is taken to be the next
element of the list and |s'| the new state of the production.
In Agda, everything needs to terminate, so we add a (hidden) parameter
|n : ℕ| which is an upper bound on the number of elements to be
produced. Each iteration decreases this number. Consequently the
type |B : ℕ → Set| is now parameterized by |n|, and |f : ∀ {n} → B
(suc n) → Maybe (A × B n)| can only be applied to a state |B (suc n)|
where still an element could be output.
\begin{spec}
unfold : {A : Set} (B : ℕ → Set)
(f : ∀ {n} → B (suc n) → Maybe (A × B n)) →
∀ {n} → B n → List A
unfold B f {n = zero} s = []
unfold B f {n = suc n} s with f s
... | nothing = []
... | just (x , s') = x ∷ unfold B f s'
\end{spec}
A typical instance of |unfold| is
the function |downFrom| from the standard library with
the behavior |downFrom 3 = 2 ∷ 1 ∷ 0 ∷ []|. We reimplement it here,
avoiding local definitions as used in the standard library.
\begin{code}
data Singleton : ℕ → Set where
wrap : (n : ℕ) → Singleton n
downFromF : ∀ {n} → Singleton (suc n) → Maybe (ℕ × Singleton n)
downFromF {n} (wrap .(suc n)) = just (n , wrap n)
downFrom : ℕ → List ℕ
downFrom n = unfold Singleton downFromF (wrap n)
\end{code}
\begin{code}
sumFrom : ℕ → ℕ
sumFrom zero = zero
sumFrom (suc n) = n + sumFrom n
\end{code}
Our goal is to show the theorem |∀ n → sum (downFrom n) ≡ sumFrom n|.
The theorem follows from general considerations:
\begin{itemize}
\item |sum| is a |foldr|, it consumes a list.
\item |downFrom| is a |unfold|, it produces a list.
\end{itemize}
The list is only produced to be consumed again.
Can we optimize away the intermediate list?
Removing intermediate data structures is called \emph{deforestation},
since data structures are tree-shaped in the general case.
In our case, we would like to fuse an |unfold| followed by a |foldr|
into a single function |foldUnfold| which does not need lists.
We observe that a |foldr| after an |unfold| satisfies the following equations:
\begin{spec}
foldr c n (unfold B f {zero} s) = n
foldr c n (unfold B f {suc m} s | f s = nothing) = n
foldr c n (unfold B f {suc m} s | f s = just (x , s'))
= foldr c n (x ∷ unfold B f s')
= c x (foldr c n (unfold B f s'))
\end{spec}
In the recursive case, the pattern |foldr c n . unfold B f|
resurfaces, and it contains all the recursive calls to |foldr| and
|unfold|. Hence, we can introduce a new function |foldUnfold| as
\begin{spec}
foldUnfold c n B f = foldr c n ○ unfold B f
\end{spec}
\begin{code}
foldUnfold : {A C : Set} → (A → C → C) → C →
(B : ℕ → Set) → (∀ {n} → B (suc n) → Maybe (A × B n)) →
{n : ℕ} → B n → C
foldUnfold c n B f {zero} s = n
foldUnfold c n B f {suc m} s with f s
... | nothing = n
... | just (x , s') = c x (foldUnfold c n B f {m} s')
\end{code}
|foldUnfold| does not produce an intermediate list.
It is easy to show that the definition of |foldUnfold| is correct.
\begin{code}
foldr-unfold : {A C : Set} → (c : A → C → C) → (n : C) →
(B : ℕ → Set) → (f : ∀ {n} → B (suc n) → Maybe (A × B n)) →
{m : ℕ} → (s : B m) →
foldr c n (unfold B f s) ≡ foldUnfold c n B f s
foldr-unfold c n B f {zero} s = refl
foldr-unfold c n B f {suc m} s with f s
... | nothing = refl
... | just (x , s') = cong (c x) (foldr-unfold c n B f {m} s')
\end{code}
|sumFrom| is a special case of |foldUnfold|.
\begin{code}
lem1 : ∀ {n} → foldUnfold _+_ 0 Singleton downFromF (wrap n) ≡ sumFrom n
lem1 {zero} = refl
lem1 {suc n} = cong (\ m → n + m) (lem1 {n})
\end{code}
Our theorem follows by composition of the two lemmata.
\begin{code}
thm : ∀ {n} → sum (downFrom n) ≡ sumFrom n
thm {n} = begin
sum (downFrom n)
≡⟨ refl ⟩
foldr _+_ 0 (unfold Singleton downFromF (wrap n))
≡⟨ foldr-unfold _+_ 0 Singleton downFromF (wrap n) ⟩
foldUnfold _+_ 0 Singleton downFromF (wrap n)
≡⟨ lem1 {n} ⟩
sumFrom n
∎
where open ≡-Reasoning
\end{code}
That's it!
\end{document}