Type Inhabitation in Template Haskell
Template Haskell allows us to write metaprograms, in the code-as-data sense of programs that manipulate Haskell code (in syntax tree form, represented as ADTs). In principle, this allows us to write a program that takes in the code of a Haskell type, and finds a term that satisfies this type - this is the type inhabitation problem. I will attempt to do exactly this, but only for the fragment of Haskell that corresponds to the pure simply typed lambda calculus with only type variables and function types:
\[ \begin{array}{r c l} {A, B} & ::= & {\phi \ |\ A \rightarrow B}\\ \end{array} \qquad \begin{array}{r c l} {M, N} & ::= & {x \ |\ \lambda x.M \ |\ M N}\\ \end{array} \]
\[ \begin{prooftree} \AxiomC{} \RightLabel{$(\textrm{Ax})$} \UnaryInfC{$\Gamma, A \vdash A$} \end{prooftree} \qquad \begin{prooftree} \AxiomC{$\Gamma, A \vdash B$} \RightLabel{$(\rightarrow\textrm{I})$} \UnaryInfC{$\Gamma \vdash A \rightarrow B$} \end{prooftree} \qquad \begin{prooftree} \AxiomC{$\Gamma \vdash A \rightarrow B$} \AxiomC{$\Gamma \vdash A$} \RightLabel{$(\rightarrow\textrm{E})$} \BinaryInfC{$\Gamma \vdash B$} \end{prooftree} \]
Basics of (Untyped) Template Haskell
We will be working with the untyped variant of Template Haskell where we can
build up program terms that won’t type check. The typed variant is safer to work
with but more complicated (perhaps a future post). The key components of Template
Haskell are:
1. the Exp
, Type
, Dec
and Pat
data types which represent the syntax
trees of expressions, types, function declarations and patterns (for pattern
matching) respectively
2. the quote
monad typeclass (concretely instantiated by the Q
datatype) which encapsulates information about the syntax tree we are
building, including auxiliary information such as variable counters for
generating fresh variables. Given a term of type Q Exp
for example we
can unquote it using the $(...)
operator which during compilation is
replaced by the code built up in the Q Exp
term we pass in.
Here’s an example generating the n-argument curry
function,
courtesy of the Haskell wiki:
{-# LANGUAGE TemplateHaskell #-}
module Curry where
import Control.Monad
import Language.Haskell.TH
-- given n, construct the lambda term that performs currying for n arguments
curryN :: Int -> Q Exp
= do
curryN n <- newName "f" -- newName generates a fresh name based on the
f <- replicateM n (newName "x") -- given string by appending a counter number
xs let args = map VarP (f:xs)
= TupE (map VarE xs)
ntup return $ LamE args (AppE (VarE f) ntup) -- builds the code for \f x1 ... xn -> f (x1 ... xn)
genCurries :: Int -> Q [Dec]
= forM [1..n] mkCurryDec
genCurries n where
= do
mkCurryDec ith <- curryN ith
cury let name = mkName $ "curry" ++ show ith -- mkName verbatim converts the given
return $ FunD name [Clause [] (NormalB cury) []] -- string into a name, unlike newName
-- builds the curry declarations curry15 = \f x1 ... x15 -> f (x1 ... x15)
As long as these functions are defined in another module, we can import them and unquote the result of applying these functions:
import Curry
curry4 :: ((a, b, c, d) -> e) -> a -> b -> c -> d -> e
= $(curryN 4)
curry4
-- you can also unquote the list of declarations directly. This declares curry1,
-- curry2 ... curry100
$(genCurries 100)
A Naive Attempt at Type inhabitation
There’s a simple type inhabitation algorithm that we can start with, that finds normal forms of a given type. Here is its pseudocode:
inhabitation Γ (A -> B) = λx. M
where
x is a fresh variable
M = inhabitation (Γ, x:A) B
inhabitation Γ B = x M_1 ... M_n
where
x:A_1 -> ... -> A_n -> B ∈ Γ (for some n ≥ 0)
M_i = inhabitation Γ A_i (for each 1 ≤ i ≤ n)
This algorithm does not terminate for certain types such as (A -> A) -> A
: to
begin with, we add x:A -> A
to the empty context and recursively attempt to
inhabit A
with this new context. In attempting to do so, the only suitable
variable in the context is x:A -> A
, which has A
as its only input type.
Hence we must recursively attempt to inhabit A
with the same context as
before, so we are stuck in an infinite loop. We hopefully can do better than
this, but its a good place to start.
To begin with, what should the type of the function be? For the context, we
represent this as a list of tuples mapping variable names to types. The function
is expected to return a program of the given type, but it may fail, so we should
return a Q (Maybe Exp)
. However, dealing with nested monads is annoying so
instead, we go for the transformer MaybeT Q Exp
.
inhabitation :: [(Name, Type)] -> Type -> MaybeT Q Exp
In the first case for types A -> B
, we have a rather straightforward
translation from pseudocode:
AppT (AppT ArrowT t1) t2) = do -- arrow type constructor is partially applied
inhabitation cxt (<- lift $ newName "x"
x <- inhabitation ((x, t1):cxt) t2
body return $ LamE [VarP x] body
In the second case, things are slightly more complicated, mostly due to having to deal with the recursive structure of repeated arrows and lambda applications:
= do
inhabitation cxt t head, headType) <- MaybeT . return $ find correctReturnType cxt
(<- mapM (inhabitation cxt) (getInputTypes headType)
args return $ foldl AppE (VarE head) args
where
-- This is used to check whether there is an x:A_1 -> ... -> A_n -> B ∈ Γ
correctReturnType :: (Name, Type) -> Bool
AppT (AppT ArrowT _) ret)
correctReturnType (x, | t == ret = True
| otherwise = correctReturnType (x, ret)
= t == ret
correctReturnType (_, ret)
-- This collects the input types A_1 ... A_n into a list so we can
-- recursively map the inhabitation function, and fold the result into a
-- repeatedly applied term
AppT (AppT ArrowT input) rest)
getInputTypes (= input : getInputTypes rest
getInputTypes ret= []
With this, we can set up a convenience function to wrap around inhabitation
by
converting the output into a Q (Maybe Exp)
and actually setting up a
declaration.
-- this will give an exception if the inhabitation returned empty-handed
inhabit :: String -> Q Type -> Q [Dec]
= do
inhabit name typ <- typ
t <- liftM fromJust $ inhabitation' t
term return $ [FunD (mkName name) [Clause [] (NormalB term) []]]
inhabitation' :: Type -> Q (Maybe Exp)
ForallT _ _ t) = inhabitation t -- See below for why we need to unwrap foralls
inhabitation' (= runMaybeT $ inhabitation' [] t inhabitation' t
inhabit
should now allow us to construct function declarations built by type
inhabitation. However, we have to build up a Type
syntax tree to pass in,
which is tedious. Luckily, there’s a quasi-quoter feature of Template Haskell
that lets us convert regular typed code into syntax tree objects. This is best
shown by example:
AppT (AppT ArrowT (ConT $ mkName "Int")) (ConT $ mkName "Int")
[t|Int -> Int|] ≡
error,
[t|a -> b -> a|] ≡ an type variables a and b occur free. The alternative is to either
because the or bind it using a forall and then later
build the syntax tree by constructors, forall types (as I did in inhabitation').
remove the
ForallT [PlainTV a_2 SpecifiedSpec] []
[t|forall a. a -> a|] ≡ AppT (AppT ArrowT (VarT a_2)) (VarT a_2)) (
Finally, this allows us to declare new functions by type inhabitation:
$(inhabit "i" [t|forall a. a -> a|])
$(inhabit "k" [t|forall a b. a -> b -> a|])
$(inhabit "s" [t|forall a b c. (a -> b -> c) -> (a -> b) -> a -> c|])
Because the declarations are not typed, the compiler will infer the principal type from the program that was found, which leads to a curious effect:
$(inhabit "k'" [t|forall a. a -> a -> a|])
-- but ghci gives:
-- ghci> :t k'
-- k' :: p1 -> p2 -> p2
Attempting to inhabit types with constants such as Int
will often fail unless
you populate the context with the built-in and prelude functions first, and of
course, $(inhabit "y" [t|forall a. (a -> a) -> a|])
causes the compiler to
endlessly loop.