Unification algorithm
On of the key steps involved in type inference is the unification algorithm. Given two type and a unifier if it exists is a type such that is a specialisation of both and . The most general unifier of and , if it exists, is the unifier which is most general, i.e. it is the unifier of and such that all other unifiers is a specialisation of . In this lecture we develop an algorithm to compute the most general unifier of two types and .
1 2 3 4 5 6 7 8 9 10 | > {- | Module defining the unification algorithm -} > > module Unification where > import Lambda -- See the last lecture > import qualified Data.Map as M -- To define specialisation. > import Data.Either > import Data.Maybe > > type Map = M.Map
|
Although we defined the unifier as a type, it is convenient when computing the unifier of and to compute the type specialisation that unifies them to their most general unifier. We will therefore abuse the term most general unifier to also mean this specialisation. Also for simplicity we drop the adjective ``most general'' and just use unifier to mean the most general unifier.
1 2 3 4 5 6 | > type Specialisation = Map String Type -- ^ A type specialisation > type Result = Either String -- ^ The error message if it failed > Specialisation -- ^ The unifier > > unify :: Type -> Type -> Result
|
A type specialisation is captured by a Map
from type variable names to the corresponding specialisation. We given an function to compute given a specialisation .
1 2 3 4 5 | > specialise :: Specialisation -> Type -> Type > specialise sp t@(TV x) = maybe t (specialise sp) $ M.lookup x sp > specialise sp (TA s t) = TA (specialise sp s) (specialise sp t) > specialise sp i = i
|
We generalise the unification in two ways:
- We consider unification of types under a particular specialisation . The unifier of and under the specialisation is nothing but the unifier of and . The unification of two types can then be seen as unification under empty specialisation.
1 2 3 | > genUnify :: Specialisation -> Type -> Type -> Result > unify = genUnify M.empty
|
- Instead of considering unifiers of pairs of types and , we consider the simultaneous unifier of a sequence of pairs . A unifier of such a sequence is a specialisation such that for all . Of course we want to find the most general of such unifier. We call this function
genUnify'
. Given, genUnify
it is easy to define genUnify'
.
1 2 3 4 5 6 7 | > genUnify' :: Specialisation -- ^ pair of types to unify > -> [(Type,Type)] -- ^ the specialisation to unify under > -> Result -- ^ the resulting unifier > genUnify' = foldl fld . Right > where fld (Right sp) (tau,sigma) = genUnify sp tau sigma > fld err _ = err
|
What is left is the definition of genUnify
.
1 2 3 4 5 6 7 | > genUnify sp (TV x) t = unifyV sp x t > genUnify sp t (TV x) = unifyV sp x t > genUnify sp INTEGER t = unifyI sp t > genUnify sp t INTEGER = unifyI sp t > genUnify sp ap1 ap2 = unifyA sp ap1 ap2 >
|
The order of the pattern matches are important. For example notice that in line 6, both ap1
and ap2
have to be an arrow type (why?) Also in lines 3 and 4, t can either be INTEGER
or an arrow type.
So our rules of unification can are split into unifying a variable with a type , a constant types (here Integer) with a type and unifying two arrow types. We capture these rules with functions unifyV
, unifyI
and unifyA
respectively.
1 2 3 4 | > unifyV :: Specialisation -> String -> Type -> Result > unifyI :: Specialisation -> Type -> Result > unifyA :: Specialisation -> Type -> Type -> Result
|
Unifying a variable and a type
Unification of variables can be tricky since specialisations are involved. Firstly, notice that if occurs in a type , we cannot unify with unless is itself. We first give a function to check this.
1 2 3 4 5 6 | > occurs :: Specialisation -> String -> Type -> Bool > occurs sp x (TV y) | x == y = True > | otherwise = maybe False (occurs sp x) $ M.lookup y sp > occurs sp x (TA s t) = occurs sp x s || occurs sp x t > occurs sp x _ = False
|
The rules of unification of a variable and a type are thus
If has a specialisation , unify and tau,
If is or can be specialised to then we already have the specialisation.
If is unspecialised then specialise it with provided there occurs no recursion either directly or indirectly.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | > unifyV sp x t = maybe (unifyV' sp x t) (genUnify sp t) $ M.lookup x sp > > unifyV' sp x INTEGER = Right $ M.insert x INTEGER sp > unifyV' sp x var@(TV y) | x == y = Right $ sp > | otherwise = maybe (Right $ M.insert x var sp) > (unifyV' sp x) > $ M.lookup y sp > unifyV' sp x ap@(TA s t) | occurs sp x s = failV sp x ap > | occurs sp x t = failV sp x ap > | otherwise = Right $ M.insert x ap sp > > failV sp x ap = Left $ unwords [ "Fail to unify", x, "with" , ppSp sp ap > , ": recursion detected." > ] >
|
Notice here that M.lookup x sp
returns Nothing
if x
is not specialised under the specialisation sp
. The function unifies only an unspecialised variable with .
Unifying an integer and type
Notice here that the type is not a variable. Then it can only be Integer or an arrow type.
1 2 3 4 5 | > unifyI sp INTEGER = Right sp > unifyI sp t = Left $ unwords [ "Failed to unify Integer with type" > , ppSp sp t > ]
|
Unifying two arrow types
1 2 3 4 5 6 7 | > unifyA sp ap1@(TA a b) ap2@(TA c d) = either errmsg Right > $ genUnify' sp [(a,c),(b,d)] > where errmsg str = Left $ unwords [ "while unifying" > , ppSp sp ap1, "and" > , ppSp sp ap2 > ] ++ "\n" ++ str
|
Helper functions.
We now document the helper functions used in the unification algorithm. First we give an algorithm to pretty print types. This makes our error messages more readable. Notice that this version does not put unnecessary brackets.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | > -- | Pretty print a type > > pp :: Type -> String > pp INTEGER = "Integer" > pp (TV x) = x > pp (TA s t) = bracket s ++ " -> " ++ pp t > where bracket t@(TA r s) = "(" ++ pp t ++ ")" > bracket s = pp s > > -- | Pretty print a specialised type > ppSp :: Specialisation -> Type -> String > ppSp sp = pp . specialise sp
|
Testing this code using ghci
Since the lecture used the module of the previous lecture you need to give the following commandline arguments.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | $ ghci src/lectures/Unification-algorithm.lhs src/lectures/Towards-type-inference.lhs GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Loading package ffi-1.0 ... linking ... done. [1 of 2] Compiling Lambda ( src/lectures/Towards-type-inference.lhs, interpreted ) [2 of 2] Compiling Unification ( src/lectures/Unification-algorithm.lhs, interpreted ) Ok, modules loaded: Unification, Lambda. *Unification> let [a,b,c,d,e] = map TV $ words "a b c d e" *Unification> unify (TA INTEGER a) b Loading package array-0.3.0.2 ... linking ... done. Loading package containers-0.4.0.0 ... linking ... done. Right (fromList [("b",TA INTEGER (TV "a"))]) *Unification> unify (TA INTEGER a) (TA b (TA b c)) Right (fromList [("a",TA (TV "b") (TV "c")),("b",INTEGER)]) *Unification> *Unification> let t = (TA a b) *Unification> unify (TA INTEGER a) (TA c t) Left "while unifying Integer -> a and c -> a -> b\nFail to unify a with a -> b : recursion detected." *Unification> let Left l = unify (TA INTEGER a) (TA c t) *Unification> putStrLn l while unifying Integer -> a and c -> a -> b Fail to unify a with a -> b : recursion detected. *Unification>
|