Beware of numeric type classes.
April 4, 2017 (Tuesday) reddit thisThis blog post is to document some cases where unconstrained Num type definitons can seriously compromise type safety. None of what I say here is new but it is worth repeating. Take it as a note of warning against being zealous when defining instances of some standard type classes, in particular the Num
class
Here is an example from the cryptographic library Raaz
. To prevent inadvertent use of the wrong offset measure, raaz uses what we call type-safe lengths. One such length is BYTES
but there are others. For example, the type BLOCKS c
measures length in multiples of the block length of the the cipher c
. The advantage of such a definition is that for low level functions like encryptBlocks
the length argument is in BLOCKS c
which ensures that one does not accidently supply length in bytes. The constructors of BLOCKS should not be directy availabe to the outside world to avoid users from circumventing such gurantees. Instead we define a class LengthUnit
which has a member function that converts instances to BYTES
. This instance function is what is use to perform the actual size calculation of pointer manipulation.
It is tempting to define the Num
instance for BLOCKS
as this is a great convenience for users. However, such an instance will effectively bypass all the type safety we so carefully built into the system. Consider for example, a code like encryptBlocks chacha20imp ptr 42
. Haskell is able to infer that the 42 is of type BLOCKS ChaCha20
because it knows the type of chacha20imp
. This will not raise any kind of type error and will result in encrypting 42 blocks (i.e. 2688 bytes) of data. This is a disaster if the user actually meant only 42
bytes. Haskell should not have the privilege of such bugs unlike others like Fortran (See Mariner 1).
A theoretical reason to see why the Num
instance is plain wrong for length units is the see that multiplication does not make sense here. In the language of physics, length offset is not a dimensionless scalar and hence the product of is which is not a length unit at all. On the other hand, makes sense when calculating the size of say an array of Word32
of length 10. This is because the quantity 10 is a dimensionless and therefore the resulting product has the same dimension as bytes.
In the heat of the moment it is pretty easy to overlook these bugs. This is particularly tempting because of the “unnecessary” boilerplate that it gets rid of that to, at a very minimal cost; just a deriving
clause away. I would freely confess my stupidity in this aspect. At least we have repented and corrected this to avoid perdition.