Tuesday, May 10, 2011

Haskell Type Woes

Everything has a learning curve, and Haskell is no exception. One thing that seems to take getting used to for new learners of
Haskell (myself included, especially) is its strong type system. Coming
from a background in Python (C and Java before that), I'm used to fluid
return types and the language taking care of almost all aspects of type for me.

Enter Haskell, where everyone has to be on the same page about what type
a function's argument is, and what type it is returning.

While not mandatory, function types can be included with their
definitions, as follows:

square :: Int -> Int
square n = n * n

which is clearly a function that takes one argument and squares it.
Haskell knows this function takes an Int, and returns an Int. If passed
a Char or a Float, it will explode. And rightly so. That would be breaking
your promise to it.

"But I want a function that can handle large squares."
And this happens, for sure (I'm looking at you, Project Euler).
In that case, we can change our function's type to look like this:

square :: Integer -> Integer

and our function will be able to handle a number as big as can be stored in
your computer's memory.

"But I want a function that can handle any number!"

Alright, fine. Here you go:

square :: Num a => a -> a

This essentially allows any number to be passed to the function, then squared.
The following will all work:

Prelude> square 5
25
Prelude> square 1.5
2.25
Prelude> square 123821935012837612834
15331871590323381123938231883275681511556

And this is all fairly straight forward.
My problems, however, have been coming from mid-function type changes,
in which you want a number of one type to be changed to another.

The following, for example, doesn't work.

toFloat :: Int -> Float
toFloat n = n

Nor does a vain attempt at C casting.

toFloat :: Int -> Float
toFloat n = n :: Float

But this will.

toFloat :: Int -> Float
toFloat n = fromIntegral n

Recall our definition of the function 'square' above. It takes a Num
and returns a Num. But Num itself isn't a type. The actually value returned
needs to be something concrete. Take a look at this:

*Main> let x = square 5
*Main> :t x
x :: Integer

Who told x to be an Integer? Well, Haskell did. Wait a minute...

*Main> let y = square 5.5
*Main> :t y
y :: Double

Aha! Proof. Haskell is dealing with the type change for us, since we only
told it that our function 'square' takes Nums and returns Nums.
(Both Integer and Double are part of the Num typeclass)

Here is where the magic of fromIntegral (and functions like it) comes in.
Doing,

toFloat :: Int -> Float
toFloat n = fromIntegral n

tells Haskell, "Yeah, I know n was an Int, but now it's an ambiguous Integral."
It is as if you had defined the type of toFloat to be:

toFloat :: Integral a => a -> Float  -- Though just this wouldn't fix the problem

Haskell then looks at the return type, since a value can't be just
an 'Integral' and says "I declare you a Float!" and everyone lives happily
ever after.

As a more contrived example, consider some library functions I wrote
to convert Integral numbers to lists of their digits (and back again).

import Data.Int (Int8())  -- I wish there were Int4s, even.

itol :: Integral a => a -> [Int8]
itol n | n == 0    = [0]
       | n < 0     = error "No negative numbers! Bad!"
       | otherwise = itol' n

itol' :: Integral a => a -> [Int8]
-- This does all the work.
itol' 0 = []
itol' n = itol' (n `div` 10) ++ [fromIntegral $ n `mod` 10]

ltoi :: [Int8] -> Integer
ltoi ns  = foldl1 (\acc n -> acc * 10 + n) $ map fromIntegral ns

The function itol turns any Integral into a list of Int8s. We don't need
normal 32-bit Ints, so we use the lowest we can, Int8s, to save memory.
Notice the appearance of fromIntegral in the code above.

Let's look at two of the lines from above. First,

itol' n = itol' (n `div` 10) ++ [fromIntegral $ n `mod` 10]

This function accepts any Integral and returns a list of Int8s.
Just,

itol' n = itol' (n `div` 10) ++ [n `mod` 10]

would explode, as n `mod` 10 would return a number of the same type as n,
make a list out of it, then attempt to concat it with itol' (n `div` 10),
which if you look at the types, we get:

[Int8] ++ [Integral a]  -- Where 'Integral a' is whatever type n was.

This won't work as numbers of different type can't be operated together like
this. (Although admittedly, had n been an Int8 it would work, but we wouldn't
have a very big range of numbers to work with.)

[fromIntegral $ n `mod` 10]

will thus convince the result of n `mod` 10 to become an ambiguous Integral,
which Haskell will then force to an Int8, as that is the return type
of the function. We then have,

[Int8] ++ [Int8]

and everything is fine.

The second line we'll look at is,

 ltoi ns  = foldl1 (\acc n -> acc * 10 + n) $ map fromIntegral ns

which takes a list of digits like one we'd create with itol, and turns it
back into an Integer (not just an Int).
The goal of this function is to take every little Int8 in the list and
squeeze them all together into an Integer using a fold. Since an 8-bit
integer would roll-over pretty quickly, all the values need to be forced out
of their 8-bitness from the get-go. Hence we have,

map fromIntegral ns

which moves through the list of Int8s, ns, and convinces them they aren't
just 8-bits anymore.

Our example is concluded. Hopefully this clears up type changes for someone.

Bottom line: When converting between number types, if the compiler is
whining at you, chances are you're missing a cleverly placed fromIntegral.
Really think about what types your variables are, and what the return types
of operations performed on them are!

No comments:

Post a Comment