Haskell - The language of types

by Zebulun Arendsee


** In development **

One of the main concepts I want to convey is that Haskell is fundamentally simpler than Python and most other mainstream languages.

Don’t believe me? If Haskell is so simple, why is it hard to learn? The answer is that complex things have been built from these simple components. You could spend a lifetime exploring the idealized forms and puzzles the community creates. But you don’t have to. Most of the complexity of Haskell is optional.

An additional reason why Haskell is harder for some people is that it is very different from the familiar Algol languages (C, Java, Python, etc). Your first thought when starting with a new language may be, how do I write a for loop? Or, how do I print “hello world”? If so, the responses, “there is no for loop” and “let me tell you about monads”, will not seem very friendly.

I will introduce six main concepts: functions, type constructors, type signatures, type classes, pattern matching and IO.

This order may seem strange to you. Why walk through three sections on types before writing a single line of “normal” code? A functional language is all about functions, functions are maps between things, and things are described by types. So I start by explaining how data is described with types. Then how function types are defined. Then how properties of generics are refined with type classes. With these tools, we can specify an entire Haskell program and type check it. I will introduce you to the design process. Then I will show how to actually write a function (this is easy since Haskell is so simple) and how to take apart data with pattern matching. Finally, I will show you how IO is handled in Haskell.

Type constructors

What is a boolean? In most languages, this is a complex question. In C, there is no built in boolean type. Relational operators (==, <, etc) return int values of 0 or 1. The conditional clause in an if statement evaluates to true if the condition has a binary value of 0. This “0” could be a NULL pointer, a 0 integer, an empty string (just a NULL), or a double that is 0. For floats, there is the odd case of -0, which does not have a binary value of 0 but still evaluates to false.

In Python, there is a built-in boolean type with just two possible values, False and True. This simplicity is deceptive though. As with C, the bool type is also an integer. The expressions 1 + True is 2 and "ab"[True] is “b”. All objects may be used in a conditional clause and Python will coerce them to a boolean. “True” objects include non-empty sets, non-empty lists, non-zero numbers, and empty strings. Classes in Python3 can implement a __bool__ method that allows them to be converted to boolean values. Some types may have multiple zeroes. For instance, if you have a value that may be either None or an integer, and you use it in a boolean, you cannot be sure why the else branch is entered.

In short, the boolean is heavily blessed in most languages. Special boolean behavior is cooked into the compiler. Learning all the quirks of the boolean is just one of many chores on the path language mastery.

In Haskell, though, there is nothing to learn. The Bool type is defined as follows:

data Bool = True | False 

There is no boolean primitive in Haskell. A boolean is simply a set of two values, True and False. It is not an integer. It has no properties. It is exactly equivalent to the logical notion of a boolean that every child understands.

Haskell programmers often speak of how Haskell removes whole categories of errors. This is one such case. It is also the first of many examples I will show of why Haskell is simpler than languages such as Python.

Now let’s switch from booleans and talk about things that may fail. If you look up a key in a map, and the key does not exist, what is returned? Of course, there are many things one might do. Maybe you could raise an exception, as is done for dictionaries in Python. This halts further evaluation and unwinds the stack until caught. Or you could return a NULL value and hope the calling function handles it correctly. However, if a NULL is returned, how can you know whether a NULL is returned because the data structure stored a NULL or because the key did not exist?

In Haskell, the broad notion of values that may or may not exist, and computations that may or may not succeed, is captured by the type Maybe, defined as follows:

data Maybe a = Nothing | Just a

Maybe is a parameterized type with one type variable a. Lowercase type terms in type declarations always represent generics. The type of a depends on use. Nothing and Just are type constructors. Just takes an argument. As with Bool, everything there is to know about Maybe is contained in the type signature. There is no special compiler handling, no coupled dictionary of methods. It is an minimal concept. The featurelessness of Maybe allow its universal use.

But Maybe is not very expressive. What if we want to know more about why on value is returned? What if we are writing a text parser and want to know where and why the parse failed. In this case, we want to return one of two possible values, either the result of a successful parse, or a description of a failing parse. This concept is captured by the data type Either:

data Either a b = Left a | Right b

There is no try-catch in Haskell. There is nothing special about code that may “fail”. What “failure” even is is subjective in Haskell. In most languages, the two paths a computation can take, the “exception” path were something goes wrong, and the successful path where nothing goes wrong, use different semantics. In Haskell, both paths are normal computation. The pass and fail branches are equal parts of the model in Haskell. Both are, in general, handled with equal care. A Haskell programmer can usually say with confidence that all corner cases are modeled within their program. If there are any cases that are not handled (i.e., partial functions), the programmer will know (if they pay any attention to compiler warnings). This leads us to another category of bugs that Haskell eliminates in practice: edge cases.

In practice, you always know if a function you are using can fail because its failure is modeled in the return type as Maybe or Either.

So with three lines we define everything needed for handling conditionals, null values and exceptions. All this without adding any special handling in the compiler.

data Bool = True | False 
data Maybe a = Nothing | Just a
data Either a b = Left a | Right b

We can define data structures in a similar fashion. Perhaps the simplest such data structure is the list:

data List a = Head a | List a 

Square brackets are used in Haskell to represent this type, for example, [Int] is a list of integers with values such as [1,2,3].

More complex types may also be easily defined, such as binary trees:

data BinaryTree a = Leaf a | Node a (BinaryTree a) (BinaryTree a)

Haskell also has special syntax for tuples, which are lists of types of set length and variable type. For example, a tuple of type (Int, String, Bool) may contain values such as (1, "yolo", True).

Long tuples, though, are confusing to work with. Usually “Records” are used instead. Records are just tuples with names for easier access. Here is how you could define a tuple and a record that store the same information:

-- a tuple containing 3d coordinates
type Point = (Double, Double, Double)

-- a record that assigns names to each dimension
data PointRecord = PointRecord 
    { pointX :: Double
    , pointY :: Double 
    , pointZ :: Double 

The type keyword defines a type synonym. This is just a name that exists purely for convenience.

The fields of PointRecord are specified in curly brackets. The compiler will generate accessor functions named pointX, pointY and pointZ for accessing each field in the record. Since these go into the global namespace, they must be unique. Which is why we likely cannot list the labels as just x, y and z. Annoying? Yes, yes it is. Why can’t we just define Point as:

data Point = Point { x :: Double, y :: Double, z :: Double}

And access with foo.x syntax? Well, actually we can with a new Haskell extension. In truth, records are one of the warts in Haskell. Ask any Haskeller what they dislike about Haskell, and awkward records is likely to be one of the items they list.

One last thing, why do we repeat Point? Why not just write:

data Point = {pointX :: Double, pointY :: Double, pointZ :: Double}

The reason is that the first Point is the name of the type. It is the name that goes in type signatures. The second Point is the name of the data constructor. It is the name used to build or match against the record. While records usually have only one data constructor, they may have more, for example:

data Point
    = Point1D {x1D :: Double}
    | Point2D {x2D :: Double, y2D :: Double}
    | Point3D {x3D :: Double, y3D :: Double, z3D :: Double}

By conventions, when there is a single data constructor for a type, the type constructor and data constructor will have the same name.

Before I go on, I would like to introduce the concepts of “sum types” and “product types”. These terms are often used in practice and understanding the intuition behind them may be useful. And if it is not useful, it is at least beautiful.

Types describe sets and sets have sizes. The size of the boolean type is 2. The size of the set of days is 7. What about Maybe and Either? The type Maybe can be Nothing or Just a. Nothing is a atom with just one value. Just a has a size that is equal to the size of a. So the size of Maybe a is $1 + ||a||, where the double bars represent size (or cardinality). What aboutEither a b? It can beLeft aorRight b, so we can just add the sizes ofaandb`: ||Eitherab|| = ||a|| + ||b||. Since the sizes of these types are a summation over the sizes of their different options, we call them “sum types”.

The elements in a tuple or record, though, are free to be anything. The tuple (a,b) will have will have ||a|| * ||b|| different possible values. So we call this a “product type”.

This may all seem mathy for the sake of being mathy. But it leads to an interesting idea. If two types have the same size, then each element in one type can be uniquely mapped to an element in the other. This means that are “isomorphic”. One can be translated into the other without loss of information.

Haskell adds more syntactic sugar than is really necessary. Implementing sum and product types really only requires two constructors (operators). From a mathematical perspective, it might be more elegant to use the type syntax a + b rather than a | b and a * b rather than (a, b). Phrased like this, what are the algebraic laws governing this system? Thinking along these lines will turn you into a computer scientist. You might ask wonder, “what is subtraction, exponentiation, derivation and integration at the type level”? This is why people hate Haskellers.

OK, so that about covers everything you need to know about data declarations. The next layer of abstraction is to combine the data types with functions.

Type signatures

To review, we declare data with the data keyword. For example:

data Weather = Sunny | Rainy | Cloudy

Functions of data may be defined as follows:

goOutsideToPlay :: Weather -> Bool

This type signature describes functions that map values from the Weather type to values in the Bool type.

A function signature generally maps to many possible functions. Just for fun, what is the size of the the goOutsideToPlay type? That is, how many possible functions are there between Weather and Bool. We can make a table of all possible functions. Each function is a particular list of input/output pairs (with True/False reduced to 0/1 for convenience):

Sunny   0 0 0 0 1 1 1 1
Rainy   0 1 0 1 0 1 0 1
Cloudy  0 0 1 1 0 0 1 1

These functions range from never going outside to always going outside. We see that there are 8 different functions. Or 2^3. Or to phrase it differently, ||Bool||^||Weather||. In the previous section, we discussed “sum types” and “product types”. Funtions can be described as “exponential types”. The size of the function a -> b is ||b||||a||. Cool, huh? Useful? Well, probably not to you. But these mathematical rules can actually be used within the compiler to optimize code. The extreme simplicity of the Haskell language allows the compiler to reason powerfully about the code.

Some functions signatures actually are specific enough to map to just one function or reasonable function. In these cases, implementations may be generated automatically. This principle will be applied in the following section to generate many typeclasses instances for a given data type.

What if we want to make functions of multiple arguments? Well. You can’t. But you can make a function that returns a function. Whether the playing outside is dependent on more than just the weather. Maybe it also depends on the child’s age. Now for a given child, you might want a function that tells you when they specifically will go outside. This can be written as follows:

goOutsideToPlayAge :: Int -> (Weather -> Bool)

Then you can write goOutsideToPlayAge 14 to get a function for children of age 14. Or you can serially apply arguments: (goOutsideToPlayAge 14) Sunny. That is a lot of unnecessary parentheses. They can be skipped since the -> operator is right associative and function application is left associative. So we can write the function signature as:

goOutsideToPlayAge :: Int -> Weather -> Bool

This signature is similar to the C prototype:

bool goOutsideToPlayAge(int, Weather);

But it is more flexible, since it allows convenient partial application, which is very useful when coupled with higher order functions. I will introduce these soon.

Being comfortable with type signatures is an essential skill for the Haskell programmer. This skill is best learned by seeing many examples. So I will show a few here.

The simplest function is the identity function:

id :: a -> a

As seen before, generic types are written in lowercase. It is conventional to use a, b, etc as generic type variables. Single letter variables are frowned upon in many communities, but they are common in Haskell. This is especially true for generics. The a in the identity function an be anything. It has no distinguishing features. If we really wanted to call it something, we would have to use something vague, like anything or whatever or thing. By conventions, we stick with a unless we have a good reason to do otherwise.

Here are a few more functions, you can probably guess what all of them do:

fst :: (a, b) -> a
snd :: (a, b) -> b
onFst :: (a -> c) -> (a, b) -> c

In fact, all of these functions have a single non-pathological implementation. The only non-pathological way to get a generic a from (a, b) is to take the first value from the tuple.

Actually, there is also exactly one pathological implementation for each of these functions. They could be undefined. That is, they could map all possible input to “bottom”, the empty/undefined member that exists in all types. Since we know nothing about the input, we cannot do any specific branching. So either we do the one correct operation or we are undefined on everything.

Of course, most functions are not so simple. Here are few examples of functions that return the first element in a list:

head :: [a] -> a
headMay :: [a] -> Maybe a
headEither :: [a] -> Either String a

Since a list may be of 0 length, the first function head is a partial function. That is, one possible input maps (domain) to no output (codomain). So head could crash the program at runtime. The compiler knows this and will raise warnings. Now notice what I did here, even without reading the implementation of head, I already know that it is partial. I know exactly how to use (and when to avoid) head.

headMay is a total function. It handles the empty case by returning Nothing and Just a otherwise.

headEither returns some sort of string describing the error. This is probably a boring string since it can’t contain any information that Nothing would not capture equally well.

Now, notice all through here I assumed that head was returning the first element. What if the impementation actually returns the second element? This would mean that the function is mis-named, which the type-system obviously won’t catch. The head function actually has a name that is narrow than the signature. You could write the following functions with the same signature:

lastElement :: [a] -> a
middleElement :: [a] -> a
secondElement :: [a] -> a

A type system more expressive than vanilla Haskell might allow further refinement of the signature, for example:

head :: xs:[a]{ len xs > 0 } -> x:a{ x == xs[0] }

With these type annotations, the signature will uniquely map to one function for each type a. In languages like Idris, this kind of signature is possible. Thinking along these lines is another reason why people hate Haskellers. For most practical purposes, the type-level ambiguity of what head means is acceptable. The signatures narrow the space of possible implementations and our mental model, using hints from the function name and documentation, does the rest.

Type classes (a foreshadowing)

Lets talk about a few more complex functions:

sum :: [a] -> a
generalSum :: col a -> a
sort :: [a] -> [a]

All of these signatures are broken.

In sum we want to add together all values in a list. We want this function to work for anything numeric. But addition does not apply to all things. So a is, again, too generic. There is not, for instance, an obvious way to add two Person objects.

In generalSum, we want to generalize sum to work on any collection that can be iterated over. This might include a set of numbers, a binary tree of numbers, or values in a hashmap of numbers. But col is too generic. Not all types are collections and certainly not all types take a single parameter.

In sort, we want to take a list and return a sorted list. But any sort function will require comparison between elements, and how do we compare things we know nothing about? We need to specify that for any type in a a comparator function exists.

The problem is that generics are too generic. We need to be able to limit generics to types that have specific required traits.

One solution to this problem would be to always pass in higher order functions that provide the required functionality.

Here is one such signature for sum, written as sum'.:

sum' :: a -> (a -> a -> a) -> [a] -> a

A reasonable sum of an empty list is 0. So whatever type a is, it needs a zero element. We also need a function that taks two members from set a and returns (adds) a new member also in set a. This function has type (a -> a -> a). Taking the summation is then a simple process of iteratively applying the addition operator to elements in the list and the accumulated value. You may know this as a “reduce” operation. More commmonly known as “fold” in the Haskell world.

The name sum is actually a bit too general for the function we have developed. The operation here needn’t be addition. It could also be multiplication. Or it could be string concatenation with an empty string as the “zero” element. Or it could be list appending, with the empty list as the “zero” element.

Now let’s look at generalSum. It needs to be passed the addition functions as well as a reduce/fold function for operating on the generic collection col:

    :: ((b -> c -> c) -> c -> col b -> c) 
    -> a 
    -> (a -> a -> a) 
    -> col a 
    -> a

Wow. That is a long signature. The first element in the signature is a general fold function. It takes a function (b -> c -> c) which combines a value from the initial collection with the accumulator and yields a new accumulator. The second argument c in the fold signature, is the initial accumulator. col b is the original collection.

The second, third, and fourth arguments of generalSum' will all be applied to the fold function to yield the final summation.

The signatures are really awkward, though. Maybe we could role all the helper functions into a named record and pass that in instead? While would could put all three helper functions in one record, it would be more modular to separate out the summing functions and the folding functions. So we may get the following general data types:

data Foldable col = Foldable {fold :: ((a -> b -> b) -> b -> col a -> b)}
data Summable a = Summable {zero :: a, add :: a -> a -> a}

Now we can write:

generalSum'' :: Foldable col -> Summable a -> col a -> a

Now we can define a Foldable record for each type of collection we use and a Summbable record for each type we want to sum.

Again, “Summbable” is a bit to specific a word. We could say “Multiplicatable” if the operator were \* rather than +. Or “Appendable” if the operator were list or string appending and the zero element [] or "". What word is there for the class of things with an empty/zero element and a closed binary operator? Well, there isn’t one in English. So we must either choose a deceptively specific word like “Summable” or choose a precise but possibly unfamiliar (and doubtlessly pretentious) mathematical term. The Haskell community has opted for the second path and borrowed the term “Monoid” from abstract algebra.

Passing in records of required functions “specializes” the generics without any builtin shenanigans. No type-level magic. No special methods baked into the objects. No over-riding or inheritance or (in this case) multiple inheritance patterns. The functions (folding, adding) and the zero element must by provided for generalSum'' to be called.

Haskell does, however, elaborate on this idea by adding “typeclasses”.

built on 2024-03-13 21:43:54.026524143 UTC from file 2023-03-17-type-language