## Contents:

- Why I write and who you are
- Haskell - The language of terms
- Haskell - The language of types
- Haskell - Typeclasses
- Parser combinators: Fast streaming of huge and complex files
- Multiple sequence alignment: Algorithms and performance
- Command line scripting: IO, system calls and parallelism
- Rogue: Graphics and concurrency

In this series, I will compare code from Python and Haskell. While I expect you to be familiar with Python, I assume Haskell is new. The goal of the next three posts is to give you a foundation in the language. You may want to jump ahead to the following practical chapters, but I suggest you at least skim the tutorial first.

Haskell can be divided into two languages: one of terms and one of types. The term-level language in Haskell is extremely simple. Indeed, you will see that it is simpler than Python.

I will introduce Haskell in three sections. The first will introduce the term language. The second will introduce the basic type language. The third will introduce the main typeclasses.

In this section on the language of terms, I will introduce basic data types, functions, imperative IO blocks and finally patterns/guards.

# Basic data types

Haskell has the usual string, integer, and float/double types. It also has tuples (same as in Python) and lists which must contain elements of the same type. Here are a few examples:

This is the GHCI shell. I will use three different shells in this series:
the system shell (BASH), the Python shell, and the Haskell shell. To
distinguish between these, I will use three different prompts: `$`

for Bash,
`>>>`

for Python, and `λ>`

for Haskell.

```
> 1 + 1
λ2
> "asdf" ++ "qwer"
λ"asdfqwer"
> [1,2,3] ++ [4,5,6]
λ1,2,3,4,5,6] [
```

Lists also contain a special operator “:” for attaching a new element to the front of a list.

```
> 1 : [2,3,4]
λ1,2,3,4] [
```

Haskell also offers list comprehensions that may be familiar from Python. The differ only in that Haskell follows the more succinct mathematical set builder notation:

```
> squares = [x * x | x <- [1..10], mod x 2 == 0]
λ> timesTable = [[x * y | x <- [1..10], mod x 2 == 0] | y <- [1..10], mod y 2 == 0] λ
```

Haskell has named tuples called “Records” (like C structs) and “sum types” that generalize enums. But I will wait to introduce there types until the next post.

Beyond linked lists, more sophisticated containers are available in modules. I won’t cover them here, though.

# Functions

In mathematics, a function is a one-to-one mapping from one set (the domain) to another set (the codomain). Every element in the domain must map to exactly one element in the codomain.

Functions in Haskell preserve the mathematical definition. That is all you need to know to reason about them. Here is an example showing the basic syntax:

`= g x + h x f x y `

This is equivalent to the mathematical statement *f*(*x*,*y*) = *g*(*x*) + *h*(*y*). Arguments
to functions are not wrapped in parentheses, as is customary in Python. The
equals sign indicates true equality. This means that the right and left sides of
the equals signs can be substituted for each other anywhere in the
code. Equivalence has deep consequences. Consider the following Haskell code:

In Haskell, the equals sign indicates equality

```
= 5
x = f x
y = (y, y) -- this is the syntax for a tuple, a pair of values z
```

Since `y = f x`

, `z = (f x, f x)`

. This entails `f x`

cannot mutate `x`

or have
any side effects. For example, if executing the function `f x`

printed a value
to the console, then substituting `f x`

for `y`

in the tuple would cause two
print statements. Similarly, if `f x`

mutated `x`

or changed some global
variable, then calling `f x`

once and using the return value `y`

twice would
differ from calling `f x`

twice in the tuple definition. From this we see that
all data must be immutable and all function execution must be have no side
effects. But wait, if there are no side effects, how can we even do anything in
Haskell?

Haskell supports `where`

statements with scope local to the function:

```
= q + r where
f x y = g x
q = h x r
```

Function arguments are positional. There are no argument names or defaults. This may seem limiting to a Python programmer who is accustomed representing the default parameterization of a function with keyword arguments. In Haskell, the same context provided by keyword arguments would typically be wrapped into a single record (like a C struct).

Haskell functions support the partial application, or currying, of
arguments. In Haskell, all functions take exactly one argument. When the
argument is applied to the function, a new value is returned. This return value
may be another function. So the expression `f x y`

is just syntactic sugar for
`(f x) y`

. First `x`

is applied to `f`

, then `y`

is applied to the function `(f x)`

. Currying simplifies reasoning about functions (as you will see in the next
post) and allows very clean syntax for making closures. For example, adding 1 to
every value in a list can be done with `map (+ 1) xs`

. When the binary function
`+`

is applied to 1, it yields a new function of a single argument that is then
applied to each value in `xs`

.

## Imperative IO blocks

Here is the obligatory “Hello World” program:

`= print "Hello World" main `

Write this into a program called `Main.hs`

and then you can compile it with GHC:

```
$ ghc Main.hs
$ ./Main
Hello World
```

Every executable Haskell program has a `main`

function in (usually) a file named
`Main.hs`

. Filenames must be capitalized.

The Haskell core also provides the function, `interact`

, which reads from STDIN,
passes the text to a function that takes and returns a string, and writes the
final string to STDOUT. With this, we can write simple string manipulation
programs. For example, we can write a function that prints the first 3 lines in
a file:

```
= unlines (take nlines (lines text))
headLines nlines text
= interact (headLines 3) main
```

Here `lines`

breaks a string on newlines, `take`

returns the first n elements in a
list, and `unlines`

concatenates many strings into one newline delimited string.
These three functions are defined in Prelude, the default module that is loaded
into every Haskell module unless specified otherwise.

Sometimes we may want to string together many IO operations. This can be done
in an imperative style using a `do`

block:

```
= do
main let filename = "z.sh"
<- ls
files unlines files) filename
cat ("444" filename chmod
```

Within a do block, variables can be defined with the `let`

keyword. Pure values
can be extracted from IO operations using the `<-`

notation. IO operations run
only for their effects, can be called by themselves, like `chmod`

, which changes
the permissions on a file. The functions used here – `ls`

, `cat`

, and `chmod`

– are all functions that I haven’t defined yet. I will introduce them in the
up-coming post on using Haskell as a scripting language.

`do`

notation can be used for far more than just IO operations. I will introduce
the full power of this system in the 3rd post of this tutorial when I introduce
monads. It is sometimes said that Haskell is the best imperative
language. While any use of the word “best” is a bit dubious, one of Haskell’s
greatest strengths certainly is in the ability to create very safe, powerful,
and elegant imperative domain-specific languages (DSLs).

## Pattern matching and guards

Pattern matching allows the behavior of a function to be specialized for particular structures in the input data.

If a function has

```
:xs) (y:ys)
merge (x| x > y = y : x : merge xs ys
| otherwise = x : y : merge xs ys
= xs <> ys merge xs ys
```

The `:`

operator is the “cons” operator that adds an element to the beginning of
a list. Thus in the pattern `(x:xs)`

, `x`

is the first element of the list and
`xs`

is all following elements. If the list is empty, the pattern `x:xs`

will
not match. If the pair of patterns `(x:xs) (y:xs)`

both do not match, then one
of the lists is empty, and the second pattern `xs ys`

, which matches everything,
will match and the lists `xs`

and `ys`

will be joined (using the Semigroup
operator `<>`

).

The guard patterns start with `|`

and branch on boolean conditions. They allow
comparisons which are not currently supported by patterns.

Pattern matching practically eliminates missing edge cases. All missing cases will be caught by the compiler. But even apart from this, the elegance of pattern matching makes the design of functions very systematic. Rather than handling edge cases as an afterthought after writing the main program, they are handled first, in a clean systematic way. For example, the Fibonacci series can be written as follows:

If you’ve ever seen a single example of recursion, then you probably know that this Fibonacci implementation runs in exponential time. I’ll show you an efficient implementation in a future post.

```
0 = 1
fibonacci 1 = 1
fibonacci = (fibonacci n) + (fibonacci (n - 1)) fibonacci n
```

Most functions in Haskell as designed in this way. First the base cases are specified, then the general case is specified. The functions thus are correct by induction.

Is our `fibonacci`

function correct? Well, that depends on the domain. If the
inputs are natural numbers (0, 1, …), then the function is “total”. That is,
all values in the domain map to a single value in the codomain. However, if the
input to fibonacci is a signed integer, then the function is “partial” and our
compiler will raise a warning.

If we assume the input is a signed integer, we can fix our code as follows:

```
fibonacci n| n < 3 = 1
| otherwise = (fibonacci n) + (fibonacci (n - 1))
```

Let me show you a few more examples of patterns and guards:

```
fst (a, _) -> a
= []
pairs [] = []
pairs [x] :y:xs) = (x,y):(pairs xs)
pairs (x
filter _ [] = []
filter f (x:xs)
| f x = x : filter f xs
| otherwise = filter f xs
```

But what if the input type is not what we expect? In Python, you could pass
`None`

into the Python equivalent of `pairs`

. In Haskell, however, that is not
possible. The functions written above are correct and will *never* go wrong.
They cannot be misused. In Haskell, you can build systems of great complexity
with near confidence in their correctness. You know the component pieces are
correct. You know the operations you use to compose them are correct. And thus
their composition is correct.