# An interpreter in Haskell

I haven’t touched Haskell for a couple of years. To get back into it, I made an interpreter for a small imperative language. Haskell is great for making interpreters!

First I define the syntactic structure.
My language has integers (its only native datatype),
pure operators on integers,
a set of global variables,
some control flow constructs,
and two I/O primitives.
Below the syntax is encoded as the type `E`

(for “expression”).

```
data E =
-- literals
EInt Int
-- pure operators
| EBinOp Op E E
| ENot E
-- global variables
| EGet String
| ESet String E
-- control flow
| EIf E E E
| ESeq E E
| EWhile E E
| EDoWhile E E
| ESkip
-- I/O
| EWriteByte E
| EReadByte
data Op = Add | Sub | Eq | Lt | Lte | And -- and many more
```

Here’s an example of a program written in this language,
called `writeXsForever`

.

```
writeXsForever :: E
writeXsForever = EWhile (EInt 1) (EWriteByte (EInt 120))
```

You might be able to eyeball this expression and guess its intended behavior.
My intended behavior for this program is that it writes the byte `x`

to stdout repeatedly forever!
But to give this program meaning, I must define the interpreter.
In Haskell, the interpreter could be a function with the type:

```
eval :: E -> IO ()
```

My interpreter is more complex for two reasons.
First, since the interpreter is evaluating an *expression*,
it needs to return the value that expression evaluated to;
in my language this is always an `Int`

.
Second, my language has mutable global variables (`Map String Int`

)
which must be “threaded” through each evaluation.

```
eval :: Map.Map String Int -> E -> IO (Int, Map.Map String Int)
```

The Haskell `main`

function begins the interpreter by calling `eval`

on the root expression of the program:

```
main :: IO ()
main = do
eval Map.empty writeXsForever
return ()
```

```
$ ./jimscript
xxxxxxxxxxxxxxxx^C
$
```

Now I define `eval`

case-by-case on each syntactic form in `E`

.
I start with perhaps the simplest, `EInt`

,
a literal which evaluates to itself
and does not modify any variables:

```
eval vars (EInt i) = return (i, vars)
```

Next, I interpret `BinOp`

, the syntax form for all binary operators on integers.
Notice how we evaluate the left-hand side expression before the right-hand side,
and that this matters because of to the side-effects that expressions can have
on global variables and on input/output.
Notice the “threading” of `vars`

through each evaluation gets quite verbose
(I’ve chosen not to abstract this,
because I plan to implement more sophisticated variable “scoping” in future).
Also notice that `evalOp`

is rather tedious,
translating between `Op`

values and Haskell functions which implement them.
Much of the work in writing an interpreter is handed off to the host language!

```
eval vars (EBinOp op e1 e2) = do
(val1, vars') <- eval vars e1
(val2, vars'') <- eval vars' e2
return (evalOp op val1 val2, vars'')
evalOp :: Op -> Int -> Int -> Int
evalOp Add a b = a + b
evalOp Sub a b = a - b
evalOp Eq a b = if a == b then 1 else 0
evalOp Lt a b = if a < b then 1 else 0
evalOp Lte a b = if a <= b then 1 else 0
evalOp And a b = if a == 0 || b == 0 then 0 else 1
```

The global variable map has primitive “get” and “set” expressions,
which are evaluated as follows.
Notice the call to `error`

if the variable isn’t set
(I’m not a Haskell purist).

```
eval vars (EGet var) = case Map.lookup var vars of
Nothing -> error $ "no such variable: " ++ var
Just x -> return (x, vars)
eval vars (ESet var e) = do
(val, vars') <- eval vars e
return (val, Map.insert var val vars)
```

On to control flow,
an interesting one is `EWhile`

.
Its “looping” behavior is implemented using Haskell recursion;
notice the subcall evaluating a new `EWhile`

with the new global variable set:

```
eval vars (EWhile c e) = do
(cond, vars') <- eval vars c
case cond of
0 -> return (0, vars')
_ -> do
(_, vars'') <- eval vars' e
eval vars'' (EWhile c e)
```

On to I/O, here’s the interpreter for `IWriteByte`

.
My language can only write to stdout,
but it could be extended to write to files, sockets and so on
(but this would want a native string datatype, not just integers).

```
eval vars (EWriteByte byteE) = do
(byte, vars') <- eval vars byteE
if byte < 0 then error $ "Tried to print byte < 0: " ++ show byte
else if 255 < byte then error $ "Tried to print byte > 255: " ++ show byte
else PosixIO.fdWrite PosixIO.stdOutput [Char.chr byte]
return (byte, vars')
```

Now here are some more interesting JimScript programs:

```
writeTheAlphabet :: E
writeTheAlphabet =
ESeq
(ESet "x" (EInt 1))
(EWhile (ENot (EBinOp Eq (EGet "x") (EInt 27))) (ESeq
(EWriteByte (EBinOp Add (EInt 64) (EGet "x")))
(ESet "x" (EBinOp Add (EGet "x") (EInt 1)))))
```

```
$ ./jimscript
ABCDEFGHIJKLMNOPQRSTUVWXYZ
```

```
uppercase :: E
uppercase =
(EDoWhile (ESeq
(ESet "c" EReadByte)
(EIf (EBinOp Eq (EGet "c") (EInt (-1)))
ESkip
(EIf (EBinOp And
(EBinOp Lte (EInt 97) (EGet "c"))
(EBinOp Lte (EGet "c") (EInt 122)))
(EWriteByte (EBinOp Sub (EGet "c") (EInt 32)))
(EWriteByte (EGet "c")))))
(ENot (EBinOp Eq (EGet "c") (EInt (-1)))))
```

```
$ ./jimscript
hello
HELLO
```

Programs in this language are Haskell expressions of type `E`

;
there is no defined syntax.
I might define a syntax and write a parser next.

Addendum:
some of the `eval`

definitions were long-winded so I omitted them.
Here’s are the rest.

```
eval vars (ENot e) = do
(v, vars') <- eval vars e
case v of
0 -> return (1, vars)
_ -> return (0, vars)
eval vars (EIf c t e) = do
(cond, vars') <- eval vars c
case cond of
0 -> eval vars' e
_ -> eval vars' t
eval vars (EDoWhile e c) = do
(_, vars') <- eval vars e
(cond, vars'') <- eval vars' c
case cond of
0 -> return (0, vars'')
_ -> eval vars'' (EDoWhile e c)
eval vars (ESeq e1 e2) = do
(_, vars') <- eval vars e1
eval vars' e2
eval vars ESkip = return (0, vars)
eval vars EReadByte = do
exp :: Either Exception.SomeException (String,Foreign.C.Types.CSize) <- Exception.try (PosixIO.fdRead PosixIO.stdInput 1)
case exp of
Left _ -> return (-1, vars)
Right (str,count) -> do
if count == 0 then
return (-1, vars)
else do
let [c] = str
return (Char.ord c, vars)
```

### More by Jim

- The inception bar: a new phishing method
- The hacker hype cycle
- Project C-43: the lost origins of asymmetric crypto
- How Hacker News stays interesting
- My parents are Flat-Earthers
- The dots do matter: how to scam a Gmail user
- The sorry state of OpenSSL usability
- I hate telephones
- The Three Ts of Time, Thought and Typing: measuring cost on the web
- Granddad died today
- Your syntax highlighter is wrong

Tagged #programming, #haskell. All content copyright James Fisher 2018. This post is not associated with my employer. Found an error? Edit this page.