Haskell is just some steps away from Java

Start with Java, then:

  1. Forbid using null;
  2. Use only immutable objects, add “final” modifier to everything;
  3. Swap methods by static functions with the the original “this” as the first argument, e.g. “foo.bar()” turns into “bar(foo)”;
  4. Add a lot of features to the type system;
  5. Remove type annotations, i.e. “Foo bar(Foo self)” turns into “bar(self)”;
  6. Remove useless parens, i.e. “bar(foo)” turns into “bar foo”;
  7. Add call-by-need evaluation;
  8. Done, you have Haskell.

It’s not that hard, is it?

One, using null references is a recognized bad practice, see “Null References: The Billion Dollar Mistake.” Java 8 already provides the Optional type to stop using nulls.

Two, immutable objects are a win strategy, see posts by Hirondelle Systems, IBM, Yegor, and others.

Three, as you only have immutable objects, there is no reason to use methods instead of static functions, considering you maintain polymorphism (not quite the case for Java, but for the sake of this rant, consider as if it has this feature).

Four, improve the type system. The type system used by Java language misses a lot of features. If you don’t feel it, just consider this as an added bonus. When you start using the type system features in your favor, you end up with much better code.

Five, imagine the Java compiler could infer the type of the arguments, so you don’t need to type them everywhere. You still have the same static typing language, you just don’t need to write the types. Shorter code means less liability to haunt you.

Six, why all those parens? Just drop them. Less code to write, hurray!

Seven, call-by-need just makes a lot of things easier (also makes a lot of things harder), but I really think it is a winner when you talk about productivity. When coding, I feel it a lot easier to express things in terms of values instead of steps (mathematicians have been doing this since long before computers). Expressing things in terms of values in a universe without call-by-need will result in a lot of useless computations, so call-by-need is a must.

Eight, done! This is Haskell. No functors, monads, arrows, categories or lists needed.

Why this post? Well, I don’t know. It just occurred to me that if you really go into following good coding practices in Java (e.g. avoid null, use immutable objects), you will eventually feel familiar with functional code. Add some more things to the mix, and you end up with Haskell. I think people feel a bit scared at first contact with Haskell (and family) because of the academic and mathematical atmosphere it has, but in the end it is just a lot of good practices that you are kind of required to comply with.

Useful pure functional programming

I guess all programmers used to the imperative model gets stunned wondering how crazy the pure functional model is. After all, how could we do anything useful without side effects, printing to the terminal, opening a window, producing logs and variables? OMG.

I did get stunned with Haskell and because of a happy insistency I realized that the problem was in my head and not in the pure functional model.

Differences

Living for a long time in the context of an imperative world made me get used to think in a specific sequential way. I always needed to tell the computer how to do every computational step, what states to hold, what variables to update, when to return some value, etc.

On the other hand, in the pure functional world, I’m forced to think in a way to transform data. Instead of thinking on each step, now I need to think how to transform some data into another, i.e. how to extract the thing I want from the thing I have.

The classical examples to show this difference involves list manipulation.

Example #1

To calculate the sum of a list in the imperative style, you need to control the current sum state, which needs to be updated at each element.

int sum(int[] list) {
  int result = 0;
  for (int i : list)
    result += i;
  return result;
}

In the functional world, you need to transform a list into a single number that represents the sum of all elements, i.e. the sum of an empty list will always be zero and the sum of any other list can be extracted by adding its first value and the sum of the rest of the list.

sum [] = 0
sum (x:xs) = x + sum xs

Example #2

When we need to maintain control over two lists in the imperative world, one for input and one for output, we need to include the current index of each list in the “bag of things we need to reason about in order to avoid silly errors”.

String[] toText(int[] list) {
    String[] result = new String[list.length];
    for (int i = 0; i < list.length; ++i)
        result[i] = Integer.toString(list[i]);
    return result;
}

The same example in the functional world stays simple as the sum of the list, the only difference is that we transform the list using the list constructor instead of the plus operator.

toText [] = []
toText (x:xs) = show x : toText xs

Motivation

The pure functional model can do many more things beyond these classical examples in a way that I consider to be pretty and with free benefits. This post is yet another try to demonstrate this. As I’m quite new to the pure functional world and have no profound knowledge of Monads, Functors, Arrows and all that stuff that looks brilliant on the hand of those who knows how to use it, I guess this post will be easy to follow, as long as you have a basic understanding of Haskell.

Calculator

We’ll create a calculator in Haskell. All code shown will be pure, with no side effects, including a DSL to write tests, the test executor itself and two other test case transformations, one to write the test in text form and one to write the same text in JUnit form.

Test cases

First, let’s define the test cases’ format. They are a sequence of actions and assertions:

data TestSequence =
    Do Action TestSequence
  | Check Assertion TestSequence
  | Done

With that data definition, the test cases can be written like this:

test =
  Do a.
  Do b.
  Do c.
  Check d$
  Done

The need to write “Done” at the end of the test case bothers me. To avoid that, we can create a type synonym to open-ended tests:

type Test = TestSequence -> TestSequence

Now the tests can be written as:

test =
  Do a.
  Do b.
  Do c.
  Check d

To me, this is a valid small DSL to write our tests cases.

The only action available to our user will be to push the buttons and the only information that could be checked is the display:

data Action =
    Press Button

data Assertion =
    DisplayHasNumber Int

data Button =
    Zero
  | One
  | Two
  | Three
  | Four
  | Five
  | Six
  | Seven
  | Eight
  | Nine
  | Plus
  | Minus
  | Times
  | Divide
  | Equals
  | Clear
    deriving (Show)

First test

The first step is done. We already have a DSL to write tests that type-check. Time to write our first test:

sample :: Test
sample =
    Do (Press One).
    Do (Press Plus).
    Do (Press One).
    Do (Press Equals).
    Check (DisplayHasNumber 2).

    Do (Press Clear).
    Check (DisplayHasNumber 0).

    Do (Press Two).
    Do (Press Zero).
    Check (DisplayHasNumber 20).
    Do (Press Divide).
    Do (Press Two).
    Check (DisplayHasNumber 2).
    Do (Press Equals).
    Check (DisplayHasNumber 10)

Test transformation

Note that the test case is actually a data structure, giving us the possibility to transform it. To make the transformation easier, let’s create a function that applies some generic function to each step, creating a list with the results:

unroll :: (TestSequence -> a) -> Test -> [a]
unroll f t = g (t Done)
  where g Done = [f Done]
        g v@(Do _ next) = f v : g next
        g v@(Check _ next) = f v : g next

Test case -> Text

We can transform the test case into a text:

prettyPrint :: Test -> String
prettyPrint = unlines . unroll prettyPrintTestSequence

prettyPrintTestSequence :: TestSequence -> String
prettyPrintTestSequence s =
    case s of
      Done              -> "end"
      Do action _       -> prettyPrintAction action
      Check assertion _ -> prettyPrintAssertion assertion

prettyPrintAction :: Action -> String
prettyPrintAction (Press button) = 
    "press " ++ prettyPrintButton button

prettyPrintButton :: Button -> String
prettyPrintButton = map toLower . show

prettyPrintAssertion :: Assertion -> String
prettyPrintAssertion (DisplayHasNumber number) = 
    "the display should be showing the number " ++ 
    show number

Let’s try in GHCi:

\> putStr $ prettyPrint sample
press one
press plus
press one
press equals
the display should be showing the number 2
press clear
the display should be showing the number 0
press two
press zero
the display should be showing the number 20
press divide
press two
the display should be showing the number 2
press equals
the display should be showing the number 10
end

Test case -> JUnit

We can translate our test case into JUnit:

generateJUnit :: Test -> String
generateJUnit = 
    ("@Test\npublic void test() {\n" ++) . 
    unlines .
    unroll generateJUnitTestSequence

generateJUnitTestSequence :: TestSequence -> String
generateJUnitTestSequence s =
    case s of
      Done      -> "}"
      Do a _    -> generateJUnitAction a
      Check a _ -> generateJUnitAssertion a

generateJUnitAction :: Action -> String
generateJUnitAction (Press b) =
    generateJUnitButton b ++ ".press();"

generateJUnitButton :: Button -> String
generateJUnitButton b = "getButton" ++ show b ++ "()"

generateJUnitAssertion :: Assertion -> String
generateJUnitAssertion (DisplayHasNumber n) =
    "assertEquals(" ++ show n ++ ", getDisplayNumber());"

At GHCi:

\> putStr $ generateJUnit sample
@Test
public void test() {
getButtonOne().press();
getButtonPlus().press();
getButtonOne().press();
getButtonEquals().press();
assertEquals(2, getDisplayNumber());
getButtonClear().press();
assertEquals(0, getDisplayNumber());
getButtonTwo().press();
getButtonZero().press();
assertEquals(20, getDisplayNumber());
getButtonDivide().press();
getButtonTwo().press();
assertEquals(2, getDisplayNumber());
getButtonEquals().press();
assertEquals(10, getDisplayNumber());
}

Concrete test

And of course, we can use our test case data structure to actually test a calculator implementation:

data TestResult =
    Ok
  | Failed FailureMessage

type FailureMessage = String

instance Show TestResult where
    show Ok = "Test passed"
    show (Failed m) = "Test failed: " ++ m

checkTest :: Test -> TestResult
checkTest t = 
    evalState 
    (threadCheckState . unroll checkTestSequence $ t)
    mkCalculator

threadCheckState :: [State Calculator TestResult] ->
                    State Calculator TestResult 
threadCheckState = go 0
  where go _ [] = return Ok
        go n (x:xs) = x >>= (f n xs)
        f n xs Ok = go (n + 1) xs
        f n _ (Failed m) = 
          return . Failed $
            "Step " ++ show n ++ ". " ++ m

checkTestSequence :: TestSequence ->
                     State Calculator TestResult
checkTestSequence Done = return Ok
checkTestSequence (Do a _) = checkAction a
checkTestSequence (Check a _) = checkAssertion a

checkAction :: Action -> State Calculator TestResult
checkAction (Press b) = do
    modify $ pressButton b 
    return Ok

checkAssertion :: Assertion -> State Calculator TestResult
checkAssertion (DisplayHasNumber n) =
    get >>= \c ->
      if displayNumber c == n
        then return Ok
        else return . Failed $ 
          "Wrong number in display, should be " ++
          show n ++ " but was " ++ show (displayNumber c)

Calculator’s core

Finally, to type-check our tester, we need to define a first version of our calculator. Here is your opportunity to try your own. I’ve produced the following code to pass the test:

data Calculator = Calculator {
    displayNumber :: Int
  , operation :: Maybe (Int -> Int -> Int)
  , savedNumber :: Int
  }

pressButton :: Button -> Calculator -> Calculator
pressButton b =
  case b of
    Zero    -> appendNumber 0
    One     -> appendNumber 1
    Two     -> appendNumber 2
    Three   -> appendNumber 3
    Four    -> appendNumber 4
    Five    -> appendNumber 5
    Six     -> appendNumber 6
    Seven   -> appendNumber 7
    Eight   -> appendNumber 8
    Nine    -> appendNumber 9
    Plus    -> saveOperation (+)
    Minus   -> saveOperation (-)
    Times   -> saveOperation (*)
    Divide  -> saveOperation div
    Equals  -> performOperation
    Clear   -> clear

appendNumber :: Int -> Calculator -> Calculator
appendNumber i c = 
  c { 
    displayNumber = (displayNumber c) * 10 + i 
  }

saveOperation :: (Int -> Int -> Int) ->
                 Calculator -> Calculator
saveOperation f c = 
  c { 
    savedNumber = (displayNumber c)
  , displayNumber = 0
  , operation = Just f
  }

performOperation :: Calculator -> Calculator
performOperation c = 
  c { 
    savedNumber = newNumber
  , displayNumber = newNumber 
  }
  where newNumber = 
          case (operation c) of
            Nothing -> displayNumber c
            Just f  -> let a = savedNumber c
                           b = displayNumber c 
                        in
                           f a b

clear :: Calculator -> Calculator
clear = const mkCalculator

mkCalculator :: Calculator
mkCalculator = 
  Calculator { 
    displayNumber = 0
  , operation = Nothing
  , savedNumber = 0
  }

Running the tests

Now we can run the tester against our implementation at GHCi, let’s do it:

\> checkTest sample
Test passed

This means our calculator is good enough for our use case. To assure the test case can capture an unexpected behavior, let’s introduce a bug changing the value of the multiplication on the function “appendNumber” from “10” to “100” and run the test again:

\> checkTest sample
Test failed: Step 9. Wrong number in display, should be 20 but was 200

Wonderful.

Summary

We’ve built a basic code for a working calculator that is good enough for a simple use case. We can add new tests and keep factoring our code, TDD style. All that was made in a 100% pure way, with no dependency upon IO, state, variables or logs.

Beyond the benefit to be able to do any kind of transformation in the test cases, we can also run as many transformations and as many tests we want in parallel, because the test and the calculator are thread-safe by the simple fact that they are pure.

We did this in only 251 lines of code. How many classes or lines are needed to do the same in your favorite language?

I hope this post can enlighten you to think outside the box of the imperative world. There are many advantages in learning a pure functional language, even if you use imperative languages at work.

The code is available at GitHub: https://gist.github.com/3354394.