Is implementation of memoization in Haskell a monad? - haskell

I tried to solve Project Euler's Problem 14 (involving length of a Collatz sequence) using memoization, and this is how I did to keep the results from previous calculations. I have this function, collatzSequence, that I want to memoized, and I memoize it with computeWithMemo, which takes a function, a value to calculate the function on, a Map, and returns the function's value at that point and an updated Map. Is this what a Monad pattern is? Thanks!
import Data.Map
computeWithMemo :: (Int -> Int) -> Int -> Map Int Int -> (Maybe Int, (Map Int Int)
computeWithMemo fun key memo
| elem key (Data.Map.keys memo) = (Data.Map.lookup key memo, memo)
| otherwise = (Just (fun key), Data.Map.insert key (fun key) memo)
collatzSequence :: Int -> Int
collatzSequence x
| x == 1 = 1
| even x = 1 + collatzSequence (x `div` 2)
| odd x = 1 + collatzSequence (x*3 + 1)
memoize f = computeWithMemo f
memoizedCollatz = memoize collatzSequence
solve x m
| x > 1 = solve (x-1) (snd (computeWithMemo (collatzSequence) x m))
| otherwise = m
solution = solve 10000 Data.Map.empty

It's an ad-hoc reimplementation of parts of the internals of the State monad in the sense that it creates and executes functions that takes and returns an additional argument in a way that simulates state.
The main differences between your code and State is:
You hard-code the logic of passing around state for a certain type of function in your solve method.
State provides a function >>= (bind) that defines how to combine two stateful functions, or how to call one stateful function from another (all monads are required to do this).
You hard-code the process of creating a stateful function from a stateless one taking and returning an Int.
State provides a function return that can be used to make any stateless function stateful (all monads are required to do this).
You hard-code the operations you can do on your state, specifically memoizing functions in a Map Int Int.
State provides some functions to get, set and modify the state that together with >>= can be used to create functions being stateful in all sorts of ways (this is specific to State, and not to monads in general).
So yes, you have basically defined a very, very specific and narrow case of one specific monad!
If you want to formally make it a true monad, you can define analogs to >>= and return, and perhaps even implement the Monad typeclass so you can use Haskell's combinators and syntactic sugar on them.

Related

How to define a function type of a non-recursive functions that has still a recursive type

Given is a Javascript function like
const isNull = x => x === null ? [] : [isNull];
Such a function might be nonsense, which is not the question, though.
When I tried to express a Haskell-like type annotation, I failed. Likewise with attempting to implement a similar function in Haskell:
let isZero = \n -> if n == 0 then [] else [isZero] -- doesn't compile
Is there a term for this kind of functions that aren't recursive themselves, but recursive in their type? Can such functions be expressed only in dynamically typed languages?
Sorry if this is obvious - my Haskell knowledge (including strict type systems) is rather superficial.
You need to define an explicit recursive type for that.
newtype T = T (Int -> [T])
isZero :: T
isZero = T (\n -> if n == 0 then [] else [isZero])
The price to pay is the wrapping/unwrapping of the T constructor, but it is feasible.
If you want to emulate a Javascript-like untyped world (AKA unityped, or dynamically typed), you can even use
data Value
= VInt Int
| VList [Value]
| VFun (Value -> Value)
...
(beware of a known bug)
In principle, every Javascript value can be represented by the above huge sum type. For example, application becomes something like
jsApply (VFun f) v = f v
jsApply _ _ = error "Can not apply a non-function value"
Note how static type checks are turned into dynamic checks, in this way. Similarly, static type errors are turned into runtime errors.
Chi showed how such an infinite type can be implemented: you need a newtype wrapper to “hide” the infinite recursion.
An intriguing alternative is to use a fixpoint formulation. Recall that you could pseudo-define something recursive like your example as
isZero = fix $ \f n -> if n == 0 then [] else [f]
Likewise, the type can actually be expressed as a fixpoint of the relevant functors, namely of the composition of (Int ->) and [] (which in transformer gestalt is ListT):
isZero :: Fix (ListT ((->) Int))
isZero = Fix . ListT $ \n -> if n==0 then [] else [isZero]
Also worth noting is that you probably don't really want ListT there. MaybeT would be more natural if you only ever have zero or one elements. Even more nicely though, you can use the fact that functor fixpoints are closely related to the free monad, which gives you exactly that “possibly trivial” alternative case:
isZero' :: Free ((->) Int) ()
isZero' = wrap $ \n -> if n==0 then Pure () else isZero'
Pure () is just return () in the monad instance, so you can as well replace the if construct with the standard when:
isZero' = wrap $ \n -> when (n/=0) isZero'

Short-lived memoization in Haskell?

In an object-oriented language when I need to cache/memoize the results of a function for a known life-time I'll generally follow this pattern:
Create a new class
Add to the class a data member and a method for each function result I want to cache
Implement the method to first check to see if the result has been stored in the data member. If so, return that value; else call the function (with the appropriate arguments) and store the returned result in the data member.
Objects of this class will be initialized with values that are needed for the various function calls.
This object-based approach is very similar to the function-based memoization pattern described here: http://www.bardiak.com/2012/01/javascript-memoization-pattern.html
The main benefit of this approach is that the results are kept around only for the life time of the cache object. A common use case is in the processing of a list of work items. For each work item one creates the cache object for that item, processes the work item with that cache object then discards the work item and cache object before proceeding to the next work item.
What are good ways to implement short-lived memoization in Haskell? And does the answer depend on if the functions to be cached are pure or involve IO?
Just to reiterate - it would be nice to see solutions for functions which involve IO.
Let's use Luke Palmer's memoization library: Data.MemoCombinators
import qualified Data.MemoCombinators as Memo
import Data.Function (fix) -- we'll need this too
I'm going to define things slightly different from how his library does, but it's basically the same (and furthermore, compatible). A "memoizable" thing takes itself as input, and produces the "real" thing.
type Memoizable a = a -> a
A "memoizer" takes a function and produces the memoized version of it.
type Memoizer a b = (a -> b) -> a -> b
Let's write a little function to put these two things together. Given a Memoizable function and a Memoizer, we want the resultant memoized function.
runMemo :: Memoizer a b -> Memoizable (a -> b) -> a -> b
runMemo memo f = fix (f . memo)
This is a little magic using the fixpoint combinator (fix). Never mind that; you can google it if you are interested.
So let's write a Memoizable version of the classic fib example:
fib :: Memoizable (Integer -> Integer)
fib self = go
where go 0 = 1
go 1 = 1
go n = self (n-1) + self (n-2)
Using a self convention makes the code straightforward. Remember, self is what we expect to be the memoized version of this very function, so recursive calls should be on self. Now fire up ghci.
ghci> let fib' = runMemo Memo.integral fib
ghci> fib' 10000
WALL OF NUMBERS CRANKED OUT RIDICULOUSLY FAST
Now, the cool thing about runMemo is you can create more than one freshly memoized version of the same function, and they will not share memory banks. That means that I can write a function that locally creates and uses fib', but then as soon as fib' falls out of scope (or earlier, depending on the intelligence of the compiler), it can be garbage collected. It doesn't have to be memoized at the top level. This may or may not play nicely with memoization techniques that rely on unsafePerformIO. Data.MemoCombinators uses a pure, lazy Trie, which fits perfectly with runMemo. Rather than creating an object which essentially becomes a memoization manager, you can simply create memoized functions on demand. The catch is that if your function is recursive, it must be written as Memoizable. The good news is you can plug in any Memoizer that you wish. You could even use:
noMemo :: Memoizer a b
noMemo f = f
ghci> let fib' = runMemo noMemo fib
ghci> fib' 30 -- wait a while; it's computing stupidly
1346269
Lazy-Haskell programming is, in a way, the memoization paradigm taken to a extreme. Also, whatever you do in an imperative language is possible in Haskell, using either IO monad, the ST monad, monad transformers, arrows, or you name what.
The only problem is that these abstraction devices are much more complicated than the imperative equivalent that you mentioned, and they need a pretty deep mind-rewiring.
I believe the above answers are both more complex than necessary, although they might be more portable than what I'm about to describe.
As I understand it, there is a rule in ghc that each value is computed exactly once when it's enclosing lambda expression is entered. You may thus create exactly your short lived memoization object as follows.
import qualified Data.Vector as V
indexerVector :: (t -> Int) -> V.Vector t -> Int -> [t]
indexerVector idx vec = \e -> tbl ! e
where m = maximum $ map idx $ V.toList vec
tbl = V.accumulate (flip (:)) (V.replicate m [])
(V.map (\v -> (idx v, v)) vec)
What does this do? It groups all the elements in the Data.Vector t passed as it's second argument vec according to integer computed by it's first argument idx, retaining their grouping as a Data.Vector [t]. It returns a function of type Int -> [t] which looks up this grouping by this pre-computed index value.
Our compiler ghc has promised that tbl shall only be thunked once when we invoke indexerVector. We may therefore assign the lambda expression \e -> tbl ! e returned by indexVector to another value, which we may use repeatedly without fear that tbl ever gets recomputed. You may verify this by inserting a trace on tbl.
In short, your caching object is exactly this lambda expression.
I've found that almost anything you can accomplish with a short term object can be better accomplished by returning a lambda expression like this.
You can use very same pattern in haskell too. Lazy evaluation will take care of checking whether value is evaluated already. It has been mentioned mupltiple times already but code example could be useful. In example below memoedValue will calculated only once when it is demanded.
data Memoed = Memoed
{ value :: Int
, memoedValue :: Int
}
memo :: Int -> Memoed
memo i = Memoed
{ value = i
, memoedValue = expensiveComputation i
}
Even better you can memoize values which depend on other memoized values. You shoud avoid dependecy loops. They can lead to nontermination
data Memoed = Memoed
{ value :: Int
, memoedValue1 :: Int
, memoedValue2 :: Int
}
memo :: Int -> Memoed
memo i = r
where
r = Memoed
{ value = i
, memoedValue1 = expensiveComputation i
, memoedValue2 = anotherComputation (memoedValue1 r)
}

Why monads? How does it resolve side-effects?

I am learning Haskell and trying to understand Monads. I have two questions:
From what I understand, Monad is just another typeclass that declares ways to interact with data inside "containers", including Maybe, List, and IO. It seems clever and clean to implement these 3 things with one concept, but really, the point is so there can be clean error handling in a chain of functions, containers, and side effects. Is this a correct interpretation?
How exactly is the problem of side-effects solved? With this concept of containers, the language essentially says anything inside the containers is non-deterministic (such as i/o). Because lists and IOs are both containers, lists are equivalence-classed with IO, even though values inside lists seem pretty deterministic to me. So what is deterministic and what has side-effects? I can't wrap my head around the idea that a basic value is deterministic, until you stick it in a container (which is no special than the same value with some other values next to it, e.g. Nothing) and it can now be random.
Can someone explain how, intuitively, Haskell gets away with changing state with inputs and output? I'm not seeing the magic here.
The point is so there can be clean error handling in a chain of functions, containers, and side effects. Is this a correct interpretation?
Not really. You've mentioned a lot of concepts that people cite when trying to explain monads, including side effects, error handling and non-determinism, but it sounds like you've gotten the incorrect sense that all of these concepts apply to all monads. But there's one concept you mentioned that does: chaining.
There are two different flavors of this, so I'll explain it two different ways: one without side effects, and one with side effects.
No Side Effects:
Take the following example:
addM :: (Monad m, Num a) => m a -> m a -> m a
addM ma mb = do
a <- ma
b <- mb
return (a + b)
This function adds two numbers, with the twist that they are wrapped in some monad. Which monad? Doesn't matter! In all cases, that special do syntax de-sugars to the following:
addM ma mb =
ma >>= \a ->
mb >>= \b ->
return (a + b)
... or, with operator precedence made explicit:
ma >>= (\a -> mb >>= (\b -> return (a + b)))
Now you can really see that this is a chain of little functions, all composed together, and its behavior will depend on how >>= and return are defined for each monad. If you're familiar with polymorphism in object-oriented languages, this is essentially the same thing: one common interface with multiple implementations. It's slightly more mind-bending than your average OOP interface, since the interface represents a computation policy rather than, say, an animal or a shape or something.
Okay, let's see some examples of how addM behaves across different monads. The Identity monad is a decent place to start, since its definition is trivial:
instance Monad Identity where
return a = Identity a -- create an Identity value
(Identity a) >>= f = f a -- apply f to a
So what happens when we say:
addM (Identity 1) (Identity 2)
Expanding this, step by step:
(Identity 1) >>= (\a -> (Identity 2) >>= (\b -> return (a + b)))
(\a -> (Identity 2) >>= (\b -> return (a + b)) 1
(Identity 2) >>= (\b -> return (1 + b))
(\b -> return (1 + b)) 2
return (1 + 2)
Identity 3
Great. Now, since you mentioned clean error handling, let's look at the Maybe monad. Its definition is only slightly trickier than Identity:
instance Monad Maybe where
return a = Just a -- same as Identity monad!
(Just a) >>= f = f a -- same as Identity monad again!
Nothing >>= _ = Nothing -- the only real difference from Identity
So you can imagine that if we say addM (Just 1) (Just 2) we'll get Just 3. But for grins, let's expand addM Nothing (Just 1) instead:
Nothing >>= (\a -> (Just 1) >>= (\b -> return (a + b)))
Nothing
Or the other way around, addM (Just 1) Nothing:
(Just 1) >>= (\a -> Nothing >>= (\b -> return (a + b)))
(\a -> Nothing >>= (\b -> return (a + b)) 1
Nothing >>= (\b -> return (1 + b))
Nothing
So the Maybe monad's definition of >>= was tweaked to account for failure. When a function is applied to a Maybe value using >>=, you get what you'd expect.
Okay, so you mentioned non-determinism. Yes, the list monad can be thought of as modeling non-determinism in a sense... It's a little weird, but think of the list as representing alternative possible values: [1, 2, 3] is not a collection, it's a single non-deterministic number that could be either one, two or three. That sounds dumb, but it starts to make some sense when you think about how >>= is defined for lists: it applies the given function to each possible value. So addM [1, 2] [3, 4] is actually going to compute all possible sums of those two non-deterministic values: [4, 5, 5, 6].
Okay, now to address your second question...
Side Effects:
Let's say you apply addM to two values in the IO monad, like:
addM (return 1 :: IO Int) (return 2 :: IO Int)
You don't get anything special, just 3 in the IO monad. addM does not read or write any mutable state, so it's kind of no fun. Same goes for the State or ST monads. No fun. So let's use a different function:
fireTheMissiles :: IO Int -- returns the number of casualties
Clearly the world will be different each time missiles are fired. Clearly. Now let's say you're trying to write some totally innocuous, side effect free, non-missile-firing code. Perhaps you're trying once again to add two numbers, but this time without any monads flying around:
add :: Num a => a -> a -> a
add a b = a + b
and all of a sudden your hand slips, and you accidentally typo:
add a b = a + b + fireTheMissiles
An honest mistake, really. The keys were so close together. Fortunately, because fireTheMissiles was of type IO Int rather than simply Int, the compiler is able to avert disaster.
Okay, totally contrived example, but the point is that in the case of IO, ST and friends, the type system keeps effects isolated to some specific context. It doesn't magically eliminate side effects, making code referentially transparent that shouldn't be, but it does make it clear at compile time what scope the effects are limited to.
So getting back to the original point: what does this have to do with chaining or composition of functions? Well, in this case, it's just a handy way of expressing a sequence of effects:
fireTheMissilesTwice :: IO ()
fireTheMissilesTwice = do
a <- fireTheMissiles
print a
b <- fireTheMissiles
print b
Summary:
A monad represents some policy for chaining computations. Identity's policy is pure function composition, Maybe's policy is function composition with failure propogation, IO's policy is impure function composition and so on.
Let me start by pointing at the excellent "You could have invented monads" article. It illustrates how the Monad structure can naturally manifest while you are writing programs. But the tutorial doesn't mention IO, so I will have a stab here at extending the approach.
Let us start with what you probably have already seen - the container monad. Let's say we have:
f, g :: Int -> [Int]
One way of looking at this is that it gives us a number of possible outputs for every possible input. What if we want all possible outputs for the composition of both functions? Giving all possibilities we could get by applying the functions one after the other?
Well, there's a function for that:
fg x = concatMap g $ f x
If we put this more general, we get
fg x = f x >>= g
xs >>= f = concatMap f xs
return x = [x]
Why would we want to wrap it like this? Well, writing our programs primarily using >>= and return gives us some nice properties - for example, we can be sure that it's relatively hard to "forget" solutions. We'd explicitly have to reintroduce it, say by adding another function skip. And also we now have a monad and can use all combinators from the monad library!
Now, let us jump to your trickier example. Let's say the two functions are "side-effecting". That's not non-deterministic, it just means that in theory the whole world is both their input (as it can influence them) as well as their output (as the function can influence it). So we get something like:
f, g :: Int -> RealWorld# -> (Int, RealWorld#)
If we now want f to get the world that g left behind, we'd write:
fg x rw = let (y, rw') = f x rw
(r, rw'') = g y rw'
in (r, rw'')
Or generalized:
fg x = f x >>= g
x >>= f = \rw -> let (y, rw') = x rw
(r, rw'') = f y rw'
in (r, rw'')
return x = \rw -> (x, rw)
Now if the user can only use >>=, return and a few pre-defined IO values we get a nice property again: The user will never actually see the RealWorld# getting passed around! And that is a very good thing, as you aren't really interested in the details of where getLine gets its data from. And again we get all the nice high-level functions from the monad libraries.
So the important things to take away:
The monad captures common patterns in your code, like "always pass all elements of container A to container B" or "pass this real-world-tag through". Often, once you realize that there is a monad in your program, complicated things become simply applications of the right monad combinator.
The monad allows you to completely hide the implementation from the user. It is an excellent encapsulation mechanism, be it for your own internal state or for how IO manages to squeeze non-purity into a pure program in a relatively safe way.
Appendix
In case someone is still scratching his head over RealWorld# as much as I did when I started: There's obviously more magic going on after all the monad abstraction has been removed. Then the compiler will make use of the fact that there can only ever be one "real world". That's good news and bad news:
It follows that the compiler must guarantuee execution ordering between functions (which is what we were after!)
But it also means that actually passing the real world isn't necessary as there is only one we could possibly mean: The one that is current when the function gets executed!
Bottom line is that once execution order is fixed, RealWorld# simply gets optimized out. Therefore programs using the IO monad actually have zero runtime overhead. Also note that using RealWorld# is obviously only one possible way to put IO - but it happens to be the one GHC uses internally. The good thing about monads is that, again, the user really doesn't need to know.
You could see a given monad m as a set/family (or realm, domain, etc.) of actions (think of a C statement). The monad m defines the kind of (side-)effects that its actions may have:
with [] you can define actions which can fork their executions in different "independent parallel worlds";
with Either Foo you can define actions which can fail with errors of type Foo;
with IO you can define actions which can have side-effects on the "outside world" (access files, network, launch processes, do a HTTP GET ...);
you can have a monad whose effect is "randomness" (see package MonadRandom);
you can define a monad whose actions can make a move in a game (say chess, Go…) and receive move from an opponent but are not able to write to your filesystem or anything else.
Summary
If m is a monad, m a is an action which produces a result/output of type a.
The >> and >>= operators are used to create more complex actions out of simpler ones:
a >> b is a macro-action which does action a and then action b;
a >> a does action a and then action a again;
with >>= the second action can depend on the output of the first one.
The exact meaning of what an action is and what doing an action and then another one is depends on the monad: each monad defines an imperative sublanguage with some features/effects.
Simple sequencing (>>)
Let's say with have a given monad M and some actions incrementCounter, decrementCounter, readCounter:
instance M Monad where ...
-- Modify the counter and do not produce any result:
incrementCounter :: M ()
decrementCounter :: M ()
-- Get the current value of the counter
readCounter :: M Integer
Now we would like to do something interesting with those actions. The first thing we would like to do with those actions is to sequence them. As in say C, we would like to be able to do:
// This is C:
counter++;
counter++;
We define an "sequencing operator" >>. Using this operator we can write:
incrementCounter >> incrementCounter
What is the type of "incrementCounter >> incrementCounter"?
It is an action made of two smaller actions like in C you can write composed-statements from atomic statements :
// This is a macro statement made of several statements
{
counter++;
counter++;
}
// and we can use it anywhere we may use a statement:
if (condition) {
counter++;
counter++;
}
it can have the same kind of effects as its subactions;
it does not produce any output/result.
So we would like incrementCounter >> incrementCounter to be of type M (): an (macro-)action with the same kind of possible effects but without any output.
More generally, given two actions:
action1 :: M a
action2 :: M b
we define a a >> b as the macro-action which is obtained by doing (whatever that means in our domain of action) a then b and produces as output the result of the execution of the second action. The type of >> is:
(>>) :: M a -> M b -> M b
or more generally:
(>>) :: (Monad m) => m a -> m b -> m b
We can define bigger sequence of actions from simpler ones:
action1 >> action2 >> action3 >> action4
Input and outputs (>>=)
We would like to be able to increment by something else that 1 at a time:
incrementBy 5
We want to provide some input in our actions, in order to do this we define a function incrementBy taking an Int and producing an action:
incrementBy :: Int -> M ()
Now we can write things like:
incrementCounter >> readCounter >> incrementBy 5
But we have no way to feed the output of readCounter into incrementBy. In order to do this, a slightly more powerful version of our sequencing operator is needed. The >>= operator can feed the output of a given action as input to the next action. We can write:
readCounter >>= incrementBy
It is an action which executes the readCounter action, feeds its output in the incrementBy function and then execute the resulting action.
The type of >>= is:
(>>=) :: Monad m => m a -> (a -> m b) -> m b
A (partial) example
Let's say I have a Prompt monad which can only display informations (text) to the user and ask informations to the user:
-- We don't have access to the internal structure of the Prompt monad
module Prompt (Prompt(), echo, prompt) where
-- Opaque
data Prompt a = ...
instance Monad Prompt where ...
-- Display a line to the CLI:
echo :: String -> Prompt ()
-- Ask a question to the user:
prompt :: String -> Prompt String
Let's try to define a promptBoolean message actions which asks for a question and produces a boolean value.
We use the prompt (message ++ "[y/n]") action and feed its output to a function f:
f "y" should be an action which does nothing but produce True as output;
f "n" should be an action which does nothing but produce False as output;
anything else should restart the action (do the action again);
promptBoolean would look like this:
-- Incomplete version, some bits are missing:
promptBoolean :: String -> M Boolean
promptBoolean message = prompt (message ++ "[y/n]") >>= f
where f result = if result == "y"
then ???? -- We need here an action which does nothing but produce `True` as output
else if result=="n"
then ???? -- We need here an action which does nothing but produce `False` as output
else echo "Input not recognised, try again." >> promptBoolean
Producing a value without effect (return)
In order to fill the missing bits in our promptBoolean function, we need a way to represent dummy actions without any side effect but which only outputs a given value:
-- "return 5" is an action which does nothing but outputs 5
return :: (Monad m) => a -> m a
and we can now write out promptBoolean function:
promptBoolean :: String -> Prompt Boolean
promptBoolean message :: prompt (message ++ "[y/n]") >>= f
where f result = if result=="y"
then return True
else if result=="n"
then return False
else echo "Input not recognised, try again." >> promptBoolean message
By composing those two simple actions (promptBoolean, echo) we can define any kind of dialogue between the user and your program (the actions of the program are deterministic as our monad does not have a "randomness effect").
promptInt :: String -> M Int
promptInt = ... -- similar
-- Classic "guess a number game/dialogue"
guess :: Int -> m()
guess n = promptInt "Guess:" m -> f
where f m = if m == n
then echo "Found"
else (if m > n
then echo "Too big"
then echo "Too small") >> guess n
The operations of a monad
A Monad is a set of actions which can be composed with the return and >>= operators:
>>= for action composition;
return for producing a value without any (side-)effect.
These two operators are the minimal operators needed to define a Monad.
In Haskell, the >> operator is needed as well but it can in fact be derived from >>=:
(>>): Monad m => m a -> m b -> m b
a >> b = a >>= f
where f x = b
In Haskell, an extra fail operator is need as well but this is really a hack (and it might be removed from Monad in the future).
This is the Haskell definition of a Monad:
class Monad m where
return :: m a
(>>=) :: m a -> (a -> m b) -> m b
(>>) :: m a -> m b -> m b -- can be derived from (>>=)
fail :: String -> m a -- mostly a hack
Actions are first-class
One great thing about monads is that actions are first-class. You can take them in a variable, you can define function which take actions as input and produce some other actions as output. For example, we can define a while operator:
-- while x y : does action y while action x output True
while :: (Monad m) => m Boolean -> m a -> m ()
while x y = x >>= f
where f True = y >> while x y
f False = return ()
Summary
A Monad is a set of actions in some domain. The monad/domain define the kind of "effects" which are possible. The >> and >>= operators represent sequencing of actions and monadic expression may be used to represent any kind of "imperative (sub)program" in your (functional) Haskell program.
The great things are that:
you can design your own Monad which supports the features and effects that you want
see Prompt for an example of a "dialogue only subprogram",
see Rand for an example of "sampling only subprogram";
you can write your own control structures (while, throw, catch or more exotic ones) as functions taking actions and composing them in some way to produce a bigger macro-actions.
MonadRandom
A good way of understanding monads, is the MonadRandom package. The Rand monad is made of actions whose output can be random (the effect is randomness). An action in this monad is some kind of random variable (or more exactly a sampling process):
-- Sample an Int from some distribution
action :: Rand Int
Using Rand to do some sampling/random algorithms is quite interesting because you have random variables as first class values:
-- Estimate mean by sampling nsamples times the random variable x
sampleMean :: Real a => Int -> m a -> m a
sampleMean n x = ...
In this setting, the sequence function from Prelude,
sequence :: Monad m => [m a] -> m [a]
becomes
sequence :: [Rand a] -> Rand [a]
It creates a random variable obtained by sampling independently from a list of random variables.
There are three main observations concerning the IO monad:
1) You can't get values out of it. Other types like Maybe might allow to extract values, but neither the monad class interface itself nor the IO data type allow it.
2) "Inside" IO is not only the real value but also that "RealWorld" thing. This dummy value is used to enforce the chaining of actions by the type system: If you have two independent calculations, the use of >>= makes the second calculation dependent on the first.
3) Assume a non-deterministic thing like random :: () -> Int, which isn't allowed in Haskell. If you change the signature to random :: Blubb -> (Blubb, Int), it is allowed, if you make sure that nobody ever can use a Blubb twice: Because in that case all inputs are "different", it is no problem that the outputs are different as well.
Now we can use the fact 1): Nobody can get something out of IO, so we can use the RealWord dummy hidden in IO to serve as a Blubb. There is only one IOin the whole application (the one we get from main), and it takes care of proper sequentiation, as we have seen in 2). Problem solved.
One thing that often helps me to understand the nature of something is to examine it in the most trivial way possible. That way, I'm not getting distracted by potentially unrelated concepts. With that in mind, I think it may be helpful to understand the nature of the Identity Monad, as it's the most trivial implementation of a Monad possible (I think).
What is interesting about the Identity Monad? I think it is that it allows me to express the idea of evaluating expressions in a context defined by other expressions. And to me, that is the essence of every Monad I've encountered (so far).
If you already had a lot of exposure to 'mainstream' programming languages before learning Haskell (like I did), then this doesn't seem very interesting at all. After all, in a mainstream programming language, statements are executed in sequence, one after the other (excepting control-flow constructs, of course). And naturally, we can assume that every statement is evaluated in the context of all previously executed statements and that those previously executed statements may alter the environment and the behavior of the currently executing statement.
All of that is pretty much a foreign concept in a functional, lazy language like Haskell. The order in which computations are evaluated in Haskell is well-defined, but sometimes hard to predict, and even harder to control. And for many kinds of problems, that's just fine. But other sorts of problems (e.g. IO) are hard to solve without some convenient way to establish an implicit order and context between the computations in your program.
As far as side-effects go, specifically, often they can be transformed (via a Monad) in to simple state-passing, which is perfectly legal in a pure functional language. Some Monads don't seem to be of that nature, however. Monads such as the IO Monad or the ST monad literally perform side-effecting actions. There are many ways to think about this, but one way that I think about it is that just because my computations must exist in a world without side-effects, the Monad may not. As such, the Monad is free to establish a context for my computation to execute that is based on side-effects defined by other computations.
Finally, I must disclaim that I am definitely not a Haskell expert. As such, please understand that everything I've said is pretty much my own thoughts on this subject and I may very well disown them later when I understand Monads more fully.
the point is so there can be clean error handling in a chain of functions, containers, and side effects
More or less.
how exactly is the problem of side-effects solved?
A value in the I/O monad, i.e. one of type IO a, should be interpreted as a program. p >> q on IO values can then be interpreted as the operator that combines two programs into one that first executes p, then q. The other monad operators have similar interpretations. By assigning a program to the name main, you declare to the compiler that that is the program that has to be executed by its output object code.
As for the list monad, it's not really related to the I/O monad except in a very abstract mathematical sense. The IO monad gives deterministic computation with side effects, while the list monad gives non-deterministic (but not random!) backtracking search, somewhat similar to Prolog's modus operandi.
With this concept of containers, the language essentially says anything inside the containers is non-deterministic
No. Haskell is deterministic. If you ask for integer addition 2+2 you will always get 4.
"Nondeterministic" is only a metaphor, a way of thinking. Everything is deterministic under the hood. If you have this code:
do x <- [4,5]
y <- [0,1]
return (x+y)
it is roughly equivalent to Python code
l = []
for x in [4,5]:
for y in [0,1]:
l.append(x+y)
You see nondeterminism here? No, it's deterministic construction of a list. Run it twice, you'll get the same numbers in the same order.
You can describe it this way: Choose arbitrary x from [4,5]. Choose arbitrary y from [0,1]. Return x+y. Collect all possible results.
That way seems to involve nondeterminism, but it's only a nested loop (list comprehension). There is no "real" nondeterminism here, it's simulated by checking all possibilities. Nondeterminism is an illusion. The code only appears to be nondeterministic.
This code using State monad:
do put 0
x <- get
put (x+2)
y <- get
return (y+3)
gives 5 and seems to involve changing state. As with lists it's an illusion. There are no "variables" that change (as in imperative languages). Everything is nonmutable under the hood.
You can describe the code this way: put 0 to a variable. Read the value of a variable to x. Put (x+2) to the variable. Read the variable to y, and return y+3.
That way seems to involve state, but it's only composing functions passing additional parameter. There is no "real" mutability here, it's simulated by composition. Mutability is an illusion. The code only appears to be using it.
Haskell does it this way: you've got functions
a -> s -> (b,s)
This function takes and old value of state and returns new value. It does not involve mutability or change variables. It's a function in mathematical sense.
For example the function "put" takes new value of state, ignores current state and returns new state:
put x _ = ((), x)
Just like you can compose two normal functions
a -> b
b -> c
into
a -> c
using (.) operator you can compose "state" transformers
a -> s -> (b,s)
b -> s -> (c,s)
into a single function
a -> s -> (c,s)
Try writing the composition operator yourself. This is what really happens, there are no "side effects" only passing arguments to functions.
From what I understand, Monad is just another typeclass that declares ways to interact with data [...]
...providing an interface common to all those types which have an instance. This can then be used to provide generic definitions which work across all monadic types.
It seems clever and clean to implement these 3 things with one concept [...]
...the only three things that are implemented are the instances for those three types (list, Maybe and IO) - the types themselves are defined independently elsewhere.
[...] but really, the point is so there can be clean error handling in a chain of functions, containers, and side effects.
Not just error handling e.g. consider ST - without the monadic interface, you would have to pass the encapsulated-state directly and correctly...a tiresome task.
How exactly is the problem of side-effects solved?
Short answer: Haskell solves manages them by using types to indicate their presence.
Can someone explain how, intuitively, Haskell gets away with changing state with inputs and output?
"Intuitively"...like what's available over here? Let's try a simple direct comparison instead:
From How to Declare an Imperative by Philip Wadler:
(* page 26 *)
type 'a io = unit -> 'a
infix >>=
val >>= : 'a io * ('a -> 'b io) -> 'b io
fun m >>= k = fn () => let
val x = m ()
val y = k x ()
in
y
end
val return : 'a -> 'a io
fun return x = fn () => x
val putc : char -> unit io
fun putc c = fn () => putcML c
val getc : char io
val getc = fn () => getcML ()
fun getcML () =
valOf(TextIO.input1(TextIO.stdIn))
(* page 25 *)
fun putcML c =
TextIO.output1(TextIO.stdOut,c)
Based on these two answers of mine, this is my Haskell translation:
type IO a = OI -> a
(>>=) :: IO a -> (a -> IO b) -> IO b
m >>= k = \ u -> let !(u1, u2) = part u in
let !x = m u1 in
let !y = k x u2 in
y
return :: a -> IO a
return x = \ u -> let !_ = part u in x
putc :: Char -> IO ()
putc c = \ u -> putcOI c u
getc :: IO Char
getc = \ u -> getcOI u
-- primitives
data OI
partOI :: OI -> (OI, OI)
putcOI :: Char -> OI -> ()
getcOI :: OI -> Char
Now remember that short answer about side-effects?
Haskell manages them by using types to indicate their presence.
Data.Char.chr :: Int -> Char -- no side effects
getChar :: IO Char -- side effects at
{- :: OI -> Char -} -- work: beware!

Using functors for global variables?

I'm learning Haskell, and am implementing an algorithm for a class. It works fine, but a requirement of the class is that I keep a count of the total number of times I multiply or add two numbers. This is what I would use a global variable for in other languages, and my understanding is that it's anathema to Haskell.
One option is to just have each function return this data along with its actual result. But that doesn't seem fun.
Here's what I was thinking: suppose I have some function f :: Double -> Double. Could I create a data type (Double, IO) then use a functor to define multiplication across a (Double, IO) to do the multiplication and write something to IO. Then I could pass my new data into my functions just fine.
Does this make any sense? Is there an easier way to do this?
EDIT: To be more clear, in an OO language I would declare a class which inherits from Double and then override the * operation. This would allow me to not have to rewrite the type signature of my functions. I'm wondering if there's some way to do this in Haskell.
Specifically, if I define f :: Double -> Double then I should be able to make a functor :: (Double -> Double) -> (DoubleM -> DoubleM) right? Then I can keep my functions the same as they are now.
Actually, your first idea (return the counts with each value) is not a bad one, and can be expressed more abstractly by the Writer monad (in Control.Monad.Writer from the mtl package or Control.Monad.Trans.Writer from the transformers package). Essentially, the writer monad allows each computation to have an associated "output", which can be anything as long as it's an instance of Monoid - a class which defines:
The empty output (mempty), which is the output assigned to 'return'
An associative function (`mappend') that combines outputs, which is used when sequencing operations
In this case, your output is a count of operations, the 'empty' value is zero, and the combining operation is addition. For example, if you're tracking operations separately:
data Counts = Counts { additions: Int, multiplications: Int }
Make that type an instance of Monoid (which is in the module Data.Monoid), and define your operations as something like:
add :: Num a => a -> a -> Writer Counts a
add x y = do
tell (Counts {additions = 1, multiplications = 0})
return (x + y)
The writer monad, together with your Monoid instance, then takes care of propagating all the 'tells' to the top level. If you wanted, you could even implement a Num instance for Num a => Writer Counts a (or, preferably, for a newtype so you're not creating an orphan instance), so that you can just use the normal numerical operators.
Here is an example of using Writer for this purpose:
import Control.Monad.Writer
import Data.Monoid
import Control.Applicative -- only for the <$> spelling of fmap
type OpCountM = Writer (Sum Int)
add :: (Num a) => a -> a -> OpCountM a
add x y = tell (Sum 1) >> return (x+y)
mul :: (Num a) => a -> a -> OpCountM a
mul x y = tell (Sum 1) >> return (x*y)
-- and a computation
fib :: Int -> OpCountM Int
fib 0 = return 0
fib 1 = return 1
fib n = do
n1 <- add n (-1)
n2 <- add n (-2)
fibn1 <- fib n1
fibn2 <- fib n2
add fibn1 fibn2
main = print (result, opcount)
where
(result, opcount) = runWriter (fib 10)
That definition of fib is pretty long and ugly... monadifying can be a pain. It can be made more concise with applicative notation:
fib 0 = return 0
fib 1 = return 1
fib n = join (fib <$> add n (-1) <*> add n (-2))
But admittedly more opaque for a beginner. I wouldn't recommend that way until you are pretty comfortable with the idioms of Haskell.
What level of Haskell are you learning? There are probably two reasonable answers: have each function return its counts along with its return value like you suggested, or (more advanced) use a monad such as State to keep the counts in the background. You could also write a special-purpose monad to keep the counts; I do not know if that is what your professor intended. Using IO for mutable variables is not the elegant way to solve the problem, and is not necessary for what you need.
Another solution, apart from returning a tuple or using the state monad explicitly, might be to wrap it up in a data type. Something like:
data OperationCountNum = OperationCountNum Int Double deriving (Show,Eq)
instance Num OperationCountNum where
...insert appropriate definitions here
The class Num defines functions on numbers, so you can define the functions +, * etc on your OperationCountNum type in such a way that they keep track of the number of operations required to produce each number.
That way, counting the operations would be hidden and you can use the normal +, * etc operations. You just need to wrap your numbers up in the OperationCountNum type at the start and then extract them at the end.
In the real world, this probably isn't how you'd do it, but it has the advantage of making the code easier to read (no explicit detupling and tupling) and being fairly easy to understand.

How does Data.MemoCombinators work?

I've been looking at the source for Data.MemoCombinators but I can't really see where the heart of it is.
Please explain to me what the logic is behind all of these combinators and the mechanics of how they actually work to speed up your program in real world programming.
I'm looking for specifics for this implementation, and optionally comparison/contrast with other Haskell approaches to memoization. I understand what memoization is and am not looking for a description of how it works in general.
This library is a straightforward combinatorization of the well-known technique of memoization. Let's start with the canonical example:
fib = (map fib' [0..] !!)
where
fib' 0 = 0
fib' 1 = 1
fib' n = fib (n-1) + fib (n-2)
I interpret what you said to mean that you know how and why this works. So I'll focus on the combinatorization.
We are essentiallly trying to capture and generalize the idea of (map f [0..] !!). The type of this function is (Int -> r) -> (Int -> r), which makes sense: it takes a function from Int -> r and returns a memoized version of the same function. Any function which is semantically the identity and has this type is called a "memoizer for Int" (even id, which doesn't memoize). We generalize to this abstraction:
type Memo a = forall r. (a -> r) -> (a -> r)
So a Memo a, a memoizer for a, takes a function from a to anything, and returns a semantically identical function that has been memoized (or not).
The idea of the different memoizers is to find a way to enumerate the domain with a data structure, map the function over them, and then index the data structure. bool is a good example:
bool :: Memo Bool
bool f = table (f True, f False)
where
table (t,f) True = t
table (t,f) False = f
Functions from Bool are equivalent to pairs, except a pair will only evaluate each component once (as is the case for every value that occurs outside a lambda). So we just map to a pair and back. The essential point is that we are lifting the evaluation of the function above the lambda for the argument (here the last argument of table) by enumerating the domain.
Memoizing Maybe a is a similar story, except now we need to know how to memoize a for the Just case. So the memoizer for Maybe takes a memoizer for a as an argument:
maybe :: Memo a -> Memo (Maybe a)
maybe ma f = table (f Nothing, ma (f . Just))
where
table (n,j) Nothing = n
table (n,j) (Just x) = j x
The rest of the library is just variations on this theme.
The way it memoizes integral types uses a more appropriate structure than [0..]. It's a bit involved, but basically just creates an infinite tree (representing the numbers in binary to elucidate the structure):
1
10
100
1000
1001
101
1010
1011
11
110
1100
1101
111
1110
1111
So that looking up a number in the tree has running time proportional to the number of bits in its representation.
As sclv points out, Conal's MemoTrie library uses the same underlying technique, but uses a typeclass presentation instead of a combinator presentation. We released our libraries independently at the same time (indeed, within a couple hours!). Conal's is easier to use in simple cases (there is only one function, memo, and it will determine the memo structure to use based on the type), whereas mine is more flexible, as you can do things like this:
boundedMemo :: Integer -> Memo Integer
boundedMemo bound f = \z -> if z < bound then memof z else f z
where
memof = integral f
Which only memoizes values less than a given bound, needed for the implementation of one of the project euler problems.
There are other approaches, for example exposing an open fixpoint function over a monad:
memo :: MonadState ... m => ((Integer -> m r) -> (Integer -> m r)) -> m (Integer -> m r)
Which allows yet more flexibility, eg. purging caches, LRU, etc. But it is a pain in the ass to use, and also it puts strictness constraints on the function to be memoized (e.g. no infinite left recursion). I don't believe there are any libraries that implement this technique.
Did that answer what you were curious about? If not, perhaps make explicit the points you are confused about?
The heart is the bits function:
-- | Memoize an ordered type with a bits instance.
bits :: (Ord a, Bits a) => Memo a
bits f = IntTrie.apply (fmap f IntTrie.identity)
It is the only function (except the trivial unit :: Memo ()) which can give you a Memo a value. It uses the same idea as in this page about Haskell memoization. Section 2 shows the simplest memoization strategy using a list and section 3 does the same using a binary tree of naturals similar to the IntTree used in memocombinators.
The basic idea is to use a construction like (map fib [0 ..] !!) or in the memocombinators case - IntTrie.apply (fmap f IntTrie.identity). The thing to notice here is the correspondance between IntTie.apply and !! and also between IntTrie.identity and [0..].
The next step is memoizing functions with other types of arguments. This is done with the wrap function which uses an isomorphism between types a and b to construct a Memo b from a Memo a. For example:
Memo.integral f
=>
wrap fromInteger toInteger bits f
=>
bits (f . fromInteger) . toInteger
=>
IntTrie.apply (fmap (f . fromInteger) IntTrie.identity) . toInteger
~> (semantically equivalent)
(map (f . fromInteger) [0..] !!) . toInteger
The rest of the source code deals with types like List, Maybe, Either and memoizing multiple arguments.
Some of the work is done by IntTrie: http://hackage.haskell.org/package/data-inttrie-0.0.4
Luke's library is a variation of Conal's MemoTrie library, which he described here: http://conal.net/blog/posts/elegant-memoization-with-functional-memo-tries/
Some further expansion -- the general notion behind functional memoization is to take a function from a -> b and map it across a datastructure indexed by all possible values of a and containing values of b. Such a datastructure should be lazy in two ways -- first it should be lazy in the values it holds. Second, it should be lazily produced itself. The former is by default in a nonstrict language. The latter is accomplished by using generalized tries.
The various approaches of memocombinators, memotrie, etc are all just ways of creating compositions of pieces of tries over individual types of datastructures to allow for the simple construction of tries for increasingly complex structures.
#luqui One thing that is not clear to me: does this have the same operational behaviour as the following:
fib :: [Int]
fib = map fib' [0..]
where fib' 0 = 0
fib' 1 = 1
fib' n = fib!!(n-1) + fib!!(n-2)
The above should memoize fib at the top level, and hence if you define two functions:
f n = fib!!n + fib!!(n+1)
If we then compute f 5, we obtain that fib 5 is not recomputed when computing fib 6. It is not clear to me whether the memoization combinators have the same behaviour (i.e. top-level memoization instead of only prohibiting the recomputation "inside" the fib computation), and if so, why exactly?

Resources