How is this hylo solution to the "coin-change" problem designed? - haskell

I came across a nice post on SO by #amalloy while looking for hylomorhism examples, that illustrate recursion scheme (RS) usage with useful discussion and full implementation:
{-# LANGUAGE DeriveFunctor #-}
import Control.Arrow ( (>>>), (<<<) )
newtype Term f = In {out :: f (Term f)}
type Algebra f a = f a -> a
type Coalgebra f a = a -> f a
cata :: (Functor f) => Algebra f a -> Term f -> a
cata fn = out >>> fmap (cata fn) >>> fn
ana :: (Functor f) => Coalgebra f a -> a -> Term f
ana f = In <<< fmap (ana f) <<< f
hylo :: Functor f => Algebra f b -> Coalgebra f a -> a -> b
hylo alg coalg = ana coalg >>> cata alg
data ChangePuzzle a = Solved Cent
| Pending {spend, forget :: a}
deriving Functor
type Cent = Int
type ChangePuzzleArgs = ([Cent], Cent)
coins :: [Cent]
coins = [50, 25, 10, 5, 1]
divide :: Coalgebra ChangePuzzle ChangePuzzleArgs
divide (_, 0) = Solved 1
divide ([], _) = Solved 0
divide (coins#(x:xs), n) | n < 0 = Solved 0
| otherwise = Pending (coins, n - x) (xs, n)
conquer :: Algebra ChangePuzzle Cent
conquer (Solved n) = n
conquer (Pending a b) = a + b
waysToMakeChange :: ChangePuzzleArgs -> Int
waysToMakeChange = hylo conquer divide
The code works as expected. Despite having some vague intuition for the RS aspect already, I am still wondering:
since this is about counting combinations, why Solved Cent and not Solved Int? (This may sound like a nitpic, if it is even a reasonable question, but I am hoping it may be the root of the rest of the uncertainty, below, although I suspect I missed something more fundamental!).
since we're later summing, in divide, Solved 0/1 presumably signifies failure/success?
in conquer, what does it mean to add, a and b, of Pending? What do those 2 values (as Cents) signify, and what would their sum mean in this context?
in conquer, I would have expected we just need to sum the Solveds, and the author touches on this, but it's not clear, yet, how the Pending case is contributing (eg fixing conquer (Pending a b) = 11 does have an adverse impact on functionality, and it is probably a clue that waysToMakeChange returns 11, or whatever constant that case is fixed to).
in conquer, a and b are Cents, whereas in divide they're ChangePuzzleArgs (aka ([Cent], Cent)) - where does that transformation occur?
Note: being new to SO, I was not able to comment below the original answer, which may have been more appropriate, but I hope this is also useful as is.

since this is about counting combinations, why Solved Cent and not Solved Int? (This may sound like a nitpic, if it is even a reasonable question, but I am hoping it may be the root of the rest of the uncertainty, below, although I suspect I missed something more fundamental!).
I would also use Int here.
since we're later summing, in divide, Solved 0/1 presumably signifies failure/success?
Yes, but it's slightly more than that. Solved 0 means "there are exactly 0 ways to generate that change amount" (i.e., failure), while Solved 1 means "there is exactly 1 way to generate that change amount" (i.e., success). In the latter case, not only we mean "success", but we also report that there is only one way to solve the task.
in conquer, what does it mean to add, a and b, of Pending? What do those 2 values (as Cents) signify, and what would their sum mean in this context?
Essentially, Pending a b with a,b::Int means "the number of ways to generate that change amount can be split into two disjoint sets, the first one having a elements, and the second one having b elements".
When we divide, we return Pending ... ... to split the problem into two disjoint subcases, (coins, n - x) and (xs, n). Here coins=(x:xs). We split according to whether we want to use coin x at least one time (hence we need to generate n-x with all the coins), or we don't want to use it at all (hence we need to generate n with the other coins, only).
in conquer, I would have expected we just need to sum the Solveds, and the author touches on this, but it's not clear, yet, how the Pending case is contributing (eg fixing conquer (Pending a b) = 11 does have an adverse impact on functionality, and it is probably a clue that waysToMakeChange returns 11, or whatever constant that case is fixed to).
Summing all the Solved ... is what we do. The cata magic essentially replaces the straightforward recursive sum
foo (Solved n) = n
foo (Pending case1 case2) = foo case1 + foo case2
with cata conquer where
conquer (Solved n) = n
conquer (Pending a b) = a + b
The magic of cata makes it so that inside Pending, we do not find subtrees upon which we want to recurse, but the result of the recursion, already computed.
in conquer, a and b are Cents, whereas in divide they're ChangePuzzleArgs (aka ([Cent], Cent)) - where does that transformation occur?
This can be indeed subtle at first. I'll provide only a rough intuition.
After ana divide we produce a result in a fixed point of the functor ChangePuzzle. Note how ana at the end returns Term ChangePuzzle, which is the fixed point. There, the pair ([Cent], Cent) magically disappears.
Dually, the Int reappears when we use cata, even if we started from Term ChangePuzzle.
Very roughly, you can think of Term ChangePuzzle as the infinite nesting
ChangePuzzle (ChangePuzzle (ChangePuzzle ( ....
which is coherent with the fact that such a tree might be arbitrarily nested. There, the "argument" of ChangePuzzle essentially disappears.
How do we get the final Int then? Well, we get that since Solved always takes an Int argument, and not an a argument. This provides that base case that makes the final cata recursion work.

Related

Is `Monad` constraint necessary in `<$!>`

As claimed in the documentation <$!> is the strict version of <$>, but surprisingly
<$!> :: Monad m => (a -> b) -> m a -> m b
f <$!> m = do
x <- m
let z = f x
z `seq` return z
instead of the more natural (in my opinion; because it keeps the weaker constraint and mimics $!)
<$!> :: Functor f => (a -> b) -> f a -> f b
f <$!> x = x `seq` (f <$> x)
I guess that appliying seq after the binding is different than the "natural" approach, but I don't know how different it is. My question is: Is there any reason which makes the "natural" approach useless, and that's why the implementation is constraint to Monad?
GHC's commit message includes the following two links which sheds more light on this function:
https://mail.haskell.org/pipermail/libraries/2013-November/021728.html
https://mail.haskell.org/pipermail/libraries/2014-April/022864.html
This was the reason which is mentioned by Johan Tibell for it (quoting from the linked mailing list):
It works on Monads instead of Functors as required by us inspecting
the argument.
This version is highly convenient if you want to work with
functors/applicatives in e.g. parser and avoid spurious thunks at the
same time. I realized that it was needed while fixing large space
usage (but not space-leak) issues in cassava.
I guess that appliying seq after the binding is different than the "natural" approach, but I don't know how different it is
Since haskell is functional, seq must work through data dependencies; it sets up a relationship: "when seq x y is evaluated to WHNF, a will have been as well".
The idea here is to pin the evaluation of a to the outer m a which we know must be evaluated for each >>= or <*> to proceed.
In your version:
Prelude> f <$!> x = x `seq` (f <$> x)
Prelude> let thunk = error "explode"
Prelude> case (+) <$!> Just thunk <*> Just thunk of ; Just _ -> "we can easily build up thunks"
"we can easily build up thunks"
I do wonder if there's a better solution possible though

Using State Monad turns all of my functions into monadic functions

I write a cryptography library in Haskell to learn about cryptography and monads. (Not for real-world use!) The type of my function for primality testing is
prime :: (Integral a, Random a, RandomGen g) => a -> State g Bool
So as you can see I use the State Monad so I don't have the thread through the generator all the time. Internally the prime function uses the Miller-Rabin test, which rely on random numbers, which is why the prime function also must rely on random number. It makes sense in a way since the prime function only does a probabilistic test.
Just for reference, the entire prime function is below, but I don't think you need to read it.
-- | findDS n, for odd n, gives odd d and s >= 0 s.t. n=2^s*d.
findDS :: Integral a => a -> (a, a)
findDS n = findDS' (n-1) 0
where
findDS' q s
| even q = findDS' (q `div` 2) (s+1)
| odd q = (q,s)
-- | millerRabinOnce n d s a does one MR round test on
-- n using a.
millerRabinOnce :: Integral a => a -> a -> a -> a -> Bool
millerRabinOnce n d s a
| even n = False
| otherwise = not (test1 && test2)
where
(d,s) = findDS n
test1 = powerModulo a d n /= 1
test2 = and $ map (\t -> powerModulo a ((2^t)*d) n /= n-1)
[0..s-1]
-- | millerRabin k n does k MR rounds testing n for primality.
millerRabin :: (RandomGen g, Random a, Integral a) =>
a -> a -> State g Bool
millerRabin k n = millerRabin' k
where
(d, s) = findDS n
millerRabin' 0 = return True
millerRabin' k = do
rest <- millerRabin' $ k - 1
test <- randomR_st (1, n - 1)
let this = millerRabinOnce n d s test
return $ this && rest
-- | primeK k n. Probabilistic primality test of n
-- using k Miller-Rabin rounds.
primeK :: (Integral a, Random a, RandomGen g) =>
a -> a -> State g Bool
primeK k n
| n < 2 = return False
| n == 2 || n == 3 = return True
| otherwise = millerRabin (min n k) n
-- | Probabilistic primality test with 64 Miller-Rabin rounds.
prime :: (Integral a, Random a, RandomGen g) =>
a -> State g Bool
prime = primeK 64
The thing is, everywhere I need to use prime numbers, I have to turn that function into a monadic function too. Even where it's seemingly not any randomness involved. For example, below is my former function for recovering a secret in Shamir's Secret Sharing Scheme. A deterministic operation, right?
recover :: Integral a => [a] -> [a] -> a -> a
recover pi_s si_s q = sum prods `mod` q
where
bi_s = map (beta pi_s q) pi_s
prods = zipWith (*) bi_s si_s
Well that was when I used a naive, deterministic primality test function. I haven't rewritten the recover function yet, but I already know that the beta function relies on prime numbers, and hence it, and recover too, will. And both will have to go from simple non-monadic functions into two monadic function, even though the reason they use the State Monad / randomness is really deep down.
I can't help but think that all the code becomes more complex now that it has to be monadic. Am I missing something or is this always the case in situations like these in Haskell?
One solution I could think of is
prime' n = runState (prime n) (mkStdGen 123)
and use prime' instead. This solution raises two questions.
Is this a bad idea? I don't think it's very elegant.
Where should this "cut" from monadic to non-monadic code be? Because I also have functions like this genPrime:
_
genPrime :: (RandomGen g, Random a, Integral a) => a -> State g a
genPrime b = do
n <- randomR_st (2^(b-1),2^b-1)
ps <- filterM prime [n..]
return $ head ps
The question becomes whether to have the "cut" before or after genPrime and the like.
That is indeed a valid criticism of monads as they are implemented in Haskell. I don't see a better solution on the short term than what you mention, and switching all the code to monadic style is probably the most robust one, even though they are more heavyweight than the natural style, and indeed it can be a pain to port a large codebase, although it may pay off later if you want to add more external effects.
I think algebraic effects can solve this elegantly, for examples:
eff (example program with randomness)
F*
All functions are annotated with their effects a -> eff b, however, contrary to Haskell, they can all be composed simply like pure functions a -> b (which are thus a special case of effectful functions, with an empty effect signature). The language then ensures that effects form a semi-lattice so that functions with different effects can be composed.
It seems difficult to have such a system in Haskell. Free(r) monads libraries allow composing types of effects in a similar way, but still require the explicit monadic style at the term level.
One interesting idea would be to overload function application, so it can be implicitly changed to (>>=), but a principled way to do so eludes me. The main issue is that a function a -> m b is seen as both an effectful function with effects in m and codomain b, and as a pure function with codomain m b. How can we infer when to use ($) or (>>=)?
In the particular case of randomness, I once had a somewhat related idea involving splittable random generators (shameless plug): https://blog.poisson.chat/posts/2017-03-04-splittable-generators.html

What are the benefits of replacing Haskell record with a function

I was reading this interesting article about continuations and I discovered this clever trick. Where I would naturally have used a record, the author uses instead a function with a sum type as the first argument.
So for example, instead of doing this
data Processor = Processor { processString :: String -> IO ()
, processInt :: Int -> IO ()
}
processor = Processor (\s -> print $ "Hello "++ s)
(\x -> print $ "value" ++ (show x))
We can do this:
data Arg = ArgString String | ArgInt Int
processor :: Arg -> IO ()
processor (ArgString s) = print "Hello" ++ s
processor (ArgInt x) = print "value" ++ (show x)
Apart from being clever, what are the benefits of it over a simple record ?
Is it a common pattern and does it have a name ?
Well, it's just a simple isomorphism. In ADT algebraic:
IO()String × IO()Int
≅ IO()String+Int
The obvious benefit of the RHS is perhaps that it only contains IO() once – DRY FTW.
This is a very loose example but you can see the Arg method as being an initial encoding and the Processor method as being a final encoding. They are, as others have noted, of equal power when viewed in many lights; however, there are some differences.
Initial encodings enable us to examine the "commands" being executed. In some sense, it means we've sliced the operation so that the input and the output are separated. This lets us choose many different outputs given the same input.
Final encodings enable us to abstract over implementations more easily. For instance, if we have two values of type Processor then we can treat them identically even if the two have different effects or achieve their effects by different means. This kind of abstraction is popularized in OO languages.
Initial encodings enable (in some sense) an easier time adding new functions since we just have to add a new branch to the Arg type. If we had many different ways of building Processors then we'd have to update each of these mechanisms.
Honestly, what I've described above is rather stretched. It is the case that Arg and Processor fit these patterns somewhat, but they do not do so in such a significant way as to really benefit from the distinction. It may be worth studying more examples if you're interested—a good search term is the "expression problem" which emphasizes the distinction in points (2) and (3) above.
To expand a bit on leftroundabout's response, there is a way of writing functions as OutputInput, because of cardinality (how many things there are). So for example if you think about all of the mappings of the set {0, 1, 2} of cardinality 3 to the set {0, 1} of cardinality 2, you see that 0 can map to 0 or 1, independent of 1 mapping to 0 or 1, independent of 2 mapping to 0 or 1. When counting the total number of functions we get 2 * 2 * 2 or 23.
In this same way of writing, sum types are written with + and product types are written with * and there is a cute way to phrase this as OutIn1 + In2 = OutIn1 * OutIn2; we could write the isomorphism as:
combiner :: (a -> z, b -> z) -> Either a b -> z
combiner (za, zb) e_ab = case e_ab of Left a -> za a; Right b -> zb b
splitter :: (Either a b -> z) -> (a -> z, b -> z)
splitter z_eab = (\a -> z_eab $ Left a, \b -> z_eab $ Right b)
and we can reify it in your code with:
type Processor = Either String Int -> IO ()
So what's the difference? There aren't many:
The combined form requires both things to have the exact same tail-end. You can't apply combiner to something of type a -> b -> z since that parses as a -> (b -> z) and b -> z is not unifiable with z. If you wanted to unify a -> b -> z with c -> z then you have to first uncurry the function to (a, b) -> z, which looks like a bit of work -- it's just not an issue when you use the record version.
The split form is also a little more concise for application; you just write fst split a instead of combined $ Left a. But this also means that you can't quite do something like yz . combined (whose equivalent is (yz . fst split, yz . snd split)) so easily. When you've actually got the Processor record defined it might be worth it to extend its kind to * -> * and make it a Functor.
The record can in general participate in type classes more easily than the sum-type-function.
Sum types will look more imperative, so they'll probably be clearer to read. For example, if I hand you the pattern withProcState p () [Read path1, Apply (map toUpper), Write path2] it's pretty easy to see that this feeds the processor with commands to uppercase path1 into path2. The equivalent of defining processors would look like procWrite p path2 $ procApply p (map toUpper) $ procRead p path1 () which is still pretty clear but not quite as awesome as the previous case.

Haskell monad return arbitrary data type

I am having trouble defining the return over a custom defined recursive data type.
The data type is as follows:
data A a = B a | C (A a) (A a)
However, I don't know how to define the return statement since I can't figure out when to return B value and when to recursively return C.
Any help is appreciated!
One way to define a Monad instance for this type is to treat it as a free monad. In effect, this takes A a to be a little syntax with one binary operator C, and variables represented by values of type a embedded by the B constructor. That makes return the B constructor, embedding variables, and >>= the operator which performs subsitution.
instance Monad A where
return = B
B x >>= f = f x
C l r >>= f = C (l >>= f) (r >>= f)
It's not hard to see that (>>= B) performs the identity substitution, and that composition of substitutions is associative.
Another, more "imperative" way to see this monad is that it captures the idea of computations that can flip coins (or read a bitstream or otherwise have some access to a sequence of binary choices).
data Coin = Heads | Tails
Any computation which can flip coins must either stop flipping and be a value (with B), or flip a coin and carry on (with C) in one way if the coin comes up Heads and another if Tails. The monadic operation which flips a coin and tells you what came up is
coin :: A Coin
coin = C (B Heads) (B Tails)
The >>= of A can now be seen as sequencing coin-flipping computations, allowing the choice of a subsequent computation to depend on the value delivered by an earlier computation.
If you have an infinite stream of coins, then (apart from your extraordinary good fortune) you're also lucky enough to be able to run any A-computation to its value, as follows
data Stream x = x :> Stream x -- actually, I mean "codata"
flipping :: Stream Coin -> A v -> v
flipping _ (B v) = v
flipping (Heads :> cs) (C h t) = flipping cs h
flipping (Tails :> cs) (C h t) = flipping cs t
The general pattern in this sort of monad is to have one constructor for returning a value (B here) and a bunch of others which represent the choice of possible operations and the different ways computations can continue given the result of an operation. Here C has no non-recursive parameters and two subtrees, so I could tell that there must be just one operation and that it must have just two possible outcomes, hence flipping a coin.
So, it's substitution for a syntax with variables and one binary operator, or it's a way of sequencing computations that flip coins. Which view is better? Well... they're two sides of the same coin.
A good rule of thumb for return is to make it the simplest possible thing which could work (of course, any definition that satisfies the monad laws is fine, but usually you want something with minimal structure). In this case it's as simple as return = B (now write a (>>=) to match!).
By the way, this is an example of a free monad -- in fact, it's the example given in the documentation, so I'll let the documentation speak for itself.

How does Data.MemoCombinators work?

I've been looking at the source for Data.MemoCombinators but I can't really see where the heart of it is.
Please explain to me what the logic is behind all of these combinators and the mechanics of how they actually work to speed up your program in real world programming.
I'm looking for specifics for this implementation, and optionally comparison/contrast with other Haskell approaches to memoization. I understand what memoization is and am not looking for a description of how it works in general.
This library is a straightforward combinatorization of the well-known technique of memoization. Let's start with the canonical example:
fib = (map fib' [0..] !!)
where
fib' 0 = 0
fib' 1 = 1
fib' n = fib (n-1) + fib (n-2)
I interpret what you said to mean that you know how and why this works. So I'll focus on the combinatorization.
We are essentiallly trying to capture and generalize the idea of (map f [0..] !!). The type of this function is (Int -> r) -> (Int -> r), which makes sense: it takes a function from Int -> r and returns a memoized version of the same function. Any function which is semantically the identity and has this type is called a "memoizer for Int" (even id, which doesn't memoize). We generalize to this abstraction:
type Memo a = forall r. (a -> r) -> (a -> r)
So a Memo a, a memoizer for a, takes a function from a to anything, and returns a semantically identical function that has been memoized (or not).
The idea of the different memoizers is to find a way to enumerate the domain with a data structure, map the function over them, and then index the data structure. bool is a good example:
bool :: Memo Bool
bool f = table (f True, f False)
where
table (t,f) True = t
table (t,f) False = f
Functions from Bool are equivalent to pairs, except a pair will only evaluate each component once (as is the case for every value that occurs outside a lambda). So we just map to a pair and back. The essential point is that we are lifting the evaluation of the function above the lambda for the argument (here the last argument of table) by enumerating the domain.
Memoizing Maybe a is a similar story, except now we need to know how to memoize a for the Just case. So the memoizer for Maybe takes a memoizer for a as an argument:
maybe :: Memo a -> Memo (Maybe a)
maybe ma f = table (f Nothing, ma (f . Just))
where
table (n,j) Nothing = n
table (n,j) (Just x) = j x
The rest of the library is just variations on this theme.
The way it memoizes integral types uses a more appropriate structure than [0..]. It's a bit involved, but basically just creates an infinite tree (representing the numbers in binary to elucidate the structure):
1
10
100
1000
1001
101
1010
1011
11
110
1100
1101
111
1110
1111
So that looking up a number in the tree has running time proportional to the number of bits in its representation.
As sclv points out, Conal's MemoTrie library uses the same underlying technique, but uses a typeclass presentation instead of a combinator presentation. We released our libraries independently at the same time (indeed, within a couple hours!). Conal's is easier to use in simple cases (there is only one function, memo, and it will determine the memo structure to use based on the type), whereas mine is more flexible, as you can do things like this:
boundedMemo :: Integer -> Memo Integer
boundedMemo bound f = \z -> if z < bound then memof z else f z
where
memof = integral f
Which only memoizes values less than a given bound, needed for the implementation of one of the project euler problems.
There are other approaches, for example exposing an open fixpoint function over a monad:
memo :: MonadState ... m => ((Integer -> m r) -> (Integer -> m r)) -> m (Integer -> m r)
Which allows yet more flexibility, eg. purging caches, LRU, etc. But it is a pain in the ass to use, and also it puts strictness constraints on the function to be memoized (e.g. no infinite left recursion). I don't believe there are any libraries that implement this technique.
Did that answer what you were curious about? If not, perhaps make explicit the points you are confused about?
The heart is the bits function:
-- | Memoize an ordered type with a bits instance.
bits :: (Ord a, Bits a) => Memo a
bits f = IntTrie.apply (fmap f IntTrie.identity)
It is the only function (except the trivial unit :: Memo ()) which can give you a Memo a value. It uses the same idea as in this page about Haskell memoization. Section 2 shows the simplest memoization strategy using a list and section 3 does the same using a binary tree of naturals similar to the IntTree used in memocombinators.
The basic idea is to use a construction like (map fib [0 ..] !!) or in the memocombinators case - IntTrie.apply (fmap f IntTrie.identity). The thing to notice here is the correspondance between IntTie.apply and !! and also between IntTrie.identity and [0..].
The next step is memoizing functions with other types of arguments. This is done with the wrap function which uses an isomorphism between types a and b to construct a Memo b from a Memo a. For example:
Memo.integral f
=>
wrap fromInteger toInteger bits f
=>
bits (f . fromInteger) . toInteger
=>
IntTrie.apply (fmap (f . fromInteger) IntTrie.identity) . toInteger
~> (semantically equivalent)
(map (f . fromInteger) [0..] !!) . toInteger
The rest of the source code deals with types like List, Maybe, Either and memoizing multiple arguments.
Some of the work is done by IntTrie: http://hackage.haskell.org/package/data-inttrie-0.0.4
Luke's library is a variation of Conal's MemoTrie library, which he described here: http://conal.net/blog/posts/elegant-memoization-with-functional-memo-tries/
Some further expansion -- the general notion behind functional memoization is to take a function from a -> b and map it across a datastructure indexed by all possible values of a and containing values of b. Such a datastructure should be lazy in two ways -- first it should be lazy in the values it holds. Second, it should be lazily produced itself. The former is by default in a nonstrict language. The latter is accomplished by using generalized tries.
The various approaches of memocombinators, memotrie, etc are all just ways of creating compositions of pieces of tries over individual types of datastructures to allow for the simple construction of tries for increasingly complex structures.
#luqui One thing that is not clear to me: does this have the same operational behaviour as the following:
fib :: [Int]
fib = map fib' [0..]
where fib' 0 = 0
fib' 1 = 1
fib' n = fib!!(n-1) + fib!!(n-2)
The above should memoize fib at the top level, and hence if you define two functions:
f n = fib!!n + fib!!(n+1)
If we then compute f 5, we obtain that fib 5 is not recomputed when computing fib 6. It is not clear to me whether the memoization combinators have the same behaviour (i.e. top-level memoization instead of only prohibiting the recomputation "inside" the fib computation), and if so, why exactly?

Resources