How do you make a generic memoize function in Haskell? - haskell

I've seen the other post about this, but is there a clean way of doing this in Haskell?
As a 2nd part, can it also be done without making the function monadic?

The package data-memocombinators on hackage provides lots of reusable memoization routines. The basic idea is:
type Memo a = forall r. (a -> r) -> (a -> r)
I.e. it can memoize any function from a. The module then provides some primitives (like unit :: Memo () and integral :: Memo Int), and combinators for building more complex memo tables (like pair :: Memo a -> Memo b -> Memo (a,b) and list :: Memo a -> Memo [a]).

You can modify Jonathan´s solution with unsafePerformIO to create a "pure" memoizing version of your function.
import qualified Data.Map as Map
import Data.IORef
import System.IO.Unsafe
memoize :: Ord a => (a -> b) -> (a -> b)
memoize f = unsafePerformIO $ do
r <- newIORef Map.empty
return $ \ x -> unsafePerformIO $ do
m <- readIORef r
case Map.lookup x m of
Just y -> return y
Nothing -> do
let y = f x
writeIORef r (Map.insert x y m)
return y
This will work with recursive functions:
fib :: Int -> Integer
fib 0 = 1
fib 1 = 1
fib n = fib_memo (n-1) + fib_memo (n-2)
fib_memo :: Int -> Integer
fib_memo = memoize fib
Altough this example is a function with one integer parameter, the type of memoize tells us that it can be used with any function that takes a comparable type. If you have a function with more than one parameter just group them in a tuple before applying memoize. F.i.:
f :: String -> [Int] -> Float
f ...
f_memo = curry (memoize (uncurry f))

This largely follows http://www.haskell.org/haskellwiki/Memoization.
You want a function of type (a -> b). If it doesn't call itself, then
you can just write a simple wrapper that caches the return values. The
best way to store this mapping depends on what properties of a you can
exploit. Ordering is pretty much a minimum. With integers
you can construct an infinite lazy list or tree holding the values.
type Cacher a b = (a -> b) -> a -> b
positive_list_cacher :: Cacher Int b
positive_list_cacher f n = (map f [0..]) !! n
or
integer_list_cacher :: Cacher Int b
integer_list_cacher f n = (map f (interleave [0..] [-1, -2, ..]) !!
index n where
index n | n < 0 = 2*abs(n) - 1
index n | n >= 0 = 2 * n
So, suppose it is recursive. Then you need it to call not itself, but
the memoized version, so you pass that in instead:
f_with_memo :: (a -> b) -> a -> b
f_with_memo memoed base = base_answer
f_with_memo memoed arg = calc (memoed (simpler arg))
The memoized version is, of course, what we're trying to define.
But we can start by creating a function that caches its inputs:
We could construct one level by passing in a function that creates a
structure that caches values. Except we need to create the version of f
that already has the cached function passed in.
Thanks to laziness, this is no problem:
memoize cacher f = cached where
cached = cacher (f cached)
then all we need is to use it:
exposed_f = memoize cacher_for_f f
The article gives hints as to how to use a type class selecting on the
input to the function to do the above, rather than choosing an explicit
caching function. This can be really nice -- rather than explicitly
constructing a cache for each combination of input types, we can implicitly
combine caches for types a and b into a cache for a function taking a and b.
One final caveat: using this lazy technique means the cache never shrinks,
it only grows. If you instead use the IO monad, you can manage this, but
doing it wisely depends on usage patterns.

Doing a direct translation from the more imperative languages, I came up with this.
memoize :: Ord a => (a -> IO b) -> IO (a -> IO b)
memoize f =
do r <- newIORef Map.empty
return $ \x -> do m <- readIORef r
case Map.lookup x m of
Just y -> return y
Nothing -> do y <- f x
writeIORef r (Map.insert x y m)
return y
But this is somehow unsatisfactory. Also, Data.Map constrains the parameter to be an instance of Ord.

If your arguments are going to be natural numbers, you can do simply:
memo f = let values = map f [0..]
in \n -> values !! n
However, that doesn't really help you with the stack overflowing, and it doesn't work with recursive calls. You can see some fancier solutions at http://www.haskell.org/haskellwiki/Memoization.

Related

Haskell instance of `bind` for a custom type

I'm trying to create an instance for bind operator (>>=) to the custom type ST a
I found this way to do it but I don't like that hardcoded 0.
Is there any way to implement it without having the hardcoded 0 and respecting the type of the function?
newtype ST a = S (Int -> (a, Int))
-- This may be useful to implement ">>=" (bind), but it is not mandatory to use it
runState :: ST a -> Int -> (a, Int)
runState (S s) = s
instance Monad ST where
return :: a -> ST a
return x = S (\n -> (x, n))
(>>=) :: ST a -> (a -> ST b) -> ST b
s >>= f = f (fst (runState s 0))
I often find it easier to follow such code with a certain type of a pseudocode rewrite, like this: starting with the
instance Monad ST where
return :: a -> ST a
return x = S (\n -> (x, n))
we get to the
runState (return x) n = (x, n)
which expresses the same thing exactly. It is now a kind of a definition through an interaction law that it must follow. This allows me to ignore the "noise"/wrapping around the essential stuff.
Similarly, then, we have
(>>=) :: ST a -> (a -> ST b) -> ST b
s >>= f = -- f (fst (runState s 0)) -- nah, 0? what's that?
--
-- runState (s >>= f) n = runState (f a) i where
-- (a, i) = runState s n
--
S $ \ n -> let (a, i) = runState s n in
runState (f a) i
because now we have an Int in sight (i.e. in scope), n, that will get provided to us when the combined computation s >>= f will "run". I mean, when it will runState.
Of course nothing actually runs until called upon from main. But it can be a helpful metaphor to hold in mind.
The way we've defined it is both the easiest and the most general, which is usually the way to go. There are more ways to make the types fit though.
One is to use n twice, in the input to the second runState as well, but this will leave the i hanging unused.
Another way is to flip the time arrow around w.r.t. the state passing, with
S $ \ n -> let (a, i2) = runState s i
(b, i ) = runState (f a) n
in (b, i2)
which is a bit weird to say the least. s still runs first (as expected for the s >>= f combination) to produce the value a from which f creates the second computation stage, but the state is being passed around in the opposite direction.
The most important thing to keep in mind is that your ST type is a wrapper around a function. What if you started your definition as (>>=) = \s -> \f -> S (\n -> ... )? It might be (ok, is) a bit silly to write separate lambdas for the s and f parameters there, but I did it to show that they're not really any different from the n parameter. You can use it in your definition of (>>=).

Does the result of a continuation have to be the final result? [duplicate]

the Cont r a type stands for a function which takes a continuation a->r and produces a result of type r. So both the continuation and the entire Cont r a produce a result of the same type r.
My question is: are the two results necessarily the same value, or can a Cont r a post-process the result from the continuation and produce a different value, albeit of the same type r?
I tried using (+1) for post-processing (note the + 1 --<--):
c1 :: Int -> Cont r Int
c1 x = let y = 2*x
in cont $ \k -> (k y) + 1 --<--
Now that doesn't typecheck, because my post-processing function (+1) only accepts an argument whose type belongs to the Num typeclass. However, I pass the result of the continuation (k y) which is of some type r that is not guaranteed to belong to the Num typeclass.
Whatever I do to (k y), it must be a function of type r->r. The only function which can do this for all r is the id function and using id for post-processing is no post-processing at all.
However, the whole thing does typecheck if I restrict r to the Num typeclass or even to the concrete type Int. It then produces the expected result:
*Main> runCont (c1 1) id
3
I am quite unsure,
if such post-processing and restricting the type of r is a normal thing to do, and if so, in what circumstances this might be useful
or if the type variable rhas to be read as for all r and restricting the type of r will lead to all sorts of trouble.
Can someone shed some light on this?
Technically, I think it's fine. Specializing Cont r a to Num r => Cont r a doesn't seem fundamentally more problematic than specializing Reader r a to Num r => Reader r a.
An implication of doing so is that the resulting CPS computation can only be run against a (final) continuation that produces a number, but that's obvious -- if you have a computation that post-processes the continuation result as a number, it can only be used with continuations that produce numbers!
As additional evidence that this is sanctioned at least to some degree, note that there's a function:
mapCont :: (r -> r) -> Cont r a -> Cont r a
If this function was to be used with no restriction on r, the only valid values for its first argument would be id or functions that don't terminate, as you have noted.
A version of your c1 using mapCont might look like:
c2 :: (Num r) => Int -> Cont r Int
c2 x = mapCont (+1) $ return (2*x)
and seems to work fine:
> runCont (c2 10) id
21
> runCont (c2 10) (const 5)
6
> runCont (c2 10) show
... No instance for (Num String) arising from a use of 'c2' ...
As for when this would be useful, I'm not sure. I can think of a few somewhat lame applications. You could define an computation that overrides the final result (provided no other kind of post-processing is used):
override x = cont (const x)
to be used like:
> runCont (return 2 >>= \x -> cont (\f -> f (x*3))) id
6
> runCont (return 2 >> override 1000 >>= \x -> cont (\f -> f (x*3))) id
1000
>
or a computation transformer that emulates a writer to add log functionality:
annotate note comp = mapCont (\(a, w) -> (a, note:w)) comp
which you might use like this:
runCont (annotate "two" (return 2)
>>= \x -> annotate "times three" (cont (\f -> f (x*3))))
(\a -> (a, []))
yielding:
(6,["two","times three"])
These don't seem like very compelling applications, though.
#KABuhr has shown that post-processing in the ordinary Cont works, but didn't find "very compelling applications". I'm going to show you how post-processing is useful, but it only works best when you generalize Cont. First, some header stuff (mostly used in the examples):
{-# LANGUAGE RebindableSyntax #-}
import Prelude(Num(..), Eq(..), Enum(..))
import Data.Bool
import Data.Function
import Data.Functor.Identity
import Data.List
import Data.Maybe
import Data.Tuple
import Control.Lens(_1, _2, traversed)
Now, a generalized Cont.
newtype Cont r f a = Cont { runCont :: (a -> r) -> f }
Your question was "is post-processing allowed in Cont?" The answer is yes. If would like it to not be so, you can use newtype ContS a = { runContS :: forall r. (a -> r) -> r } which totally disallows it. In fact, ContS a is isomorphic to a. The Cont I just defined takes the opposite position: even type-changing post-processors are allowed. We can define a standard Functorial (<$>).
infixl 1 <$>
(<$>) :: (a -> b) -> Cont r f a -> Cont r f b
f <$> Cont x = Cont $ \cont -> x $ \realX -> cont (f realX)
Before continuing, let's get an understanding of the metaphor behind Cont. A Cont r f a is a computation that can produce as. It will give you the as, but will ask you to produce rs. Once you do that, it'll make fs. It's sort of like a (r -> f, a), but with heavy restrictions on use. If we try to define an Applicative-ish operator, we see something interesting.
infixl 1 <*>
(<*>) :: Cont m f (a -> b) -> Cont r m a -> Cont r f b
Cont f <*> Cont x = Cont $ \cont -> x $ \realX -> f $ \realF -> cont (realF realX)
(<*>) is sort of doing two operations at once. It is applying the a -> b to an a to get b, but it's also composing the m -> f and r -> m aspects into a r -> f part. However, the type of (<*>) no longer fits into the normal Applicative format. This is why we use Cont r a instead of Cont r f a. The former is less powerful, but it fits into our existing framework. To get our Cont to work, we have to leave some of the established infrastructure behind.
Before we get into the RebindableSyntax-level stuff, here's some usage.
complete :: Cont a f a -> f
complete (Cont x) = x id
amb :: [a] -> Cont (Maybe b) (Maybe (a, b)) a
amb [] = Cont (const Nothing)
amb (x : xs) = Cont $ \test -> case test x of
Nothing -> runCont (amb xs) test
Just y -> Just (x, y)
poly :: Num a => a -> a -> a -> a
poly x y z = sq x * y + sq y + z + sq z * x
where sq x = x * x
solution :: (Num a, Enum a, Eq a) => Maybe (a, (a, (a, ())))
solution = complete $ testRoot <$> amb [-5..5]
<*> amb [-10 .. -5]
<*> amb [5..10]
where testRoot x y z = case poly x y z of
0 -> Just ()
_ -> Nothing
complete completes a computation when there isn't actually a gap holding it up. amb takes a [a], and goes through each a, one by one. It passes each into the test, and searches until it finds one that succeeds. It post-processes the result of the test in two ways. It resets the result until it's a Just (or gives up), and a Just result gets up paired with the input that built it.
In solution, the complete is delimiting the extent of the continuation passed to the ambs. Each amb is passed the code that lies between it and the complete. E.g., the continuation given to the amb [-5..5] is \x -> testRoot x <*> amb [-10 .. -5] <*> amb [10..5]. This style of continuations is called shift/reset. Cont is shift, complete is reset. The idea is that amb [-5..5] is a "liar"; it "looks like" a Num a => a because it's getting passed to testRoot, but it's actually a control structure that turns everything around it inside-out. Compared to the normal Cont r a, the control structures allowed in our Cont are more powerful.
Now, here's what we need RebindableSyntax for:
(=<<) :: (a -> Cont r m b) -> Cont m f a -> Cont r f b
f =<< Cont x = Cont $ \cont -> x $ \realX -> runCont (f realX) cont
(>>=) = flip (=<<)
return :: a -> Cont r r a
return x = Cont ($ x)
(=<<) is the Monad-style function application operator. Again, our version doesn't fit the usual type. With (>>=) and return, do-notation has now been redefined to work with Cont. You can go back and rewrite solution in do-notation to see that it works.
Let's really get out there. The idea behind profunctor optics is that data structures give rise to "transformer transformers". E.g. a Lens s t a b = forall f. Functor f => (a -> f b) -> s -> f t takes a transformer between the "small" structures a and b and makes one from between the "bigger" s and t. Look what lies just a flip away...
editing :: ((a -> Identity b) -> s -> Identity t) -> s -> Cont b t a
editing optic x = Cont (runIdentity . flip optic x . (Identity .))
editing, as a control structure, takes a reference to a field inside a structure, a structure to use it on, and then mutates that structure with "the rest of the program." Using it, you can write the following:
example :: (a -> a) -> [(Bool, (a, a))] -> [(Bool, (a, a))]
example f xs = complete $ do x <- editing traversed xs
n2 <- editing _2 x
n <- case fst x of
True -> editing _1 n2
False -> editing _2 n2
return (f n)
I hope, with even these contrived examples, that you're convinced that post-processing is useful in Cont. There's nothing wrong with doing it. However, if you want to use it at its full potential, you have to break out of the existing Applicative and Monad form. This is painful, so we cripple Cont to make it fit, disabling type-changing post-processing as a trade-off.

Unexpected memory growth with Control.Monad foldM

I have the following code, which has been stripped down and is I think as minimal as possible that has some very odd behaviour.
The code consists of two source files:
One to define some data:
module MyFunction where
data MyFunction =
MyFunction {
functionNumber :: Int,
functionResult :: IO String
}
makeMyFunction :: Show a => Int -> IO a -> MyFunction
makeMyFunction number result = MyFunction {
functionNumber = number,
functionResult = result >>= return . show }
And the other is Main:
module Main (main) where
import System.CPUTime (getCPUTime)
import Data.List (foldl')
import Data.Foldable (foldlM)
import Control.Monad (foldM)
import MyFunction
exampleFunction = do
--let x = foldl' (\a b -> a `seq` (a + b)) 0 [1..20000000] -- This works
--x <- foldlM (\a b -> a `seq` return (a + b)) 0 [1..20000000] -- This works (*)
x <- foldM (\a b -> a `seq` return (a + b)) 0 [1..20000000] -- This doesn't
print x
return ()
runFunction fn = do
result <- functionResult fn
duration <- getCPUTime
if result /= "()"
then putStrLn ""
else return ()
putStrLn (show (fromIntegral duration / (10^9)) ++ "ms")
return fn
main = do
runFunction (makeMyFunction 123 exampleFunction)
return ()
The code as above (compiled using GHC 7.10.3 with stack 1.0.0 with default flags) has a rapid increase in memory usage (exceeding 1GB), and takes typically 3.3 seconds.
If I make a changes to the code, for example:
Use one of the commented alternatives to the problem line
Take out any line from runFunction
The memory usage will remain minimal, and takes only about 1 second.
One feature that I think is most surprising to me is that replacing foldM with foldlM (which as far as I know foldM = foldlM) fixes the problem.
Also making changes to code that I don't see has any relationship to the problem lines of code also fixes the problem. For example removing the last putStrLn.
Another oddity is that if I merge the MyFunction module into the Main module, while it doesn't fix the problem, it actually causes foldlM to behave as foldM using excessive memory.
In the real code that this came from, I have a large number exampleFunctions, and there is significantly more Main code, and every so often I encounter this sort of unexplained memory usage from functions, that can usually be resolved by some sort of voodoo.
I'm looking for an explanation for the behaviour. If I know why this occurs I can then look into avoiding it. Could this be a compiler issue, or maybe just a misunderstanding on my part?
(*) I've highlighted the secondary issue that causes the same memory growth to occur with foldlM.
Here is foldlM from Foldable.hs (ghc)
-- | Monadic fold over the elements of a structure,
-- associating to the left, i.e. from left to right.
foldlM :: (Foldable t, Monad m) => (b -> a -> m b) -> b -> t a -> m b
foldlM f z0 xs = foldr f' return xs z0
where f' x k z = f z x >>= k
and foldM from Monad.hs
foldM :: (Foldable t, Monad m) => (b -> a -> m b) -> b -> t a -> m b
{-# INLINEABLE foldM #-}
{-# SPECIALISE foldM :: (a -> b -> IO a) -> a -> [b] -> IO a #-}
{-# SPECIALISE foldM :: (a -> b -> Maybe a) -> a -> [b] -> Maybe a #-}
foldM = foldlM
I placed these definitions to a separate module Test and tested the execution with and without INLINEABLE / SPESIALISE lines. Whatever the reason is, leaving out the SPECIALISE directives helped and the execution time and memory usage was like with foldlM.
After a little bit more digging, removing line
{-# SPECIALISE foldM :: (a -> b -> IO a) -> a -> [b] -> IO a #-}
effected the most.

Constructing efficient monad instances on `Set` (and other containers with constraints) using the continuation monad

Set, similarly to [] has a perfectly defined monadic operations. The problem is that they require that the values satisfy Ord constraint, and so it's impossible to define return and >>= without any constraints. The same problem applies to many other data structures that require some kind of constraints on possible values.
The standard trick (suggested to me in a haskell-cafe post) is to wrap Set into the continuation monad. ContT doesn't care if the underlying type functor has any constraints. The constraints become only needed when wrapping/unwrapping Sets into/from continuations:
import Control.Monad.Cont
import Data.Foldable (foldrM)
import Data.Set
setReturn :: a -> Set a
setReturn = singleton
setBind :: (Ord b) => Set a -> (a -> Set b) -> Set b
setBind set f = foldl' (\s -> union s . f) empty set
type SetM r a = ContT r Set a
fromSet :: (Ord r) => Set a -> SetM r a
fromSet = ContT . setBind
toSet :: SetM r r -> Set r
toSet c = runContT c setReturn
This works as needed. For example, we can simulate a non-deterministic function that either increases its argument by 1 or leaves it intact:
step :: (Ord r) => Int -> SetM r Int
step i = fromSet $ fromList [i, i + 1]
-- repeated application of step:
stepN :: Int -> Int -> Set Int
stepN times start = toSet $ foldrM ($) start (replicate times step)
Indeed, stepN 5 0 yields fromList [0,1,2,3,4,5]. If we used [] monad instead, we would get
[0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5]
instead.
The problem is efficiency. If we call stepN 20 0 the output takes a few seconds and stepN 30 0 doesn't finish within a reasonable amount of time. It turns out that all Set.union operations are performed at the end, instead of performing them after each monadic computation. The result is that exponentially many Sets are constructed and unioned only at the end, which is unacceptable for most tasks.
Is there any way around it, to make this construction efficient? I tried but without success.
(I even suspect that there could be some kinds of theoretical limits following from Curry-Howard isomorphism and Glivenko's theorem. Glivenko's theorem says that for any propositional tautology φ the formula ¬¬φ can be proved in intuitionistic logic. However, I suspect that the length of the proof (in normal form) can be exponentially long. So, perhaps, there could be cases when wrapping a computation into the continuation monad will make it exponentially longer?)
Monads are one particular way of structuring and sequencing computations. The bind of a monad cannot magically restructure your computation so as to happen in a more efficient way. There are two problems with the way you structure your computation.
When evaluating stepN 20 0, the result of step 0 will be computed 20 times. This is because each step of the computation produces 0 as one alternative, which is then fed to the next step, which also produces 0 as alternative, and so on...
Perhaps a bit of memoization here can help.
A much bigger problem is the effect of ContT on the structure of your computation. With a bit of equational reasoning, expanding out the result of replicate 20 step, the definition of foldrM and simplifying as many times as necessary, we can see that stepN 20 0 is equivalent to:
(...(return 0 >>= step) >>= step) >>= step) >>= ...)
All parentheses of this expression associate to the left. That's great, because it means that the RHS of each occurrence of (>>=) is an elementary computation, namely step, rather than a composed one. However, zooming in on the definition of (>>=) for ContT,
m >>= k = ContT $ \c -> runContT m (\a -> runContT (k a) c)
we see that when evaluating a chain of (>>=) associating to the left, each bind will push a new computation onto the current continuation c. To illustrate what is going on, we can use again a bit of equational reasoning, expanding out this definition for (>>=) and the definition for runContT, and simplifying, yielding:
setReturn 0 `setBind`
(\x1 -> step x1 `setBind`
(\x2 -> step x2 `setBind` (\x3 -> ...)...)
Now, for each occurrence of setBind, let's ask ourselves what the RHS argument is. For the leftmost occurrence, the RHS argument is the whole rest of the computation after setReturn 0. For the second occurrence, it's everything after step x1, etc. Let's zoom in to the definition of setBind:
setBind set f = foldl' (\s -> union s . f) empty set
Here f represents all the rest of the computation, everything on the right hand side of an occurrence of setBind. That means that at each step, we are capturing the rest of the computation as f, and applying f as many times as there are elements in set. The computations are not elementary as before, but rather composed, and these computations will be duplicated many times.
The crux of the problem is that the ContT monad transformer is transforming the initial structure of the computation, which you meant as a left associative chain of setBind's, into a computation with a different structure, ie a right associative chain. This is after all perfectly fine, because one of the monad laws says that, for every m, f and g we have
(m >>= f) >>= g = m >>= (\x -> f x >>= g)
However, the monad laws do not impose that the complexity remain the same on each side of the equations of each law. And indeed, in this case, the left associative way of structuring this computation is a lot more efficient. The left associative chain of setBind's evaluates in no time, because only elementary subcomputations are duplicated.
It turns out that other solutions shoehorning Set into a monad also suffer from the same problem. In particular, the set-monad package, yields similar runtimes. The reason being, that it too, rewrites left associative expressions into right associative ones.
I think you have put the finger on a very important yet rather subtle problem with insisting that Set obeys a Monad interface. And I don't think it can be solved. The problem is that the type of the bind of a monad needs to be
(>>=) :: m a -> (a -> m b) -> m b
ie no class constraint allowed on either a or b. That means that we cannot nest binds on the left, without first invoking the monad laws to rewrite into a right associative chain. Here's why: given (m >>= f) >>= g, the type of the computation (m >>= f) is of the form m b. A value of the computation (m >>= f) is of type b. But because we can't hang any class constraint onto the type variable b, we can't know that the value we got satisfies an Ord constraint, and therefore cannot use this value as the element of a set on which we want to be able to compute union's.
Recently on Haskell Cafe Oleg gave an example how to implement the Set monad efficiently. Quoting:
... And yet, the efficient genuine Set monad is possible.
...
Enclosed is the efficient genuine Set monad. I wrote it in direct style (it seems to be faster, anyway). The key is to use the optimized choose function when we can.
{-# LANGUAGE GADTs, TypeSynonymInstances, FlexibleInstances #-}
module SetMonadOpt where
import qualified Data.Set as S
import Control.Monad
data SetMonad a where
SMOrd :: Ord a => S.Set a -> SetMonad a
SMAny :: [a] -> SetMonad a
instance Monad SetMonad where
return x = SMAny [x]
m >>= f = collect . map f $ toList m
toList :: SetMonad a -> [a]
toList (SMOrd x) = S.toList x
toList (SMAny x) = x
collect :: [SetMonad a] -> SetMonad a
collect [] = SMAny []
collect [x] = x
collect ((SMOrd x):t) = case collect t of
SMOrd y -> SMOrd (S.union x y)
SMAny y -> SMOrd (S.union x (S.fromList y))
collect ((SMAny x):t) = case collect t of
SMOrd y -> SMOrd (S.union y (S.fromList x))
SMAny y -> SMAny (x ++ y)
runSet :: Ord a => SetMonad a -> S.Set a
runSet (SMOrd x) = x
runSet (SMAny x) = S.fromList x
instance MonadPlus SetMonad where
mzero = SMAny []
mplus (SMAny x) (SMAny y) = SMAny (x ++ y)
mplus (SMAny x) (SMOrd y) = SMOrd (S.union y (S.fromList x))
mplus (SMOrd x) (SMAny y) = SMOrd (S.union x (S.fromList y))
mplus (SMOrd x) (SMOrd y) = SMOrd (S.union x y)
choose :: MonadPlus m => [a] -> m a
choose = msum . map return
test1 = runSet (do
n1 <- choose [1..5]
n2 <- choose [1..5]
let n = n1 + n2
guard $ n < 7
return n)
-- fromList [2,3,4,5,6]
-- Values to choose from might be higher-order or actions
test1' = runSet (do
n1 <- choose . map return $ [1..5]
n2 <- choose . map return $ [1..5]
n <- liftM2 (+) n1 n2
guard $ n < 7
return n)
-- fromList [2,3,4,5,6]
test2 = runSet (do
i <- choose [1..10]
j <- choose [1..10]
k <- choose [1..10]
guard $ i*i + j*j == k * k
return (i,j,k))
-- fromList [(3,4,5),(4,3,5),(6,8,10),(8,6,10)]
test3 = runSet (do
i <- choose [1..10]
j <- choose [1..10]
k <- choose [1..10]
guard $ i*i + j*j == k * k
return k)
-- fromList [5,10]
-- Test by Petr Pudlak
-- First, general, unoptimal case
step :: (MonadPlus m) => Int -> m Int
step i = choose [i, i + 1]
-- repeated application of step on 0:
stepN :: Int -> S.Set Int
stepN = runSet . f
where
f 0 = return 0
f n = f (n-1) >>= step
-- it works, but clearly exponential
{-
*SetMonad> stepN 14
fromList [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14]
(0.09 secs, 31465384 bytes)
*SetMonad> stepN 15
fromList [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
(0.18 secs, 62421208 bytes)
*SetMonad> stepN 16
fromList [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
(0.35 secs, 124876704 bytes)
-}
-- And now the optimization
chooseOrd :: Ord a => [a] -> SetMonad a
chooseOrd x = SMOrd (S.fromList x)
stepOpt :: Int -> SetMonad Int
stepOpt i = chooseOrd [i, i + 1]
-- repeated application of step on 0:
stepNOpt :: Int -> S.Set Int
stepNOpt = runSet . f
where
f 0 = return 0
f n = f (n-1) >>= stepOpt
{-
stepNOpt 14
fromList [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14]
(0.00 secs, 515792 bytes)
stepNOpt 15
fromList [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
(0.00 secs, 515680 bytes)
stepNOpt 16
fromList [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
(0.00 secs, 515656 bytes)
stepNOpt 30
fromList [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30]
(0.00 secs, 1068856 bytes)
-}
I don't think your performance problems in this case are due to the use of Cont
step' :: Int -> Set Int
step' i = fromList [i,i + 1]
foldrM' f z0 xs = Prelude.foldl f' setReturn xs z0
where f' k x z = f x z `setBind` k
stepN' :: Int -> Int -> Set Int
stepN' times start = foldrM' ($) start (replicate times step')
gets similar performance to the Cont based implementation but occurs entirely in the Set "restricted monad"
I am not sure if I believe your claim about Glivenko's theorem leading to exponential increase in (normalized) proof size--at least in the Call-By-Need context. That is because we can arbitrarily reuse subproofs (and our logic is second order, we need only a single proof of forall a. ~~(a \/ ~a)). Proofs are not trees, they are graphs (sharing).
In general, you are likely to see performance costs from Cont wrapping Set but they can usually be avoided via
smash :: (Ord r, Ord k) => SetM r r -> SetM k r
smash = fromSet . toSet
I found out another possibility, based on GHC's ConstraintKinds extension. The idea is to redefine Monad so that it includes a parametric constraint on allowed values:
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE RebindableSyntax #-}
import qualified Data.Foldable as F
import qualified Data.Set as S
import Prelude hiding (Monad(..), Functor(..))
class CFunctor m where
-- Each instance defines a constraint it valust must satisfy:
type Constraint m a
-- The default is no constraints.
type Constraint m a = ()
fmap :: (Constraint m a, Constraint m b) => (a -> b) -> (m a -> m b)
class CFunctor m => CMonad (m :: * -> *) where
return :: (Constraint m a) => a -> m a
(>>=) :: (Constraint m a, Constraint m b) => m a -> (a -> m b) -> m b
fail :: String -> m a
fail = error
-- [] instance
instance CFunctor [] where
fmap = map
instance CMonad [] where
return = (: [])
(>>=) = flip concatMap
-- Set instance
instance CFunctor S.Set where
-- Sets need Ord.
type Constraint S.Set a = Ord a
fmap = S.map
instance CMonad S.Set where
return = S.singleton
(>>=) = flip F.foldMap
-- Example:
-- prints fromList [3,4,5]
main = print $ do
x <- S.fromList [1,2]
y <- S.fromList [2,3]
return $ x + y
(The problem with this approach is in the case the monadic values are functions, such as m (a -> b), because they can't satisfy constraints like Ord (a -> b). So one can't use combinators like <*> (or ap) for this constrained Set monad.)

Applying a function to an arbitrarily long list of arguments

I want to create a function apply that takes a function with an arbitrary amount of arguments as well as a list of integers, and returns the result of the function (Where each integer in the list is an argument in order.
I was thinking something like:
apply :: ([Int] -> Int) -> [Int] -> Int
apply f x:xs = apply (f x) xs
apply f [] = f
But I know this won't work because the type signature is wrong - the function doesn't take a list of ints, it just takes some amount of int arguments.
Additionally, when I get to the base case the f argument to apply should actually be an integer, violating the type signature anyway.
Does anyone know how to deal with this sort of problem?
I want to create a function apply that takes a function with an arbitrary amount of arguments as well as a list of integers,
Why do you want to do this? Perhaps your argument structure should be passed as a data structure, but so far you've over constrained the problem to ensure it won't produce an idiomatic Haskell solution.
You can do it with some fancy type classes
{-# LANGUAGE FlexibleInstances #-}
-- for ApplyType (Int -> r)
class ApplyType t where
apply :: t -> [Int] -> Int
instance ApplyType Int where
apply f _ = f
instance (ApplyType r) => ApplyType (Int -> r) where
apply f (x:xs) = apply (f x) xs
main :: IO ()
main = do print $ apply ((+) :: Int->Int->Int) [1, 2]
print $ apply ((\x y z w -> x*y - z`div`w) :: Int->Int->Int->Int->Int) [3,5,8,2]

Resources