I am trying to do a global counter using monads in Haskell, I want to get the incremented value every time I use the monad counter, but I am kind of stuck getting the same value each time!
The code list is as follows:
module CounterMonad where
data Counter a = C (Int -> (Int, a))
--reset the counter
new :: Counter ()
new = C $ \_ -> (0, ())
-- increment the counter:
--inc :: Counter Int
--inc = C $ \n -> (n+1, n)
inc = get >>= \s -> (put (s+1))
-- returning the current value of the counter
get :: Counter Int
get = C $ \n -> (n, n)
--
put x = C $ \n -> (x, x)
--return is nop, >>= is sequential exectuion
instance Monad Counter where
return r = C $ \n -> (n, r)
(>>=) (C f) g = C $ \n0 -> let (n1, r1) = f n0
C g' = g r1
in g' n1
run :: Counter a -> a
run (C f) = snd (f 0)
tickC = do
inc
c <- get
return c
When I try to execute as run tickC, it always returns 1.
What I want is everytime I run tickC, it returns the incremented value, like 1, ,2, 3,4 ....
I know there must be some stupid problem lying there, can you guys point out how?
This is correct behavior. Every time you call run tickC, the run function evaluates the operation with the counter set to zero. Essentially, each time you call run you get a different "copy" of the counter initialized to zero.
If you want to have the counter increment each time, you have to execute all operations in the same call to run. For example,
tickMany = do
x <- tickC
y <- tickC
z <- tickC
return [x, y, z]
> run tickMany
[1, 2, 3]
This is true of all monads, including IO (ignoring the "unsafe" operations). Since the only way to run an IO operation is through main, the IO monad is threaded through every function that uses it.
So if you want a global counter, that counter has to be managed by a monad that is used globally (i.e., by every function that needs to access it). You can either use your Counter monad globally or put the counter in the IO monad. It seems to be accepted design practice to put state like this in your own monad, but of course, it depends on the application and IO is fine too.
You may also wish to look at Control.Monad.State, which would allow you to define your monad with much less typing. Something like:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import Control.Monad.State
newtype Counter a = Counter (State Int a) deriving (Monad)
...
Related
I have written this function that computes Collatz sequences, and I see wildly varying times of execution depending on the spin I give it. Apparently it is related to something called "memoization", but I have a hard time understanding what it is and how it works, and, unfortunately, the relevant article on HaskellWiki, as well as the papers it links to, have all proven to not be easily surmountable. They discuss intricate details of the relative performance of highly layman-indifferentiable tree constructions, while what I miss must be some very basic, very trivial point that these sources neglect to mention.
This is the code. It is a complete program, ready to be built and executed.
module Main where
import Data.Function
import Data.List (maximumBy)
size :: (Integral a) => a
size = 10 ^ 6
-- Nail the basics.
collatz :: Integral a => a -> a
collatz n | even n = n `div` 2
| otherwise = n * 3 + 1
recollatz :: Integral a => a -> a
recollatz = fix $ \f x -> if (x /= 1)
then f (collatz x)
else x
-- Now, I want to do the counting with a tuple monad.
mocollatz :: Integral b => b -> ([b], b)
mocollatz n = ([n], collatz n)
remocollatz :: Integral a => a -> ([a], a)
remocollatz = fix $ \f x -> if x /= 1
then f =<< mocollatz x
else return x
-- Trivialities.
collatzLength :: Integral a => a -> Int
collatzLength x = (length . fst $ (remocollatz x)) + 1
collatzPairs :: Integral a => a -> [(a, Int)]
collatzPairs n = zip [1..n] (collatzLength <$> [1..n])
longestCollatz :: Integral a => a -> (a, Int)
longestCollatz n = maximumBy order $ collatzPairs n
where
order :: Ord b => (a, b) -> (a, b) -> Ordering
order x y = snd x `compare` snd y
main :: IO ()
main = print $ longestCollatz size
With ghc -O2 it takes about 17 seconds, without ghc -O2 -- about 22 seconds to deliver the length and the seed of the longest Collatz sequence starting at any point below size.
Now, if I make these changes:
diff --git a/Main.hs b/Main.hs
index c78ad95..9607fe0 100644
--- a/Main.hs
+++ b/Main.hs
## -1,6 +1,7 ##
module Main where
import Data.Function
+import qualified Data.Map.Lazy as M
import Data.List (maximumBy)
size :: (Integral a) => a
## -22,10 +23,15 ## recollatz = fix $ \f x -> if (x /= 1)
mocollatz :: Integral b => b -> ([b], b)
mocollatz n = ([n], collatz n)
-remocollatz :: Integral a => a -> ([a], a)
-remocollatz = fix $ \f x -> if x /= 1
- then f =<< mocollatz x
- else return x
+remocollatz :: (Num a, Integral b) => b -> ([b], a)
+remocollatz 1 = return 1
+remocollatz x = case M.lookup x (table mutate) of
+ Nothing -> mutate x
+ Just y -> y
+ where mutate x = remocollatz =<< mocollatz x
+
+table :: (Ord a, Integral a) => (a -> b) -> M.Map a b
+table f = M.fromList [ (x, f x) | x <- [1..size] ]
-- Trivialities.
-- Then it will take just about 4 seconds with ghc -O2, but I would not live long enough to see it complete without ghc -O2.
Looking at the details of cost centres with ghc -prof -fprof-auto -O2 reveals that the first version enters collatz about a hundred million times, while the patched one -- just about one and a half million times. This must be the reason of the speedup, but I have a hard time understanding the inner workings of this magic. My best idea is that we replace a portion of expensive recursive calls with O(log n) map lookups, but I don't know if it's true and why it depends so much on some godforsaken compiler flags, while, as I see it, such performance swings should all follow solely from the language.
Can I haz an explanation of what happens here, and why the performance differs so vastly between ghc -O2 and plain ghc builds?
P.S. There are two requirements to the achieving of automagical memoization highlighted elsewhere on Stack Overflow:
Make a function to be memoized a top-level name.
Make a function to be memoized a monomorphic one.
In line with these requirements, I rebuilt remocollatz as follows:
remocollatz :: Int -> ([Int], Int)
remocollatz 1 = return 1
remocollatz x = mutate x
mutate :: Int -> ([Int], Int)
mutate x = remocollatz =<< mocollatz x
Now it's as top level and as monomorphic as it gets. Running time is about 11 seconds, versus the similarly monomorphized table version:
remocollatz :: Int -> ([Int], Int)
remocollatz 1 = return 1
remocollatz x = case M.lookup x (table mutate) of
Nothing -> mutate x
Just y -> y
mutate :: Int -> ([Int], Int)
mutate = \x -> remocollatz =<< mocollatz x
table :: (Int -> ([Int], Int)) -> M.Map Int ([Int], Int)
table f = M.fromList [ (x, f x) | x <- [1..size] ]
-- Running in less than 4 seconds.
I wonder why the memoization ghc is supposedly performing in the first case here is almost 3 times slower than my dumb table.
Can I haz an explanation of what happens here, and why the performance differs so vastly between ghc -O2 and plain ghc builds?
Disclaimer: this is a guess, not verified by viewing GHC core output. A careful answer would do so to verify the conjectures outlined below. You can try peering through it yourself: add -ddump-simpl to your compilation line and you will get copious output detailing exactly what GHC has done to your code.
You write:
remocollatz x = {- ... -} table mutate {- ... -}
where mutate x = remocollatz =<< mocollatz x
The expression table mutate in fact does not depend on x; but it appears on the right-hand side of an equation that takes x as an argument. Consequently, without optimizations, this table is recomputed each time remocollatz is called (presumably even from inside the computation of table mutate).
With optimizations, GHC notices that table mutate does not depend on x, and floats it to its own definition, effectively producing:
fresh_variable_name = table mutate
where mutate x = remocollatz =<< mocollatz x
remocollatz x = case M.lookup x fresh_variable_name of
{- ... -}
The table is therefore computed just once for the entire program run.
don't know why it [the performance] depends so much on some godforsaken compiler flags, while, as I see it, such performance swings should all follow solely from the language.
Sorry, but Haskell doesn't work that way. The language definition tells clearly what the meaning of a given Haskell term is, but does not say anything about the runtime or memory performance needed to compute that meaning.
Another approach to memoization that works in some situations, like this one, is to use a boxed vector, whose elements are computed lazily. The function used to initialize each element can use other elements of the vector in its calculation. As long as the evaluation of an element of the vector doesn't loop and refer to itself, just the elements it recursively depends on will be evaluated. Once evaluated, an element is effectively memoized, and this has the further benefit that elements of the vector that are never referenced are never evaluated.
The Collatz sequence is a nearly ideal application for this technique, but there is one complication. The next Collatz value(s) in sequence from a value under the limit may be outside the limit, which would cause a range error when indexing the vector. I solved this by just iterating through the sequence until back under the limit and counting the steps to do so.
The following program takes 0.77 seconds to run unoptimized and 0.30 when optimized:
import qualified Data.Vector as V
limit = 10 ^ 6 :: Int
-- The Collatz function, which given a value returns the next in the sequence.
nextCollatz val
| odd val = 3 * val + 1
| otherwise = val `div` 2
-- Given a value, return the next Collatz value in the sequence that is less
-- than the limit and the number of steps to get there. For example, the
-- sequence starting at 13 is: [13, 40, 20, 10, 5, 16, 8, 4, 2, 1], so if
-- limit is 100, then (nextCollatzWithinLimit 13) is (40, 1), but if limit is
-- 15, then (nextCollatzWithinLimit 13) is (10, 3).
nextCollatzWithinLimit val = (firstInRange, stepsToFirstInRange)
where
firstInRange = head rest
stepsToFirstInRange = 1 + (length biggerThanLimit)
(biggerThanLimit, rest) = span (>= limit) (tail collatzSeqStartingWithVal)
collatzSeqStartingWithVal = iterate nextCollatz val
-- A boxed vector holding Collatz length for each index. The collatzFn used
-- to generate the value for each element refers back to other elements of
-- this vector, but since the vector elements are only evaluated as needed and
-- there aren't any loops in the Collatz sequences, the values are calculated
-- only as needed.
collatzVec :: V.Vector Int
collatzVec = V.generate limit collatzFn
where
collatzFn :: Int -> Int
collatzFn index
| index <= 1 = 1
| otherwise = (collatzVec V.! nextWithinLimit) + stepsToGetThere
where
(nextWithinLimit, stepsToGetThere) = nextCollatzWithinLimit index
main :: IO ()
main = do
-- Use a fold through the vector to find the longest Collatz sequence under
-- the limit, and keep track of both the maximum length and the initial
-- value of the sequence, which is the index.
let (maxLength, maxIndex) = V.ifoldl' accMaxLen (0, 0) collatzVec
accMaxLen acc#(accMaxLen, accMaxIndex) index currLen
| currLen <= accMaxLen = acc
| otherwise = (currLen, index)
putStrLn $ "Max Collatz length below " ++ show limit ++ " is "
++ show maxLength ++ " at index " ++ show maxIndex
I am trying to generate a tuple of Vectors by using a function that creates a custom data type (or a tuple) of values from an index. Here is an approach that achieves the desired result:
import Prelude hiding (map, unzip)
import Data.Vector hiding (map)
import Data.Array.Repa
import Data.Functor.Identity
data Foo = Foo {fooX :: Int, fooY :: Int}
unfoo :: Foo -> (Int, Int)
unfoo (Foo x y) = (x, y)
make :: Int -> (Int -> Foo) -> (Vector Int, Vector Int)
make n f = unzip $ generate n getElt where
getElt i = unfoo $ f i
Except that I would like to do it in a single iteration per Vector, almost like it is shown below, but avoiding multiple evaluation of function f:
make' :: Int -> (Int -> Foo) -> (Vector Int, Vector Int)
make' n f = (generate n getElt1, generate n getElt2) where
getElt1 i = fooX $ f i
getElt2 i = fooY $ f i
Just as a note, I understand that Vector library supports fusion, and the first example is already pretty efficient. I need a solution to generate concept, other libraries have very similar constructors (Repa has fromFunction for example), and I am using Vectors here simply to demonstrate a problem.
Maybe some sort of memoizing of f function call would work, but I cannot think of anything.
Edit:
Another demonstration of the problem using Repa:
makeR :: Int -> (Int -> Foo) -> (Array U DIM1 Int, Array U DIM1 Int)
makeR n f = runIdentity $ do
let arr = fromFunction (Z :. n) (\ (Z :. i) -> unfoo $ f i)
arr1 <- computeP $ map fst arr
arr2 <- computeP $ map snd arr
return (arr1, arr2)
Same as with vectors, fusion saves the day on performance, but an intermediate array arr of tuples is still required, which I am trying to avoid.
Edit 2: (3 years later)
In the Repa example above it will not create an intermediate array, since fromFunction creates a delayed array. Instead it will be even worse, it will evaluate f twice for each index, one for the first array, second time for the second array. Delayed array must be computed in order to avoid such duplication of work.
Looking back at my own question from a few years ago I can now easily show what I was trying to do back than and how to get it done.
In short, it can't be done purely, therefore we need to resort to ST monad and manual mutation of two vectors, but in the end we do get this nice and pure function that creates only two vectors and does not rely on fusion.
import Control.Monad.ST
import Data.Vector.Primitive
import Data.Vector.Primitive.Mutable
data Foo = Foo {fooX :: Int, fooY :: Int}
make :: Int -> (Int -> Foo) -> (Vector Int, Vector Int)
make n f = runST $ do
let n' = max 0 n
mv1 <- new n'
mv2 <- new n'
let fillVectors i
| i < n' = let Foo x y = f i
in write mv1 i x >> write mv2 i y >> fillVectors (i + 1)
| otherwise = return ()
fillVectors 0
v1 <- unsafeFreeze mv1
v2 <- unsafeFreeze mv2
return (v1, v2)
And the we use it in a similar fashion it is done with generate:
λ> make 10 (\ i -> Foo (i + i) (i * i))
([0,2,4,6,8,10,12,14,16,18],[0,1,4,9,16,25,36,49,64,81])
The essential thing you're trying to write is
splat f = unzip . fmap f
which shares the results of evaluating f between the two result vectors, but you want to avoid the intermediate vector. Unfortunately, I'm pretty sure you can't have it both ways in any meaningful sense. Consider a vector of length 1 for simplicity. In order for the result vectors to share the result of f (v ! 0), each will need a reference to a thunk representing that result. Well, that thunk has to be somewhere, and it really might as well be in a vector.
TL:DR: Is there a way to do example 3 without passing an argument
I'm trying to understand the state monad in haskell (Control.Monad.State). I made an extremely simple function:
Example 1
example :: State Int Int
example = do
e <- get
put (e*5)
return e
This example works in ghci...
runState example 3
(3,15)
I modified it to be able to take arguments....
Example 2
example :: Int -> State Int Int
example n = do
e <- get
put (e*n)
return e
also works in ghci...
runState (example 5) 3
(3,15)
I made it recursive, counting the number of steps it takes for a computation to satisfy some condition
Example 3
example :: Int -> State Int Int
example n = do
e <- get
if (n /= 1)
then do
put (succ e)
example (next n)
else return (succ e)
next :: Int -> Int
next n
| even n = div n 2
| otherwise = 3*n+1
ghci
evalState (example 13) 0
10
My question is, is there a way to do the previous example without explicitly passing a value?
You can store n in the state along side of e, for example, something like:
example = do
(e,n) <- get
if n /= 1
then do put (succ e, next n); example
else return e
There is some overhead to using the State monad, so you should compare this with the alternatives.
For instance, a more Haskelly way of approaching this problem is compose list operations to compute the answer, e.g.:
collatz :: Int -> [Int]
collatz n = iterate next n
collatzLength n = length $ takeWhile (/= 1) $ collatz n
I am working with Haskell and maybe monads but I am a little bit confused with them
here is my code but I am getting error and I do not know how to improve my code.
doAdd :: Int -> Int -> Maybe Int
doAdd x y = do
result <- x + y
return result
Let's look critically at the type of the function that you're writing:
doAdd :: Int -> Int -> Maybe Int
The point of the Maybe monad is to work with types that are wrapped with a Maybe type constructor. In your case, the two Int arguments are just plain Ints, and the + function always produces an Int so there is no need for the monad.
If instead, your function took Maybe Int as its arguments, then you could use do notation to handle the Nothing case behind the scenes:
doAdd :: Maybe Int -> Maybe Int -> Maybe Int
doAdd mx my = do x <- mx
y <- my
return (x + y)
example1 = doAdd (Just 1) (Just 3) -- => Just 4
example2 = doAdd (Just 1) Nothing -- => Nothing
example3 = doAdd Nothing (Just 3) -- => Nothing
example4 = doAdd Nothing Nothing -- => Nothing
But we can extract a pattern from this: what you are doing, more generically, is taking a function ((+) :: Int -> Int -> Int) and adapting it to work in the case where the arguments it wants are "inside" a monad. We can abstract away from the specific function (+) and the specific monad (Maybe) and get this generic function:
liftM2 :: Monad m => (a -> b -> c) -> m a -> m b -> m c
liftM2 f ma mb = do a <- ma
b <- mb
return (f a b)
Now with liftM2 you can write:
doAdd :: Maybe Int -> Maybe Int -> Maybe Int
doAdd = liftM2 (+)
The reason why I chose the name liftM2 is because this is actually a library function—you don't need to write it, you can import the Control.Monad module and you'll get it for free.
What would be a better example of using the Maybe monad? When you have an operation that, unlike +, can intrinsically can produce a Maybe result. One idea would be if you wanted to catch division by 0 mistakes. You could write a "safe" version of the div function:
-- | Returns `Nothing` if second argument is zero.
safeDiv :: Int -> Int -> Maybe Int
safeDiv _ 0 = Nothing
safeDiv x y = Just (x `div` y)
Now in this case the monad does become more useful:
-- | This function tests whether `x` is divisible by `y`. Returns `Nothing` if
-- division by zero.
divisibleBy :: Int -> Int -> Maybe Bool
divisibleBy x y = do z <- safeDiv x y
let x' = z * y
return (x == x')
Another more interesting monad example is if you have operations that return more than one value—for example, positive and negative square roots:
-- Compute both square roots of x.
allSqrt x = [sqrt x, -(sqrt x)]
-- Example: add the square roots of 5 to those of 7.
example = do x <- allSqrt 5
y <- allSqrt 7
return (x + y)
Or using liftM2 from above:
example = liftM2 (+) (allSqrt 5) (allSqrt 7)
So anyway, a good rule of thumb is this: never "pollute" a function with a monad type if it doesn't really need it. Your original doAdd—and even my rewritten version—are a violation of this rule of thumb, because what the function does is adding, but adding has nothing to do with Maybe—the Nothing handling is just a behavior that we add on top of the core function (+). The reason for this rule of thumb is that any function that does not use monads can be generically adapted to add the behavior of any monad you want, using utility functions like liftM2 (and many other similar utility functions).
On the other hand, safeDiv and allSqrt are examples where you can't really write the function you want without using Maybe or []; if you are dealing with a function like that, then monads are often a convenient abstraction for eliminating boilerplate code.
A better example might be
justPositive :: Num a => a -> Maybe a
justPositive x
| x <= 0 = Nothing
| otherwise = Just x
addPositives x y = do
x' <- justPositive x
y' <- justPositive y
return $ x' + y'
This will filter out any non-positive values passed into the function using do notation
That isn't how you'd write that code. The <- operator is for getting a value out of a monad. The result of x + y is just a number, not a monad wrapping a number.
Do notation is actually completely wasteful here. If you were bound and determined to write it that way, it would have to look like this:
doAdd x y = do
let result = x + y
return result
But that's just a longwinded version of this:
doAdd x y = return $ x + y
Which is in turn equivalent to
doAdd x y = Just $ x + y
Which is how you'd actually write something like this.
The use case you give doesn't justify do notation, but this is a more common use case- You can chain functions of this type together.
func::Int->Int->Maybe Int -- func would be a function like divide, which is undefined for division by zero
main = do
result1 <- func 1 2
result2 <- func 3 4
result3 <- func result1 result2
return result3
This is the whole point of monads anyway, chaining together functions of type a->m a.
When used this way, the Maybe monad acts much like exceptions in Java (you can use Either if you want to propagate a message up).
How do you increment a variable in a functional programming language?
For example, I want to do:
main :: IO ()
main = do
let i = 0
i = i + 1
print i
Expected output:
1
Simple way is to introduce shadowing of a variable name:
main :: IO () -- another way, simpler, specific to monads:
main = do main = do
let i = 0 let i = 0
let j = i i <- return (i+1)
let i = j+1 print i
print i -- because monadic bind is non-recursive
Prints 1.
Just writing let i = i+1 doesn't work because let in Haskell makes recursive definitions — it is actually Scheme's letrec. The i in the right-hand side of let i = i+1 refers to the i in its left hand side — not to the upper level i as might be intended. So we break that equation up by introducing another variable, j.
Another, simpler way is to use monadic bind, <- in the do-notation. This is possible because monadic bind is not recursive.
In both cases we introduce new variable under the same name, thus "shadowing" the old entity, i.e. making it no longer accessible.
How to "think functional"
One thing to understand here is that functional programming with pure — immutable — values (like we have in Haskell) forces us to make time explicit in our code.
In imperative setting time is implicit. We "change" our vars — but any change is sequential. We can never change what that var was a moment ago — only what it will be from now on.
In pure functional programming this is just made explicit. One of the simplest forms this can take is with using lists of values as records of sequential change in imperative programming. Even simpler is to use different variables altogether to represent different values of an entity at different points in time (cf. single assignment and static single assignment form, or SSA).
So instead of "changing" something that can't really be changed anyway, we make an augmented copy of it, and pass that around, using it in place of the old thing.
As a general rule, you don't (and you don't need to). However, in the interests of completeness.
import Data.IORef
main = do
i <- newIORef 0 -- new IORef i
modifyIORef i (+1) -- increase it by 1
readIORef i >>= print -- print it
However, any answer that says you need to use something like MVar, IORef, STRef etc. is wrong. There is a purely functional way to do this, which in this small rapidly written example doesn't really look very nice.
import Control.Monad.State
type Lens a b = ((a -> b -> a), (a -> b))
setL = fst
getL = snd
modifyL :: Lens a b -> a -> (b -> b) -> a
modifyL lens x f = setL lens x (f (getL lens x))
lensComp :: Lens b c -> Lens a b -> Lens a c
lensComp (set1, get1) (set2, get2) = -- Compose two lenses
(\s x -> set2 s (set1 (get2 s) x) -- Not needed here
, get1 . get2) -- But added for completeness
(+=) :: (Num b) => Lens a b -> Lens a b -> State a ()
x += y = do
s <- get
put (modifyL x s (+ (getL y s)))
swap :: Lens a b -> Lens a b -> State a ()
swap x y = do
s <- get
let x' = getL x s
let y' = getL y s
put (setL y (setL x s y') x')
nFibs :: Int -> Int
nFibs n = evalState (nFibs_ n) (0,1)
nFibs_ :: Int -> State (Int,Int) Int
nFibs_ 0 = fmap snd get -- The second Int is our result
nFibs_ n = do
x += y -- Add y to x
swap x y -- Swap them
nFibs_ (n-1) -- Repeat
where x = ((\(x,y) x' -> (x', y)), fst)
y = ((\(x,y) y' -> (x, y')), snd)
There are several solutions to translate imperative i=i+1 programming to functional programming. Recursive function solution is the recommended way in functional programming, creating a state is almost never what you want to do.
After a while you will learn that you can use [1..] if you need a index for example, but it takes a lot of time and practice to think functionally instead of imperatively.
Here's a other way to do something similar as i=i+1 not identical because there aren't any destructive updates. Note that the State monad example is just for illustration, you probably want [1..] instead:
module Count where
import Control.Monad.State
count :: Int -> Int
count c = c+1
count' :: State Int Int
count' = do
c <- get
put (c+1)
return (c+1)
main :: IO ()
main = do
-- purely functional, value-modifying (state-passing) way:
print $ count . count . count . count . count . count $ 0
-- purely functional, State Monad way
print $ (`evalState` 0) $ do {
count' ; count' ; count' ; count' ; count' ; count' }
Note: This is not an ideal answer but hey, sometimes it might be a little good to give anything at all.
A simple function to increase the variable would suffice.
For example:
incVal :: Integer -> Integer
incVal x = x + 1
main::IO()
main = do
let i = 1
print (incVal i)
Or even an anonymous function to do it.