Good day. Given code
import Control.DeepSeq
import Control.Exception
import Control.Parallel
import Control.Parallel.Strategies
import System.Environment
import Text.Printf
l = [34,56,43,1234,456,765,345,4574,58,878,978,456,34,234,1234123,1234,12341234]
f x = Just (sum [1..x])
fun1 :: [Maybe Integer]
fun1 = map f l `using` parList rdeepseq
fun2 :: [Maybe Integer]
fun2 = map f l `using` evalList (rparWith rdeepseq)
fun3 :: [Maybe Integer]
fun3 = map f l `using` evalList (rpar . force)
main :: IO ()
main = print fun1
Why fun1 and fun2 run sequentially?
From what I understood, rparWith should spark its argument. Answer here states the same. But for fun1 and fun2 I'm getting output like "SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)". So Sparks were not even created.
fun3 works as expected with sparks being created.
Ty for help
UPD: And I found that rdeepseq makes example from book (Parallel and Concurrent Programming in Haskell) works in sequence. Book says:
And we can use parPair to write a Strategy that fully evaluates both components of a pair in parallel:
parPair rdeepseq rdeepseq :: (NFData a, NFData b) => Strategy (a,b)
To break down what happens when this Strategy is applied to a pair: parPair calls, and evalPair calls rparWith rdeepseq on each component of the pair. So
the effect is that each component will be fully evaluated to normal form in parallel.
But if I run
(Just (fib 35), Just (fib 36)) `using` parPair rdeepseq rdeepseq
or even
(fib 35, fib 36) `using` parPair rdeepseq rdeepseq
Threadscope shows only one core running and 0 sparks created.
fib implemented like this (from book too)
fib :: Integer -> Integer
fib 0 = 1
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
rparWith was defined using realWorld#, a deeply magical GHC internal value. The way it was used is essentially the same as applying a "function" sometimes called accursedUnutterablePerformIO (more officially, unsafeInlinePerformIO). Using it is only legitimate when the IO in question is actually exceptionally pure. The thinking was that since Eval is just for calculation, that should be fine. But in fact, sparking threads is an IO effect, and one we care about! The optimizer was rearranging those effects in an unfortunate way, causing them ultimately to be dropped. The fix was to use unsafeDupablePerformIO instead. That's a much better-behaved "function", and seems to do the trick. See the ticket for details.
Note: my initial fix turned out to be a bit wrong; it's now been modified once again.
The original paper describes rdeepseq as
rdeepseq :: NFData a => Strategy a
rdeepseq x = rnf x ‘pseq‘ return x
And indeed, if you use this definition, it will create sparks, like you'd expect. Looks like rdeepseq sematrics was changes (probably here), intentionally or incidentally. I don't see any note neither in the documentation, nor in the changelog, so it is probably a bug. Please create at issue on their bug tracker and ask maintainers for clarification.
Related
I have function fun1 that is not IO and can be computationally expensive, so I want to run it for a specified amount of seconds max. I found a function timeout, but it requires this fun1 to be of IO.
timeout :: Int -> IO a -> IO (Maybe a)
How can I circumvent this, or is there a better approach to achieve my goal?
Edit:
I revised first sentence fun1 is NOT IO, it is of type fun1 :: Formula -> Bool.
Close to what talex said except moving the seq should work. Here is an example using inefficient fib as the expensive computation.
Prelude> import System.Timeout
Prelude System.Timeout> :{
Prelude System.Timeout| let fib 0 = 0
Prelude System.Timeout| fib 1 = 1
Prelude System.Timeout| fib n = fib (n-1) + fib (n-2)
Prelude System.Timeout| :}
Prelude System.Timeout> timeout 1000000 (let x = fib 44 in x `seq` return x)
Nothing
Prelude System.Timeout>
Limiting function execution to a specific time length is not pure (i.e. it does not ensure the same result every time), hence you should not be pursuing such behavior outside of IO. You can, for example, use something evil like unsafePerformIO (timeout 1000 (pure fun1)) but such usage will quickly lead to programs that are hard to understand with unexpected quirks. A better idea may be to define a custom monad that allows limited time execution and can be lifted to IO but I don't know if such a thing exists.
import System.Timeout (timeout)
import Control.Exception (evaluate)
import Control.DeepSeq (NFData, force)
timeoutPure :: Int -> a -> IO (Maybe a)
timeoutPure t = timeout t . evaluate
timeoutPureDeep :: NFData a => Int -> a -> IO (Maybe a)
timeoutPureDeep t = timeoutPure t . force
You may not want to actually write these functions, but they demonstrate the right approach. evaluate is better than seq for this sort of thing, because seq can potentially be moved around by the compiler, escaping the timeout. I'm not sure if that's actually possible in this case, but it's better to just do the thing that's sure to work than to try to analyze carefully whether the riskier approach is okay.
According to the definitions given by "Parallel and Concurrent Programming in Haskell" on page 29, the class method "return normal form" is defined as:
rnf a = a `seq` ()
Just to see if wrapping seq in another function, as the authors prescribe, really had the result of forcing 'a' to be evaluated to normal form, I tried implementing the function myself and got a result in the negative:
Prelude Control.DeepSeq > myrnf a = a `seq` ()
Prelude Control.DeepSeq > xs = map (+1) [1..10] :: [Int]
Prelude Control.DeepSeq > :sprint xs
xs = _
Prelude Control.DeepSeq > myrunf xs
()
Prelude Control.DeepSeq > :sprint xs
xs = _ : _
Prelude Control.DeepSeq > rnf xs
()
Prelude Control.DeepSeq > :sprint xs
xs = [2,3,4,5,6,7,8,9,10,11]
So have the authors made a glaring mistake, or am I missing something here?
Edit: I realised my original question contained elementary mistakes. Here is the proper form of the question.
Perhaps, by declaring the implementation in class, one is saying that that is the default implementation, unless stated otherwise in a specific instance declaration?
Yes, that's exactly what it means. What's unusual (IMO) is that normally this default implementation is valid (i.e. does what you want) for all instances but may be overridden for efficiency or to break a cycle of default definitions (e.g. in Eq the default implementations are x == y = not (x /= y) and x /= y = not (x == y), so you can override whichever is more convenient).
But in case of rnf, the documentation for 1.3.0.0 says
The default implementation of rnf ... may be convenient when defining instances for data types with no unevaluated fields (e.g. enumerations).
i.e. it doesn't really work for nearly all types.
This problem has been fixed since 1.4.0.0.
I was wondering if it is possible in Haskell to define a function which upon calling gives the next element of an (infinite) list, so for example:
Prelude> func
1
Prelude> func
2
Is it possible to have such a function in Haskell, and if there is, can you give me an example?
You could do a Stateful thing like this:
{-# LANGUAGE FlexibleContexts #-}
import Control.Monad.State
import Data.List
import Data.Maybe
-- This is not a function! The misleading name func comes from the question text.
func :: MonadState [a] m => m a
func = state (fromJust . uncons)
exampleUsage :: State [Int] (Int, Int)
exampleUsage = do
x <- func
y <- func
return (x, y)
You can try it in ghci:
> evalState exampleUsage [1..]
(1, 2)
However, at a high level, I would suggest rethinking your requirements. func is not very idiomatic at all; simply working with the infinite list directly is generally going to be much clearer, have lower (syntactic) overhead, and lead to better generated code. For example:
exampleUsage' :: [a] -> (a, a)
exampleUsage' (x:y:_) = (x,y)
N.B. this is two lines of code with no extensions or imports, compared to the previous 11 lines of code including a language extension and three imports. Usage is also simplified; you can drop the call to evalState entirely and be done.
> exampleUsage' [1..]
(1, 2)
You can use mutable references and the IO monad (or other stateful monad). This can be made rather pretty via partial application:
Prelude> import Data.IORef
Prelude Data.IORef> ref <- newIORef 0
Prelude Data.IORef> let func = readIORef ref >>= \r -> writeIORef ref (r+1) >> return r
Prelude Data.IORef> func
0
Prelude Data.IORef> func
1
Or closer to what you requested:
Prelude Data.IORef> ref2 <- newIORef [0..]
Prelude Data.IORef> let func2 = readIORef ref2 >>= \(x:xs) -> writeIORef ref2 xs >> return x
Prelude Data.IORef> func2
0
Prelude Data.IORef> func2
1
It sounds like you are looking for something like other languages' Iterator or Generator constructs. If so, this seems like a good use case for the conduit library. Note that there are options (e.g. pipes); however, conduit may be a good starting point for you.
If you are trying to operate only over lists, using the State Monad may be a simpler answer (as Daniel suggests); however, if you are looking for a more general solution, conduit (or the like) may indeed be the answer.
The func you are searching for is therefore most likely the await function.
Here's a simple example -
import Prelude
import Conduit
import Data.MonoTraversable
main :: IO ()
main = runConduit (source .| consume) >>= print
source :: Monad m => Producer m (Element [Integer])
source = yieldMany [0..]
consume :: Monad m => ConduitM i o m (Maybe (i, i))
consume = do
mx <- await
my <- await
return $ (,) <$> mx <*> my
And its output -
λ main
Just (0,1)
I occasionally would like to delay specific parts of a pure algorithm while developing / testing, so I can monitor the evaluation simply by watching the lazy result build up piece by piece (which would generally be too fast to be useful in the final, un-delayed version). I then find myself inserting ugly stuff like sum [1..1000000] `seq` q, which kind of works (though often with the usual thunk-explosion problems, because I never think much about this), but is rather trial-and-error-like.
Is there a nicer, more controllable alternative that's still just as simple, when I want to do some quick testing in that way and can't be bothered to do proper profiling, criterion etc.?
I'd also like to avoid unsafePerformIO $ threadDelay, though I reckon this might actually be an appropriate use.
This looping solution avoids calling threadDelay, but still calls unsafePerformIO, so maybe we don't gain much:
import Data.AdditiveGroup
import Data.Thyme.Clock
import Data.Thyme.Clock.POSIX
import System.IO.Unsafe
pureWait :: NominalDiffTime -> ()
pureWait time = let tsList = map unsafePerformIO ( repeat getPOSIXTime ) in
case tsList of
(t:ts) -> loop t ts
where
loop t (t':ts') = if (t' ^-^ t) > time
then ()
else loop t ts'
main :: IO ()
main = do
putStrLn . show $ pureWait (fromSeconds 10)
UPDATE: Here's an altenative solution. First determine (using IO) how many iterations do you need to achieve a given delay, and then just use a pure looping function.
pureWait :: Integer -> Integer
pureWait i = foldl' (+) 0 $ genericTake i $ intersperse (negate 1) (repeat 1)
calibrate :: NominalDiffTime -> IO Integer
calibrate timeSpan = let iterations = iterate (*2) 2 in loop iterations
where
loop (i:is) = do
t1 <- getPOSIXTime
if pureWait i == 0
then do
t2 <- getPOSIXTime
if (t2 ^-^ t1) > timeSpan
then return i
else loop is
else error "should never happen"
main :: IO ()
main = do
requiredIterations <- calibrate (fromSeconds 10)
putStrLn $ "iterations required for delay: " ++ show requiredIterations
putStrLn . show $ pureWait requiredIterations
I have written the following code in Haskell:
import Data.IORef
import Control.Monad
import Control.Monad.Trans.Cont
import Control.Monad.IO.Class
fac n = do
i<-newIORef 1
f<-newIORef 1
replicateM_ n $ do
ri<-readIORef i
modifyIORef f (\x->x*ri)
modifyIORef i (+1)
readIORef f
This is very nice code which implements factorial as an imperative function. But replicateM_ cannot fully simulate the use of a real for loop. So I tried to create something using continuations but I have failed here is my code:
ff = (`runContT` id) $ do
callCC $ \exit1 -> do
liftIO $ do
i<-newIORef 1
f<-newIORef 1
callCC $ \exit2 -> do
liftIO $ do
ri<-readIORef i
modifyIORef (\x->x*ri)
modifyIORef i (+1)
rri<-readIORef i
when (rri<=n) $ exit2(())
liftIO $ do
rf<-readIORef f
return rf
Can you help me correct my code?
Thanks
Since your a beginner to Haskell and not doing this simply to learn how continuations and IORefs work, you're doing it wrong.
The Haskell-y way to write an imperative loop is tail-calls or folds.
factorial n = foldl1' (*) [1..n]
factorial' n = go 1 n
where go accum 0 = accum
go accum n = go (n-1) (accum * n)
Also since Haskell's callCC in essence provides you an early return, using it to simulate loops is not going to work.
callCC (\c -> ???)
Think about what we would have to put in for ??? in order to loop. somehow, we want to run callCC again if it returns a certain value, otherwise just keep going on our merry way.
But nothing we put in ??? can make the callCC run again! It's going to return a value no matter what we do. So instead we'll need to do something around that callCC
let (continue, val) = callCC (someFunc val)
in if continue
then callCallCCAgain val
else val
Something like this right? But wait, callCallCCAgain is recursion! It's even tail recursion! In fact, that callCC is doing no one any good
loop val = let (continue, val') = doBody val
in if continue
then loop val'
else val'
Look familiar? This is the same structure as factorial' above.
You can still use IORefs and something like the monad-loops package, but it's going to be an uphill battle always because Haskell isn't meant to be written like that.
Summary
When you want to directly do "loops" in haskell, use tail recursion. But really, try to use combinators like fold and map, they're like little specialized loops and GHC is fantastic at optimizing them. And definitely don't use IORefs, trying to program Haskell like it's C is just going to hurt your performance, readability, and everyone will be sad.