I have function fun1 that is not IO and can be computationally expensive, so I want to run it for a specified amount of seconds max. I found a function timeout, but it requires this fun1 to be of IO.
timeout :: Int -> IO a -> IO (Maybe a)
How can I circumvent this, or is there a better approach to achieve my goal?
Edit:
I revised first sentence fun1 is NOT IO, it is of type fun1 :: Formula -> Bool.
Close to what talex said except moving the seq should work. Here is an example using inefficient fib as the expensive computation.
Prelude> import System.Timeout
Prelude System.Timeout> :{
Prelude System.Timeout| let fib 0 = 0
Prelude System.Timeout| fib 1 = 1
Prelude System.Timeout| fib n = fib (n-1) + fib (n-2)
Prelude System.Timeout| :}
Prelude System.Timeout> timeout 1000000 (let x = fib 44 in x `seq` return x)
Nothing
Prelude System.Timeout>
Limiting function execution to a specific time length is not pure (i.e. it does not ensure the same result every time), hence you should not be pursuing such behavior outside of IO. You can, for example, use something evil like unsafePerformIO (timeout 1000 (pure fun1)) but such usage will quickly lead to programs that are hard to understand with unexpected quirks. A better idea may be to define a custom monad that allows limited time execution and can be lifted to IO but I don't know if such a thing exists.
import System.Timeout (timeout)
import Control.Exception (evaluate)
import Control.DeepSeq (NFData, force)
timeoutPure :: Int -> a -> IO (Maybe a)
timeoutPure t = timeout t . evaluate
timeoutPureDeep :: NFData a => Int -> a -> IO (Maybe a)
timeoutPureDeep t = timeoutPure t . force
You may not want to actually write these functions, but they demonstrate the right approach. evaluate is better than seq for this sort of thing, because seq can potentially be moved around by the compiler, escaping the timeout. I'm not sure if that's actually possible in this case, but it's better to just do the thing that's sure to work than to try to analyze carefully whether the riskier approach is okay.
Related
Good day. Given code
import Control.DeepSeq
import Control.Exception
import Control.Parallel
import Control.Parallel.Strategies
import System.Environment
import Text.Printf
l = [34,56,43,1234,456,765,345,4574,58,878,978,456,34,234,1234123,1234,12341234]
f x = Just (sum [1..x])
fun1 :: [Maybe Integer]
fun1 = map f l `using` parList rdeepseq
fun2 :: [Maybe Integer]
fun2 = map f l `using` evalList (rparWith rdeepseq)
fun3 :: [Maybe Integer]
fun3 = map f l `using` evalList (rpar . force)
main :: IO ()
main = print fun1
Why fun1 and fun2 run sequentially?
From what I understood, rparWith should spark its argument. Answer here states the same. But for fun1 and fun2 I'm getting output like "SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)". So Sparks were not even created.
fun3 works as expected with sparks being created.
Ty for help
UPD: And I found that rdeepseq makes example from book (Parallel and Concurrent Programming in Haskell) works in sequence. Book says:
And we can use parPair to write a Strategy that fully evaluates both components of a pair in parallel:
parPair rdeepseq rdeepseq :: (NFData a, NFData b) => Strategy (a,b)
To break down what happens when this Strategy is applied to a pair: parPair calls, and evalPair calls rparWith rdeepseq on each component of the pair. So
the effect is that each component will be fully evaluated to normal form in parallel.
But if I run
(Just (fib 35), Just (fib 36)) `using` parPair rdeepseq rdeepseq
or even
(fib 35, fib 36) `using` parPair rdeepseq rdeepseq
Threadscope shows only one core running and 0 sparks created.
fib implemented like this (from book too)
fib :: Integer -> Integer
fib 0 = 1
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
rparWith was defined using realWorld#, a deeply magical GHC internal value. The way it was used is essentially the same as applying a "function" sometimes called accursedUnutterablePerformIO (more officially, unsafeInlinePerformIO). Using it is only legitimate when the IO in question is actually exceptionally pure. The thinking was that since Eval is just for calculation, that should be fine. But in fact, sparking threads is an IO effect, and one we care about! The optimizer was rearranging those effects in an unfortunate way, causing them ultimately to be dropped. The fix was to use unsafeDupablePerformIO instead. That's a much better-behaved "function", and seems to do the trick. See the ticket for details.
Note: my initial fix turned out to be a bit wrong; it's now been modified once again.
The original paper describes rdeepseq as
rdeepseq :: NFData a => Strategy a
rdeepseq x = rnf x ‘pseq‘ return x
And indeed, if you use this definition, it will create sparks, like you'd expect. Looks like rdeepseq sematrics was changes (probably here), intentionally or incidentally. I don't see any note neither in the documentation, nor in the changelog, so it is probably a bug. Please create at issue on their bug tracker and ask maintainers for clarification.
I'm trying to learn the pipes package by writing my own sum function and I'm getting stumped. I'd like to not use the utility functions from Pipes.Prelude (since it has sum and fold and other functions which make it trivial) and only use the information as described in Pipes.Tutorial. The tutorial doesn't talk about the constructors of Proxy, but if I look in the source of sum and fold it uses those constructors and I wonder whether it is possible to write my sum function without knowledge of these low level details.
I'm having trouble coming to terms with how this function would be able to continue taking in values as long as there are values available, and then somehow return that sum to the user. I guess the type would be:
sum' :: Monad m => Consumer Int m Int
It appears to me this could work because this function could consume values until there are no more, then return the final sum. I would use it like this:
mysum <- runEffect $ inputs >-> sum'
However, the function in Pipes.Prelude has the following signature instead:
sum :: (Monad m, Num a) => Producer a m () -> m a
So I guess this is my first hurdle. Why does the sum function take a Producer as an argument as opposed to using >-> to connect?
FYI I ended up with the following after the answer from danidiaz:
sum' = go 0
where
go n p = next p >>= \x -> case x of
Left _ -> return n
Right (_, p') -> go (n + 1) p'
Consumers are actually quite limited in what they can do. They can't detect end-of-input (pipes-parse uses a different technique for that) and when some other part of the pipeline stops (for example the Producer upstream) that part is the one that must provide the result value for the pipeline. So putting the sum in the return value of the Consumer won't work in general.
Some alternatives are:
Implement a function that deals directly with Producer internals, or perhaps uses an auxiliary function like next. There are adapters of this type that can feed Producer data to "smarter" consumers, like Folds from the foldl package.
Keep using a Consumer, but instead of putting the sum in the return value of the Consumer, use a WriterT as the base monad with a Sum Int monoid as accumulator. That way, even if the Producer stop first, you can still run the writer to get to the accumulator This solution is likely to be less efficient, though.
Example code for the WriterT approach:
import Data.Monoid
import Control.Monad
import Control.Monad.Trans.Writer
import Pipes
producer :: Monad m => Producer Int m ()
producer = mapM_ yield [1..10]
summator :: Monad n => Consumer Int (WriterT (Sum Int) n) ()
summator = forever $ await >>= lift . tell . Sum
main :: IO ()
main = do
Sum r <- execWriterT . runEffect $ producer >-> summator
print r
Excuse me if this is a really dumb question, but I've read through one book and most of another book on Haskell already and don't seem to remember anywhere this was brought up.
How do I do the same thing n times? If you want to know exactly what I am doing, I'm trying to do some of the Google Code Jam questions to learn Haskell, and the first line of the input gives you the number of test cases. That being the case, I need to do the same thing n times where n is the number of test cases.
The only way I can think of to do this so far is to write a recursive function like this:
recFun :: Int -> IO () -> IO ()
recFun 0 f = do return ()
recFun n f = do
f
recFun (n-1) f
return ()
Is there no built in function that already does this?
forM_ from Control.Monad is one way.
Example:
import Control.Monad (forM_)
main = forM_ [1..10] $ \_ -> do
print "We'll do this 10 times!"
See the documentation here
http://hackage.haskell.org/package/base-4.8.0.0/docs/Control-Monad.html#v:forM-95-
I have:
main :: IO ()
main = do
iniciofibonaccimap <- getCPUTime
let fibonaccimap = map fib listaVintesete
fimfibonaccimap <- getCPUTime
let difffibonaccimap = (fromIntegral (fimfibonaccimap - iniciofibonaccimap)) / (10^12)
printf "Computation time fibonaccimap: %0.3f sec\n" (difffibonaccimap :: Double)
listaVintesete :: [Integer]
listaVintesete = replicate 100 27
fib :: Integer -> Integer
fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
But
*Main> main
Computation time fibonaccimap: 0.000 sec
I do not understand why this happens.
Help-me thanks.
As others have said, this is due to lazy evaluation. To force evaluation you should use the deepseq package and BangPatterns:
{-# LANGUAGE BangPatterns #-}
import Control.DeepSeq
import Text.Printf
import System.CPUTime
main :: IO ()
main = do
iniciofibonaccimap <- getCPUTime
let !fibonaccimap = rnf $ map fib listaVintesete
fimfibonaccimap <- getCPUTime
let difffibonaccimap = (fromIntegral (fimfibonaccimap - iniciofibonaccimap)) / (10^12)
printf "Computation time fibonaccimap: %0.3f sec\n" (difffibonaccimap :: Double)
...
In the above code you should notice three things:
It compiles (modulo the ... of functions you defined above). When you post code for questions please make sure it runs (iow, you should include imports)
The use of rnf from deepseq. This forces the evaluation of each element in the list.
The bang pattern on !fibonaccimap, meaning "do this now, don't wait". This forces the list to be evaluated to weak-head normal form (whnf, basically just the first constructor (:)). Without this the rnf function would itself remain unevaluated.
Resulting in:
$ ghc --make ds.hs
$ ./ds
Computation time fibonaccimap: 6.603 sec
If you're intending to do benchmarking you should also use optimization (-O2) and the Criterion package instead of getCPUTime.
Haskell is lazy. The computation you request in the line
let fibonaccimap = map fib listaVintesete
doesn't actually happen until you somehow use the value of fibonaccimap. Thus to measure the time used, you'll need to introduce something that will force the program to perform the actual computation.
ETA: I originally suggested printing the last element to force evaluation. As TomMD points out, this is nowhere near good enough -- I strongly recommend reading his response here for an actually working way to deal with this particular piece of code.
I suspect you are a "victim" of lazy evaluation. Nothing forces the evaluation of fibonaccimap between the timing calls, so it's not computed.
Edit
I suspect you're trying to benchmark your code, and in that case it should be pointed out that there are better ways to do this more reliably.
10^12 is an integer, which forces the value of fromIntegral to be an integer, which means difffibonaccimap is assigned a rounded value, so it's 0 if the time is less than half a second. (That's my guess, anyway. I don't have time to look into it.)
Lazy evaluation has in fact bitten you, as the other answers have said. Specifically, 'let' doesn't force the evaluation of an expression, it just scopes a variable. The computation won't actually happen until its value is demanded by something, which probably won't happen until an actual IO action needs its value. So you need to put your print statement between your getCPUTime evaluations. Of course, this will also get the CPU time used by print in there, but most of print's time is waiting on IO. (Terminals are slow.)
I'm needing some Ints to use as seed to random number generation and so I wanted to use the old trick of using the system time as seed.
So I tried to use the Data.Time package and I managed to do the following:
import Data.Time.Clock
time = getCurrentTime >>= return . utctDayTime
When I run time I get things like:
Prelude Data.Time.Clock> time
55712.00536s
The type of time is IO DiffTime. I expected to see an IO Something type as this depends on things external to the program. So I have two questions:
a) Is it possible to somehow unwrap the IO and get the underlying DiffTime value?
b) How do I convert a DiffTime to an integer with it's value in seconds? There's a function secondsToDiffTime but I couldn't find its inverse.
Is it possible to somehow unwrap the IO and get the underlying DiffTime value?
Yes. There are dozens of tutorials on monads which explain how. They are all based on the idea that you write a function that takes DiffTime and does something (say returning IO ()) or just returns an Answer. So if you have f :: DiffTime -> Answer, you write
time >>= \t -> return (f t)
which some people would prefer to write
time >>= (return . f)
and if you have continue :: DiffTime -> IO () you have
time >>= continue
Or you might prefer do notation:
do { t <- time
; continue t -- or possibly return (f t)
}
For more, consult one of the many fine tutorals on monads.
a) Of course it is possible to get the DiffTime value; otherwise, this function would be rather pointless. You'll need to read up on monads. This chapter and the next of Real World Haskell has a good introduction.
b) The docs for DiffTime say that it's an instance of the Real class, i.e. it can be treated as a real number, in this case the number of seconds. Converting it to seconds is thus a simple matter of chaining conversion functions:
diffTimeToSeconds :: DiffTime -> Integer
diffTimeToSeconds = floor . toRational
If you are planning to use the standard System.Random module for random number generation, then there is already a generator with a time-dependent seed initialized for you: you can get it by calling getStdGen :: IO StdGen. (Of course, you still need the answer to part (a) of your question to use the result.)
This function is not exactly what the OP asks. But it's useful:
λ: import Data.Time.Clock
λ: let getSeconds = getCurrentTime >>= return . fromRational . toRational . utctDayTime
λ: :i getSeconds
getSeconds :: IO Double -- Defined at <interactive>:56:5
λ: getSeconds
57577.607162
λ: getSeconds
57578.902397
λ: getSeconds
57580.387334