I am playing with async library and trying to figure out its API in practice. I've noticed a strange behavior I didn't expect. It looks like a bug, but maybe it's a feature and I just need to know a workaround.
import Control.Concurrent
import Control.Concurrent.Async
> withAsync (putStrLn "HELLO") (\_ -> putStrLn "WORLD")
WORLHEDL
The snippet above is working just fine - both line lines executed, but more complex async body is evaluated partially.
> withAsync (putStrLn "XXXXXXXXXX" >> putStrLn "HELLO") (\_ -> putStrLn "WORLD")
WOXRXLXDX
See, the second putStrLn is not executed.
I guess I need to wrap the whole async body in some sort of bnf, but it looks weird anyway. Why withAsync doesn't do that for me?
forkIO works just right, but I don't want to bother with unlifting.
async has pair library unlift-async which propagates monad automatically.
forkIO (Prelude.putStrLn "XXXXXXXXXX" >> Prelude.putStrLn "HELLO")
ThreadXIXdX X1X1X3X
XXX
HELLO
I found lifted-base with fork function. It works as forkIO and pass parent monad to child thread.
withAsync interrupts the forked thread (1st argument) when the main thread (2nd argument) terminates. This is explicit in the documentation of withAsync:
When the function returns or throws an exception, uninterruptibleCancel is called on the Async.
(...) This is a useful variant of async that ensures an Async is never left running unintentionally.
If you want to just fork a thread, use async.
async (putStrLn "XXXXXXXX" >> putStrLn "WORLD") >> putStrLn "HELLO"
Related
I'm a Haskell newbie and I want to do something which has a side effect like this:
i = 3.0
main :: IO ()
main = let m = print i in putStrLn "Hello world"
Then I can know the value of i when main runs, but the I didn't print. I add ! before m but it also doesn't work. I would like to know how to hack this, thanks in advance!
For debugging, use trace and friends.
import Debug.Trace
i = 3.0
main :: IO ()
main = traceShow i $ putStrLn "Hello world"
See it live
Note the trace appears on the standard error stream, as debugging output should.
The function you use trace in doesn't have to be IO-typed. This for example will also work:
add a b = a + traceShow i b
Trace functions are a bit foreign to Haskell because they are technically impure. However the side effects are limited in scope and not observable by the program itself, so it's kinda OK.
More info
In an IO action, you can just use putStrLn or print as usual, like
do
print i
putStrLn "Hello world"
Which desugars to print i >> putStrLn "…". This is equivalent, but not really any better because the action m doesn’t really need to be named:
let m = print i -- define an action
in do
m -- ensure the action is actually executed
putStrLn "Hello world"
In a pure function, you can use trace or traceShow from Debug.Trace:
traceShow i (putStrLn "Hello world")
But be aware that these print when the expression is forced, which may be in a different order than you may expect due to lazy evaluation, or not at all if a value is never used. You can add strictness annotations with seq or BangPatterns as you tried for the monadic code to assist with making sure that things are forced when you expect—the reason !m = … didn’t work for your IO action is that the strictness annotation only makes the expression evaluated, producing an IO action but not executing it because it’s not sequenced with another action as part of main. Remember: you can only (purely) construct IO actions and bind them together; the runtime is what actually executes them.
Finally, in a “pure” monad where you don’t have IO available, you can still use trace &c., for example in the list monad:
numbers :: [Int]
numbers = do
x <- [1, 2, 3]
traceShow x (pure ())
y <- [4, 5, 6]
traceShow y (pure ())
pure (x * y)
You created m but never used it. To fix that you can try:
i = 3.0
main :: IO ()
main = let m = print i in m >> putStrLn "Hello world"
I'm trying to dive into building concurrent and robust code with Haskell, and it was recommended that I use the safe-exceptions and async libraries. However, I'm having a hard time understanding how to handle non-fatal errors thrown within an async action.
For instance, if there is a simple loop that is checking a network resource every n seconds, it would make sense to stop this using the cancel function which would cause an AsyncCancelled exception to be thrown within the seperate thread. Of course, there would also be the possibility that an IOError would be thrown from within the thread due to the network going down or some other problem. Depending on the type of exception and the data it contains, I would want to control whether the seperate thread ignores the exception, performs some action, stops, or raises the exception in the main thread.
With the safe-exceptions library, the only functions that are able to do this are catchAsync and others like it, which are labelled as dangerous in the documentation. Other than that, there is waitCatch in the async library, but the fromException function always returns Nothing when I try to extract an IOError:
{-# LANGUAGE ScopedTypeVariables #-}
import Control.Concurrent.Async
import Control.Concurrent hiding (throwTo)
import Control.Exception.Safe
import Control.Monad
import System.IO
import System.IO.Error hiding (catchIOError)
main = do
hSetBuffering stdin NoBuffering
putStrLn "Press any key to continue..."
a <- async runLoop
async $ hWaitForInput stdin (-1) *>
throwTo (asyncThreadId a) (userError "error")
waitCatch a >>= either handler nothing
where
printThenWait i = putStr (show i ++ " ") *> threadDelay 1000000
runLoop = sequence_ $ printThenWait <$> [1..]
nothing _ = pure ()
handler e
| Just (e' :: IOError) <- fromException e =
putStrLn "It's an IOError!"
| (Nothing :: Maybe IOError) <- fromException e =
putStrLn "We got Nothing!"
I'm a bit confused about the danger of recovering from async exceptions, especially when standard functions such as cancel cause them to be thrown, and I don't know what the recommended way to handle them would be when using these two libraries. Is this an instance where catchAsync would be recommended, or is there another way to handle these types of situations that I haven't discovered?
Notice that Control.Exception.Safe.throwTo wraps synchronous exceptions into AsyncExceptionWrapper, and IOError is synchronous. (I have no idea why this wrapping is necessary, you should never throw synchronous exceptions asynchronously anyway.)
To make your code work, you should either catch AsyncExceptionWrapper or use Control.Exception.throwTo. But actually I don't compitely understand what you are trying to do, most likely you are overcomplicating things.
While learning Haskell I am wondering when an IO action will be performed. In several places I found descriptions like this:
"What’s special about I/O actions is that if they fall into the main function, they are performed."
But in the following example, 'greet' never returns and therefore nothing should be printed.
import Control.Monad
main = greet
greet = forever $ putStrLn "Hello World!"
Or maybe I should ask: what does it mean to "fall into the main function"?
First of all, main is not a function. It is indeed just a regular value and its type is IO (). The type can be read as: An action that, when performed, produces a value of type ().
Now the run-time system plays the role of an interpreter that performs the actions that you have described. Let's take your program as example:
main = forever (putStrLn "Hello world!")
Notice that I have performed a transformation. That one is valid, since Haskell is a referentially transparent language. The run-time system resolves the forever and finds this:
main = putStrLn "Hello world!" >> MORE1
It doesn't yet know what MORE1 is, but it now knows that it has a composition with one known action, which is executed. After executing it, it resolves the second action, MORE1 and finds:
MORE1 = putStrLn "Hello world!" >> MORE2
Again it executes the first action in that composition and then keeps on resolving.
Of course this is a high level description. The actual code is not an interpreter. But this is a way to picture how a Haskell program gets executed. Let's take another example:
main = forever (getLine >>= putStrLn)
The RTS sees this:
main = forever MORE1
<< resolving forever >>
MORE1 = getLine >>= MORE2
<< executing getLine >>
MORE2 result = putStrLn result >> MORE1
<< executing putStrLn result (where 'result' is the line read)
and starting over >>
When understanding this you understand how an IO String is not "a string with side effects" but rather the description of an action that would produce a string. You also understand why laziness is crucial for Haskell's I/O system to work.
In my opinion the point of the statement "What’s special about I/O actions is that if they fall into the main function, they are performed." is that IO actions are first class citizens. That is, IO-actions can occur at all places where values of other data types like Int can occur. For example, you can define a list that contains IO actions as follows.
actionList = [putStr "Hello", putStr "World"]
The list actionList has type [IO ()]. That is, the list contains actions that interact with the world, for example, print on the console or read in input from the user. But, in defining this list we do not execute the actions, we simply put them in a list for later use.
If an IO can occur somewhere in your program, the question arrises when these actions are performed and here main comes into play. Consider the following definition of main.
main = do
actionList !! 0
actionList !! 1
This main function projects to the first and the second component of the list and "executes" the corresponding actions by using them within its definition. Note that it does not necessarily have to be the main function itself that executes an IO action. Any function that is called from the main function can execute actions as well. For example, we can define a function that calls the actions from actionList and let main call this function as follows.
main = do
caller
putStr "!"
caller = do
actionList !! 0
actionList !! 1
To highlight that it does not have to be a simple renaming like in main = caller I have added an action that prints an exclamation mark after it has performed the actions from the list.
Simple IO actions can be combined into more advanced ones by using do notation.
main = do
printStrLn "Hello"
printStrLn "World"
combines the IO action printStrLn "Hello" with the IO action printStrLn "World". Main is now an IO action first printing a line that says "Hello" and then a line that says "World". Written without do-notation (which is just syntactic suger) it looks like this:
main = printStrLn "Hello" >> printStrLn "World"
Here you can see the >> function combining the two actions.
You can create an IO action that reads a line, passes it to a function(that does awesome stuff to it :)) and the prints the result like this:
main = do
input <- getLine
let result = doAwesomeStuff input
printStrLn result
or without binding the result to a variable:
main = do
input <- getLine
printStrLn (doAwesomeStuff input)
This can ofcourse also be written as IO actions and functions that combine them like this:
main = getLine >>= (\input -> printStrLn (doAwesomeStuff input))
When you run the program the main IO action is executed. This is the only time any IO actions are actually executed. (well technically you can also execute them within you program, but it is not safe. The function that does the is called unsafePerformIO.)
You can read more here: http://www.haskell.org/haskellwiki/Introduction_to_Haskell_IO/Actions
(This link is probably a better explaination than mine, but I only found it after I had written nearly everything. It is also quite a bit longer)
launchAMissile :: IO ()
launchAMissile = do
openASilo
loadCoordinates
launchAMissile
main = do
let launch3missiles = launchAMissile >> launchAMissile >> launchAMissile
putStrLn "Not actually launching any missiles"
forever isn't a loop like C's while (true). It is a function that produces an IO value (which contains an infinitely repeated sequence of actions), which is consumed by the caller. (In this case, the caller is main, which means that the actions get executed by the runtime system).
I have a worker thread which reads data repeatedly from an MVar and performs some useful work on that. After a while, the rest of the program forgets about that worker thread, which means that it will wait on an empty MVar and become very lonely. My question is:
Will the MVar be garbage collected if threads no longer write to it, for instance because they all wait for it?
Will garbage collection kill the waiting threads?
If neither, can I somehow indicate to the compiler that the MVar should be garbage collected and the thread be killed?
EDIT: I should probably clarify the purpose of my question. I don't desire general protection against deadlocks; instead, what I would like to do is to tie the life of the worker thread to life of a value (as in: dead values are claimed by garbage collection). In other words, the worker thread is a resource that I would like to free not by hand, but when a certain value (the MVar or a derivative) is garbage collected.
Here an example program that demonstrates what I have in mind
import Control.Concurrent
import Control.Concurrent.MVar
main = do
something
-- the thread forked in something can be killed here
-- because the MVar used for communication is no longer in scope
etc
something = do
v <- newEmptyMVar
forkIO $ forever $ work =<< takeMVar v
putMVar v "Haskell"
putMVar v "42"
In other words, I want the thread to be killed when I can no longer communicate with it, i.e. when the MVar used for communication is no longer in scope. How to do that?
It will just work: when the MVar is only reachable by the thread that is blocked on it, then the thread is sent the BlockedIndefinitelyOnMVar exception, which will normally cause it to die silently (the default exception handler for a thread ignores this exception).
BTW, for doing some cleanup when the thread dies, you'll want to use forkFinally (which I just added to Control.Concurrent).
If you're lucky, you'll get a "BlockedIndefinitelyOnMVar", indicating that you're waiting on an MVar that no thread will ever write to.
But, to quote Ed Yang,
GHC only knows that a thread can be considered garbage if there are no
references to the thread. Who is holding a reference to the thread?
The MVar, as the thread is blocking on this data structure and has
added itself to the blocking list of this. Who is keeping the MVar
alive? Why, our closure that contains a call to takeMVar. So the
thread stays.
without a bit of work (which would be, by the way, quite interesting to see), BlockedIndefinitelyOnMVar is not an obviously useful mechanism for giving your Haskell programs deadlock protection.
GHC just can't solve the problem in general of knowing whether your thread will make progress.
A better approach would be to explicitly terminate threads by sending them a Done message. E.g. just lift your message type into an optional value that also includes an end-of-message value:
import Control.Concurrent
import Control.Concurrent.MVar
import Control.Monad
import Control.Exception
import Prelude hiding (catch)
main = do
something
threadDelay (10 * 10^6)
print "Still here"
something = do
v <- newEmptyMVar
forkIO $
finally
(let go = do x <- takeMVar v
case x of
Nothing -> return ()
Just v -> print v >> go
in go)
(print "Done!")
putMVar v $ Just "Haskell"
putMVar v $ Just "42"
putMVar v Nothing
and we get the correct clean up:
$ ./A
"Haskell"
"42"
"Done!"
"Still here"
I tested the simple weak MVar and it did get finalized and killed. The code is:
import Control.Monad
import Control.Exception
import Control.Concurrent
import Control.Concurrent.MVar
import System.Mem(performGC)
import System.Mem.Weak
dologger :: MVar String -> IO ()
dologger mv = do
tid <- myThreadId
weak <- mkWeakPtr mv (Just (putStrLn "X" >> killThread tid))
logger weak
logger :: Weak (MVar String) -> IO ()
logger weak = act where
act = do
v <- deRefWeak weak
case v of
Just mv -> do
a <- try (takeMVar mv) :: IO (Either SomeException String)
print a
either (\_ -> return ()) (\_ -> act) a
Nothing -> return ()
play mv = act where
act = do
c <- getLine
if c=="quit" then return ()
else putMVar mv c >> act
doplay mv = do
forkIO (dologger mv)
play mv
main = do
putStrLn "Enter a string to escape, or quit to exit"
mv <- newEmptyMVar
doplay mv
putStrLn "*"
performGC
putStrLn "*"
yield
putStrLn "*"
threadDelay (10^6)
putStrLn "*"
The session with the program was:
(chrisk)-(/tmp)
(! 624)-> ghc -threaded -rtsopts --make weak2.hs
[1 of 1] Compiling Main ( weak2.hs, weak2.o )
Linking weak2 ...
(chrisk)-(/tmp)
(! 625)-> ./weak2 +RTS -N4 -RTS
Enter a string to escape, or quit to exit
This is a test
Right "This is a test"
Tab Tab
Right "Tab\tTab"
quit
*
*
X
*
Left thread killed
*
So blocking on takeMVar did not keep the MVar alive on ghc-7.4.1 despite expectations.
While BlockedIndefinitelyOnMVar should work, also consider using ForeignPointer finalizers. The normal role of those is to delete C structures that are no longer accessible in Haskell. However, you can attach any IO finalizer to them.
I'm toying with Haskell threads, and I'm running into the problem of communicating lazily-evaluated values across a channel. For example, with N worker threads and 1 output thread, the workers communicate unevaluated work and the output thread ends up doing the work for them.
I've read about this problem in various documentation and seen various solutions, but I only found one solution that works and the rest do not. Below is some code in which worker threads start some computation that can take a long time. I start the threads in descending order, so that the first thread should take the longest, and the later threads should finish earlier.
import Control.Concurrent (forkIO)
import Control.Concurrent.Chan -- .Strict
import Control.Concurrent.MVar
import Control.Exception (finally, evaluate)
import Control.Monad (forM_)
import Control.Parallel.Strategies (using, rdeepseq)
main = (>>=) newChan $ (>>=) (newMVar []) . run
run :: Chan (Maybe String) -> MVar [MVar ()] -> IO ()
run logCh statVars = do
logV <- spawn1 readWriteLoop
say "START"
forM_ [18,17..10] $ spawn . busyWork
await
writeChan logCh Nothing -- poison the logger
takeMVar logV
putStrLn "DONE"
where
say mesg = force mesg >>= writeChan logCh . Just
force s = mapM evaluate s -- works
-- force s = return $ s `using` rdeepseq -- no difference
-- force s = return s -- no-op; try this with strict channel
busyWork = say . show . sum . filter odd . enumFromTo 2 . embiggen
embiggen i = i*i*i*i*i
readWriteLoop = readChan logCh >>= writeReadLoop
writeReadLoop Nothing = return ()
writeReadLoop (Just mesg) = putStrLn mesg >> readWriteLoop
spawn1 action = do
v <- newEmptyMVar
forkIO $ action `finally` putMVar v ()
return v
spawn action = do
v <- spawn1 action
modifyMVar statVars $ \vs -> return (v:vs, ())
await = do
vs <- modifyMVar statVars $ \vs -> return ([], vs)
mapM_ takeMVar vs
Using most techniques, the results are reported in the order spawned; that is, the longest-running computation first. I interpret this to mean that the output thread is doing all the work:
-- results in order spawned (longest-running first = broken)
START
892616806655
503999185040
274877906943
144162977343
72313663743
34464808608
15479341055
6484436675
2499999999
DONE
I thought the answer to this would be strict channels, but they didn't work. I understand that WHNF for strings is insufficient because that would just force the outermost constructor (nil or cons for the first character of the string). The rdeepseq is supposed to fully evaluate, but it makes no difference. The only thing I've found that works is to map Control.Exception.evaluate :: a -> IO a over all the characters in the string. (See the force function comments in the code for several different alternatives.) Here's the result with Control.Exception.evaluate:
-- results in order finished (shortest-running first = correct)
START
2499999999
6484436675
15479341055
34464808608
72313663743
144162977343
274877906943
503999185040
892616806655
DONE
So why don't strict channels or rdeepseq produce this result? Are there other techniques? Am I misinterpreting why the first result is broken?
There are two issues going on here.
The reason the first attempt (using an explicit rnf) doesn't work is that, by using return, you've created a thunk that fully evaluates itself when it is evaluated, but the thunk itself has not being evaluated. Notice that the type of evaluate is a -> IO a: the fact that it returns a value in IO means that evaluate can impose ordering:
return (error "foo") >> return 1 == return 1
evaluate (error "foo") >> return 1 == error "foo"
The upshot is that this code:
force s = evaluate $ s `using` rdeepseq
will work (as in, have the same behavior as mapM_ evaluate s).
The case of using strict channels is a little trickier, but I believe this is due to a bug in strict-concurrency. The expensive computation is actually being run on the worker threads, but it's not doing you much good (you can check for this explicitly by hiding some asynchronous exceptions in your strings and seeing which thread the exception surfaces on).
What's the bug? Let's take a look at the code for strict writeChan:
writeChan :: NFData a => Chan a -> a -> IO ()
writeChan (Chan _read write) val = do
new_hole <- newEmptyMVar
modifyMVar_ write $ \old_hole -> do
putMVar old_hole $! ChItem val new_hole
return new_hole
We see that modifyMVar_ is called on write before we evaluate the thunk. The sequence of operations then is:
writeChan is entered
We takeMVar write (blocking anyone else who wants to write to the channel)
We evaluate the expensive thunk
We put the expensive thunk onto the channel
We putMVar write, unblocking all of the other threads
You don't see this behavior with the evaluate variants, because they perform the evaluation before the lock is acquired.
I’ll send Don mail about this and see if he agrees that this behavior is kind of suboptimal.
Don agrees that this behavior is suboptimal. We're working on a patch.