Updating MVar Throws Exception - haskell

Given:
λ: >let x = Control.Concurrent.MVar.newMVar ""
λ: >:t x
x :: IO (MVar [Char])
I tried to call putMVar:
λ: >:t putMVar
putMVar :: MVar a -> a -> IO ()
λ: >:t x
x :: IO (MVar [Char])
yet it failed
λ: >x >>= \y -> putMVar y "foo"
:^?^?
*** Exception: thread blocked indefinitely in an MVar operation
Why did it fail and how can I update x with "foo" rather than ""?

x is not an MVar. It is an action that creates an MVar, i.e. it's another name for newMVar "".
x >>= \y -> putMVar y "foo" is an action that creates an MVar and names it y. It then tries to put "foo" in the MVar. However, y already contains "", so putMVar blocks. It doesn't simply block forever because y is a local variable in this action, meaning no one else has access to it and no readers exist. putMVar detects this situation (deadlock) and throws an exception instead.
What you should do instead is start with:
x <- newMVar ""
This makes x an MVar.
Then you can take the old value ("") out:
takeMVar x
And put a new value in
putMVar x "foo"
(MVar doesn't support replacing the existing value in one step; you first have to take it out, then put a new value in.)

Let's look up the documentation:
data MVar a
An MVar (pronounced "em-var") is a synchronising variable, used for
communication between concurrent threads. It can be thought of as a a
box, which may be empty or full.
and
newMVar :: a -> IO (MVar a)
Create an MVar which contains the supplied value.
and
putMVar :: MVar a -> a -> IO ()
Put a value into an MVar. If the MVar is currently full, putMVar will
wait until it becomes empty.
There are two further important properties of putMVar:
putMVar is single-wakeup. That is, if there are multiple threads
blocked in putMVar, and the MVar becomes empty, only one thread will
be woken up. The runtime guarantees that the woken thread completes
its putMVar operation. When multiple threads are blocked on an MVar,
they are woken up in FIFO order. This is useful for providing fairness
properties of abstractions built using MVars.
melpomene's answer contains the correct explanation. I let my answer remain here for the cited documentation.

Related

How does the RTS detect that a thread is blocked indefinitely on an MVar operation?

I have this code
mvarToList :: MVar (Maybe a) -> IO [a]
mvarToList mv = do
mby <- takeMVar mv
case mby of
Nothing -> return []
Just x -> do
xs <- unsafeInterleaveIO (mvarToList mv)
return (x : xs)
{-# NOINLINE mvarToList #-}
streamQuery_ :: FromRow a => Connection -> Query -> IO [a]
streamQuery_ conn q = do
mv <- newEmptyMVar
void $ forkIO $ do
fold_ conn q () (\_ x -> putMVar mv (Just x))
putMVar mv Nothing
mvarToList mv
It uses the postgresql-simple library which provides a fold_ function. It allows you to provide a function that's ran on every row of the query's result. The internal implementation is such that if the provided function blocks, the next result won't be fetched (modulo batching).
What I've done here is basically attached the evaluation of an element of the list to takeMVar which gives me a lazy list that streams results from the DB.
The problem here is leakage. The only way I can hope to avoid any leaks is if SOMEHOW garbage collecting the list results in the RTS determining that the fetching thread is blocked forever (since nothing will be taken from the MVar anymore) and throws an exception to it. Maybe I could even catch that exception and handle cleaning up?

Using TChan with Timeout

I have a TChan as input for a thread which should behave like this:
If sombody writes to the TChan within a specific time, the content should be retrieved. If there is nothing written within the specified time, it should unblock and continue with Nothing.
My attempt on this was to use the timeout function from System.Timeout like this:
timeout 1000000 $ atomically $ readTChan pktChannel
This seemed to work but now I discovered, that I am sometimes loosing packets (they are written to the channel, but not read on the other side. In the log I get this:
2014.063.11.53.43.588365 Pushing Recorded Packet: 2 1439
2014.063.11.53.43.592319 Run into timeout
2014.063.11.53.44.593396 Run into timeout
2014.063.11.53.44.593553 Pushing Recorded Packet: 3 1439
2014.063.11.53.44.597177 Sending Recorded Packet: 3 1439
Where "Pushing Recorded Packet" is the writing from the one thread and "Sending Recorded Packet" is the reading from the TChan in the sender thread. The line with Sending Recorded Packet 2 1439 is missing, which would indicate a successful read from the TChan.
It seems that if the timeout is received at the wrong point in time, the channel looses the packet. I suspect that the threadKill function used inside timeout and STM don't play well together.
Is this correct? Does somebody have another solution that does not loose the packet?
Use registerDelay, an STM function, to signal a TVar when the timeout is reached. You can then use the orElse function or the Alternative operator <|> to select between the next TChan value or the timeout.
import Control.Applicative
import Control.Monad
import Control.Concurrent
import Control.Concurrent.STM
import System.Random
-- write random values after a random delay
packetWriter :: Int -> TChan Int -> IO ()
packetWriter maxDelay chan = do
let xs = randomRs (10000 :: Int, maxDelay + 50000) (mkStdGen 24036583)
forM_ xs $ \ x -> do
threadDelay x
atomically $ writeTChan chan x
-- block (retry) until the delay TVar is set to True
fini :: TVar Bool -> STM ()
fini = check <=< readTVar
-- Read the next value from a TChan or timeout
readTChanTimeout :: Int -> TChan a -> IO (Maybe a)
readTChanTimeout timeoutAfter pktChannel = do
delay <- registerDelay timeoutAfter
atomically $
Just <$> readTChan pktChannel
<|> Nothing <$ fini delay
-- | Print packets until a timeout is reached
readLoop :: Show a => Int -> TChan a -> IO ()
readLoop timeoutAfter pktChannel = do
res <- readTChanTimeout timeoutAfter pktChannel
case res of
Nothing -> putStrLn "timeout"
Just val -> do
putStrLn $ "packet: " ++ show val
readLoop timeoutAfter pktChannel
main :: IO ()
main = do
let timeoutAfter = 1000000
-- spin up a packet writer simulation
pktChannel <- newTChanIO
tid <- forkIO $ packetWriter timeoutAfter pktChannel
readLoop timeoutAfter pktChannel
killThread tid
The thumb rule of concurrency is: if adding a sleep in some point inside an IO action matters, your program is not safe.
To understand why the code timeout 1000000 $ atomically $ readTChan pktChannel does not work, consider the following alternative implementation of atomically:
atomically' :: STM a -> IO a
atomically' action = do
result <- atomically action
threadDelay someTimeAmount
return result
The above is equal to atomically, but for an extra innocent delay. Now it is easy to see that if timeout kills the thread during the threadDelay, the atomic action has completed (consuming a message from the channel), yet timeout will return Nothing.
A simple fix to timeout n $ atomically ... could be the following
smartTimeout :: Int -> STM a -> IO (Maybe a)
smartTimeout n action = do
v <- atomically $ newEmptyTMvar
_ <- timeout n $ atomically $ do
result <- action
putTMvar v result
atomically $ tryTakeTMvar v
The above uses an extra transactional variable v to do the trick. The result value of the action is stored into v inside the same atomic block in which the action is run. The return value of timeout is not trusted, since it does not tell us if action was run or not. After that, we check the TMVar v, which will be full if and only if action was run.
Instead of TChan a, use TChan (Maybe a) . Your normal producer (of x) now writes Just x. Fork an extra "ticking" process that writes Nothing to the channel (every x seconds). Then have a reader for the channel, and abort if you get two successive Nothing. This way, you avoid exceptions, which may cause data to get lost in your case (but I am not sure).

theoretical deadlock in Control.Concurrent.Chan readChan

Browsing the source of readChan one finds the following implementation and comment, starting with version 4.6 of base:
-- |Read the next value from the 'Chan'.
readChan :: Chan a -> IO a
readChan (Chan readVar _) = do
modifyMVarMasked readVar $ \read_end -> do -- Note [modifyMVarMasked]
(ChItem val new_read_end) <- readMVar read_end
-- Use readMVar here, not takeMVar,
-- else dupChan doesn't work
return (new_read_end, val)
-- Note [modifyMVarMasked]
-- This prevents a theoretical deadlock if an asynchronous exception
-- happens during the readMVar while the MVar is empty. In that case
-- the read_end MVar will be left empty, and subsequent readers will
-- deadlock. Using modifyMVarMasked prevents this. The deadlock can
-- be reproduced, but only by expanding readMVar and inserting an
-- artificial yield between its takeMVar and putMVar operations.
Prior to base version 4.6, modifyMVar was used rather than modifyMVarMasked.
I don't understand what theoretical problem is solved for here. The last sentence states there is a problem if the thread yields between the takeMVar and putMVar that comprise readMVar. But as readMVar executes under mask_, how can an async exception prevent the put after successful take?
Any help understanding the issue here is appreciated.
Let's compare the source of modifyMVar and modifyMVarMasked, since the code changed from using one to using the other:
modifyMVar m io =
mask $ \restore -> do
a <- takeMVar m
(a',b) <- restore (io a) `onException` putMVar m a
putMVar m a'
return b
modifyMVarMasked m io =
mask_ $ do
a <- takeMVar m
(a',b) <- io a `onException` putMVar m a
putMVar m a'
return b
The key here is that modifyMVar calls restore before executing its second argument, whereas modifyMVarMasked does not. So readMVar was not called under mask_ in the old version of the code as you claim in your question! It was called under restore, instead, and therefore asynchronous exceptions could be enabled after all.
Here's me working through it.
So in readMVar...
readMVar :: MVar a -> IO a
readMVar m =
mask_ $ do
a <- takeMVar m
putMVar m a
return a
...despite the mask_ the runtime may raise an exception in a blocked takeMVar. Note in that function there's no need to actually handle that case; either the readMVar worked, in which case we're safe from async exceptions, or the takeMVar never succeeds; either way we never break the MVar by leaving it empty. (Is this correct? This is what I took away from the answer to my own related question.)
modifyMVar and modifyMVarMasked are:
modifyMVar :: MVar a -> (a -> IO (a,b)) -> IO b
modifyMVar m io =
mask $ \restore -> do
a <- takeMVar m
(a',b) <- restore (io a) `onException` putMVar m a
putMVar m a'
return b
modifyMVarMasked :: MVar a -> (a -> IO (a,b)) -> IO b
modifyMVarMasked m io =
mask_ $ do
a <- takeMVar m
(a',b) <- io a `onException` putMVar m a
putMVar m a'
return b
...where the difference is in modifyMVar the masking state is restored (i.e. async exceptions probably become unmasked) in io a, which in our case is more or less readMVar.
EDIT: Although readMVar is mask_-ed as well, so now I can't see why either choice of modifyMVarMasked or modifyMVar would make a difference...
The comment seems to imply that yield (inserted into readMVar) is interruptible (I can't find this documented anywhere) and so an async exception might be raised, in which case readVar would be restored (in both current and pre-4.6 versions), but in a non-empty queue readers would see an empty one and block.
You may be interested in reading the GHC trac on this commit, which has a sample program that consistently reproduces this bug when both Control.Concurrent.Chan and the test program are compiled -O0
https://ghc.haskell.org/trac/ghc/ticket/6153
In a similar vein:
https://ghc.haskell.org/trac/ghc/ticket/5870

Compilation error: "The last statement in a 'do' construct must be an expression"

The following is my dining philosophers code and yields a compilation error saying "The last statement in a 'do' construct must be an expression: mVar2 <- newEmptyMVar mVar3"
Can Somebody help me fix this error and get this program working? thank you
import Control.Concurrent
import Control.Concurrent.MVar
import System.Random
takefork :: Int -> forks -> IO ()
takefork n forks = takeMVar (forks!!n)
releasefork :: Int -> forks -> IO ()
releasefork n forks = putMVar (forks!!n)
philosopher :: [Int]
philosopher = [1,2,3,4,5]
forks :: [MVar] -> [Int]
forks = do
takefork n ( philosopher - 1)
threadDelay delay
let delay = 100000
takefork n philosopher
putStrLn("Philosopher" ++ philosopher ++ "has started eating")
releasefork n philosopher
releasefork n ( philosopher - 1)
ptStrLn ("Philosopher" ++ philosopher ++ "has stopped eating")
forks
main :: IO ()
main = do
mVar1 <- newEmptyMVar
mVar2 <- newEmptyMVar
mVar3 <- newEmptyMVar
mVar4 <- newEmptyMVar
mVar5 <- newEmptyMVar
let mVar = [mVar1, mVar2, mVar3, mVar4, mVar5]
sequence_ [ forkIO forks (mVar philosopher) ]
There are many problems with your code.
The error message you report indicates you are probably mixing spaces and tabs. Get rid of the tabs and use only spaces.
You are presumably writing this program in order to practice writing Haskell programs, not in order to run the program for fun and profit. So we don't want to simply give you a working Dining Philosophers implementation, we want to help you write your implementation.
I cannot tell from your code how you expect it to work.
I'm going to focus on the last line:
sequence_ [ forkIO forks (mVar philosopher) ]
sequence_ :: [IO a] -> IO () --- give sequence_ a list of i/o actions, and it (returns an i/o action that) performs each action in order. From the [...], it looks like you are trying to give it a list, but with only one element. This is probably not what you mean.
forkIO :: IO () -> IO ThreadID --- give forkIO an i/o action, and it (returns an i/o action that) starts that i/o action running in a new thread, giving you the id of that thread.
There are two problems here:
forks is a function, not an i/o action (it's not even a function that returns an i/o action, though you probably mean it to be)
you give forkIO a second argunment ((mVar philosopher)), but it only takes one argument
mVar philosopher itself doesn't make any sense: mVar :: [MVar a] (it's a list of MVars, and I haven't worked out what type the MVars are supposed to contain) but you treat it like a function, passing it philosopher as an argument.
At this point a lightbulb blinks on above my head. You wish to call forks with parameters mVar and philosopher?
sequence_ [ forkIO (forks mVar philosopher) ]
We're still sequencing a single action though. Perhaps you wish to call forks with each element of philosopher in turn?
sequence_ $ map (\n -> forkIO (forks mVar n)) philosopher
We can simplify this to
mapM_ (\n -> forkIO (forks mVar n)) philosopher
This doesn't match up with the type you given forks :: [MVar] -> [Int]. But that's probably wrong, so you'll want to fix that function next.

Help understanding MVar example in Haskell

I'm trying to understand the MVar example in the GHC latest docs -
data SkipChan a = SkipChan (MVar (a, [MVar ()])) (MVar ())
newSkipChan :: IO (SkipChan a)
newSkipChan = do
sem <- newEmptyMVar
main <- newMVar (undefined, [sem])
return (SkipChan main sem)
putSkipChan :: SkipChan a -> a -> IO ()
putSkipChan (SkipChan main _) v = do
(_, sems) <- takeMVar main
putMVar main (v, [])
mapM_ (sem -> putMVar sem ()) sems
getSkipChan :: SkipChan a -> IO a
getSkipChan (SkipChan main sem) = do
takeMVar sem
(v, sems) <- takeMVar main
putMVar main (v, sem:sems)
return v
dupSkipChan :: SkipChan a -> IO (SkipChan a)
dupSkipChan (SkipChan main _) = do
sem <- newEmptyMVar
(v, sems) <- takeMVar main
putMVar main (v, sem:sems)
return (SkipChan main sem)
I understand most of the program but for two questions -
Are operations like putSkipChan atomic? It seems to avoid blocking on putMVar by first doing a takeMVar. But wouldn't that fail if something else calls putMVar after the takeMVar but before the putMVar? In such cases, it seems the program would block forever.
Why does dupSkipChan append sem to the list of semaphores in the SkipChan? Isn't that done by getSkipChan. It seems to me that calling dupSkipChan followed by getSkipChan (which seems to be what you have to do to have multiple readers) would cause a block when putSkipChan tries to wake up the same semaphore twice?
You are correct, another thread could call putMVar main and mess up putSkipChan. But the module creating the above code would not export the SkipChan constructor so such a rogue operation would be impossible.
dupSkipChan makes a new emptyMVar called sem and adds that to the list in main. It does not add the pre-existing one that was created in newSkipChan. Thus there is no block.
To explain more to other readers of this question and comment: The idea is that there may be multiple reader threads. Initially SkipChan main sem1 is the only such reader. dupSkipChan makes a SkipChan main sem2. If there are thousands of readers then you would not want to notify all of them of a new value in putSkipChan, thus the design is that getSkipChan puts its sem into the list in main. Initializing SkipChan as done in newSkipChan and dupSkipChan also includes putting the new empty sem into the list in main.
The above initialization and design means that the first getSkipChan obtains the most recent past value to have been written (or block for the first value to arrive). Future getSkipChan on that SkipChan will always get a newer value than any gotten before, and these will not block if that value is already available.

Resources