Behaviour of withMVar and bracket_ different - haskell

This ensures mutual exclusion of actions:
do
lock <- newMVar ()
let atomicPrint = withMVar lock . const . print
mapM_ (forkIO . atomicPrint) [['1'..'8'],['a'..'h']]
This doesn't:
do
lock <- newMVar ()
let atomicPrint x = bracket_ (takeMVar lock) (print x) (putMVar lock ())
mapM_ (forkIO . atomicPrint) [['1'..'8'],['a'..'h']]
Can you explain why? By the definitions of withMVar and bracket_ on Hackage and equational reasoning I came to the conclusion that the two pieces of code should do the same. But running in GHCi proves me wrong.

You have the order wrong; You are acquiring the lock and immediately releasing it; whereas the signature says:
bracket_
:: IO a -- computation to run first ("acquire resource")
-> IO b -- computation to run last ("release resource")
-> IO c -- computation to run in-between
-> IO c -- returns the value from the in-between computation
so it should be:
bracket_ (takeMVar lock) (putMVar lock ()) $ print x

Related

Haskell ForkIO limit number of threads to certain value

Seems like ForkIO creates as many threads as there are cores in the Haskell program I work with
-- | Fork a thread in checker monad.
fork :: Checker a b () -> Checker a b ()
fork act = do
s0 <- get
void $ liftIO $ forkIO (curTGroup s0) $ evalChecker act s0
occurs :: Eq a => a -> [a] -> Int
occurs x = length . filter (x==)
https://github.com/PLSysSec/sys/blob/821c4d7cf924e68838c128cbe824be46c9955416/src/Static/Check.hs#L73
New to Haskel ForkIO, I wanted to set the thread amount using setNumCapabilities.
Tried adding
let setNumCapabilities = 1
Haskell made a warning about unused var and this didn't make any effect.
How to do it properly?
setNumCapabilities :: Int -> IO () is a function. You thus use it in your code with:
import Control.Concurrent(setNumCapabilities)
fork :: Checker a b () -> Checker a b ()
fork act = do
s0 <- get
void $ liftIO $ do
setNumCapabilities 1
forkIO (curTGroup s0) $ evalChecker act s0
or somewhere else, for example in the main function.

Updating MVar Throws Exception

Given:
λ: >let x = Control.Concurrent.MVar.newMVar ""
λ: >:t x
x :: IO (MVar [Char])
I tried to call putMVar:
λ: >:t putMVar
putMVar :: MVar a -> a -> IO ()
λ: >:t x
x :: IO (MVar [Char])
yet it failed
λ: >x >>= \y -> putMVar y "foo"
:^?^?
*** Exception: thread blocked indefinitely in an MVar operation
Why did it fail and how can I update x with "foo" rather than ""?
x is not an MVar. It is an action that creates an MVar, i.e. it's another name for newMVar "".
x >>= \y -> putMVar y "foo" is an action that creates an MVar and names it y. It then tries to put "foo" in the MVar. However, y already contains "", so putMVar blocks. It doesn't simply block forever because y is a local variable in this action, meaning no one else has access to it and no readers exist. putMVar detects this situation (deadlock) and throws an exception instead.
What you should do instead is start with:
x <- newMVar ""
This makes x an MVar.
Then you can take the old value ("") out:
takeMVar x
And put a new value in
putMVar x "foo"
(MVar doesn't support replacing the existing value in one step; you first have to take it out, then put a new value in.)
Let's look up the documentation:
data MVar a
An MVar (pronounced "em-var") is a synchronising variable, used for
communication between concurrent threads. It can be thought of as a a
box, which may be empty or full.
and
newMVar :: a -> IO (MVar a)
Create an MVar which contains the supplied value.
and
putMVar :: MVar a -> a -> IO ()
Put a value into an MVar. If the MVar is currently full, putMVar will
wait until it becomes empty.
There are two further important properties of putMVar:
putMVar is single-wakeup. That is, if there are multiple threads
blocked in putMVar, and the MVar becomes empty, only one thread will
be woken up. The runtime guarantees that the woken thread completes
its putMVar operation. When multiple threads are blocked on an MVar,
they are woken up in FIFO order. This is useful for providing fairness
properties of abstractions built using MVars.
melpomene's answer contains the correct explanation. I let my answer remain here for the cited documentation.

STM and atomically: why the semantic of these two programs differ?

Let's consider this simple Haskell program:
module Main where
import Control.Concurrent.STM
import Control.Concurrent
import Control.Exception
import Control.Monad
import Data.Maybe
import Data.Monoid
import Control.Applicative
terminator :: Either SomeException () -> IO ()
terminator r = print $ "Dying with " <> show r
doStuff :: TMVar () -> TChan () -> Int -> IO ()
doStuff writeToken barrier w = void $ flip forkFinally terminator $ do
hasWriteToken <- isJust <$> atomically (tryTakeTMVar writeToken)
case hasWriteToken of
True -> do
print $ show w <> "I'm the lead.."
threadDelay (5 * 10^6)
print "Done heavy work"
atomically $ writeTChan barrier ()
False -> do
print $ show w <> " I'm the worker, waiting for the barrier..."
myChan <- atomically $ dupTChan barrier
_ <- atomically $ readTChan myChan
print "Unlocked!"
main :: IO ()
main = do
writeToken <- newTMVarIO ()
barrier <- newBroadcastTChanIO
_ <- forM [1..20] (doStuff writeToken barrier)
threadDelay (20 * 10^6)
return ()
It essentially model a concurrency scenario where a "lead" acquire the write token, do something and the workers will sync on a barrier and way for the "green light" from the lead. This works, but if we replace worker "atomically" block with this:
_ <- atomically $ do
myChan <- dupTChan barrier
readTChan myChan
All the workers remains blocked indefinitely inside a STM transaction:
"Done heavy work"
"Dying with Right ()"
"Dying with Left thread blocked indefinitely in an STM transaction"
"Dying with Left thread blocked indefinitely in an STM transaction"
"Dying with Left thread blocked indefinitely in an STM transaction"
...
I suspect the key lies inside the semantic of atomically. Any idea?
Thanks!
Alfredo
I think this behavior comes from the definition of dupTChan. Copied here for readability, along with readTChan
dupTChan :: TChan a -> STM (TChan a)
dupTChan (TChan _read write) = do
hole <- readTVar write
new_read <- newTVar hole
return (TChan new_read write)
readTChan :: TChan a -> STM a
readTChan (TChan read _write) = do
listhead <- readTVar read
head <- readTVar listhead
case head of
TNil -> retry
TCons a tail -> do
writeTVar read tail
return a
inlining those functions, we get this STM block:
worker_block (TChan _read write) = do
hole <- readTVar write
new_read <- newTVar hole
listhead <- readTVar new_read
head <- readTVar listhead
case head of
TNil -> retry
...
When you try to run this block atomically, we make a new read_end from the tail of the channel, then call readTVar on it. The tail is of course empty, so this readTVar will retry. However, when the lead writes to the channel, the act of writing to the channel invalidates this transaction! So every follower transaction will always have to retry.
In fact, I don't think there is any case where dupTChan >>= readTChan will ever result in anything other than the thread being blocked indefinitely on an STM transaction. You can reason this out from the documentation as well. dupTChan begins empty, so within a single atomic transaction it will never have any items unless that same transaction adds them.

theoretical deadlock in Control.Concurrent.Chan readChan

Browsing the source of readChan one finds the following implementation and comment, starting with version 4.6 of base:
-- |Read the next value from the 'Chan'.
readChan :: Chan a -> IO a
readChan (Chan readVar _) = do
modifyMVarMasked readVar $ \read_end -> do -- Note [modifyMVarMasked]
(ChItem val new_read_end) <- readMVar read_end
-- Use readMVar here, not takeMVar,
-- else dupChan doesn't work
return (new_read_end, val)
-- Note [modifyMVarMasked]
-- This prevents a theoretical deadlock if an asynchronous exception
-- happens during the readMVar while the MVar is empty. In that case
-- the read_end MVar will be left empty, and subsequent readers will
-- deadlock. Using modifyMVarMasked prevents this. The deadlock can
-- be reproduced, but only by expanding readMVar and inserting an
-- artificial yield between its takeMVar and putMVar operations.
Prior to base version 4.6, modifyMVar was used rather than modifyMVarMasked.
I don't understand what theoretical problem is solved for here. The last sentence states there is a problem if the thread yields between the takeMVar and putMVar that comprise readMVar. But as readMVar executes under mask_, how can an async exception prevent the put after successful take?
Any help understanding the issue here is appreciated.
Let's compare the source of modifyMVar and modifyMVarMasked, since the code changed from using one to using the other:
modifyMVar m io =
mask $ \restore -> do
a <- takeMVar m
(a',b) <- restore (io a) `onException` putMVar m a
putMVar m a'
return b
modifyMVarMasked m io =
mask_ $ do
a <- takeMVar m
(a',b) <- io a `onException` putMVar m a
putMVar m a'
return b
The key here is that modifyMVar calls restore before executing its second argument, whereas modifyMVarMasked does not. So readMVar was not called under mask_ in the old version of the code as you claim in your question! It was called under restore, instead, and therefore asynchronous exceptions could be enabled after all.
Here's me working through it.
So in readMVar...
readMVar :: MVar a -> IO a
readMVar m =
mask_ $ do
a <- takeMVar m
putMVar m a
return a
...despite the mask_ the runtime may raise an exception in a blocked takeMVar. Note in that function there's no need to actually handle that case; either the readMVar worked, in which case we're safe from async exceptions, or the takeMVar never succeeds; either way we never break the MVar by leaving it empty. (Is this correct? This is what I took away from the answer to my own related question.)
modifyMVar and modifyMVarMasked are:
modifyMVar :: MVar a -> (a -> IO (a,b)) -> IO b
modifyMVar m io =
mask $ \restore -> do
a <- takeMVar m
(a',b) <- restore (io a) `onException` putMVar m a
putMVar m a'
return b
modifyMVarMasked :: MVar a -> (a -> IO (a,b)) -> IO b
modifyMVarMasked m io =
mask_ $ do
a <- takeMVar m
(a',b) <- io a `onException` putMVar m a
putMVar m a'
return b
...where the difference is in modifyMVar the masking state is restored (i.e. async exceptions probably become unmasked) in io a, which in our case is more or less readMVar.
EDIT: Although readMVar is mask_-ed as well, so now I can't see why either choice of modifyMVarMasked or modifyMVar would make a difference...
The comment seems to imply that yield (inserted into readMVar) is interruptible (I can't find this documented anywhere) and so an async exception might be raised, in which case readVar would be restored (in both current and pre-4.6 versions), but in a non-empty queue readers would see an empty one and block.
You may be interested in reading the GHC trac on this commit, which has a sample program that consistently reproduces this bug when both Control.Concurrent.Chan and the test program are compiled -O0
https://ghc.haskell.org/trac/ghc/ticket/6153
In a similar vein:
https://ghc.haskell.org/trac/ghc/ticket/5870

Help understanding MVar example in Haskell

I'm trying to understand the MVar example in the GHC latest docs -
data SkipChan a = SkipChan (MVar (a, [MVar ()])) (MVar ())
newSkipChan :: IO (SkipChan a)
newSkipChan = do
sem <- newEmptyMVar
main <- newMVar (undefined, [sem])
return (SkipChan main sem)
putSkipChan :: SkipChan a -> a -> IO ()
putSkipChan (SkipChan main _) v = do
(_, sems) <- takeMVar main
putMVar main (v, [])
mapM_ (sem -> putMVar sem ()) sems
getSkipChan :: SkipChan a -> IO a
getSkipChan (SkipChan main sem) = do
takeMVar sem
(v, sems) <- takeMVar main
putMVar main (v, sem:sems)
return v
dupSkipChan :: SkipChan a -> IO (SkipChan a)
dupSkipChan (SkipChan main _) = do
sem <- newEmptyMVar
(v, sems) <- takeMVar main
putMVar main (v, sem:sems)
return (SkipChan main sem)
I understand most of the program but for two questions -
Are operations like putSkipChan atomic? It seems to avoid blocking on putMVar by first doing a takeMVar. But wouldn't that fail if something else calls putMVar after the takeMVar but before the putMVar? In such cases, it seems the program would block forever.
Why does dupSkipChan append sem to the list of semaphores in the SkipChan? Isn't that done by getSkipChan. It seems to me that calling dupSkipChan followed by getSkipChan (which seems to be what you have to do to have multiple readers) would cause a block when putSkipChan tries to wake up the same semaphore twice?
You are correct, another thread could call putMVar main and mess up putSkipChan. But the module creating the above code would not export the SkipChan constructor so such a rogue operation would be impossible.
dupSkipChan makes a new emptyMVar called sem and adds that to the list in main. It does not add the pre-existing one that was created in newSkipChan. Thus there is no block.
To explain more to other readers of this question and comment: The idea is that there may be multiple reader threads. Initially SkipChan main sem1 is the only such reader. dupSkipChan makes a SkipChan main sem2. If there are thousands of readers then you would not want to notify all of them of a new value in putSkipChan, thus the design is that getSkipChan puts its sem into the list in main. Initializing SkipChan as done in newSkipChan and dupSkipChan also includes putting the new empty sem into the list in main.
The above initialization and design means that the first getSkipChan obtains the most recent past value to have been written (or block for the first value to arrive). Future getSkipChan on that SkipChan will always get a newer value than any gotten before, and these will not block if that value is already available.

Resources