STM and atomically: why the semantic of these two programs differ? - multithreading

Let's consider this simple Haskell program:
module Main where
import Control.Concurrent.STM
import Control.Concurrent
import Control.Exception
import Control.Monad
import Data.Maybe
import Data.Monoid
import Control.Applicative
terminator :: Either SomeException () -> IO ()
terminator r = print $ "Dying with " <> show r
doStuff :: TMVar () -> TChan () -> Int -> IO ()
doStuff writeToken barrier w = void $ flip forkFinally terminator $ do
hasWriteToken <- isJust <$> atomically (tryTakeTMVar writeToken)
case hasWriteToken of
True -> do
print $ show w <> "I'm the lead.."
threadDelay (5 * 10^6)
print "Done heavy work"
atomically $ writeTChan barrier ()
False -> do
print $ show w <> " I'm the worker, waiting for the barrier..."
myChan <- atomically $ dupTChan barrier
_ <- atomically $ readTChan myChan
print "Unlocked!"
main :: IO ()
main = do
writeToken <- newTMVarIO ()
barrier <- newBroadcastTChanIO
_ <- forM [1..20] (doStuff writeToken barrier)
threadDelay (20 * 10^6)
return ()
It essentially model a concurrency scenario where a "lead" acquire the write token, do something and the workers will sync on a barrier and way for the "green light" from the lead. This works, but if we replace worker "atomically" block with this:
_ <- atomically $ do
myChan <- dupTChan barrier
readTChan myChan
All the workers remains blocked indefinitely inside a STM transaction:
"Done heavy work"
"Dying with Right ()"
"Dying with Left thread blocked indefinitely in an STM transaction"
"Dying with Left thread blocked indefinitely in an STM transaction"
"Dying with Left thread blocked indefinitely in an STM transaction"
...
I suspect the key lies inside the semantic of atomically. Any idea?
Thanks!
Alfredo

I think this behavior comes from the definition of dupTChan. Copied here for readability, along with readTChan
dupTChan :: TChan a -> STM (TChan a)
dupTChan (TChan _read write) = do
hole <- readTVar write
new_read <- newTVar hole
return (TChan new_read write)
readTChan :: TChan a -> STM a
readTChan (TChan read _write) = do
listhead <- readTVar read
head <- readTVar listhead
case head of
TNil -> retry
TCons a tail -> do
writeTVar read tail
return a
inlining those functions, we get this STM block:
worker_block (TChan _read write) = do
hole <- readTVar write
new_read <- newTVar hole
listhead <- readTVar new_read
head <- readTVar listhead
case head of
TNil -> retry
...
When you try to run this block atomically, we make a new read_end from the tail of the channel, then call readTVar on it. The tail is of course empty, so this readTVar will retry. However, when the lead writes to the channel, the act of writing to the channel invalidates this transaction! So every follower transaction will always have to retry.
In fact, I don't think there is any case where dupTChan >>= readTChan will ever result in anything other than the thread being blocked indefinitely on an STM transaction. You can reason this out from the documentation as well. dupTChan begins empty, so within a single atomic transaction it will never have any items unless that same transaction adds them.

Related

Behaviour of withMVar and bracket_ different

This ensures mutual exclusion of actions:
do
lock <- newMVar ()
let atomicPrint = withMVar lock . const . print
mapM_ (forkIO . atomicPrint) [['1'..'8'],['a'..'h']]
This doesn't:
do
lock <- newMVar ()
let atomicPrint x = bracket_ (takeMVar lock) (print x) (putMVar lock ())
mapM_ (forkIO . atomicPrint) [['1'..'8'],['a'..'h']]
Can you explain why? By the definitions of withMVar and bracket_ on Hackage and equational reasoning I came to the conclusion that the two pieces of code should do the same. But running in GHCi proves me wrong.
You have the order wrong; You are acquiring the lock and immediately releasing it; whereas the signature says:
bracket_
:: IO a -- computation to run first ("acquire resource")
-> IO b -- computation to run last ("release resource")
-> IO c -- computation to run in-between
-> IO c -- returns the value from the in-between computation
so it should be:
bracket_ (takeMVar lock) (putMVar lock ()) $ print x

How to get input for a game loop in haskell

http://pastebin.com/2CS1k1Zq
In this game i need to get step the game forward every half a second or so while occasionally getting input to change direction. These too things seem impossible to do with haskell is there a way to do it? Currently I am having an mv tread stall exception.
Update: Found the hWaitForInput function in System.IO which is essentially the same as waitFor.
Here is some code largely based on this answer.
The main difference I made is that the thread waiting for a key press does not perform the getChar directly. The result communicated in the MVar is an indication of timeout or that a key press has occurred. It is the responsibility of the main thread to actually get the character. This prevents a possible race condition in case the char reading thread is killed between getting the character and putting it into the MVar.
import Control.Concurrent
import Control.Monad
import Data.Maybe
import System.IO
import Control.Exception
data Event = CharReady | TimedOut
withRawStdin :: IO a -> IO a
withRawStdin = bracket uncook restore . const
where
uncook = do
oldBuffering <- hGetBuffering stdin
oldEcho <- hGetEcho stdin
hSetBuffering stdin NoBuffering
hSetEcho stdin False
return (oldBuffering, oldEcho)
restore (oldBuffering, oldEcho) = do
hSetBuffering stdin oldBuffering
hSetEcho stdin oldEcho
waitFor :: Int -> IO Event
waitFor delay = do
done <- newEmptyMVar
withRawStdin . bracket (start done) cleanUp $ \_ -> takeMVar done
where
start done = do
t1 <- forkIO $ hLookAhead stdin >> putMVar done CharReady
t2 <- forkIO $ threadDelay delay >> putMVar done TimedOut
return (t1, t2)
cleanUp (t1, t2) = do
killThread t1
killThread t2
loop state = do
if state <= 0
then putStrLn "Game over."
else do putStrLn $ "Rounds to go: " ++ show state
e <- waitFor 3000000
case e of
TimedOut -> do putStrLn "Too late!"; loop state
CharReady -> do c <- getChar -- should not block
if c == 'x'
then do putStrLn "Good job!"; loop (state-1)
else do putStrLn "Wrong key"; loop state
main = loop 3

Using TChan with Timeout

I have a TChan as input for a thread which should behave like this:
If sombody writes to the TChan within a specific time, the content should be retrieved. If there is nothing written within the specified time, it should unblock and continue with Nothing.
My attempt on this was to use the timeout function from System.Timeout like this:
timeout 1000000 $ atomically $ readTChan pktChannel
This seemed to work but now I discovered, that I am sometimes loosing packets (they are written to the channel, but not read on the other side. In the log I get this:
2014.063.11.53.43.588365 Pushing Recorded Packet: 2 1439
2014.063.11.53.43.592319 Run into timeout
2014.063.11.53.44.593396 Run into timeout
2014.063.11.53.44.593553 Pushing Recorded Packet: 3 1439
2014.063.11.53.44.597177 Sending Recorded Packet: 3 1439
Where "Pushing Recorded Packet" is the writing from the one thread and "Sending Recorded Packet" is the reading from the TChan in the sender thread. The line with Sending Recorded Packet 2 1439 is missing, which would indicate a successful read from the TChan.
It seems that if the timeout is received at the wrong point in time, the channel looses the packet. I suspect that the threadKill function used inside timeout and STM don't play well together.
Is this correct? Does somebody have another solution that does not loose the packet?
Use registerDelay, an STM function, to signal a TVar when the timeout is reached. You can then use the orElse function or the Alternative operator <|> to select between the next TChan value or the timeout.
import Control.Applicative
import Control.Monad
import Control.Concurrent
import Control.Concurrent.STM
import System.Random
-- write random values after a random delay
packetWriter :: Int -> TChan Int -> IO ()
packetWriter maxDelay chan = do
let xs = randomRs (10000 :: Int, maxDelay + 50000) (mkStdGen 24036583)
forM_ xs $ \ x -> do
threadDelay x
atomically $ writeTChan chan x
-- block (retry) until the delay TVar is set to True
fini :: TVar Bool -> STM ()
fini = check <=< readTVar
-- Read the next value from a TChan or timeout
readTChanTimeout :: Int -> TChan a -> IO (Maybe a)
readTChanTimeout timeoutAfter pktChannel = do
delay <- registerDelay timeoutAfter
atomically $
Just <$> readTChan pktChannel
<|> Nothing <$ fini delay
-- | Print packets until a timeout is reached
readLoop :: Show a => Int -> TChan a -> IO ()
readLoop timeoutAfter pktChannel = do
res <- readTChanTimeout timeoutAfter pktChannel
case res of
Nothing -> putStrLn "timeout"
Just val -> do
putStrLn $ "packet: " ++ show val
readLoop timeoutAfter pktChannel
main :: IO ()
main = do
let timeoutAfter = 1000000
-- spin up a packet writer simulation
pktChannel <- newTChanIO
tid <- forkIO $ packetWriter timeoutAfter pktChannel
readLoop timeoutAfter pktChannel
killThread tid
The thumb rule of concurrency is: if adding a sleep in some point inside an IO action matters, your program is not safe.
To understand why the code timeout 1000000 $ atomically $ readTChan pktChannel does not work, consider the following alternative implementation of atomically:
atomically' :: STM a -> IO a
atomically' action = do
result <- atomically action
threadDelay someTimeAmount
return result
The above is equal to atomically, but for an extra innocent delay. Now it is easy to see that if timeout kills the thread during the threadDelay, the atomic action has completed (consuming a message from the channel), yet timeout will return Nothing.
A simple fix to timeout n $ atomically ... could be the following
smartTimeout :: Int -> STM a -> IO (Maybe a)
smartTimeout n action = do
v <- atomically $ newEmptyTMvar
_ <- timeout n $ atomically $ do
result <- action
putTMvar v result
atomically $ tryTakeTMvar v
The above uses an extra transactional variable v to do the trick. The result value of the action is stored into v inside the same atomic block in which the action is run. The return value of timeout is not trusted, since it does not tell us if action was run or not. After that, we check the TMVar v, which will be full if and only if action was run.
Instead of TChan a, use TChan (Maybe a) . Your normal producer (of x) now writes Just x. Fork an extra "ticking" process that writes Nothing to the channel (every x seconds). Then have a reader for the channel, and abort if you get two successive Nothing. This way, you avoid exceptions, which may cause data to get lost in your case (but I am not sure).

Stop threads from interleaving output

The following program creates two threads running concurrently, that each sleep for a random amount of time, before printing a line of text to stdout.
import Control.Concurrent
import Control.Monad
import System.Random
randomDelay t = randomRIO (0, t) >>= threadDelay
printer str = forkIO . forever $ do
randomDelay 1000000 -- μs
putStrLn str
main = do
printer "Hello"
printer "World"
return ()
The output generally looks something like
>> main
Hello
World
World
Hello
WoHrelld
o
World
Hello
*Interrupted
>>
How do you ensure that only one thread can write to stdout at a time? This seems like the kind of thing that STM should be good at, but all STM transactions must have the type STM a for some a, and an action that prints to the screen has type IO a, and there doesn't seem to be a way to embed IO into STM.
The way to handle output with STM is to have an output queue that is shared between all threads and which is processed by a single thread.
import Control.Concurrent
import Control.Concurrent.STM
import Control.Monad
import System.Random
randomDelay t = randomRIO (0, t) >>= threadDelay
printer queue str = forkIO . forever $ do
randomDelay 1000000 -- μs
atomically $ writeTChan queue str
prepareOutputQueue = do
queue <- newTChanIO
forkIO . forever $ atomically (readTChan queue) >>= putStrLn
return queue
main = do
queue <- prepareOutputQueue
printer queue "Hello"
printer queue "World"
return ()
Locking in the way you're describing isn't possible usingSTM. This is because STM is based on optimistic locking and so every transaction must be restartable at any point. If you embedded an IO operation into STM, it could be executed multiple times.
Probably the easiest solution for this problem is to use a MVar as a lock:
import Control.Concurrent
import Control.Concurrent.MVar
import Control.Monad
import System.Random
randomDelay t = randomRIO (0, t) >>= threadDelay
printer lock str = forkIO . forever $ do
randomDelay 1000000
withMVar lock (\_ -> putStrLn str)
main = do
lock <- newMVar ()
printer lock "Hello"
printer lock "World"
return ()
In this solution the lock is passed as an argument to printer.
Some people prefer to declare the lock as a top-level global variable, but currently this requires unsafePerformIO and relies on properties of GHC that AFAIK aren't part of the Haskell Language Report (in particular, it relies on the fact that a global variable with non-polymorphic type is evaluated at most once during the execution of a program).
A bit of research, based on Petr Pudlák's answer shows that there is a module Control.Concurrent.Lock in the concurrent-extra package that provides an abstraction around MVar ()-based locks.
The solution using that library is
import Control.Concurrent
import qualified Control.Concurrent.Lock as Lock
import Control.Monad
import System.Random
randomDelay t = randomRIO (0, t) >>= threadDelay
printer lock str = forkIO . forever $ do
randomDelay 1000
Lock.with lock (putStrLn str)
main = do
lock <- Lock.new
printer lock "Hello"
printer lock "World"
return ()
This is the example using global lock as mentioned by Petr.
import Control.Concurrent
import Control.Monad
import System.Random
import Control.Concurrent.MVar (newMVar, takeMVar, putMVar, MVar)
import System.IO.Unsafe (unsafePerformIO)
{-# NOINLINE lock #-}
lock :: MVar ()
lock = unsafePerformIO $ newMVar ()
printer x = forkIO . forever $ do
randomDelay 100000
() <- takeMVar lock
let atomicPutStrLn str = putStrLn str >> putMVar lock ()
atomicPutStrLn x
randomDelay t = randomRIO (0, t) >>= threadDelay
main = do
printer "Hello"
printer "World"
return ()
You can actually implement a lock using STM if you want, though an MVar will almost certainly perform better.
newtype Lock = Lock (TVar Status)
data Status = Locked | Unlocked
newLocked :: IO Lock
newLocked = Lock <$> newTVarIO Locked
newUnlocked :: IO Lock
newUnlocked = Lock <$> newTVarIO Unlocked
-- | Acquire a lock.
acquire :: Lock -> IO ()
acquire (Lock tv) = atomically $
readTVar tv >>= \case
Locked -> retry
Unlocked -> writeTVar tv Locked
-- | Try to acquire a lock. If the operation succeeds,
-- return `True`.
tryAcquire :: Lock -> IO Bool
tryAcquire (Lock tv) = atomically $
readTVar tv >>= \case
Locked -> pure False
Unlocked -> True <$ writeTVar tv Locked
-- | Release a lock. This version throws an exception
-- if the lock is unlocked.
release :: Lock -> IO ()
release (Lock tv) = atomically $
readTVar tv >>= \case
Unlocked -> throwSTM DoubleRelease
Locked -> writeTVar tv Unlocked
data DoubleRelease = DoubleRelease deriving Show
instance Exception DoubleRelease where
displayException ~DoubleRelease = "Attempted to release an unlocked lock."
-- | Release a lock. This version does nothing if
-- the lock is unlocked.
releaseIdempotent :: Lock -> IO ()
releaseIdempotent (Lock tv) = atomically $ writeTVar tv Unlocked
-- | Get the status of a lock.
isLocked :: Lock -> IO Status
isLocked (Lock tv) = readTVarIO tv
acquire/release pairs need careful masking and exception handling, much like primitive MVar operations. The documentation suggests, but does not actually state, that STM operations are interruptible when they retry; assuming this is true, the same approach used for withMVar will work here. Note: I've opened a GHC ticket to document retry interruptibility.

How to use System.IO.Unsafe together with TVars?

I would like to call a UDP send function within an STM transaction so that I can avoid code like below where m' is read (and could be updated by an other thread) before the values are eventually sent & where two consecutive where clauses make me look quite "helpless".
sendRecv s newmsgs q m = do
m' <- atomically $ readTVar m
time <- getPOSIXTime
result <- appendMsg newmsgs key m
when (result > 0) (atomically $ do
mT <- readTVar m
qT <- readTVar q
--let Just messages = Map.lookup key mT in sendq s (B.pack $ unwords messages) "192.168.1.1" 4711
let mT' = Map.delete key mT
qT' = PSQ.delete key qT
writeTVar q (PSQ.insert key time qT')
writeTVar m (Map.insert key [newmsgs] mT'))
when (result > 0) (let Just messages = Map.lookup key m' in sendq s (B.pack $ unwords messages) "192.168.1.1" 4711)
sendq :: Socket -> B.ByteString -> String -> PortNumber -> IO ()
sendq s datastring host port = do
hostAddr <- inet_addr host
sendAllTo s datastring (SockAddrInet port hostAddr)
return ()
I thought that by invoking TVars with newTVarIO and using import System.IO.Unsafe I could eventually use unsafePerformIO somewhere and call my sendq function (that returns IO() ) from within the transaction.
However, I do not find where this "somewhere" is? Is it at the creation of the TVar? Is it instead of atomically $ do? Do I understand the sense an applicability of unsafePerformIO wrong?
IO cannot be done from inside an STM block, because general IO cannot be undone. If you want to do some IO, you must schedule it in the STM block, but do it outside. For example:
foo tvar = do
scheduledAction <- atomically $ do
v <- readTVar tvar
when v retry
return (sendSomethingOnASocket "okay, we're done here")
scheduledAction
If you really need to do IO within a transaction, there's unsafeIOToSTM :: IO a -> STM a, however you should make sure to read the documentation first, as there are several gotchas to be aware of. In particular, the IO action may be run several times if the transaction has to be retried.
That said, I don't think that is appropriate in this case, and you should probably refactor your code so that the message is sent outside the transaction.

Resources