Selective send on TChan? - multithreading

AFAIK TChan acts as a hub, every message sent is seen by others right ?!
i want a TChan that acts as a switch to send a message to specific thread, and also support broadcasting.
is there such thing ?

Edit: I re-read your question. This answer doesn't quite address "selective send", though it clarifies what a TChan can do.
The "broadcast" approach described below will wake up all listeners (though on the bright side, it won't make 1000 copies of each item). To avoid this, use the Map approach as #Mikhail suggested. I did this in my chat server example.
A TChan is a FIFO queue:
writeTChan adds an item to the end.
readTChan reads an item from the beginning.
For example, the following example forks 10 threads which fight over a single channel:
import Control.Concurrent
import Control.Concurrent.STM
import Control.Monad
main = do
chan <- newTChanIO
forM_ [1..10] $ \i ->
forkIO $
forever $ do
x <- atomically $ readTChan chan
putStrLn $ "Thread " ++ show i ++ ": " ++ show x
mapM_ (atomically . writeTChan chan) [1..1000]
-- Wait for channel to empty out
atomically $ do
empty <- isEmptyTChan chan
when (not empty) retry
Here, each item is read by exactly one thread.
In contrast, the following example "broadcasts" a stream of items to 10 threads, by making ten copies of the channel using dupTChan:
import Control.Concurrent
import Control.Concurrent.STM
import Control.Monad
main = do
master <- newTChanIO
forM_ [1..10] $ \i -> do
chan <- atomically $ dupTChan master
forkIO $
forever $ do
x <- atomically $ readTChan chan
putStrLn $ "Thread " ++ show i ++ ": " ++ show x
mapM_ (atomically . writeTChan master) [1..100]
-- Give threads time to complete
threadDelay 1000000
Now each thread gets all of the items written to the channel.
A couple subtleties to note:
Items written to a channel prior to dupTChan will not appear in the new channel. If we called dupTChan from the child threads rather than the main thread, some writeTChans could happen first, meaning the children might not see all the items.
Since nobody is reading the master channel, items written to it will pile up and will likely not be garbage collected. To avoid this caveat, use newBroadcastTChan to create the master channel.

Related

How dangerous is forkProcess? How can I use it safely?

I’d like to play tricks with forkProcess, where I want to clone my Haskell process, and then let both clones talk to each other (maybe using Cloud Haskell to send even closures around).
But I wonder how well that works with the GHC runtime. Does anyone have experience here?
The documenation for forkProcess says that no other threads are copied, so I assume all data used by other threads will then be garbage collected in the fork, which sounds good. But that means that finalizers will run in both clone, which may or may not be the right thing to do…
I assume I can’t just use it without worry; but are there rules I can follow that will make sure its use is safe?
But that means that finalizers will run in both clone, which may or may not be the right thing to do…
Finalizers are very rarely used in Haskell, and even where they are used, I would expect them to only have in-process effects. For example, a finalizer calls hClose on garbage-collected Handles if you forgot to do it yourself. This is easy to demonstrate: the following program fails with openFile: resource exhausted (Too many open files), but if you uncomment the pure (), the Handles get garbage-collected and the program completes successfully.
import Control.Concurrent
import Control.Monad
import System.IO
import System.Mem
main :: IO ()
main = do
rs <- replicateM 1000 $ do
threadDelay 1000 -- not sure why did is needed; maybe to give control back
-- to the OS, so it can recycle the file descriptors?
performGC
openFile "input" ReadMode
--pure ()
print rs -- force all the Handles to still be alive by this point
File descriptors are process-owned and are copied by forkProcess, so it makes sense to have each clone close their copies.
The case which would be problematic is if a finalizer was cleaning up a system-owned resource, e.g. deleting a file. But I hope no library is relying on finalizers to delete such resources, because as the documentation explains, finalizers are not guaranteed to run. So it's better to use something like bracket to cleanup resources (although the cleanup is still not guaranteed, e.g. if bracket is used from a thread).
What the documentation for forkProcess is warning about is not finalizers, but the fact that other threads will appear to end abruptly inside the forked process. This is especially problematic if those threads are holding locks. Normally, two threads can use modifyMVar_ to ensure that only one thread at a time is running a critical section, and as long as each thread is only holding the lock for a finite amount of time, the other thread can simply wait for the MVar to become available. If you call forkProcess while one thread is in the middle of a modifyMVar_, however, that thread will not continue in the cloned process, and so the cloned process cannot simply call modifyMVar_ or it could get stuck forever while waiting for a non-existing thread to release the lock. Here is a program demonstrating the problem.
import Control.Concurrent
import Control.Monad
import System.Posix.Process
-- >>> main
-- (69216,"forkIO thread",0)
-- (69216,"main thread",1)
-- (69216,"forkIO thread",2)
-- (69216,"main thread",3)
-- (69216,"forkIO thread",4)
-- (69216,"main thread",5)
-- calling forkProcess
-- forkProcess main thread waiting for MVar...
-- (69216,"forkIO thread",6)
-- (69216,"original main thread",7)
-- (69216,"forkIO thread",8)
-- (69216,"original main thread",9)
-- (69216,"forkIO thread",10)
-- (69216,"original main thread",11)
main :: IO ()
main = do
mvar <- newMVar (0 :: Int)
_ <- forkIO $ replicateM_ 6 $ do
modifyMVar_ mvar $ \i -> do
threadDelay 100000
processID <- getProcessID
print (processID, "forkIO thread", i)
pure (i+1)
threadDelay 50000
replicateM_ 3 $ do
modifyMVar_ mvar $ \i -> do
threadDelay 100000
processID <- getProcessID
print (processID, "main thread", i)
pure (i+1)
putStrLn "calling forkProcess"
_ <- forkProcess $ do
threadDelay 25000
replicateM_ 3 $ do
putStrLn "forkProcess main thread waiting for MVar..."
modifyMVar_ mvar $ \i -> do
threadDelay 100000
processID <- getProcessID
print (processID, "forkProcess main thread", i)
pure (i+1)
replicateM_ 3 $ do
modifyMVar_ mvar $ \i -> do
threadDelay 100000
processID <- getProcessID
print (processID, "original main thread", i)
pure (i+1)
threadDelay 100000
As the output shows, the forkProcess main thread gets stuck waiting forever for the MVar, and never prints the forkProcess main thread line. If you move the threadDelays outside the modifyMVar_ critical section, the forkIO thread is a lot less likely to be in the middle of that critical section when forkProcess is called, so you'll see an output which looks like this instead:
(69369,"forkIO thread",0)
(69369,"main thread",1)
(69369,"forkIO thread",2)
(69369,"main thread",3)
(69369,"forkIO thread",4)
(69369,"main thread",5)
calling forkProcess
(69369,"forkIO thread",6)
(69369,"original main thread",7)
forkProcess main thread waiting for MVar...
(69370,"forkProcess main thread",6)
(69369,"forkIO thread",8)
(69369,"original main thread",9)
forkProcess main thread waiting for MVar...
(69370,"forkProcess main thread",7)
(69369,"forkIO thread",10)
(69369,"original main thread",11)
forkProcess main thread waiting for MVar...
(69370,"forkProcess main thread",8)
After the forkProcess call, there are now two MVars which both hold the value 5, and so in the original process, original main thread and forkIO thread are both incrementing one MVar, while in the other process forkProcess main thread is incrementing the other.

Haskell: TMVar vs MVar

I want a small operation where one thread adds to a shared state some value, while another thread takes the value out and prints it. Here are two versions using TMVar and MVar respectively. The TMVar version is not working somehow, it keeps printing out the first value. What is the problem in the STM first version? How to fix the first TMVar version to make it work?
import Control.Concurrent (forkIO, takeMVar,newEmptyMVar,putMVar)
import Control.Monad (forM_, replicateM_)
import Control.Concurrent.STM (atomically, readTMVar, putTMVar, newEmptyTMVarIO)
n=10
main = do
mvar<- newEmptyTMVarIO
forkIO $ do
forM_ [1..n] $ \x-> atomically $ do
putTMVar mvar $! x
replicateM_ n $ do
a<- atomically $ readTMVar mvar
print $ show a
main2 = do
mvar<- newEmptyMVar
forkIO $ do
mapM_ (\x-> putMVar mvar x) [1..n]
replicateM_ n $ do
a<- takeMVar mvar
print $ show a
You're using readTMVar, which just looks at what's in the TMVar. I imagine you mean to use takeTMVar to give the other thread a chance to put something new in it.

STM with partial atomicity for certain TVars

I am doing things with STM and have among other things used the TBQueue data structure with great success. An useful feature I've been using it for involves reading from it based on a precondition in a TVar, basically like so:
shouldRead <- readTVar shouldReadVar
if shouldRead
then do
a <- readTBQueue queue
doSomethingWith a
else doSomethingElse
If we assume that queue is empty and shouldReadVar contains True before executing this block, it will result in readTBQueue calling retry, and the block will be re-executed when shouldReadVar contains False or queue contains an element, whatever happens first.
I am now in need of a synchronous channel data structure, similar to the structure described in this article (Please read it if you want to understand this question), except it needs to be readable with a pre-condition like in the previous example, and possibly compose with other stuff as well.
Let's call this data structure SyncChan with writeSyncChan and readSyncChan operations defined on it.
And here's a possible use case: This (pseudo) code (which will not work because I mix STM/IO concepts):
shouldRead <- readTVar shouldReadVar
if shouldRead
then do
a <- readSyncChan syncChan
doSomethingWith a
else doSomethingElse
Assuming that no other thread is currently blocking on a writeSyncChan call, and shouldReadChan contains True, I want the block to "retry" until either shouldReadChan contains False, or a different thread blocks on a writeSyncChan. In other words: when one thread retrys on writeSyncChan and another thread blocks reaches a readSyncChan, or vice versa, I want the value to be transferred along the channel. In all other cases, both sides should be in a retry state and thus react to a change in shouldReadVar, so that the read or write can be cancelled.
The naïve approach described in the article linked above using two (T)MVars is of course not possible. Because the data structure is synchronous, it is impossible to use it within two atomically blocks, because you cannot change one TMVar and wait for another TMVar to be changed in an atomic context.
Instead, I am looking for a kind of partial atomicity, where I can "commit" a certain part of a transaction and only roll it back when certain variables change, but not others. If I have "msg" and "ack" variables like the first example in the article above, I want to be able to write to the "msg" variable, then wait for either a value to arrive on "ack", or for my other transactional variables to change. If other transactional variables change, the whole atomic block should be retried, and if an "ack" value arrives, the transaction should continue as it was in the previous state. For the reading side, something similar should happen, except I would of course be reading from "msg" and writing to "ack."
Is this possible to do using GHC STM, or do I need to do manual MVar/rollback handling?
This is what you want:
import Control.Concurrent
import Control.Concurrent.STM
import Control.Monad
data SyncChan a = SyncChan (TMVar a) (TMVar ())
newSyncChan :: IO (SyncChan a)
newSyncChan = do
msg <- newEmptyTMVarIO
ack <- newEmptyTMVarIO
return (SyncChan msg ack)
readIf :: SyncChan a -> TVar Bool -> STM (Maybe a)
readIf (SyncChan msg ack) shouldReadVar = do
b <- readTVar shouldReadVar
if b
then do
a <- takeTMVar msg
putTMVar ack ()
return (Just a)
else return Nothing
write :: SyncChan a -> a -> IO ()
write (SyncChan msg ack) a = do
atomically $ putTMVar msg a
atomically $ takeTMVar ack
main = do
sc <- newSyncChan
tv <- newTVarIO True
forkIO $ forever $ forM_ [False, True] $ \b -> do
threadDelay 2000000
atomically $ writeTVar tv b
forkIO $ forM_ [0..] $ \i -> do
putStrLn "Writing..."
write sc i
putStrLn "Write Complete"
threadDelay 300000
forever $ do
putStrLn "Reading..."
a <- atomically $ readIf sc tv
print a
putStrLn "Read Complete"
This gives the behavior you had in mind. While the TVar is True the input and output ends will be synchronized with each other. When the TVar switches to False then the read end freely aborts and returns Nothing.

Killing a thread when MVar is garbage collected

I have a worker thread which reads data repeatedly from an MVar and performs some useful work on that. After a while, the rest of the program forgets about that worker thread, which means that it will wait on an empty MVar and become very lonely. My question is:
Will the MVar be garbage collected if threads no longer write to it, for instance because they all wait for it?
Will garbage collection kill the waiting threads?
If neither, can I somehow indicate to the compiler that the MVar should be garbage collected and the thread be killed?
EDIT: I should probably clarify the purpose of my question. I don't desire general protection against deadlocks; instead, what I would like to do is to tie the life of the worker thread to life of a value (as in: dead values are claimed by garbage collection). In other words, the worker thread is a resource that I would like to free not by hand, but when a certain value (the MVar or a derivative) is garbage collected.
Here an example program that demonstrates what I have in mind
import Control.Concurrent
import Control.Concurrent.MVar
main = do
something
-- the thread forked in something can be killed here
-- because the MVar used for communication is no longer in scope
etc
something = do
v <- newEmptyMVar
forkIO $ forever $ work =<< takeMVar v
putMVar v "Haskell"
putMVar v "42"
In other words, I want the thread to be killed when I can no longer communicate with it, i.e. when the MVar used for communication is no longer in scope. How to do that?
It will just work: when the MVar is only reachable by the thread that is blocked on it, then the thread is sent the BlockedIndefinitelyOnMVar exception, which will normally cause it to die silently (the default exception handler for a thread ignores this exception).
BTW, for doing some cleanup when the thread dies, you'll want to use forkFinally (which I just added to Control.Concurrent).
If you're lucky, you'll get a "BlockedIndefinitelyOnMVar", indicating that you're waiting on an MVar that no thread will ever write to.
But, to quote Ed Yang,
GHC only knows that a thread can be considered garbage if there are no
references to the thread. Who is holding a reference to the thread?
The MVar, as the thread is blocking on this data structure and has
added itself to the blocking list of this. Who is keeping the MVar
alive? Why, our closure that contains a call to takeMVar. So the
thread stays.
without a bit of work (which would be, by the way, quite interesting to see), BlockedIndefinitelyOnMVar is not an obviously useful mechanism for giving your Haskell programs deadlock protection.
GHC just can't solve the problem in general of knowing whether your thread will make progress.
A better approach would be to explicitly terminate threads by sending them a Done message. E.g. just lift your message type into an optional value that also includes an end-of-message value:
import Control.Concurrent
import Control.Concurrent.MVar
import Control.Monad
import Control.Exception
import Prelude hiding (catch)
main = do
something
threadDelay (10 * 10^6)
print "Still here"
something = do
v <- newEmptyMVar
forkIO $
finally
(let go = do x <- takeMVar v
case x of
Nothing -> return ()
Just v -> print v >> go
in go)
(print "Done!")
putMVar v $ Just "Haskell"
putMVar v $ Just "42"
putMVar v Nothing
and we get the correct clean up:
$ ./A
"Haskell"
"42"
"Done!"
"Still here"
I tested the simple weak MVar and it did get finalized and killed. The code is:
import Control.Monad
import Control.Exception
import Control.Concurrent
import Control.Concurrent.MVar
import System.Mem(performGC)
import System.Mem.Weak
dologger :: MVar String -> IO ()
dologger mv = do
tid <- myThreadId
weak <- mkWeakPtr mv (Just (putStrLn "X" >> killThread tid))
logger weak
logger :: Weak (MVar String) -> IO ()
logger weak = act where
act = do
v <- deRefWeak weak
case v of
Just mv -> do
a <- try (takeMVar mv) :: IO (Either SomeException String)
print a
either (\_ -> return ()) (\_ -> act) a
Nothing -> return ()
play mv = act where
act = do
c <- getLine
if c=="quit" then return ()
else putMVar mv c >> act
doplay mv = do
forkIO (dologger mv)
play mv
main = do
putStrLn "Enter a string to escape, or quit to exit"
mv <- newEmptyMVar
doplay mv
putStrLn "*"
performGC
putStrLn "*"
yield
putStrLn "*"
threadDelay (10^6)
putStrLn "*"
The session with the program was:
(chrisk)-(/tmp)
(! 624)-> ghc -threaded -rtsopts --make weak2.hs
[1 of 1] Compiling Main ( weak2.hs, weak2.o )
Linking weak2 ...
(chrisk)-(/tmp)
(! 625)-> ./weak2 +RTS -N4 -RTS
Enter a string to escape, or quit to exit
This is a test
Right "This is a test"
Tab Tab
Right "Tab\tTab"
quit
*
*
X
*
Left thread killed
*
So blocking on takeMVar did not keep the MVar alive on ghc-7.4.1 despite expectations.
While BlockedIndefinitelyOnMVar should work, also consider using ForeignPointer finalizers. The normal role of those is to delete C structures that are no longer accessible in Haskell. However, you can attach any IO finalizer to them.

Compilation error: "The last statement in a 'do' construct must be an expression"

The following is my dining philosophers code and yields a compilation error saying "The last statement in a 'do' construct must be an expression: mVar2 <- newEmptyMVar mVar3"
Can Somebody help me fix this error and get this program working? thank you
import Control.Concurrent
import Control.Concurrent.MVar
import System.Random
takefork :: Int -> forks -> IO ()
takefork n forks = takeMVar (forks!!n)
releasefork :: Int -> forks -> IO ()
releasefork n forks = putMVar (forks!!n)
philosopher :: [Int]
philosopher = [1,2,3,4,5]
forks :: [MVar] -> [Int]
forks = do
takefork n ( philosopher - 1)
threadDelay delay
let delay = 100000
takefork n philosopher
putStrLn("Philosopher" ++ philosopher ++ "has started eating")
releasefork n philosopher
releasefork n ( philosopher - 1)
ptStrLn ("Philosopher" ++ philosopher ++ "has stopped eating")
forks
main :: IO ()
main = do
mVar1 <- newEmptyMVar
mVar2 <- newEmptyMVar
mVar3 <- newEmptyMVar
mVar4 <- newEmptyMVar
mVar5 <- newEmptyMVar
let mVar = [mVar1, mVar2, mVar3, mVar4, mVar5]
sequence_ [ forkIO forks (mVar philosopher) ]
There are many problems with your code.
The error message you report indicates you are probably mixing spaces and tabs. Get rid of the tabs and use only spaces.
You are presumably writing this program in order to practice writing Haskell programs, not in order to run the program for fun and profit. So we don't want to simply give you a working Dining Philosophers implementation, we want to help you write your implementation.
I cannot tell from your code how you expect it to work.
I'm going to focus on the last line:
sequence_ [ forkIO forks (mVar philosopher) ]
sequence_ :: [IO a] -> IO () --- give sequence_ a list of i/o actions, and it (returns an i/o action that) performs each action in order. From the [...], it looks like you are trying to give it a list, but with only one element. This is probably not what you mean.
forkIO :: IO () -> IO ThreadID --- give forkIO an i/o action, and it (returns an i/o action that) starts that i/o action running in a new thread, giving you the id of that thread.
There are two problems here:
forks is a function, not an i/o action (it's not even a function that returns an i/o action, though you probably mean it to be)
you give forkIO a second argunment ((mVar philosopher)), but it only takes one argument
mVar philosopher itself doesn't make any sense: mVar :: [MVar a] (it's a list of MVars, and I haven't worked out what type the MVars are supposed to contain) but you treat it like a function, passing it philosopher as an argument.
At this point a lightbulb blinks on above my head. You wish to call forks with parameters mVar and philosopher?
sequence_ [ forkIO (forks mVar philosopher) ]
We're still sequencing a single action though. Perhaps you wish to call forks with each element of philosopher in turn?
sequence_ $ map (\n -> forkIO (forks mVar n)) philosopher
We can simplify this to
mapM_ (\n -> forkIO (forks mVar n)) philosopher
This doesn't match up with the type you given forks :: [MVar] -> [Int]. But that's probably wrong, so you'll want to fix that function next.

Resources