Haskell Job Pool - multithreading

Haskell Job Pool - multithreading

I'm interested in Haskell programming, but I'd want to create a job pool system, and I wonder if that will be a problem in Haskell.
Below is a simple program in Ruby. On one thread of execution, words are taken from a user and added to a list. On another thread, words are taken from the list and processed in some way (in this case, reversed and printed back to the user).
words = []
# Create new thread to take words from array, one at a time, and process them
t = Thread.new {
loop do
unless words.empty?
word = words.pop
break if word == 'quit'
sleep 1
puts word.reverse
end
end
}
# Take words from user and add to array
loop do
puts "Enter word:"
word = gets.chomp
words << word
break if word == 'quit'
end
t.join
What is the equivalent Haskell code?

Here's a pretty close translation.
Chan is a FIFO queue for message passing between Haskell threads.
Below I use a MVar for waiting that the spooler has exited. This is like a regular mutable variable, but it is protected with a mutex. It can either be empty (only a put is allowed, takes wait) or full (only a take is allowed, puts wait).
I also use Haskell threads below, which might be run on separate OS-level threads or not -- the Haskell runtime chooses that. Haskell threads are very cheap, compared with OS threads.
See e.g. Real World Haskell for more discussion.
{-# OPTIONS -Wall #-}
module JobPool where
import Control.Monad (when)
import Control.Concurrent
spooler :: Chan String -> MVar () -> IO ()
spooler ch stop = do
word <- readChan ch
if word == "quit"
then putMVar stop ()
else do
threadDelay 1000000 -- us
putStrLn (reverse word)
spooler ch stop
main :: IO ()
main = do
stop <- newEmptyMVar
ch <- newChan
_ <- forkIO $ spooler ch stop
let loop = do
word <- getLine
writeChan ch word
when (word /= "quit") loop
loop
takeMVar stop

Related

Endless loop and a break for TUI in Haskell

I want to listen for keypresses and depending on those, use commands from System.Console.ANSI
package to manipulate console interface for my program.
In Python I would to this
while True:
read_from_console()
if condition:
print_stuff_into_console
break
How do I approach such task in Haskell, in simplest possible way?
Thanks

The equivalent abstract pseudo-ish code in Haskell would look like:
loop = do
line <- readFromConsole
if condition line
then do
printStuffToConsole
loop -- Recurse - i.e. repeat the same thing again
else
pure () -- Don't recurse - the function execution ends
main = loop
But of course the devil would be in how readFromConsole and printStuffToConsole look. And these really depend on what exactly you'd like to do.
I will offer the dumbest possible implementation, just to illustrate how everything works and to build a complete program.
Let's say that "read from console" just means having the user enter a line of text and press Enter. For that, you can use the getLine function:
readFromConsole = getLine
And let's say you want to print the same thing every time. For printing, you can use the putStrLn function:
printStuffToConsole = putStrLn "Give me another!"
And then let's say that the condition for stopping is that the user enters "STOP". This can be expressed with a string comparison:
condition line = line /= "STOP"
If you put all of that together, you get a complete program:
loop = do
line <- readFromConsole
if condition line
then do
printStuffToConsole
loop -- Recurse - i.e. repeat the same thing again
else
pure () -- Don't recurse - the function execution ends
where
readFromConsole = getLine
printStuffToConsole = putStrLn "Give me another!"
condition line = line /= "STOP"
main = loop
Of course, while it's nice to have parts of the program semantically named, you don't strictly speaking have to do it if you wanted to make the whole thing shorter:
main = do
line <- getLine
if line /= "STOP"
then do
putStrLn "Give me another!"
main
else
pure ()

Fyodor Soikin already provided the simple way to do it.
Here I'll comment on a general way to "break" a loop: using continuations and callCC.
import Control.Monad.Cont
main :: IO ()
main = do
putStrLn "start"
flip runContT return $ callCC $ \break -> forever $ do
l <- lift $ getLine
if l == "quit"
then break ()
else lift $ putStrLn $ "not a quit command " ++ l
lift $ putStrLn "next iteration"
putStrLn "end"
Continuations are infamously hard to grasp, but the above code is not too complex. A rough intuition is as follows.
The forever library function is used to repeat an action indefinitely, it is the Haskell equivalent of while true.
The flip runContT return $ callCC $ \f -> .... part means "define f to be a break-like function, which will exit the "block" .... immediately. In the code, I call that break to make that clear. The call break () interrupts the forever (and returns the () outside -- we could use that value if we wrote x <- flip runContT .... to bind it to x).
There is a downside, though. In the .... part we no longer work inside the IO monad, but in the ContT () IO monad. That is what lets us call break (). In order to use regular IO there, we need to lift the IO actions. So, we can't use putStrLn ".." but we need to use lift $ putStrLn ".." instead.
The rest should be more or less straightforward to follow.
Here's a small demo in GHCi.
> main
start
1 (typed by the user)
not a quit command 1
next iteration
2 (typed by the user)
not a quit command 2
next iteration
3 (typed by the user)
not a quit command 3
next iteration
4 (typed by the user)
not a quit command 4
next iteration
quit (typed by the user)
end
Is it a good idea to use continuation just for break? Maybe. If you are not familiar with this technique, probably it is not worth it. The plain recursive approach looks much simpler.

Why this haskell code cannot be compiled

I was trying to compile a haskell game code, this code generates three thread, one for infinite loop, one for collecting user's interaction, one for triggering the events. However, the code cannot be compiled and I don't know why.
Here's the code:
module Main where
import Control.Concurrent
import Control.Monad
import System.IO
import System.Random
import Text.Printf
data Msg = C Char | Time
data Event = C Char | Time Char
main :: IO ()
main = do
hSetBuffering stdout NoBuffering
hSetBuffering stdin NoBuffering
hSetEcho stdin False
-- shared resources
chan <- newEmptyMVar
removedDigits <- newEmptyMVar
unmatchedDigits <- newEmptyMVar
numberOfGuesses <- newEmptyMVar
--starting the generating thread and the user thread
forkIO $ generatingThread chan
forkIO $ userThread chan
--the main loop
if mainloop chan == True then "Congratulations! You won!" else "Better luck next time!"
return()
mainloop :: Chan c -> Bool
let mainloop = do
if length unmatchedDigits >= 10
then return False
Event <- readChan c
if Event == "timer"
then unmatchedDigits ++ param
else if testGuessedNumber param unmatchedDigits == True
then
removeMatchedDigit
if length unmatchedDigits == 0
then return True
mainloop c
-- Generating Thread aka event thread generating the random numbers
generatingThread :: Chan msgChan -> IO ()
generatingThread msgChan = forever $ do
publishTimerEvent msgChan 1000000
publishTimerEvent :: Chan msgChan -> Int delay ()
publishTimerEvent msgChan = do
c <- getRandomChar
putMVar msgChan ("timer" c)
threadDelay newDelay
velocity <- 0.9
if delay * velocity < 100
then newDelay <- 100
else newDelay <- delay * velocity
publishTimerEvent msgChan newDelay
getRandomChar :: Char c ()
getRandomChar = do
i <- randomRIO (0,9)
let c = "0123456789" !! i
return c
-- User Thread
userThread :: MVar Msg -> IO ()
userThread chan = forever $ do
c <- getChar
putMVar chan (C c)
showStr(show c)
testGuessedNumber :: Int -> Int -> Bool
testGuessedNumber a b
| a == b = True
| otherwise = False
-- Shows the given string at the left edge of the current terminal line after
-- having blanked out the first 20 characters on that line.
showStr :: String -> IO ()
showStr s = putStr ("\r" ++ replicate 20 ' ' ++ "\r" ++ s)
The error is "test.hs:36:3: error: parse error on input ‘Event’"

Variable names can not begin with uppercase letters, such as Event. Try renaming the variable to something like event.

In Haskell, all if ... then ... else blocks must have all their components; what would the result be otherwise?
The problem is that the compiler was expecting an else, but it actually got Event. That said, you have more problems than a simple parse error. return does not do what you think it does. For example, this code will print hi.
main = do
return ()
putStrLn "hi"
The return function simply lifts a value into a monad, it doesn't stop the computation or anything like that. Here is what your probably want to have:
...
if length unmatchedDigits >= 10
then return False
else do
Event <- readChan c
if Event == "timer"
then ...
else ...
This way, nothing happens after the if block, so the function just ends there, with the last value as False (if length unmatchedDigits >= 10) or continues on properly (if length unmatchedDigits < 10).
You also almost certainly don't want to use Event (uppercase E), because that would mean it is a data constructor. You probably meant event (lowercase e), which is just an ordinary variable name.
Additionally: This is very, very non-idiomatic Haskell. You definitely don't need MVars in this situation, and certainly not four of them. Chans are not the same thing as MVars, and you don't need either one unless you are doing heavy-duty multi-threading. I highly recommend you completely rewrite this and try to minimize the amount of code that uses IO (it should be maybe 10-15 lines of IO code in this example, probably less).
This is not Java; you don't need to name your variables in your type signature (Chan msgChan -> Int delay ()), nor do you need to write wrapper functions for standard library functions to monomorphize their type. testGuessedNumber is literally the same function as (==).
You will have much more success if you revisit basic, pure function syntax and understand how problems are solved in Haskell than with trying to emulate another language. Read some LYAH or Real World Haskell.

How to exit a forkIO thread when it excutes a loop procedure

I'm writing something like a music player and get stuck with the playback progress bar.
In my program when the play button is clicked, I use forkIO to fork a thread which controls the progressbar. However, the forked thread now executes a loop. How can I inform that thread to terminate when I stop current song or change songs.
I've been trying to use IORef Var, for example
flag <- newIORef False
forkIO $ progressBarFunc flag
and in the function progreeBarFunc it checks whether flag is true and decides to exit loop or not.
But this does not work.
More generally, how can I tell the forked thread to stop when I use forkIO to fork threads?
In addition, if I have an IORef Var and pass it to the function in forkIO, do the main thread and the forked thread share the same IORef Var or the forked thread actually has a copy of it?

You can communicate between threads using IORefs. The IORef refers to the same thing in the forked thread as it did in the main thread.
There are a few things you should check:
Does the forked thread actually get a chance to test the IORef?
Can the UI interactions you are expecting actually happen from the forked thread? Many UI libraries, including both gtk and OpenGL, have restrictions on which threads can interact with the UI.
Is the flag set for long enough that the forked thread had a chance to see it? If the flag is set to True and then back to False before the forked thread calls readIORef, it won't detect the stop.
One way to address the final problem is to use an Integer instead of a Bool for a flag.
newFlag :: IO (IORef Integer)
newFlag = newIORef 0
An observer of the flag remembers the value of the flag when the observer was created, and stops when it becomes greater. This returns True when the thread can continue (the flag has not been raised).
testFlag :: IORef Integer -> IO (IO Bool)
testFlag flag = do
n <- readIORef flag
return (fmap (<=n) (readIORef flag))
To raise the flag, the signaler increments the value.
raiseFlag :: IORef Integer -> IO ()
raiseFlag ref = atomicModifyIORef ref (\x -> (x+1,()))
This little example program demonstrates an IORef sharing a flag with other threads. It forks new threads when given the input "f", signals the threads to stop when given the input "s", and quits when given the input "q".
main = do
flag <- newFlag
let go = do
command <- getLine
case command of
"f" -> do
continue <- testFlag flag
forkIO $ thread continue
go
"s" -> do
raiseFlag flag
go
"q" -> do
raiseFlag flag
return ()
go
The threads periodically do some "work", which takes half a second, and test for the continue condition before continuing.
thread :: IO Bool -> IO ()
thread continue = go
where
go = do
me <- myThreadId
putStrLn (show me ++ " Outputting")
threadDelay 500000
c <- continue
if c then go else putStrLn (show me ++ " Stopping") >> return ()

STM with partial atomicity for certain TVars

I am doing things with STM and have among other things used the TBQueue data structure with great success. An useful feature I've been using it for involves reading from it based on a precondition in a TVar, basically like so:
shouldRead <- readTVar shouldReadVar
if shouldRead
then do
a <- readTBQueue queue
doSomethingWith a
else doSomethingElse
If we assume that queue is empty and shouldReadVar contains True before executing this block, it will result in readTBQueue calling retry, and the block will be re-executed when shouldReadVar contains False or queue contains an element, whatever happens first.
I am now in need of a synchronous channel data structure, similar to the structure described in this article (Please read it if you want to understand this question), except it needs to be readable with a pre-condition like in the previous example, and possibly compose with other stuff as well.
Let's call this data structure SyncChan with writeSyncChan and readSyncChan operations defined on it.
And here's a possible use case: This (pseudo) code (which will not work because I mix STM/IO concepts):
shouldRead <- readTVar shouldReadVar
if shouldRead
then do
a <- readSyncChan syncChan
doSomethingWith a
else doSomethingElse
Assuming that no other thread is currently blocking on a writeSyncChan call, and shouldReadChan contains True, I want the block to "retry" until either shouldReadChan contains False, or a different thread blocks on a writeSyncChan. In other words: when one thread retrys on writeSyncChan and another thread blocks reaches a readSyncChan, or vice versa, I want the value to be transferred along the channel. In all other cases, both sides should be in a retry state and thus react to a change in shouldReadVar, so that the read or write can be cancelled.
The naïve approach described in the article linked above using two (T)MVars is of course not possible. Because the data structure is synchronous, it is impossible to use it within two atomically blocks, because you cannot change one TMVar and wait for another TMVar to be changed in an atomic context.
Instead, I am looking for a kind of partial atomicity, where I can "commit" a certain part of a transaction and only roll it back when certain variables change, but not others. If I have "msg" and "ack" variables like the first example in the article above, I want to be able to write to the "msg" variable, then wait for either a value to arrive on "ack", or for my other transactional variables to change. If other transactional variables change, the whole atomic block should be retried, and if an "ack" value arrives, the transaction should continue as it was in the previous state. For the reading side, something similar should happen, except I would of course be reading from "msg" and writing to "ack."
Is this possible to do using GHC STM, or do I need to do manual MVar/rollback handling?

This is what you want:
import Control.Concurrent
import Control.Concurrent.STM
import Control.Monad
data SyncChan a = SyncChan (TMVar a) (TMVar ())
newSyncChan :: IO (SyncChan a)
newSyncChan = do
msg <- newEmptyTMVarIO
ack <- newEmptyTMVarIO
return (SyncChan msg ack)
readIf :: SyncChan a -> TVar Bool -> STM (Maybe a)
readIf (SyncChan msg ack) shouldReadVar = do
b <- readTVar shouldReadVar
if b
then do
a <- takeTMVar msg
putTMVar ack ()
return (Just a)
else return Nothing
write :: SyncChan a -> a -> IO ()
write (SyncChan msg ack) a = do
atomically $ putTMVar msg a
atomically $ takeTMVar ack
main = do
sc <- newSyncChan
tv <- newTVarIO True
forkIO $ forever $ forM_ [False, True] $ \b -> do
threadDelay 2000000
atomically $ writeTVar tv b
forkIO $ forM_ [0..] $ \i -> do
putStrLn "Writing..."
write sc i
putStrLn "Write Complete"
threadDelay 300000
forever $ do
putStrLn "Reading..."
a <- atomically $ readIf sc tv
print a
putStrLn "Read Complete"
This gives the behavior you had in mind. While the TVar is True the input and output ends will be synchronized with each other. When the TVar switches to False then the read end freely aborts and returns Nothing.

Killing a thread when MVar is garbage collected

I have a worker thread which reads data repeatedly from an MVar and performs some useful work on that. After a while, the rest of the program forgets about that worker thread, which means that it will wait on an empty MVar and become very lonely. My question is:
Will the MVar be garbage collected if threads no longer write to it, for instance because they all wait for it?
Will garbage collection kill the waiting threads?
If neither, can I somehow indicate to the compiler that the MVar should be garbage collected and the thread be killed?
EDIT: I should probably clarify the purpose of my question. I don't desire general protection against deadlocks; instead, what I would like to do is to tie the life of the worker thread to life of a value (as in: dead values are claimed by garbage collection). In other words, the worker thread is a resource that I would like to free not by hand, but when a certain value (the MVar or a derivative) is garbage collected.
Here an example program that demonstrates what I have in mind
import Control.Concurrent
import Control.Concurrent.MVar
main = do
something
-- the thread forked in something can be killed here
-- because the MVar used for communication is no longer in scope
etc
something = do
v <- newEmptyMVar
forkIO $ forever $ work =<< takeMVar v
putMVar v "Haskell"
putMVar v "42"
In other words, I want the thread to be killed when I can no longer communicate with it, i.e. when the MVar used for communication is no longer in scope. How to do that?

It will just work: when the MVar is only reachable by the thread that is blocked on it, then the thread is sent the BlockedIndefinitelyOnMVar exception, which will normally cause it to die silently (the default exception handler for a thread ignores this exception).
BTW, for doing some cleanup when the thread dies, you'll want to use forkFinally (which I just added to Control.Concurrent).

If you're lucky, you'll get a "BlockedIndefinitelyOnMVar", indicating that you're waiting on an MVar that no thread will ever write to.
But, to quote Ed Yang,
GHC only knows that a thread can be considered garbage if there are no
references to the thread. Who is holding a reference to the thread?
The MVar, as the thread is blocking on this data structure and has
added itself to the blocking list of this. Who is keeping the MVar
alive? Why, our closure that contains a call to takeMVar. So the
thread stays.
without a bit of work (which would be, by the way, quite interesting to see), BlockedIndefinitelyOnMVar is not an obviously useful mechanism for giving your Haskell programs deadlock protection.
GHC just can't solve the problem in general of knowing whether your thread will make progress.
A better approach would be to explicitly terminate threads by sending them a Done message. E.g. just lift your message type into an optional value that also includes an end-of-message value:
import Control.Concurrent
import Control.Concurrent.MVar
import Control.Monad
import Control.Exception
import Prelude hiding (catch)
main = do
something
threadDelay (10 * 10^6)
print "Still here"
something = do
v <- newEmptyMVar
forkIO $
finally
(let go = do x <- takeMVar v
case x of
Nothing -> return ()
Just v -> print v >> go
in go)
(print "Done!")
putMVar v $ Just "Haskell"
putMVar v $ Just "42"
putMVar v Nothing
and we get the correct clean up:
$ ./A
"Haskell"
"42"
"Done!"
"Still here"

I tested the simple weak MVar and it did get finalized and killed. The code is:
import Control.Monad
import Control.Exception
import Control.Concurrent
import Control.Concurrent.MVar
import System.Mem(performGC)
import System.Mem.Weak
dologger :: MVar String -> IO ()
dologger mv = do
tid <- myThreadId
weak <- mkWeakPtr mv (Just (putStrLn "X" >> killThread tid))
logger weak
logger :: Weak (MVar String) -> IO ()
logger weak = act where
act = do
v <- deRefWeak weak
case v of
Just mv -> do
a <- try (takeMVar mv) :: IO (Either SomeException String)
print a
either (\_ -> return ()) (\_ -> act) a
Nothing -> return ()
play mv = act where
act = do
c <- getLine
if c=="quit" then return ()
else putMVar mv c >> act
doplay mv = do
forkIO (dologger mv)
play mv
main = do
putStrLn "Enter a string to escape, or quit to exit"
mv <- newEmptyMVar
doplay mv
putStrLn "*"
performGC
putStrLn "*"
yield
putStrLn "*"
threadDelay (10^6)
putStrLn "*"
The session with the program was:
(chrisk)-(/tmp)
(! 624)-> ghc -threaded -rtsopts --make weak2.hs
[1 of 1] Compiling Main ( weak2.hs, weak2.o )
Linking weak2 ...
(chrisk)-(/tmp)
(! 625)-> ./weak2 +RTS -N4 -RTS
Enter a string to escape, or quit to exit
This is a test
Right "This is a test"
Tab Tab
Right "Tab\tTab"
quit
*
*
X
*
Left thread killed
*
So blocking on takeMVar did not keep the MVar alive on ghc-7.4.1 despite expectations.

While BlockedIndefinitelyOnMVar should work, also consider using ForeignPointer finalizers. The normal role of those is to delete C structures that are no longer accessible in Haskell. However, you can attach any IO finalizer to them.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Haskell Job Pool - multithreading

Related

Endless loop and a break for TUI in Haskell

Why this haskell code cannot be compiled

How to exit a forkIO thread when it excutes a loop procedure

STM with partial atomicity for certain TVars

Killing a thread when MVar is garbage collected

Categories

Resources