How do I create a thread pool?

How do I create a thread pool? - haskell

Sometimes I want to run a maximum amount of IO actions in parallel at once for network-activity, etc. I whipped up a small concurrent thread function which works well with https://gist.github.com/810920, but this isn't really a pool as all IO actions must finish before others can start.
The type of what I'm looking for would be something like:
runPool :: Int -> [IO a] -> IO [a]
and should be able to operate on finite and infinite lists.
The pipes package looks like it would be able to achieve this quite well, but I feel there is probably a similar solution to the gist I have provided just using mvars, etc, from the haskell-platform.
Has anyone encountered an idiomatic solution without any heavy dependencies?

You need a thread pool, if you want something short, you could get inspiration from Control.ThreadPool (from the control-engine package which also provide more general functions), for instance threadPoolIO is just :
threadPoolIO :: Int -> (a -> IO b) -> IO (Chan a, Chan b)
threadPoolIO nr mutator = do
input <- newChan
output <- newChan
forM_ [1..nr] $
\_ -> forkIO (forever $ do
i <- readChan input
o <- mutator i
writeChan output o)
return (input, output)
It use two Chan for communication with the outside but that's usually what you want, it really help writing code that don't mess up.
If you absolutely want to wrap it up in a function of your type you can encapsulate the communication too :
runPool :: Int -> [IO a] -> IO [a]
runPool n as = do
(input, output) <- threadPoolIO n (id)
forM_ as $ writeChan input
sequence (repeat (length as) $ readChan output)
This won't keep the order of your actions, is that a problem (it's easy to correct by transmitting the index of the action or just using an array instead to store the responses) ?
Note : the n threads will stay alive forever with this simplistic version, adding a "killAll" returned action to threadPoolIO would resolve this problem handily if you intend to create and trash several of those pool in a long running application (if not, given the weight of threads in Haskell, it's probably not worth the bother).
Note that this function works on finite lists only, that's because IO is normally strict so you can't start to process elements of IO [a] before the whole list is produced, if you really want that you'll have either to use lazy IO with unsafeInterleaveIO (maybe not the best idea) or completely change your model and use something like conduits to stream your results.

Related

Timeout on StateT with IO

I have a custom type
type GI a = StateT GenState IO a
where GenState is a state I keep for Generating Random Trees of some kind.
When generating my trees, termination is not guaranteed in a reasonable amount of time. That's why I thought I might terminate the calculation and restart it over and over again with a timeout until a result is given.
So my question is how to write a function of the form
tryGeneration :: GI a -> GI a
tryGeneraton action = ...
where action is the calculation to try in some microseconds and if it runs out of time begin the action from the start.
Please keep in mind that I'm quite new to Monad Transformers and I cannot say that i fully understand them yet.
I tried to use lift with System.Timeout.timeout and did not succeed
EDIT: thank you all for your suggestions. I followed them, and got it done in the IO monad.
tryGenerationTime :: Int -> GenState -> GI a -> IO (a, GenState)
tryGenerationTime time state action = do
(_, s') <- -- change the random state to not generate the same thing over and over
res <- timeout time (runStateT action s')
case res of
Nothing -> tryGenerationTime time s' action
Just r -> return r
timeItT :: Int -> GI a -> GI a
timeItT time action = do
state <- get
(x, s') <- lift $ tryGenerationTime time state action
put s'
return x
Any suggestion to improving this code is welcome. I just wanted to get it done fast, since that wasn't the solution to my generation problem and I needed to set a limit to the tree height to succeed.

I suspect what you actually want is more like
tryGeneration :: GI a -> IO a
tryGeneraton action = ...
in such a way that all of your "build a tree" actions have timeout-based retries.
The key thing to understand is that "attempt to do X; if you aren't done in n milliseconds, start over" is IO's job; IO is where you have access to things like time. (Of course there are wrappers you could and should use when you only need part of what IO has to offer.)
This is fine; you have access to IO in GI, you probably just have to lift it.
That said, there's not enough information here to say exactly how to do what you want, and I'm more familiar with free-monad effect systems than mtl transformers anyway...

How to limit code changes when introducing state?

I am a senior C/C++/Java/Assembler programmer and I have been always fascinated by the pure functional programming paradigm. From time to time, I try to implement something useful with it, e.g., a small tool, but often I quickly reach a point where I realize that I (and my tool, too) would be much faster in a non-pure language. It's probably because I have much more experience with imperative programming languages with thousands of idoms, patterns and typical solution approaches in my head.
Here is one of those situations. I have encountered it several times and I hope you guys can help me.
Let's assume I write a tool to simulate communication networks. One important task is the generation of network packets. The generation is quite complex, consisting of dozens of functions and configuration parameters, but at the end there is one master function and because I find it useful I always write down the signature:
generatePackets :: Configuration -> [Packet]
However, after a while I notice that it would be great if the packet generation would have some kind of random behavior deep down in one of the many sub-functions of the generation process. Since I need a random number generator for that (and I also need it at some other places in the code), this means to manually change dozens of signatures to something like
f :: Configuration -> RNGState [Packet]
with
type RNGState = State StdGen
I understand the "mathematical" necessity (no states) behind this. My question is on a higher (?) level: How would an experienced Haskell programmer have approached this situation? What kind of design pattern or work flow would have avoided the extra work later?
I have never worked with an experienced Haskell programmer. Maybe you will tell me that you never write signatures because you have to change them too often afterwards, or that you give all your functions a state monad, "just in case" :)

One approach that I've been fairly successful with is using a monad transformer stack. This lets you both add new effects when needed and also track the effects required by particular functions.
Here's a really simple example.
import Control.Monad.State
import Control.Monad.Reader
data Config = Config { v1 :: Int, v2 :: Int }
-- the type of the entire program describes all the effects that it can do
type Program = StateT Int (ReaderT Config IO) ()
runProgram program config startState =
runReaderT (runStateT program startState) config
-- doesn't use configuration values. doesn't do IO
step1 :: MonadState Int m => m ()
step1 = get >>= \x -> put (x+1)
-- can use configuration and change state, but can't do IO
step2 :: (MonadReader Config m, MonadState Int m) => m ()
step2 = do
x <- asks v1
y <- get
put (x+y)
-- can use configuration and do IO, but won't touch our internal state
step3 :: (MonadReader Config m, MonadIO m) => m ()
step3 = do
x <- asks v2
liftIO $ putStrLn ("the value of v2 is " ++ show x)
program :: Program
program = step1 >> step2 >> step3
main :: IO ()
main = do
let config = Config { v1 = 42, v2 = 123 }
startState = 17
result <- runProgram program config startState
return ()
Now if we want to add another effect:
step4 :: MonadWriter String m => m()
step4 = tell "done!"
program :: Program
program = step1 >> step2 >> step3 >> step4
Just adjust Program and runProgram
type Program = StateT Int (ReaderT Config (WriterT String IO)) ()
runProgram program config startState =
runWriterT $ runReaderT (runStateT program startState) config
To summarize, this approach lets us decompose a program in a way that tracks effects but also allows adding new effects as needed without a huge amount of refactoring.
edit:
It's come to my attention that I didn't answer the question about what to do for code that's already written. In many cases, it's not too difficult to change pure code into this style:
computation :: Double -> Double -> Double
computation x y = x + y
becomes
computation :: Monad m => Double -> Double -> m Double
computation x y = return (x + y)
This function will now work for any monad, but doesn't have access to any extra effects. Specifically, if we add another monad transformer to Program, then computation will still work.

Sampling an MVar, can I avoid unsafePerformIO?

I have
sample :: MVar a -> IO [a]
sample v = do
a <- takeMVar v
pure (a:unsafePerformIO (sample v))
which appears to be a legitimate use of unsafePerformIO to me. But I am very interested to know how to avoid it! Is there a pattern for this use already?

You can implement a similar function using a thread, a Chan and getChanContents:
sample :: MVar a -> IO [a]
sample v = do
c <- newChan
forkIO $ forever $ takeMVar v >>= writeChan c
getChanContents c
The thread/getChanContents approach is slightly better, since at least you can rely on the MVar being continuously taken. Instead, the unsafePerformIO approach will run takeMVar at impredictable points, making the putMVars blocking in a similarly impredictable way. Of course, the getChanContents approach will buffer all the data, possibly requiring more memory.
However, both approaches are essentially similar to lazy IO, which is best to be avoided, in my opinion.

Is it possible to run several instances of hint in parallel?

Is there any way to start two hint interpreters and at runtime & subsequently assign smaller computations to either one or the other? When I invoke hint for a small expression (e.g. typed into a website) then, - without reliable testing -, it seems to me as if the time to start/load hint is approximately one second. If the instance is already started that second would be shaved.
The hint seems to have no function where I can start it and keep it nicely pending for later use.
(Auto)Plugins would be a further option of course but I think that is more suitable for modules and less elegant for smaller computations.

The GHC api, which hint is implemented in terms of (the various plugin packages are, too), does not support concurrent use.
You can leave hint running, though. It's an instance of MonadIO.
interpreterLoop :: (MonadIO m, Typeable) a => Chan ((MVar a, String)) -> InterpreterT m ()
interpreterLoop ch = do
(mvar, command) <- liftIO $ readChan ch
a <- interpret command $ argTypeWitness mvar
liftIO $ putMVar mvar a
interpreterLoop ch
where
argTypeWitness :: MVar a -> a
argTypeWitness = undefined -- this value is only used for type checking, never evaluated
runInLoop :: Typeable a => Chan ((MVar a, String)) -> String -> IO a
runInLoop ch command = do
mvar <- newEmptyMVar
writeChan ch (mvar, command)
takeMVar mvar
(I didn't test this, so I may have missed a detail or two, but the basic idea will work.)

How can I emulate Go's channels with Haskell?

I recently started reading about the Go programming language and I found the channel variables a very appealing concept. Is it possible to emulate the same concept in Haskell? Maybe to have a data type Channel a and a monad structure to enable mutable state and functions that work like the keyword go.
I'm not very good in concurrent programming and a simple channel passing mechanism like this in Haskell would really make my life easier.
EDIT
People asked me to clarify what kind of Go's patterns I was interested in translating to Haskell. So Go has channel variables that are first class and can be passed around and returned by functions. I can read and write to these channels, and so communicate easily between routines that can run concurrently. Go also has a go keyword, that according to the language spec initiates the execution of a function concurrently as an independent thread and continues to execute the code without waiting.
The exact pattern I'm interested in is something like this (Go's syntax is weird - variables are declared by varName varType instead of the usual inverted way - but I think it is readable):
func generateStep(ch chan int) {
//ch is a variable of type chan int, which is a channel that comunicate integers
for {
ch <- randomInteger() //just sends random integers in the channel
}
func filter(input, output chan int) {
state int
for {
step <- input //reads an int from the input channel
newstate := update(state, step) //update the variable with some update function
if criteria(newstate, state) {
state = newstate // if the newstate pass some criteria, accept the update
}
output <- state //pass it to the output channel
}
}
func main() {
intChan := make(chan int)
mcChan := make(chan int)
go generateStep(intChan) // execute the channels concurrently
go filter(intChan, mcChan)
for i:=0; i<numSteps; i++ {
x <- mcChan // get values from the filtered channel
accumulateStats(x) // calculate some statistics
}
printStatisticsAbout(x)
}
My primary interest is to do Monte Carlo simulations, in which I generate configurations sequentially by trying to modify the current state of the system and accepting the modification if it satisfies some criteria.
The fact the using those channel stuff I could write a very simple, readable and small Monte Carlo simulation that would run in parallel in my multicore processor really impressed me.
The problem is that Go have some limitations (specially, it lacks polymorphism in the way I'm accustomed to in Haskell), and besides that, I really like Haskell and don't wanna trade it away. So the question is if there's some way to use some mechanics that looks like the code above to do a concurrent simulation in Haskell easily.
EDIT(2, context):
I'm not learned in Computer Science, specially in concurrency. I'm just a guy who creates simple programs to solve simple problems in my daily research routine in a discipline not at all related to CS. I just find the way Haskell works interesting and like to use it to do my little chores.
I never heard about alone pi-calculus or CSP channels. Sorry if the question seems ill posed, it's probably my huge-ignorance-about-the-matter's fault.
You are right, I should be more specific about what pattern in Go I'd like to replicate in Haskell, and I'll try to edit the question to be more specific. But don't expect profound theoretical questions. The thing is just that, from the few stuff I read and coded, it seems Go have a neat way to do concurrency (and in my case this just means that my job of making all my cores humming with numerical calculations is easier), and if I could use a similar syntax in Haskell I'd be glad.

I think what you are looking for is Control.Concurrent.Chan from Base. I haven't found it to be any different from go's chans other then the obvious haskellifications. Channels aren't something that is special to go, have a look at the wiki page about it.
Channels are part of a more general concept called communicating sequential processes (CSP), and if you want to do programming in the style of CSP in Haskell you might want to take a look at the Communicating Haskell Processes (CHP) package.
CHP is only one way of doing concurrency in Haskell, take a look at the Haskellwiki concurrency page for more information. I think your use case might be best written using Data Parrallel Haskell, however that is currently a work in progress, so you might want to use something else for now.

Extending HaskellElephant's answer, Control.Concurrent.Chan is the way to go for channels and Control.Concurrent's forkIO can emulate the go keyword. To make the syntax a bit more similar to Go, this set of aliases can be used:
import Control.Concurrent (forkIO)
import Control.Concurrent.Chan (newChan, readChan, writeChan)
import Control.Concurrent.MVar (newMVar, swapMVar, readMVar)
data GoChan a = GoChan { chan :: Chan a, closed :: MVar Bool }
go :: IO () -> IO ThreadId
go = forkIO
make :: IO (GoChan a)
make = do
ch <- newChan
cl <- newMVar False
return $ GoChan ch cl
get :: GoChan a -> IO a
get ch = do
cl <- readMVar $ closed ch
if cl
then error "Can't read from closed channel!"
else readChan $ chan ch
(=->) :: a -> GoChan a -> IO ()
v =-> ch = do
cl <- readMVar $ closed ch
if cl
then error "Can't write to closed channel!"
else writeChan (chan ch) v
forRange :: GoChan a -> (a -> IO b) -> IO [b]
forRange ch func = fmap reverse $ range_ ch func []
where range_ ch func acc = do
cl <- readMVar $ closed ch
if cl
then return ()
else do
v <- get ch
func v
range_ ch func $ v : acc
close :: GoChan a -> IO ()
close ch = do
swapMVar (closed ch) True
return ()
This can be used like so:
import Control.Monad
generate :: GoChan Int -> IO ()
generate c = do
forM [1..100] (=-> c)
close c
process :: GoChan Int -> IO ()
process c = forRange c print
main :: IO ()
main = do
c <- make
go $ generate c
process c
(Warning: untested code)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How do I create a thread pool? - haskell

Related

Timeout on StateT with IO

How to limit code changes when introducing state?

Sampling an MVar, can I avoid unsafePerformIO?

Is it possible to run several instances of hint in parallel?

How can I emulate Go's channels with Haskell?

Categories

Resources