I recently started reading about the Go programming language and I found the channel variables a very appealing concept. Is it possible to emulate the same concept in Haskell? Maybe to have a data type Channel a and a monad structure to enable mutable state and functions that work like the keyword go.
I'm not very good in concurrent programming and a simple channel passing mechanism like this in Haskell would really make my life easier.
EDIT
People asked me to clarify what kind of Go's patterns I was interested in translating to Haskell. So Go has channel variables that are first class and can be passed around and returned by functions. I can read and write to these channels, and so communicate easily between routines that can run concurrently. Go also has a go keyword, that according to the language spec initiates the execution of a function concurrently as an independent thread and continues to execute the code without waiting.
The exact pattern I'm interested in is something like this (Go's syntax is weird - variables are declared by varName varType instead of the usual inverted way - but I think it is readable):
func generateStep(ch chan int) {
//ch is a variable of type chan int, which is a channel that comunicate integers
for {
ch <- randomInteger() //just sends random integers in the channel
}
func filter(input, output chan int) {
state int
for {
step <- input //reads an int from the input channel
newstate := update(state, step) //update the variable with some update function
if criteria(newstate, state) {
state = newstate // if the newstate pass some criteria, accept the update
}
output <- state //pass it to the output channel
}
}
func main() {
intChan := make(chan int)
mcChan := make(chan int)
go generateStep(intChan) // execute the channels concurrently
go filter(intChan, mcChan)
for i:=0; i<numSteps; i++ {
x <- mcChan // get values from the filtered channel
accumulateStats(x) // calculate some statistics
}
printStatisticsAbout(x)
}
My primary interest is to do Monte Carlo simulations, in which I generate configurations sequentially by trying to modify the current state of the system and accepting the modification if it satisfies some criteria.
The fact the using those channel stuff I could write a very simple, readable and small Monte Carlo simulation that would run in parallel in my multicore processor really impressed me.
The problem is that Go have some limitations (specially, it lacks polymorphism in the way I'm accustomed to in Haskell), and besides that, I really like Haskell and don't wanna trade it away. So the question is if there's some way to use some mechanics that looks like the code above to do a concurrent simulation in Haskell easily.
EDIT(2, context):
I'm not learned in Computer Science, specially in concurrency. I'm just a guy who creates simple programs to solve simple problems in my daily research routine in a discipline not at all related to CS. I just find the way Haskell works interesting and like to use it to do my little chores.
I never heard about alone pi-calculus or CSP channels. Sorry if the question seems ill posed, it's probably my huge-ignorance-about-the-matter's fault.
You are right, I should be more specific about what pattern in Go I'd like to replicate in Haskell, and I'll try to edit the question to be more specific. But don't expect profound theoretical questions. The thing is just that, from the few stuff I read and coded, it seems Go have a neat way to do concurrency (and in my case this just means that my job of making all my cores humming with numerical calculations is easier), and if I could use a similar syntax in Haskell I'd be glad.
I think what you are looking for is Control.Concurrent.Chan from Base. I haven't found it to be any different from go's chans other then the obvious haskellifications. Channels aren't something that is special to go, have a look at the wiki page about it.
Channels are part of a more general concept called communicating sequential processes (CSP), and if you want to do programming in the style of CSP in Haskell you might want to take a look at the Communicating Haskell Processes (CHP) package.
CHP is only one way of doing concurrency in Haskell, take a look at the Haskellwiki concurrency page for more information. I think your use case might be best written using Data Parrallel Haskell, however that is currently a work in progress, so you might want to use something else for now.
Extending HaskellElephant's answer, Control.Concurrent.Chan is the way to go for channels and Control.Concurrent's forkIO can emulate the go keyword. To make the syntax a bit more similar to Go, this set of aliases can be used:
import Control.Concurrent (forkIO)
import Control.Concurrent.Chan (newChan, readChan, writeChan)
import Control.Concurrent.MVar (newMVar, swapMVar, readMVar)
data GoChan a = GoChan { chan :: Chan a, closed :: MVar Bool }
go :: IO () -> IO ThreadId
go = forkIO
make :: IO (GoChan a)
make = do
ch <- newChan
cl <- newMVar False
return $ GoChan ch cl
get :: GoChan a -> IO a
get ch = do
cl <- readMVar $ closed ch
if cl
then error "Can't read from closed channel!"
else readChan $ chan ch
(=->) :: a -> GoChan a -> IO ()
v =-> ch = do
cl <- readMVar $ closed ch
if cl
then error "Can't write to closed channel!"
else writeChan (chan ch) v
forRange :: GoChan a -> (a -> IO b) -> IO [b]
forRange ch func = fmap reverse $ range_ ch func []
where range_ ch func acc = do
cl <- readMVar $ closed ch
if cl
then return ()
else do
v <- get ch
func v
range_ ch func $ v : acc
close :: GoChan a -> IO ()
close ch = do
swapMVar (closed ch) True
return ()
This can be used like so:
import Control.Monad
generate :: GoChan Int -> IO ()
generate c = do
forM [1..100] (=-> c)
close c
process :: GoChan Int -> IO ()
process c = forRange c print
main :: IO ()
main = do
c <- make
go $ generate c
process c
(Warning: untested code)
Related
I have a custom type
type GI a = StateT GenState IO a
where GenState is a state I keep for Generating Random Trees of some kind.
When generating my trees, termination is not guaranteed in a reasonable amount of time. That's why I thought I might terminate the calculation and restart it over and over again with a timeout until a result is given.
So my question is how to write a function of the form
tryGeneration :: GI a -> GI a
tryGeneraton action = ...
where action is the calculation to try in some microseconds and if it runs out of time begin the action from the start.
Please keep in mind that I'm quite new to Monad Transformers and I cannot say that i fully understand them yet.
I tried to use lift with System.Timeout.timeout and did not succeed
EDIT: thank you all for your suggestions. I followed them, and got it done in the IO monad.
tryGenerationTime :: Int -> GenState -> GI a -> IO (a, GenState)
tryGenerationTime time state action = do
(_, s') <- -- change the random state to not generate the same thing over and over
res <- timeout time (runStateT action s')
case res of
Nothing -> tryGenerationTime time s' action
Just r -> return r
timeItT :: Int -> GI a -> GI a
timeItT time action = do
state <- get
(x, s') <- lift $ tryGenerationTime time state action
put s'
return x
Any suggestion to improving this code is welcome. I just wanted to get it done fast, since that wasn't the solution to my generation problem and I needed to set a limit to the tree height to succeed.
I suspect what you actually want is more like
tryGeneration :: GI a -> IO a
tryGeneraton action = ...
in such a way that all of your "build a tree" actions have timeout-based retries.
The key thing to understand is that "attempt to do X; if you aren't done in n milliseconds, start over" is IO's job; IO is where you have access to things like time. (Of course there are wrappers you could and should use when you only need part of what IO has to offer.)
This is fine; you have access to IO in GI, you probably just have to lift it.
That said, there's not enough information here to say exactly how to do what you want, and I'm more familiar with free-monad effect systems than mtl transformers anyway...
I am a senior C/C++/Java/Assembler programmer and I have been always fascinated by the pure functional programming paradigm. From time to time, I try to implement something useful with it, e.g., a small tool, but often I quickly reach a point where I realize that I (and my tool, too) would be much faster in a non-pure language. It's probably because I have much more experience with imperative programming languages with thousands of idoms, patterns and typical solution approaches in my head.
Here is one of those situations. I have encountered it several times and I hope you guys can help me.
Let's assume I write a tool to simulate communication networks. One important task is the generation of network packets. The generation is quite complex, consisting of dozens of functions and configuration parameters, but at the end there is one master function and because I find it useful I always write down the signature:
generatePackets :: Configuration -> [Packet]
However, after a while I notice that it would be great if the packet generation would have some kind of random behavior deep down in one of the many sub-functions of the generation process. Since I need a random number generator for that (and I also need it at some other places in the code), this means to manually change dozens of signatures to something like
f :: Configuration -> RNGState [Packet]
with
type RNGState = State StdGen
I understand the "mathematical" necessity (no states) behind this. My question is on a higher (?) level: How would an experienced Haskell programmer have approached this situation? What kind of design pattern or work flow would have avoided the extra work later?
I have never worked with an experienced Haskell programmer. Maybe you will tell me that you never write signatures because you have to change them too often afterwards, or that you give all your functions a state monad, "just in case" :)
One approach that I've been fairly successful with is using a monad transformer stack. This lets you both add new effects when needed and also track the effects required by particular functions.
Here's a really simple example.
import Control.Monad.State
import Control.Monad.Reader
data Config = Config { v1 :: Int, v2 :: Int }
-- the type of the entire program describes all the effects that it can do
type Program = StateT Int (ReaderT Config IO) ()
runProgram program config startState =
runReaderT (runStateT program startState) config
-- doesn't use configuration values. doesn't do IO
step1 :: MonadState Int m => m ()
step1 = get >>= \x -> put (x+1)
-- can use configuration and change state, but can't do IO
step2 :: (MonadReader Config m, MonadState Int m) => m ()
step2 = do
x <- asks v1
y <- get
put (x+y)
-- can use configuration and do IO, but won't touch our internal state
step3 :: (MonadReader Config m, MonadIO m) => m ()
step3 = do
x <- asks v2
liftIO $ putStrLn ("the value of v2 is " ++ show x)
program :: Program
program = step1 >> step2 >> step3
main :: IO ()
main = do
let config = Config { v1 = 42, v2 = 123 }
startState = 17
result <- runProgram program config startState
return ()
Now if we want to add another effect:
step4 :: MonadWriter String m => m()
step4 = tell "done!"
program :: Program
program = step1 >> step2 >> step3 >> step4
Just adjust Program and runProgram
type Program = StateT Int (ReaderT Config (WriterT String IO)) ()
runProgram program config startState =
runWriterT $ runReaderT (runStateT program startState) config
To summarize, this approach lets us decompose a program in a way that tracks effects but also allows adding new effects as needed without a huge amount of refactoring.
edit:
It's come to my attention that I didn't answer the question about what to do for code that's already written. In many cases, it's not too difficult to change pure code into this style:
computation :: Double -> Double -> Double
computation x y = x + y
becomes
computation :: Monad m => Double -> Double -> m Double
computation x y = return (x + y)
This function will now work for any monad, but doesn't have access to any extra effects. Specifically, if we add another monad transformer to Program, then computation will still work.
I am wiriting simple application which has functionality of saving/loading its current state.
Save function looks as below:
doSave :: BoardType -> Field -> [Char] -> Bool
doSave board player fileName = do
let x = encodeFile fileName (board :: BoardType, player :: Field)
True -- there will be exception handling
And my load function:
doLoad :: [Char] -> IO (BoardType, Field)
doLoad fileName = decodeFile fileName :: IO (BoardType, Field)
And there's my problem, after loading, I have IO (BoardType, Field) which does not fit my program and other functions which probably should not accept IO parameters. If I have followed this IO escalation, there would be all IOs in my application - is it necessary (or - normal in haskell language)?
And finally - is there a simple way I can get rid of this IO?
It takes a little while to get used to.
Some monads let you extract the "inner value" after some work, but IO never does.
There is no way that e.g. returning the system time can ever be "pure" so any calculations you make with it need to remain wrapped in IO.
However, that doesn't mean most of your code lives in IO-land.
main = do
(bt, fld) <- doLoad "somefilename"
let bResult = doSomethingPureWithBoard bt
let fResult = doSomethingPureWithField fld
let bt2 = updateBoard bt bResult fResult
doSave "someFilename" bt2 fld -- This should also be in IO
You can always call pure functions from IO, just not the other way around. The <- gives you the "unwrapped" values while you are in an IO function. Actually it's passing the results as parameters to the next "statement" - google around "de-sugaring do notation" and similar.
is it necessary (or - normal in haskell language)
Your application will typically have an outer wrapper of IO actions, beginning with main :: IO () and the repeatedly restricted code that has less and less privledges, until you're only dealing with pure code.
Sometimes I want to run a maximum amount of IO actions in parallel at once for network-activity, etc. I whipped up a small concurrent thread function which works well with https://gist.github.com/810920, but this isn't really a pool as all IO actions must finish before others can start.
The type of what I'm looking for would be something like:
runPool :: Int -> [IO a] -> IO [a]
and should be able to operate on finite and infinite lists.
The pipes package looks like it would be able to achieve this quite well, but I feel there is probably a similar solution to the gist I have provided just using mvars, etc, from the haskell-platform.
Has anyone encountered an idiomatic solution without any heavy dependencies?
You need a thread pool, if you want something short, you could get inspiration from Control.ThreadPool (from the control-engine package which also provide more general functions), for instance threadPoolIO is just :
threadPoolIO :: Int -> (a -> IO b) -> IO (Chan a, Chan b)
threadPoolIO nr mutator = do
input <- newChan
output <- newChan
forM_ [1..nr] $
\_ -> forkIO (forever $ do
i <- readChan input
o <- mutator i
writeChan output o)
return (input, output)
It use two Chan for communication with the outside but that's usually what you want, it really help writing code that don't mess up.
If you absolutely want to wrap it up in a function of your type you can encapsulate the communication too :
runPool :: Int -> [IO a] -> IO [a]
runPool n as = do
(input, output) <- threadPoolIO n (id)
forM_ as $ writeChan input
sequence (repeat (length as) $ readChan output)
This won't keep the order of your actions, is that a problem (it's easy to correct by transmitting the index of the action or just using an array instead to store the responses) ?
Note : the n threads will stay alive forever with this simplistic version, adding a "killAll" returned action to threadPoolIO would resolve this problem handily if you intend to create and trash several of those pool in a long running application (if not, given the weight of threads in Haskell, it's probably not worth the bother).
Note that this function works on finite lists only, that's because IO is normally strict so you can't start to process elements of IO [a] before the whole list is produced, if you really want that you'll have either to use lazy IO with unsafeInterleaveIO (maybe not the best idea) or completely change your model and use something like conduits to stream your results.
Here's an example Haskell FRP program using the reactive-banana library. I'm only just starting to feel my way with Haskell, and especially haven't quite got my head around what FRP means. I'd really appreciate some critique of the code below
{-# LANGUAGE DeriveDataTypeable #-}
module Main where
{-
Example FRP/zeromq app.
The idea is that messages come into a zeromq socket in the form "id state". The state is of each id is tracked until it's complete.
-}
import Control.Monad
import Data.ByteString.Char8 as C (unpack)
import Data.Map as M
import Data.Maybe
import Reactive.Banana
import System.Environment (getArgs)
import System.ZMQ
data Msg = Msg {mid :: String, state :: String}
deriving (Show, Typeable)
type IdMap = Map String String
-- | Deserialize a string to a Maybe Msg
fromString :: String -> Maybe Msg
fromString s =
case words s of
(x:y:[]) -> Just $ Msg x y
_ -> Nothing
-- | Map a message to a partial operation on a map
-- If the 'state' of the message is "complete" the operation is a delete
-- otherwise it's an insert
toMap :: Msg -> IdMap -> IdMap
toMap msg = case msg of
Msg id_ "complete" -> delete id_
_ -> insert (mid msg) (state msg)
main :: IO ()
main = do
(socketHandle,runSocket) <- newAddHandler
args <- getArgs
let sockAddr = case args of
[s] -> s
_ -> "tcp://127.0.0.1:9999"
putStrLn ("Socket: " ++ sockAddr)
network <- compile $ do
recvd <- fromAddHandler socketHandle
let
-- Filter out the Nothings
justs = filterE isJust recvd
-- Accumulate the partially applied toMap operations
counter = accumE M.empty $ (toMap . fromJust <$> justs)
-- Print the contents
reactimate $ fmap print counter
actuate network
-- Get a socket and kick off the eventloop
withContext 1 $ \ctx ->
withSocket ctx Sub $ \sub -> do
connect sub sockAddr
subscribe sub ""
linkSocketHandler sub runSocket
-- | Recieve a message, deserialize it to a 'Msg' and call the action with the message
linkSocketHandler :: Socket a -> (Maybe Msg -> IO ()) -> IO ()
linkSocketHandler s runner = forever $ do
receive s [] >>= runner . fromString . C.unpack
There's a gist here: https://gist.github.com/1099712.
I'd particularly welcome any comments around whether this is a "good" use of accumE, (I'm unclear of this function will traverse the whole event stream each time although I'm guessing not).
Also I'd like to know how one would go about pulling in messages from multiple sockets - at the moment I have one event loop inside a forever. As a concrete example of this how would I add second socket (a REQ/REP pair in zeromq parlance) to query to the current state of the IdMap inside counter?
(Author of reactive-banana speaking.)
Overall, your code looks fine to me. I don't actually understand why you are using reactive-banana in the first place, but you'll have your reasons. That said, if you are looking for something like Node.js, remember that Haskell's leightweight threads make it unnecessary to use an event-based architecture.
Addendum: Basically, functional reactive programming is useful when you have a variety of different inputs, states and output that must work together with just the right timing (think GUIs, animations, audio). In contrast, it's overkill when you are dealing with many essentially independent events; these are best handled with ordinary functions and the occasional state.
Concerning the individual questions:
"I'd particularly welcome any comments around whether this is a "good" use of accumE, (I'm unclear of this function will traverse the whole event stream each time although I'm guessing not)."
Looks fine to me. As you guessed, the accumE function is indeed real-time; it will only store the current accumulated value.
Judging from your guess, you seem to be thinking that whenever a new event comes in, it will travel through the network like a firefly. While this does happen internally, it is not how you should think about functional reactive programming. Rather, the right picture is this: the result of fromAddHandler is the complete list of input events as they will happen. In other words, you should think that recvd contains the ordered list of each and every event from the future. (Of course, in the interest of your own sanity, you shouldn't try to look at them before their time has come. ;-)) The accumE function simply transforms one list into another by traversing it once.
I will need to make this way of thinking more clear in the documentation.
"Also I'd like to know how one would go about pulling in messages from multiple sockets - at the moment I have on event loop inside a forever. As a concrete example of this how would I add second socket (a REQ/REP pair in zeromq parlance) to query to the current state of the IdMap inside counter?"
If the receive function does not block, you can simply call it twice on different sockets
linkSocketHandler s1 s2 runner1 runner2 = forever $ do
receive s1 [] >>= runner1 . fromString . C.unpack
receive s2 [] >>= runner2 . fromString . C.unpack
If it does block, you will need to use threads, see also the section Handling Multiple TCP Streams in the book Real World Haskell. (Feel free to ask a new question on this, as it is outside the scope of this one.)