Ensure IO computations are run in a specific thread

Ensure IO computations are run in a specific thread - haskell

I need to make sure that some actions are run on a specific OS thread. I wrote an API where this thread runs a loop listening to a TQueue and executes the given commands. From the API user side, there is an opaque value that is really just a newtype over this queue.
One problem is that what I really need is to embed arbitrary actions (type IO a), but I believe I can't directly exchange messages of that type. So I currently have something like this (pseudo code) :
makeSafe :: RubyInterpreter -> IO a -> IO (Either RubyError a)
makeSafe (RubyInterpreter q) a = do
mv <- newEmptyTMVarIO
-- embedded is of type IO (), letting me send this in my queue
let embedded = handleErrors a >>= atomically . putTMVar mv
atomically (writeTQueue q (SomeMessage embedded))
atomically (readTMVar mv)
(for more details, this is for the hruby package)
edit - clarifications :
Being able to send actions of type IO a would be nicer, but is not my main objective.
My main problem is that you can shoot yourself in the foot with this API, for example if there is a makeSafe call in the IO action that is passed as a parameter, this will hang.
My secondary problem is that this solution feels a bit contrived, and I wondered if there was a nicer/safer solution around.

Related

How to modify a state monad?

I use State Monad Transformer to manage global state like this
data State = State ...
StateT State IO ()
And I use amqp to consume messages from RabbitMQ. The state will be modified according to messages received. The function has the type like
consumeMsgs :: Channel
-> Text
-> Ack
-> ((Message, Envelope) -> IO ()) -- ^ the callback function
-> IO ConsumerTag
Right now we can ignore other parameters but the third which is a callback function I will supply and where the modification happen.
Because it's a mainly IO Monad, so I use this function as follows
consumeMsgs chan queue Rmq.Ack (flip evalStateT ssss . rmqCallback)
Here the ssss is the state I put in and I find that during the process of my callback function rmqCallback the state can be correctly modified. But, every time next callback happens the global state is the same as before the consumeMsgs is called or equal with ssss.
I understand State Monad is just a process needing an initial state to put in and maintain the state during whole way but has nothing to do with the state out of Monad (am I missing something?), so I count on MVar to hold and modify the state, and that works. I want to know it's there other way to handle this, maybe another Monad?

It looks like you could use Network.AMQP.Lifted.consumeMsgs. StateT s IO is an instance of MonadBaseControl IO m, so you could run whole consumeMsgs inside single runStateT
Yes, StateT monad transformer is basically a nice notation for a pure code, so if your API accepts only IO callbacks you have no choice but to use "real" state like MVar or IORef etc.
PS: As other answer suggests, the state changes done in Network.AMQP.Lifted.consumeMsgs's callback do not propagate to subsequent callback runs or resulting state. I cannot wrap my head around the implementation, but I tried liftBaseWith a bit and it really looks like so.

To add a clarification that might be useful for future reference, the accepted answer is not exact. While Network.AMQP.Lifted.consumeMsgs should work with StateT s IO, the RabbitMQ haskell library actually discards the monadic state after each use. This means that if you do use that instance, you will not see changes made after the initial consumeMsgs call, including changes made by the callback itself. The callback is basically called with the same Monadic state every time - the state in which it was when the callback was registered.
This means that you can use it to pass global configuration state, but not to keep track of state between callback executions.

Concurrency considerations between pipes and non-pipes code

I'm in the process of wrapping a C library for some encoding in a pipes interface, but I've hit upon some design decisions that need to be made.
After the C library is set up, we hold on to an encoder context. With this, we can either encode, or change some parameters (let's call the Haskell interface to this last function tune :: Context -> Int -> IO ()). There are two parts to my question:
The encoding part is easily wrapped up in a Pipe Foo Bar IO (), but I would also like to expose tune. Since simultaneous use of the encoding context must be lock protected, I would need to take a lock at every iteration in the pipe, and protect tune with taking the same lock. But now I feel I'm forcing hidden locks on the user. Am I barking up the wrong tree here? How is this kind of situation normally resolved in the pipes ecosystem? In my case I expect the pipe that my specific code is part of to always run in its own thread, with tuning happening concurrently, but I don't want to force this point of view upon any users. Other packages in the pipes ecosystem do not seem to force their users like either.
An encoding context that is no longer used needs to be properly de-initialized. How does one, in the pipes ecosystem, ensure that such things (in this case performing som IO actions) are taken care of when the pipe is destroyed?
A concrete example would be wrapping a compression library, in which case the above can be:
The compression strength is tunable. We set up the pipe and it runs along merrily. How should one best go about allowing the compression strength setting to be changed while the pipe keeps running, assuming that concurrent access to the compression codec context must be serialized?
The compression library allocated a bunch of memory off the Haskell heap when set up, and we'll need to call some library function to clean this up when the pipe is torn down.
Thanks… this might all be obvious, but I'm quite new to the pipes ecosystem.
Edit: Reading this after posting, I'm quite sure it's the vaguest question I've ever asked here. Ugh! Sorry ;-)

Regarding (1), the general solution is to change your Pipe's type to:
Pipe (Either (Context, Int) Foo) Bar IO ()
In other words, it accepts both Foo inputs and tune requests, which it processes internally.
So let's then assume that you have two concurrent Producers corresponding to inputs and tune requests:
producer1 :: Producer Foo IO ()
producer2 :: Producer (Context, Int) IO ()
You can use pipes-concurrency to create a buffer that they both feed into, like this:
example = do
(output, input) <- spawn Unbounded
-- input :: Input (Either (Context, Int) Foo)
-- output :: Output (Either (Context, Int) Foo)
let io1 = runEffect $ producer1 >-> Pipes.Prelude.map Right >-> toOutput output
io2 = runEffect $ producer2 >-> Pipes.Prelude.map Left >-> toOutput output
as <- mapM async [io1, io2]
runEffect (fromInput >-> yourPipe >-> someConsumer)
mapM_ wait as
You can learn more about the pipes-concurrency library by reading this tutorial.
By forcing all tune requests to go through the same single-threaded Pipe you can ensure that you don't accidentally have two concurrent invocations of the tune function.
Regarding (2) there are two ways you can acquire a resource using pipes. The more sophisticated approach is to use the pipes-safe library, which provides a bracket function that you can use within a Pipe, but that is probably overkill for your purpose and only exists for acquiring and releasing multiple resources over the lifetime of a pipe. A simpler solution is just to use the following with idiom to acquire the pipe:
withEncoder :: (Pipe Foo Bar IO () -> IO r) -> IO r
withEncoder k = bracket acquire release $ \resource -> do
k (createPipeFromResource resource)
Then a user would just write:
withEncoder $ \yourPipe -> do
runEffect (someProducer >-> yourPipe >-> someConsumer)
You can optionally use the managed package, which simplifies the types a bit and makes it easier to acquire multiple resources. You can learn more about it from reading this blog post of mine.

Functional Banana Traveller - input Handling : Will this do what I want?

The way I want to manage input for my game is to poll a TChan, and then create an Event when an eTick happens. But will the way I'm trying it work?
data UAC = UAC (AID,PlayerCommand) deriving Show
makeNetworkDescription :: forall t . Frameworks t =>
TChan UAC ->
AddHandler () ->
TChan GameState ->
Moment t ()
makeNetworkDescription commandChannel tickHandler gsChannel = do
eTick <- fromAddHandler tickHandler
bCChannel <- fromPoll $ grabCommands commandChannel
let eCChannel = bCChannel <# eTick
...
reactimate ...
grabCommands :: TChan UAC -> IO [UAC]
grabCommands unval = do
(atomically $ readTChan unval) `untilM` (atomically $ isEmptyTChan unval)
from the documentation for fromPoll
"Input, obtain a Behavior by frequently polling mutable data, like the current time. The resulting Behavior will be updated on whenever the event network processes an input event."
Am I understanding this correctly? The TChan is being populated from other code and then every eTick I empty it and get another Event t [UAC]?
Maybe my understanding is wrong, or this computation is too expensive for fromPoll. In that case what's a better direction to go in?

I was facing the same question in my game. I decided to try to keep the event network as free as possible from implementation-specific stuff (such as input protocols). Instead I block in a IO thread outside of the event network and send processed events to the event network from there (using something like the EventSource "design pattern" used in the reactive-banana examples.
The reason for this is that this way the event network has to process only well-defined and simple input commands and fromPoll is not needed. The particulars (such as if the input is coming from local input or a network, if the input events are well-formed, how errors are handled) are done in other parts of the program.
On the other hand, if the design of your game is such that input is handled only during the game ticks and input events must be buffered, then you will need some place to hold them. A TChan will do the trick as well as any other way, I suppose.

State-dependent event processing with state updates

I want to use FRP (i.e., reactive banana 0.6.0.0) for my project (a GDB/MI front-end). But I have troubles declaring the event network.
There are commands from the GUI and there are stop events from GDB. Both need to be handled and handling them depends on the state of the system.
My current approach looks like this (I think this is the minimum required complexity to show the problem):
data Command = CommandA | CommandB
data Stopped = ReasonA | ReasonB
data State = State {stateExec :: Exec, stateFoo :: Int}
data StateExec = Running | Stopped
create_network :: NetworkDescription t (Command -> IO ())
create_network = do
(eCommand, fCommand) <- newEvent
(eStopped, fStopped) <- newEvent
(eStateUpdate, fStateUpdate) <- newEvent
gdb <- liftIO $ gdb_init fStopped
let
eState = accumE initialState eStateUpdate
bState = stepper initialState eState
reactimate $ (handleCommand gdb fStateUpdate <$> bState) <#> eCommand
reactimate $ (handleStopped gdb fStateUpdate <$> bState) <#> eStopped
return fCommand
handleCommand and handelStopped react on commands and stop events depending on the current state. Possible reactions are calling (synchronous) GDB I/O functions and firing state update events. For example:
handleCommand :: GDB -> ((State -> State) -> IO ()) -> State -> Command -> IO ()
handleCommand gdb fStateUpdate state CommandA = case stateExec state of
Running -> do
gdb_interrupt gdb
fStateUpdate f
where f state' = state' {stateFoo = 23}
The problem is, when f gets evaluated by accumE, state' sometimes is different from state.
I am not 100% sure why this can happen as I don't fully understand the semantics of time and simultaneity and the order of "reactimation" in reactive banana. But I guess that state update functions fired by handleStopped might get evaluated before f thus changing the state.
Anyway, this event network leads to inconsistent state because the assumptions of f on the "current" state are sometimes wrong.
I have been trying to solve this problem for over a week now and I just cannot figure it out. Any help is much appreciated.

It looks like you want to make a eStateUpdate event occur whenever eStop or eCommand occurs?
If so, you can simply express it as the union of the two events:
let
eStateUpdate = union (handleCommand' <$> eCommand)
(handleStopped' <$> eStopped)
handleCommand' :: Command -> (State -> State)
handleStopped' :: Stopped -> (State -> State)
eState = accumE initialState eStateUpdate
etc.
Remember: events behave like ordinary values which you can combine to make new ones, you're not writing a chain of callback functions.
The newEvent function should only be used if you want to import an event from the outside world. That's the case for eCommand and eStopped, as they are triggered by the external GDB, but the eStateUpdate event seems to be internal to the network.
Concerning behavior of your current code, reactive-banana always does the following things when receiving an external event:
Calculate/update all event occurrences and behavior values.
Run the reactimates in order.
But it may well happen happen that step 2 triggers the network again (for instance via the fStateUpdate function), in which case the network calculates new values and calls the reactimates again, as part of this function call. After this, flow control returns to the first sequence of reactimates that is still being run, and a second call to fStateUpdate will have strange effects: the behaviors inside the network have been updated already, but the argument to this call is still an old value. Something like this:
reactimate1
reactimate2
fStateUpdate -- behaviors inside network get new values
reactimate1'
reactimate2'
reactimate3 -- may contain old values from first run!
Apparently, this is tricky to explain and tricky to reason about, but fortunately unnecessary if you stick to the guidelines above.
In a sense, the latter part embodies the trickiness of writing event handlers in the traditional style, whereas the former part embodies the (relative) simplicity of programming with events in FRP-style.
The golden rule is:
Do not call another event handler while handling an event.
You don't have to follow this rule, and it can be useful at times; but things will become complicated if you do that.

As far as I can see, FRP seems not to be the right abstraction for my problem.
So I switched to actors with messages of type State -> IO State.
This gives me the required serialization of events and the possibility to do IO when updating the state. What I loose is the nice description of the event network. But it's not too bad with actors either.

How can one implement a forking try-catch in Haskell?

I want to write a function
forkos_try :: IO (Maybe α) -> IO (Maybe α)
which Takes a command x. x is an imperative operation which first mutates state, and then checks whether that state is messed up or not. (It does not do anything external, which would require some kind of OS-level sandboxing to revert the state.)
if x evaluates to Just y, forkos_try returns Just y.
otherwise, forkos_try rolls back state, and returns Nothing.
Internally, it should fork() into threads parent and child, with x running on child.
if x succeeds, child should keep running (returning x's result) and parent should die
otherwise, parent should keep running (returning Nothing) and child should die
Question: What's the way to write something with equivalent, or more powerful semantics than forkos_try? N.B. -- the state mutated (by x) is in an external library, and cannot be passed between threads. Hence, the semantic of which thread to keep alive is important.
Formally, "keep running" means "execute some continuation rest :: Maybe α -> IO () ". But, that continuation isn't kept anywhere explicit in code.
For my case, I think it will (for the time) work to write it in different style, using forkOS (which takes the entire computation child will run), since I can write an explicit expression for rest. But, it troubles me that I can't figure out how do this with the primitive function forkOS -- one would think it would be general enough to support any specific case (which could appear as a high-level API, like forkos_try).
EDIT -- please see the example code with explicit rest if the problem's still not clear [ http://pastebin.com/nJ1NNdda ].
p.s. I haven't written concurrency code in a while; hopefully my knowledge of POSIX fork() is correct! Thanks in advance.

Things are a lot simpler to reason about if you model state explicitly.
someStateFunc :: (s -> Maybe (a, s))
-- inside some other function
case someStateFunc initialState of
Nothing -> ... -- it failed. stick with initial state
Just (a, newState) -> ... -- it suceeded. do something with
-- the result and new state
With immutable state, "rolling back" is simple: just keep using initialState. And "not rolling back" is also simple: just use newState.
So...I'm assuming from your explanation that this "external library" performs some nontrivial IO effects that are nevertheless restricted to a few knowable and reversible operations (modify a file, an IORef, etc). There is no way to reverse some things (launch the missiles, write to stdout, etc), so I see one of two choices for you here:
clone the world, and run the action in a sandbox. If it succeeds, then go ahead and run the action in the Real World.
clone the world, and run the action in the real world. If it fails, then replace the Real World with the snapshot you took earlier.
Of course, both of these are actually the same approach: fork the world. One world runs the action, one world doesn't. If the action succeeds, then that world continues; otherwise, the other world continues. You are proposing to accomplish this by building upon forkOS, which would clone the entire state of the program, but this would not be sufficient to deal with, for example, file modifications. Allow me to suggest instead an approach that is nearer to the simplicity of immutable state:
tryIO :: IO s -> (s -> IO ()) -> IO (Maybe a) -> IO (Maybe a)
tryIO save restore action = do
initialState <- save
result <- action
case result of
Nothing -> restore initialState >> return Nothing
Just x -> return (Just x)
Here you must provide some data structure s, and a way to save to and restore from said data structure. This allows you the flexibility to perform any cloning you know to be necessary. (e.g. save could copy a certain file to a temporary location, and then restore could copy it back and delete the temporary file. Or save could copy the value of certain IORefs, and then restore could put the value back.) This approach may not be the most efficient, but it's very straightforward.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string