How to modify a state monad? - haskell

I use State Monad Transformer to manage global state like this
data State = State ...
StateT State IO ()
And I use amqp to consume messages from RabbitMQ. The state will be modified according to messages received. The function has the type like
consumeMsgs :: Channel
-> Text
-> Ack
-> ((Message, Envelope) -> IO ()) -- ^ the callback function
-> IO ConsumerTag
Right now we can ignore other parameters but the third which is a callback function I will supply and where the modification happen.
Because it's a mainly IO Monad, so I use this function as follows
consumeMsgs chan queue Rmq.Ack (flip evalStateT ssss . rmqCallback)
Here the ssss is the state I put in and I find that during the process of my callback function rmqCallback the state can be correctly modified. But, every time next callback happens the global state is the same as before the consumeMsgs is called or equal with ssss.
I understand State Monad is just a process needing an initial state to put in and maintain the state during whole way but has nothing to do with the state out of Monad (am I missing something?), so I count on MVar to hold and modify the state, and that works. I want to know it's there other way to handle this, maybe another Monad?

It looks like you could use Network.AMQP.Lifted.consumeMsgs. StateT s IO is an instance of MonadBaseControl IO m, so you could run whole consumeMsgs inside single runStateT
Yes, StateT monad transformer is basically a nice notation for a pure code, so if your API accepts only IO callbacks you have no choice but to use "real" state like MVar or IORef etc.
PS: As other answer suggests, the state changes done in Network.AMQP.Lifted.consumeMsgs's callback do not propagate to subsequent callback runs or resulting state. I cannot wrap my head around the implementation, but I tried liftBaseWith a bit and it really looks like so.

To add a clarification that might be useful for future reference, the accepted answer is not exact. While Network.AMQP.Lifted.consumeMsgs should work with StateT s IO, the RabbitMQ haskell library actually discards the monadic state after each use. This means that if you do use that instance, you will not see changes made after the initial consumeMsgs call, including changes made by the callback itself. The callback is basically called with the same Monadic state every time - the state in which it was when the callback was registered.
This means that you can use it to pass global configuration state, but not to keep track of state between callback executions.

Related

Accessing state in an IO-Monad

I'm trying to access a state of a State Monad inside an IO action.
More specifically: I'm trying to write a state-dependent signal handler using installHandlerfrom System.Posix.Signals which requires IO, however, I'd like to do different actions and change the state from inside the handler. I took a look at unliftio, but I read that State-Monads shouldn't be unlifted.
Is this possible? I'm more looking for an explanation than for a "copy-paste" solution. If unlifting State inside IO doesn't work, what would a solution look like, for when one wants to do some state-aware processing inside IO?
A value of type State a b does not contain a state. It is just an encapsulated function that can provide a resulting state and a result if you pass it a starting state (using the runState function. There is no way to access an intermediate (or "current") state from the outside of this function. This is what makes the State Monad "pure".
You seem to intend to have a handler, that does not always behave the same (even if invoked with the same parameters), but depends on an outside state. This kind of behaviour is "impure", and cannot be achieved by using only pure means. So what you need in this case is something that encapsulates the impureness in a way that you can access a "current value" of some state from the handler, without the current value itself getting passed into the handler.
As you already know from the comments, the go-to tool to provide access to mutable state to an IO action is using an IORef. IORefs work, because the Haskell runtime (traditionally, before multithreading at least) serializes IO actions. So the concept of the "current value" always makes sense. After every IO action, the value pointed to by every IORef is fixed. The order IO actions happen is also fixed, as it needs to be the order you chain them inside do blocks or using the >>= operators. Handling of Signals is performed by the Haskell runtime in a deterministic way, kind of like everytime you chain two IO actions, the runtime checks for pending signals, and invokes the corresponding handler.
In case you want to write code that manipulates data in an imperative way (where you can have a lot of variables, and even arrays where you update sinlge elements), you could write your code as I/O action and use IORef and IOArray for it, but there is a special "lite" version of IO that just supports mutable state in the same way as I/O without being able to interact with the environment. The shared state needs to be created, read and written from inside the same "capsule" of this special IO lite, so that running the whole action does not interact with outside state, but just with its internal state - the capsule as a whole is thus pure, even if single statements inside the capsule can be considered impure. The name of this lite version of IO is called ST, which is short for "state thread".

Haskell STM alwaysSucceeds

There is a function in haskell's stm library with the following type signature:
alwaysSucceeds :: STM a -> STM ()
From what I understand of STM in haskell, there are three ways that something can "go wrong" (using that term loosely) while an STM computation is executing:
The value of an TVar that has been read is changed by another thread.
An user-specified invariant is violated. This seems to usually be triggered by calling retry to make it start over. This effectively makes the thread block and then retry once a TVar in the read set changes.
An exception is thrown. Calling throwSTM causes this. This one differs from the first two because the transaction doesn't get restarted. Instead, the error is propagated and either crashes the program or is caught in the IO monad.
If these are accurate (and if they are not, please tell me), I can't understand what alwaysSucceeds could possibly do. The always function, which appears to be built on top of it, seems like it could be written without alwaysSucceeds as:
--This is probably wrong
always :: STM Bool -> STM ()
always stmBool = stmBool >>= check
The documentation for alwaysSucceeds says:
alwaysSucceeds adds a new invariant that must be true when passed to
alwaysSucceeds, at the end of the current transaction, and at the end
of every subsequent transaction. If it fails at any of those points
then the transaction violating it is aborted and the exception raised
by the invariant is propagated.
But since the argument is of type STM a (polymorphic in a), it can't use the value that the transaction returns for any part of the decision making. So, it seems like it would be looking for the different types of failures that I listed earlier. But what's the point of that? The STM monad already handles the failures. How would wrapping it in this function affect it? And why does the variable of type a get dropped, resulting in STM ()?
The special effect of alwaysSucceeds is not how it checks for failure at the spot it's run (running the "invariant" action alone should do the same thing), but how it reruns the invariant check at the end of transactions.
Basically, this function creates a user-specified invariant as in (2) above, that has to hold not just right now, but also at the end of later transactions.
Note that a "transaction" doesn't refer to every single subaction in the STM monad, but to a combined action that is passed to atomically.
I guess the a is dropped just for convenience so you don't have to convert an action to STM () (e.g. with void) before passing it to alwaysSucceeds. The return value will be useless anyhow for the later repeated checks.

Ensure IO computations are run in a specific thread

I need to make sure that some actions are run on a specific OS thread. I wrote an API where this thread runs a loop listening to a TQueue and executes the given commands. From the API user side, there is an opaque value that is really just a newtype over this queue.
One problem is that what I really need is to embed arbitrary actions (type IO a), but I believe I can't directly exchange messages of that type. So I currently have something like this (pseudo code) :
makeSafe :: RubyInterpreter -> IO a -> IO (Either RubyError a)
makeSafe (RubyInterpreter q) a = do
mv <- newEmptyTMVarIO
-- embedded is of type IO (), letting me send this in my queue
let embedded = handleErrors a >>= atomically . putTMVar mv
atomically (writeTQueue q (SomeMessage embedded))
atomically (readTMVar mv)
(for more details, this is for the hruby package)
edit - clarifications :
Being able to send actions of type IO a would be nicer, but is not my main objective.
My main problem is that you can shoot yourself in the foot with this API, for example if there is a makeSafe call in the IO action that is passed as a parameter, this will hang.
My secondary problem is that this solution feels a bit contrived, and I wondered if there was a nicer/safer solution around.

State-dependent event processing with state updates

I want to use FRP (i.e., reactive banana 0.6.0.0) for my project (a GDB/MI front-end). But I have troubles declaring the event network.
There are commands from the GUI and there are stop events from GDB. Both need to be handled and handling them depends on the state of the system.
My current approach looks like this (I think this is the minimum required complexity to show the problem):
data Command = CommandA | CommandB
data Stopped = ReasonA | ReasonB
data State = State {stateExec :: Exec, stateFoo :: Int}
data StateExec = Running | Stopped
create_network :: NetworkDescription t (Command -> IO ())
create_network = do
(eCommand, fCommand) <- newEvent
(eStopped, fStopped) <- newEvent
(eStateUpdate, fStateUpdate) <- newEvent
gdb <- liftIO $ gdb_init fStopped
let
eState = accumE initialState eStateUpdate
bState = stepper initialState eState
reactimate $ (handleCommand gdb fStateUpdate <$> bState) <#> eCommand
reactimate $ (handleStopped gdb fStateUpdate <$> bState) <#> eStopped
return fCommand
handleCommand and handelStopped react on commands and stop events depending on the current state. Possible reactions are calling (synchronous) GDB I/O functions and firing state update events. For example:
handleCommand :: GDB -> ((State -> State) -> IO ()) -> State -> Command -> IO ()
handleCommand gdb fStateUpdate state CommandA = case stateExec state of
Running -> do
gdb_interrupt gdb
fStateUpdate f
where f state' = state' {stateFoo = 23}
The problem is, when f gets evaluated by accumE, state' sometimes is different from state.
I am not 100% sure why this can happen as I don't fully understand the semantics of time and simultaneity and the order of "reactimation" in reactive banana. But I guess that state update functions fired by handleStopped might get evaluated before f thus changing the state.
Anyway, this event network leads to inconsistent state because the assumptions of f on the "current" state are sometimes wrong.
I have been trying to solve this problem for over a week now and I just cannot figure it out. Any help is much appreciated.
It looks like you want to make a eStateUpdate event occur whenever eStop or eCommand occurs?
If so, you can simply express it as the union of the two events:
let
eStateUpdate = union (handleCommand' <$> eCommand)
(handleStopped' <$> eStopped)
handleCommand' :: Command -> (State -> State)
handleStopped' :: Stopped -> (State -> State)
eState = accumE initialState eStateUpdate
etc.
Remember: events behave like ordinary values which you can combine to make new ones, you're not writing a chain of callback functions.
The newEvent function should only be used if you want to import an event from the outside world. That's the case for eCommand and eStopped, as they are triggered by the external GDB, but the eStateUpdate event seems to be internal to the network.
Concerning behavior of your current code, reactive-banana always does the following things when receiving an external event:
Calculate/update all event occurrences and behavior values.
Run the reactimates in order.
But it may well happen happen that step 2 triggers the network again (for instance via the fStateUpdate function), in which case the network calculates new values and calls the reactimates again, as part of this function call. After this, flow control returns to the first sequence of reactimates that is still being run, and a second call to fStateUpdate will have strange effects: the behaviors inside the network have been updated already, but the argument to this call is still an old value. Something like this:
reactimate1
reactimate2
fStateUpdate -- behaviors inside network get new values
reactimate1'
reactimate2'
reactimate3 -- may contain old values from first run!
Apparently, this is tricky to explain and tricky to reason about, but fortunately unnecessary if you stick to the guidelines above.
In a sense, the latter part embodies the trickiness of writing event handlers in the traditional style, whereas the former part embodies the (relative) simplicity of programming with events in FRP-style.
The golden rule is:
Do not call another event handler while handling an event.
You don't have to follow this rule, and it can be useful at times; but things will become complicated if you do that.
As far as I can see, FRP seems not to be the right abstraction for my problem.
So I switched to actors with messages of type State -> IO State.
This gives me the required serialization of events and the possibility to do IO when updating the state. What I loose is the nice description of the event network. But it's not too bad with actors either.

How can one implement a forking try-catch in Haskell?

I want to write a function
forkos_try :: IO (Maybe α) -> IO (Maybe α)
which Takes a command x. x is an imperative operation which first mutates state, and then checks whether that state is messed up or not. (It does not do anything external, which would require some kind of OS-level sandboxing to revert the state.)
if x evaluates to Just y, forkos_try returns Just y.
otherwise, forkos_try rolls back state, and returns Nothing.
Internally, it should fork() into threads parent and child, with x running on child.
if x succeeds, child should keep running (returning x's result) and parent should die
otherwise, parent should keep running (returning Nothing) and child should die
Question: What's the way to write something with equivalent, or more powerful semantics than forkos_try? N.B. -- the state mutated (by x) is in an external library, and cannot be passed between threads. Hence, the semantic of which thread to keep alive is important.
Formally, "keep running" means "execute some continuation rest :: Maybe α -> IO () ". But, that continuation isn't kept anywhere explicit in code.
For my case, I think it will (for the time) work to write it in different style, using forkOS (which takes the entire computation child will run), since I can write an explicit expression for rest. But, it troubles me that I can't figure out how do this with the primitive function forkOS -- one would think it would be general enough to support any specific case (which could appear as a high-level API, like forkos_try).
EDIT -- please see the example code with explicit rest if the problem's still not clear [ http://pastebin.com/nJ1NNdda ].
p.s. I haven't written concurrency code in a while; hopefully my knowledge of POSIX fork() is correct! Thanks in advance.
Things are a lot simpler to reason about if you model state explicitly.
someStateFunc :: (s -> Maybe (a, s))
-- inside some other function
case someStateFunc initialState of
Nothing -> ... -- it failed. stick with initial state
Just (a, newState) -> ... -- it suceeded. do something with
-- the result and new state
With immutable state, "rolling back" is simple: just keep using initialState. And "not rolling back" is also simple: just use newState.
So...I'm assuming from your explanation that this "external library" performs some nontrivial IO effects that are nevertheless restricted to a few knowable and reversible operations (modify a file, an IORef, etc). There is no way to reverse some things (launch the missiles, write to stdout, etc), so I see one of two choices for you here:
clone the world, and run the action in a sandbox. If it succeeds, then go ahead and run the action in the Real World.
clone the world, and run the action in the real world. If it fails, then replace the Real World with the snapshot you took earlier.
Of course, both of these are actually the same approach: fork the world. One world runs the action, one world doesn't. If the action succeeds, then that world continues; otherwise, the other world continues. You are proposing to accomplish this by building upon forkOS, which would clone the entire state of the program, but this would not be sufficient to deal with, for example, file modifications. Allow me to suggest instead an approach that is nearer to the simplicity of immutable state:
tryIO :: IO s -> (s -> IO ()) -> IO (Maybe a) -> IO (Maybe a)
tryIO save restore action = do
initialState <- save
result <- action
case result of
Nothing -> restore initialState >> return Nothing
Just x -> return (Just x)
Here you must provide some data structure s, and a way to save to and restore from said data structure. This allows you the flexibility to perform any cloning you know to be necessary. (e.g. save could copy a certain file to a temporary location, and then restore could copy it back and delete the temporary file. Or save could copy the value of certain IORefs, and then restore could put the value back.) This approach may not be the most efficient, but it's very straightforward.

Resources