I'm trying to access a state of a State Monad inside an IO action.
More specifically: I'm trying to write a state-dependent signal handler using installHandlerfrom System.Posix.Signals which requires IO, however, I'd like to do different actions and change the state from inside the handler. I took a look at unliftio, but I read that State-Monads shouldn't be unlifted.
Is this possible? I'm more looking for an explanation than for a "copy-paste" solution. If unlifting State inside IO doesn't work, what would a solution look like, for when one wants to do some state-aware processing inside IO?
A value of type State a b does not contain a state. It is just an encapsulated function that can provide a resulting state and a result if you pass it a starting state (using the runState function. There is no way to access an intermediate (or "current") state from the outside of this function. This is what makes the State Monad "pure".
You seem to intend to have a handler, that does not always behave the same (even if invoked with the same parameters), but depends on an outside state. This kind of behaviour is "impure", and cannot be achieved by using only pure means. So what you need in this case is something that encapsulates the impureness in a way that you can access a "current value" of some state from the handler, without the current value itself getting passed into the handler.
As you already know from the comments, the go-to tool to provide access to mutable state to an IO action is using an IORef. IORefs work, because the Haskell runtime (traditionally, before multithreading at least) serializes IO actions. So the concept of the "current value" always makes sense. After every IO action, the value pointed to by every IORef is fixed. The order IO actions happen is also fixed, as it needs to be the order you chain them inside do blocks or using the >>= operators. Handling of Signals is performed by the Haskell runtime in a deterministic way, kind of like everytime you chain two IO actions, the runtime checks for pending signals, and invokes the corresponding handler.
In case you want to write code that manipulates data in an imperative way (where you can have a lot of variables, and even arrays where you update sinlge elements), you could write your code as I/O action and use IORef and IOArray for it, but there is a special "lite" version of IO that just supports mutable state in the same way as I/O without being able to interact with the environment. The shared state needs to be created, read and written from inside the same "capsule" of this special IO lite, so that running the whole action does not interact with outside state, but just with its internal state - the capsule as a whole is thus pure, even if single statements inside the capsule can be considered impure. The name of this lite version of IO is called ST, which is short for "state thread".
Related
I use State Monad Transformer to manage global state like this
data State = State ...
StateT State IO ()
And I use amqp to consume messages from RabbitMQ. The state will be modified according to messages received. The function has the type like
consumeMsgs :: Channel
-> Text
-> Ack
-> ((Message, Envelope) -> IO ()) -- ^ the callback function
-> IO ConsumerTag
Right now we can ignore other parameters but the third which is a callback function I will supply and where the modification happen.
Because it's a mainly IO Monad, so I use this function as follows
consumeMsgs chan queue Rmq.Ack (flip evalStateT ssss . rmqCallback)
Here the ssss is the state I put in and I find that during the process of my callback function rmqCallback the state can be correctly modified. But, every time next callback happens the global state is the same as before the consumeMsgs is called or equal with ssss.
I understand State Monad is just a process needing an initial state to put in and maintain the state during whole way but has nothing to do with the state out of Monad (am I missing something?), so I count on MVar to hold and modify the state, and that works. I want to know it's there other way to handle this, maybe another Monad?
It looks like you could use Network.AMQP.Lifted.consumeMsgs. StateT s IO is an instance of MonadBaseControl IO m, so you could run whole consumeMsgs inside single runStateT
Yes, StateT monad transformer is basically a nice notation for a pure code, so if your API accepts only IO callbacks you have no choice but to use "real" state like MVar or IORef etc.
PS: As other answer suggests, the state changes done in Network.AMQP.Lifted.consumeMsgs's callback do not propagate to subsequent callback runs or resulting state. I cannot wrap my head around the implementation, but I tried liftBaseWith a bit and it really looks like so.
To add a clarification that might be useful for future reference, the accepted answer is not exact. While Network.AMQP.Lifted.consumeMsgs should work with StateT s IO, the RabbitMQ haskell library actually discards the monadic state after each use. This means that if you do use that instance, you will not see changes made after the initial consumeMsgs call, including changes made by the callback itself. The callback is basically called with the same Monadic state every time - the state in which it was when the callback was registered.
This means that you can use it to pass global configuration state, but not to keep track of state between callback executions.
There is a function in haskell's stm library with the following type signature:
alwaysSucceeds :: STM a -> STM ()
From what I understand of STM in haskell, there are three ways that something can "go wrong" (using that term loosely) while an STM computation is executing:
The value of an TVar that has been read is changed by another thread.
An user-specified invariant is violated. This seems to usually be triggered by calling retry to make it start over. This effectively makes the thread block and then retry once a TVar in the read set changes.
An exception is thrown. Calling throwSTM causes this. This one differs from the first two because the transaction doesn't get restarted. Instead, the error is propagated and either crashes the program or is caught in the IO monad.
If these are accurate (and if they are not, please tell me), I can't understand what alwaysSucceeds could possibly do. The always function, which appears to be built on top of it, seems like it could be written without alwaysSucceeds as:
--This is probably wrong
always :: STM Bool -> STM ()
always stmBool = stmBool >>= check
The documentation for alwaysSucceeds says:
alwaysSucceeds adds a new invariant that must be true when passed to
alwaysSucceeds, at the end of the current transaction, and at the end
of every subsequent transaction. If it fails at any of those points
then the transaction violating it is aborted and the exception raised
by the invariant is propagated.
But since the argument is of type STM a (polymorphic in a), it can't use the value that the transaction returns for any part of the decision making. So, it seems like it would be looking for the different types of failures that I listed earlier. But what's the point of that? The STM monad already handles the failures. How would wrapping it in this function affect it? And why does the variable of type a get dropped, resulting in STM ()?
The special effect of alwaysSucceeds is not how it checks for failure at the spot it's run (running the "invariant" action alone should do the same thing), but how it reruns the invariant check at the end of transactions.
Basically, this function creates a user-specified invariant as in (2) above, that has to hold not just right now, but also at the end of later transactions.
Note that a "transaction" doesn't refer to every single subaction in the STM monad, but to a combined action that is passed to atomically.
I guess the a is dropped just for convenience so you don't have to convert an action to STM () (e.g. with void) before passing it to alwaysSucceeds. The return value will be useless anyhow for the later repeated checks.
I wrote a simple Wai-to-uwsgi proxy, but in doing so, I had to use unwrapResumable. That gives an unwrapped Pipe and a "release" function that needs to be called eventually. The release function's type is ResourceT IO (), and I think I want to register it with my current resource, but to do that I'd need the release to just be IO (). What should I be doing with the release function?
The release action should already be registered with your ResourceT. In proper conduit code, there are two different ways of taking care of resource cleanup:
Within the Pipe itself. This cleanup will be called as early as possible, but is not exception safe.
From ResourceT. This is exception safe, but may be delayed.
The cleanup action provided by unwrapResumable is allowing you to reclaim the "early as possible" aspect. But if you'd just be calling the cleanup outside of the ResourceT block, there's no need to worry about it anyway.
I want to write a function
forkos_try :: IO (Maybe α) -> IO (Maybe α)
which Takes a command x. x is an imperative operation which first mutates state, and then checks whether that state is messed up or not. (It does not do anything external, which would require some kind of OS-level sandboxing to revert the state.)
if x evaluates to Just y, forkos_try returns Just y.
otherwise, forkos_try rolls back state, and returns Nothing.
Internally, it should fork() into threads parent and child, with x running on child.
if x succeeds, child should keep running (returning x's result) and parent should die
otherwise, parent should keep running (returning Nothing) and child should die
Question: What's the way to write something with equivalent, or more powerful semantics than forkos_try? N.B. -- the state mutated (by x) is in an external library, and cannot be passed between threads. Hence, the semantic of which thread to keep alive is important.
Formally, "keep running" means "execute some continuation rest :: Maybe α -> IO () ". But, that continuation isn't kept anywhere explicit in code.
For my case, I think it will (for the time) work to write it in different style, using forkOS (which takes the entire computation child will run), since I can write an explicit expression for rest. But, it troubles me that I can't figure out how do this with the primitive function forkOS -- one would think it would be general enough to support any specific case (which could appear as a high-level API, like forkos_try).
EDIT -- please see the example code with explicit rest if the problem's still not clear [ http://pastebin.com/nJ1NNdda ].
p.s. I haven't written concurrency code in a while; hopefully my knowledge of POSIX fork() is correct! Thanks in advance.
Things are a lot simpler to reason about if you model state explicitly.
someStateFunc :: (s -> Maybe (a, s))
-- inside some other function
case someStateFunc initialState of
Nothing -> ... -- it failed. stick with initial state
Just (a, newState) -> ... -- it suceeded. do something with
-- the result and new state
With immutable state, "rolling back" is simple: just keep using initialState. And "not rolling back" is also simple: just use newState.
So...I'm assuming from your explanation that this "external library" performs some nontrivial IO effects that are nevertheless restricted to a few knowable and reversible operations (modify a file, an IORef, etc). There is no way to reverse some things (launch the missiles, write to stdout, etc), so I see one of two choices for you here:
clone the world, and run the action in a sandbox. If it succeeds, then go ahead and run the action in the Real World.
clone the world, and run the action in the real world. If it fails, then replace the Real World with the snapshot you took earlier.
Of course, both of these are actually the same approach: fork the world. One world runs the action, one world doesn't. If the action succeeds, then that world continues; otherwise, the other world continues. You are proposing to accomplish this by building upon forkOS, which would clone the entire state of the program, but this would not be sufficient to deal with, for example, file modifications. Allow me to suggest instead an approach that is nearer to the simplicity of immutable state:
tryIO :: IO s -> (s -> IO ()) -> IO (Maybe a) -> IO (Maybe a)
tryIO save restore action = do
initialState <- save
result <- action
case result of
Nothing -> restore initialState >> return Nothing
Just x -> return (Just x)
Here you must provide some data structure s, and a way to save to and restore from said data structure. This allows you the flexibility to perform any cloning you know to be necessary. (e.g. save could copy a certain file to a temporary location, and then restore could copy it back and delete the temporary file. Or save could copy the value of certain IORefs, and then restore could put the value back.) This approach may not be the most efficient, but it's very straightforward.
The scenario:
I have a simple state machine:
Happy path:
Uninitialized->Initialized->InProgress->Done
Unhappy path:
Uninitialized->Initialized->Error
Simply put, I need to cause a transition (either into InProgress or in Error state) without an external event/trigger. I.e. Initialized state should immediately result in one of those states.
Questions:
Is it OK to cause state transition from within Initialized.Enter() ?
I could use state guards to do this, but I'd rather not have non-trivial logic in the state guard (and initialization can very well be complex).
If it is NOT OK, how can I do it differently?
Should I just take this decision out of he FSM all together and have some other component cause the appropriate transition? But then, wouldn't I still have to call that external component from within Initialized.Enter() ? so it solves nothing?
In a state machine, next state is a combinatorial logic function of both input and current state.
In the case you are describing, the same cause (Initialized state) seems to be able to trigger two different effects (either InProgress or Error state). I guess that there is a hidden input whose value makes the difference. I also guess that this input is received during transition from Uninitialized to Initialized.
Therefore I would have a different model:
Uninitialized -> Successfully initialized -> InProgress -> Done
\
`-> Failed Initialization -> Error
Possibly combining Successfully initialized with InProgress and Failed initialization with Error.
EDIT: From your comment, I understand that the hidden input actually is the result of an action (a device initialization). Taking your model, I assume that initialization takes place while in Initialized state (let's call it Initializing). This way, the result from the device is your external event which will trigger transition either to InProgress or to Error.
So keep your state machine and simply add the result of device.Initialize() to the list of inputs or external events.