Haskell: Monad transformers and global state - haskell

I'm trying to learn Haskell. I'm trying to write a programm that contains a "global state": Vars. I want to change a component of the state (e.g. var1) each time I call a function. The change can be a simple function on the components (e.g. +4). Also, it prints out the component changed. Here is what I've done so far (but I'm stuck). Edit: after running the code I want to see the recent version of the global state.
import Control.Monad.State
import Control.Monad.IO.Class (liftIO)
data Vars = Vars {
var1 :: Int,
var2 :: Float
} deriving (Show)
sample :: StateT Vars IO a
sample = do
a <- change
liftIO $ print a
-- I want to call change again and apply more change to the state
change :: StateT Vars IO a
change = do
dd <- get
-- I don't know what to do next!
main = do
runStateT sample (Vars 20 3)
evalStateT sample (Vars 20 3)

Let's try to solve your problem step-by-step starting from easier and small parts. It's important skill in programming and FP teaches you that skill in nice way. Also, working with State monad and especially with several effects in monad-transformers helps you to reason about effects and understand things better.
You want to update var1 inside your immutable data type. This can be done only by creating new object. So let's write such function:
plusFour :: Vars -> Vars
plusFour (Vars v1 v2) = Vars (v1 + 4) v2
There exist ways in Haskell to write this function much shorter though less understandable, but we don't care about those things now.
Now you want to use this function inside State monad to update immutable state and by this simulate mutability. What can be told about this function only by looking at its type signature: change :: StateT Vars IO a? We can say that this function have several effects: it has access to Vars state and it can do arbitrary IO actions. Also this function returns value of type a. Hmm, this last one is strange. What is a? What this function should return? In imperative programming this function will have type void or Unit. It just do things, it doesn't return everything. Only updates context. So it's result type should be (). It can be different. For example we might want to return new Vars after change. But this is generally bad approach in programming. It makes this function more complex.
After we understood what type function should have (try to always start with defining types) we can implement it. We want to change our state. There're functions that operates with stateful parts of our context. Basically, you interested in this one:
modify :: Monad m => (s -> s) -> StateT s m ()
modify function takes function which updates state. After you run this function you can observe that state is modified according to passed function. Now change can be written like this:
change :: StateT Vars IO ()
change = modify plusFour
You can implement modify (and thus change using only put and get functions which is nice exercise for beginner).
Let's now call change function from some other function. What does calling mean in this case? It means that you execute monadic action change. This action changes your context, you don't care about it's result because it's (). But if you run get function (which binds whole state to variable) after change you can observe new change. If you want to print only changed component, like var1 you can use gets function. And, again, what type should sample have? What should it return? If on the caller side you're interested only in resulting state, then, again, it should be () like this:
sample :: StateT Vars IO ()
sample = do
change
v1 <- gets var1
liftIO $ print v1
change
v1' <- gets var1
liftIO $ print v1' -- this should be v1 + 4
This should add you some understanding of what is happening. Monad transformers require some time to get used to them though it's a powerful tool (not perfect but extremely useful).
As as a side note I want to add that these function can be written much nicer using common Haskell design patterns. But you don't need to care about those right now, just try to understand what's going on here.

Related

Is print in Haskell a pure function?

Is print in Haskell a pure function; why or why not? I'm thinking it's not, because it does not always return the same value as pure functions should.
A value of type IO Int is not really an Int. It's more like a piece of paper which reads "hey Haskell runtime, please produce an Int value in such and such way". The piece of paper is inert and remains the same, even if the Ints eventually produced by the runtime are different.
You send the piece of paper to the runtime by assigning it to main. If the IO action never comes in the way of main and instead languishes inside some container, it will never get executed.
Functions that return IO actions are pure like the others. They always return the same piece of paper. What the runtime does with those instructions is another matter.
If they weren't pure, we would have to think twice before changing
foo :: (Int -> IO Int) -> IO Int
foo f = liftA2 (+) (f 0) (f 0)
to:
foo :: (Int -> IO Int) -> IO Int
foo f = let x = f 0 in liftA2 (+) x x
Yes, print is a pure function. The value it returns has type IO (), which you can think of as a bunch of code that outputs the string you passed in. For each string you pass in, it always returns the same code.
If you just read the Tag of pure-function (A function that always evaluates to the same result value given the same argument value(s) and that does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices.) and then Think in the type of print:
putStrLn :: String -> IO ()
You will find a trick there, it always returns IO (), so... No, it produces effects. So in terms of Referential Transparency is not pure
For example, getLine returns IO String but it is also a pure function. (#interjay contribution), What I'm trying to say, is that the answer depends very close of the question:
On matter of value, IO () will always be the same IO () value for the same input.
And
On matter of execution, it is not pure because the execution of that
IO () could have side effects (put an string in the screen, in this
case looks so innocent, but some IO could lunch nuclear bombs, and
then return the Int 42)
You could understand better with the nice approach of #Ben here:
"There are several ways to explain how you're "purely" manipulating
the real world. One is to say that IO is just like a state monad, only
the state being threaded through is the entire world outside your
program;= (so your Stuff -> IO DBThing function really has an extra
hidden argument that receives the world, and actually returns a
DBThing along with another world; it's always called with different
worlds, and that's why it can return different DBThing values even
when called with the same Stuff). Another explanation is that an IO
DBThing value is itself an imperative program; your Haskell program is
a totally pure function doing no IO, which returns an impure program
that does IO, and the Haskell runtime system (impurely) executes the
program it returns."
And #Erik Allik:
So Haskell functions that return values of type IO a, are actually not
the functions that are being executed at runtime — what gets executed
is the IO a value itself. So these functions actually are pure but
their return values represent non-pure computations.
You can found them here Understanding pure functions in Haskell with IO

Haskell State Monad

Does the put function of the State Monad update the actual state or does it just return a new state with the new value? My question is, can the State Monad be used like a "global variable" in an imperative setting? And does put modify the "global variable"?
My understanding was NO it does NOT modify the initial state, but using the monadic interface we can just pass around the new states b/w computations, leaving the initial state "intact". Is this correct? if not, please correct me.
The answer is in the types.
newtype State s a = State {runState :: s -> (a, s)}
Thus, a state is essentially a function that takes one parameter, 's' (which we call the state), and returns a tuple (value, state). The monad is implemented like below
instance Monad (State s) where
return a = State $ \s -> (a,s)
(State f) >>= h = State $ \s -> let (a,s') = f s
in (runState h a) s'
Thus, you have a function that operates on the initial state and spits out a value-state tuple to be processed by the next function in the composition.
Now, put is the following function.
put newState = State $ \s -> ((),newState)
This essentially sets the state that will be passed to the next function in the composition and the downstream function will see the modified state.
In fact, the State monad is completely pure (that is, nothing is being set); only what is passed downstream changes. In other words, the State monad saves you the trouble of carrying around a state explicitly in a pure language like Haskell. In other words, State monad just provides an interface that hides the details of state threading (that's what is called in the WikiBooks and or Learn you a Haskell, I think).
The following shows this in action. You have get, which sets the value field to be the same as the state field (Note that, when I mean setting, I mean the output, not a variable). put gets the state via the value passed to it, increments it and sets the state with this new value.
-- execState :: State s a -> s -> s
let x = get >>= \x -> put (x+10)
execState x 10
The above outputs 20.
Now, let's do the following.
execState (x >> x) 10
This will give an output of 30. The first x sets the state to 20 via the put. This is now used by the second x. The get at this point sets the state passed it to the value field, which is now 20. Now, our put will get this value, increment it by 10 and set this as the new state.
Thus, you have states in a pure context. Hope this helps.
There's nothing magical about State. You could implement it like this:
newtype State s a = State {runState :: s -> (a, s)}
That is, a State s a (which we think of as a computation that uses a state of type s to produce a result of type a) is just a function that takes a state and returns a result and a new state. You should attempt to write out the Monad instance and definitions of get and put for this definition. The real definition is more general:
type State s = StateT s Identity
newtype Identity a = Identity a
newtype StateT s m a = StateT {runStateT :: s -> m (a, s)}
This allows state to be added to other monadic computations. It's also possible to define state transformers as "operational monads". Apfelmus has a tutorial on those somewhere.
First, the state isn't "global" as such; you could have several different copies of the state monad running, each with its own independent state, and they won't interfere with each other. (Indeed, that's arguably the whole point.) If you wanted the state to be global to the entire program, you'd have to put the entire program into a single state monad.
Second, calling put changes the result that subsequent calls to get will return. That's all. It doesn't "change" the actual value itself. Like, if you call get and put the result into a variable somewhere, and then call put, your variable won't change. Even if the state is a dictionary or something, if you were to add a new key and put that, anybody still looking at the old copy of the dictionary will still see the old dictionary. This isn't special to the state monad; it's just how Haskell works.

How to modify parts of a State in Haskell

I have a number of operations which modify a System. System is defined like this:
data System = Sys {
sysId :: Int,
sysRand :: StdGen,
sysProcesses :: ProcessDb,
sysItems :: ItemDb
}
with e.g.
type ProcessDb = M.Map Int Process
But I also have some functions, which do not need access to the full System, but have types like this:
foo' :: (Process, ItemDb) -> ((Process, ItemDb),[Event])
Currently I gave them types like
foo: System -> (System, [Event])
But this is a needlessly broad interface. To use the narrow interface above in conjuntion with System I would have to extract a single Process and the ItemDb from System, run foo' and then modify System with the results.
This is quite some unwrapping and wrapping and results in more lines of code than just passing system as a whole and let foo extract whatever it needs. In the latter case, the wrapping and unwrapping is mingled with the actual foo' operation and I have the feeling that these two aspects should be separated.
I suppose I need some kind of lifting operation which turns a narrow foo' into a foo. I suppose I could write this, but I would have to write such a lifter for every signature of the narrow functions, resulting is lots of different lifters.
is there an idiom how to solve such problems?
is it worth bothering?
One common solution is to use a class, possibly created by the Template Haskell magic of Control.Lens.TH.makeClassy. The gist is that you pass in the whole System, but you don't let the function know that that's what you're giving it. All it's allowed to know is that what you're giving it offers methods for getting and/or modifying the pieces it's supposed to handle.
I ended up writing a function which work on any State and which requires a "Lens" which captures the specfic transformation from the bigger State to the smaller State and back
focus :: (Lens s' s) -> State s' a -> State s a
focus lens ms'= do
s <- get
let (s', set) = lens s
(a, s'') = runState ms' s'
put (set s'')
return a
It allows me to write things like
run :: ExitP -> State SimState Log
...
do
evqs' <-focus onSys $ step (t,evt)
...
Where step operates on the "smaller" state
step :: Timed Event -> State Sys.System [EventQu]
Here onSys is a "Lens" and it works like this:
onSys :: Lens Sys.System SimState
onSys (Sis e s) = (s, Sis e)
where
data SimState = Sis {
events :: EventQu,
sisSys :: Sys.System
I suppose the existing Lens libraries follow a similar approach, but do much more magic, like creating lenses automatically. I did shy away from lenses. Instead I was pleased to realise that all it takes was a few lines of codes to get what I need.

Accessing vector element by index using lens

I'm looking for a way to reference an element of a vector using lens library...
Let me try to explain what I'm trying to achieve using a simplified example of my code.
I'm working in this monad transformer stack (where StateT is the focus, everything else is not important)
newtype MyType a = MyType (StateT MyState (ExceptT String IO) a)
MyState has a lot of fields but one of those is a vector of clients which is a data type I defined:
data MyState = MyState { ...
, _clients :: V.Vector ClientT
}
Whenever I need to access one of my clients I tend to do it like this:
import Control.Lens (use)
c <- use clients
let neededClient = c V.! someIndex
... -- calculate something, update client if needed
clients %= (V.// [(someIndex, updatedClient)])
Now, here is what I'm looking for: I would like my function to receive a "reference" to the client I'm interested in and use it (retrieve it from State, update it if needed).
In order to clear up what I mean here is a snippet (that won't compile even in pseudo code):
...
myFunction (clients.ix 0)
...
myFunction clientLens = do
c <- use clientLens -- I would like to access a client in the vector
... -- calculate stuff
clientLens .= updatedClient
Basically, I would like to pass to myFunction something from Lens library (I don't know what I'm passing here... Lens? Traversal? Getting? some other thingy?) which will allow me to point at particular element in the vector which is kept in my StateT. Is it at all possible? Currently, when using "clients.ix 0" I get an error that my ClientT is not an instance of Monoid.
It is a very dumbed down version of what I have. In order to answer the question "why I need it this way" requires a lot more explanation. I'm interested if it is possible to pass this "reference" which will point to some element in my vector which is kept in State.
clients.ix 0 is a traversal. In particular, traversals are setters, so setting and modifying should work fine:
clients.ix 0 .= updatedClient
Your problem is with use. Because a traversal doesn't necessarily contain exactly one value, when you use a traversal (or use some other getter function on it), it combines all the values assuming they are of a Monoid type.
In particular,
use (clients.ix n)
would want to return mempty if n is out of bounds.
Instead, you can use the preuse function, which discards all but the first target of a traversal (or more generally, a fold), and wraps it in a Maybe. E.g.
Just c <- preuse (clients.ix n)
Note this will give a pattern match error if n is out of bounds, since preuse returns Nothing then.

Why does State need a value?

Just learning about State monad from this excellent tutorial. However, when I tried to explain it to a non-programmer they had a question that stumped me.
If the purpose of the State is to simulate mutable memory, why is the function that state monad stores is of the type:
s -> (a, s)
and not simply:
s -> s
In other words, what is the need for the "intermediate" value? For example, couldn't we, in the cases where we need it, simulate it by simply defining a state as a tuple of (state, value)?
I'm sure I confused something, any help is appreciated.
To draw a parallel with an imperative language like C, s -> s corresponds to a function with the return type void, which is invoked purely for side effects (such as mutating the memory). It is isomorphic to State s ().
And indeed, it is possible to write C functions which communicate only through global variables. But, as in C, it is often convenient to return values from functions. That's what a is for.
Of course it's possible that for your particular problem s -> s is a better choice. Although it's not a Monad, it is a Monoid (when wrapped in Endo). So you can construct such functions using <> and mempty, which correspond to >>= and return of Monad.
To expand a bit on Nick's answer:
s is the state. If all your functions were s -> s (state to state), your functions would not be able to return any values. You could define your state as (the actual state, value returned), but that conflates the state with the value the state-ful functions are computing. And it's also the common case that you'll want functions to actually compute and return values...
s' -> s' is equivalent to (a, s) -> (a, s). Here it is obvious that your State will need an initial a to start things off in addition to s.
On the other hand s -> (a, s) only needs the seed s to begin things and does not require an a value at all.
Thus the type of s -> (a, s) tells you that State is less complex than if it were (a, s) -> (a, s). Types in Haskell convey LOTS of information.
If the purpose of the State is to simulate mutable memory, why is the function that state monad stores is of the type:
s -> (a, s)
and not simply:
s -> s
The purpose of the State monad is not to simulate mutable memory, but rather to model computations that both produce a value and have a side effect. Simply, given some initial state of type s, your computation will produce some value of type a, as well as an updated state.
Maybe your computation does not produce a value... Then, easy: the value type a is simply (). Perhaps on the other hand your computation does not have a side effect. Again, easy: you might think of your state transition function (the s -> s argument to modify) as just being id. But often you're dealing with both at the same time.
You can actually use get and put as relatively simple examples:
get :: State s s -- s -> (s, s)
put :: s -> State () -- s -> (s -> ((), s))
get is a computation which, given the current state (the first s), will return it both as a value -- that is, the result of the computation -- and as the "new" (unmodified) state.
put is a computation which, given a new state (the first s) and a current state (the second s), will simply ignore the current state. It will produce () as the computed value (because, of course, it hasn't computed any value!) and hang onto the new state provided.
Presumably you want to use your stateful computations inside of do notation?
You should ask yourself what the Monad instance would look like for a stateful computation defined by
newtype State s = { runState :: s -> s }
The problem to be solved is that you have an input and a series of functions, and you want to apply the functions to the input in order.
If the functions are purely state-changing functions, s -> s on an input of type s, then you don't need State to use them. Haskell is very good at chaining together functions like these, e.g. with the standard composition operator ., or something like foldr (.) id, or foldr id.
However, if the functions both mutate a state and report some result of doing so, so that you could give them the type s -> (s,a), then gluing them all together is a bit of a nuisance. You have to unpack the result tuple and pass the new state value to the next function, use the reported value somewhere else, and then unpack that result, and so on. It's easy to pass the wrong state to an input function because you have to name each result and input explicitly to do the unpacking. You end up with something like this:
let
(res1, s1) = fun1 s0
(res2, s2) = fun2 s1
(res3, s3) = fun3 res1 res2 s1
...
in resN
There, I accidentally passed s1 instead of s2, maybe because I added the second line in later and didn't realise the third line needed changing. When composing the s -> s functions, this problem can't possibly arise because there are no names to get right:
let
resN = fun1 . fun2 . fun3 . -- etc.
So we invented State to do the same trick. State is essentially just a way of gluing functions like s -> (s,a) together in such a way that the right state always gets passed to the right function.
So it's not so much that people went "we want to use State, let's use s -> (s,a)" but rather "we're writing functions like s -> (s,a), let's invent State to make that easy". With functions s -> s, it's already easy and we don't have to invent anything.
As an example of how s -> (s,a) arises naturally, consider parsing: a parser will be given some input, consume some of the input and return a value. In Haskell, this is naturally modelled as taking an input list, and returning a pair of the value and the remaining input - i.e. [Input] -> ([Input], a), or State [Input].
a is the value returned, and the s is the final state.
http://www.haskell.org/haskellwiki/State_Monad#Implementation

Resources