How to refactor program that uses state monad transformer? - haskell

I finished (well, almost) my first more-or-less useful project in Haskell. It consists of several modules and almost all the modules use StateT a lot.
Big picture is: on top level I need to work with state and IO simultaneously, so I use StateT myState IO monad transformer. It is OK and my code magically 'just works', but now I think that perhaps the code isn't perfect because a lot of functions in other modules are inside the monad transformer, so they potentially can perform IO, although they are pretty pure by their nature. And that's a bad thing.
Can you advise me how to refactor the program so that I can somehow write functions in the modules inside State monad, without any IO, but being able to combine this code with IO on top level?

If your function only needs StateT, you could give it a signature such as
incrementCounter :: (Monad m) => (StateT Counter m ())
incrementCounter = do count <- get
put (increment count)
return ()
That way your function needs to work with any Monad m (and can't rely on it being IO). At the top level you can instantiate m = IO.

Related

lift, return, and a transformer type constructor

For well over a year, I have been intensely using lift, return, and constructors such as EitherT, ReaderT, and so forth. I've read Real World Haskell, Learn You a Haskell, almost every monad tutorial out there, and tried writing my own. Yet, I constantly remain confused about these three operations. Any time I am writing new code I try to figure out which of the three to use, and it almost always takes me an hour or more on the first function in a particular block of code.
What is an intuitive understanding of the three? Simple types are insufficient, as in all three cases I can instantly recite the types to you. What is a meaning for what these do that is consistent across all of the standard monad transformers?
(Unfortunately, if you respond in math terms, I'm still not going to understand you. While I can write code to solve math problems and can set up time complexity based on the code I see, I cannot after many years of trying to work in Haskell relate math terms to programming terms.)
return takes a pure computation and turns it into a computation which claims to have some monad-y side-effects, but doesn't.
lift takes a computation that has some side-effects, and adds more.
EitherT, ReaderT, and so on take a computation that already has all the side-effects you're interested in and "spells them differently" -- for example, where before your state was spelled as a function that returns an updated value, it is now spelled as a State(T)-ful computation.
So let's say you have a computation. In a lazy language like Haskell you'd write
comp1 :: a
and know that this computation will be performed upon request and result in a value of type a.
Let's say you have a similar computation, but in addition to computing a value of type a, it might "fail" for some reason or another. For example, a might be Integer and this computation will "fail" if its a division by zero. We're write this now as
comp2 :: Maybe a
where the Maybe constructor "tags" the a to indicate failure.
Let's say we have a similar computation as before, but now we are allowed to fail, but also collect a log during the computation. "Log collecting" is called Writer so we'd like to tag our type with Writer as well as Maybe. Unfortunately
comp3_bad :: (Writer String) Maybe a
doesn't make any sense. The definition of writer allows for a single parameter, not two. We can consider a bit of what the underlying mechanics of this combined effect would be, though—it needs to return a Maybe paired with the log... or perhaps if the computation fails, the log is discarded. There are two options
comp3_1 :: (String, Maybe a)
comp3_2 :: Maybe (String, a)
If we unpack the Writer, we can see that these are equivalent to
comp3_1' :: Writer String (Maybe a)
comp3_2' :: Maybe (Writer String a)
This pattern of nesting is called composition. If you want to combine the effects of two monads then you'd like to compose them. For some monads this works directly, though it's a little cumbersome.
Unfortunately, some monads start to break the monad laws once they are composed. They can still be "stacked" but not in the normal way. So, we allow each type to determine its stacking method by creating the transformer version <monad>T.
newtype WriterT w m a = WriterT { runWriterT :: m (w, a) }
newtype MaybeT m a = MaybeT { runMaybeT :: m (Maybe a) }
-- note that
WriterT String Maybe a == Maybe (String, a)
MaybeT (Writer String) a == (String, Maybe a)
These composed stacks of monads are called monad transformer stacks and they allow you to assemble side effects in layers.
So what happens if we have two different, but similar stacks that we'd like to use together. For instance, we can consider Maybe to be a monad... or a monad transformer stack of a single layer. Compare that to WriterT String Maybe which is a monad transformer stack of two layers, the bottom of which is Maybe.
These two stacks are very similar, but we cannot transport computations from one to the other. Or rather, we can, but it's fairly annoying
transport :: Maybe a -> WriterT String Maybe a
transport Nothing = WriterT Nothing
transport (Just a) = WriterT (Just ("", a))
this transport forms a general pattern where we "add another layer" onto a stack. This general pattern is called lift
lift :: Maybe a -> WriterT String Maybe a
Or, written polymorphically we see the extra layer t being prepended.
lift :: MonadTrans t => m a -> t m a
Finally, we've come a long way from our pure computation at the beginning
comp1 :: a
and demonstrated that we can lift simple transformer stacks into more complex ones. Can we consider comp1 to be living in the very simplest of transformer stacks—the empty stack?
It turns out that this is actually a really valid point of view. We can even "lift" comp1 into a more sophisticated transformer stack... but the terminology changes slightly.
return :: Monad m => a -> m a
So, it's valid to think of return as lifting a pure computation into a basic monad. This is a foundational principle of monads even—that they can embed pure computations within them.

Partially lift with liftIO

I'm trying to do something that's probably impossible. I have a type that is an instance of MonadIO. If you liftIO an IO action in a context where this type is the base monad of some transformer stack, it will work fine. So, what I'd like to be able to do is take a value that's already been lifted part-way (to my type) and lift it "the rest of the way" in one step.
I can do this in two ways. One is that my type can actually be trivially re-embedded into normal IO, so I can do this:
liftMore :: (MonadIO m) => MyType a -> m a
liftMore x = liftIO $ embedMyTypeInIO x
And this works. However, this also provides a way to fully escape from my type if used in context where just IO is the base monad, which is undesirable.
I can also do this by building a new typeclass like MonadIO that uses my type as a base, but then it needs to be instantiated for everything, which is very undesirable. I tried using a newtype wrapper to make every monad transformer an instance of such a class, but couldn't quite get it.
Any ideas on strategies I could try to accomplish this? (I'm willing to play with language extensions, but of course a solution that is Haskell98 is much preferrable).

What is MonadBaseControl for?

I'm digging deeper into Yesod's monads, and have encountered MonadBaseControl.
I took a look at the hackage doc, and got lost. Could someone tell me the problem it is trying to solve?
Michael Snoyman actually wrote a small tutorial on monad-control: http://www.yesodweb.com/book/monad-control
The gist of that article might be the following:
Imagine you have this piece of code:
withMyFile :: (Handle -> IO a) -> IO a
withMyFile = withFile "test.txt" WriteMode
You can apply withMyFile to any function of the type Handle -> IO a and get a nice IO a value. However, what if you have a function of the type Handle -> ErrorT MyError IO a and want to get a value of type ErrorT MyError IO a? Well, basically, you will have to modify withMyFile in order to incorporate a lot of wrapping/unwrapping. MonadBaseControl allows you to somewhat 'lift' functions like withMyFile to certain monad transfromers which allows unwrapping ("running"). Thus, resulting code looks like this:
useMyFileError :: (Handle -> ErrorT MyError IO ()) -> ErrorT MyError IO ()
useMyFileError func = control $ \run -> withMyFile $ run . func
It comes from the package monad-control, and is one of a pair of type classes (the other one being MonadTransControl) that enhance MonadBase (resp. MonadTrans) by supporting an alternative liftBase (resp. lift) operation for monads that implement it. This enhanced version no longer takes a simple action in the absolute base monad (resp. immediate base monad), but instead takes a function that gets the base monad's (resp. monad transformer's) whole state at that point as its only parameter and returns the aforementioned action.
As the package documentation states, this enhancement, along with the rest of the contents of these type classes, allow you to lift functions like catch, alloca, and forkIO from the absolute base monad (resp. immediate base monad), which is not possible with the simpler scheme present in MonadBase (resp. MonadTrans) because the latter pair do not allow you to lift the arguments of a function, just the results, while the approach taken by monad-control allows both.
As a result, the set of monads (resp. monad transformers) that can be used with MonadBaseControl (resp. MonadTransControl) is a strict subset of the set of monads that can be used with MonadBase (resp. MonadTrans), but the former groups are much more powerful than the latter for the same reason.

Avoiding lift with monad transformers

I have a problem to which a stack of monad transformers (or even one monad transformer) over IO. Everything is good, except that using lift before every action is terribly annoying! I suspect there is really nothing to do about that, but I thought I'd ask anyway.
I am aware of lifting entire blocks, but what if the code is really of mixed types? Would it not be nice if GHC threw in some syntactic sugar (for example, <-$ = <- lift)?
For all the standard mtl monads, you don't need lift at all. get, put, ask, tell — they all work in any monad with the right transformer somewhere in the stack. The missing piece is IO, and even there liftIO lifts an arbitrary IO action down an arbitrary number of layers.
This is done with typeclasses for each "effect" on offer: for example, MonadState provides get and put. If you want to create your own newtype wrapper around a transformer stack, you can do deriving (..., MonadState MyState, ...) with the GeneralizedNewtypeDeriving extension, or roll your own instance:
instance MonadState MyState MyMonad where
get = MyMonad get
put s = MyMonad (put s)
You can use this to selectively expose or hide components of your combined transformer, by defining some instances and not others.
(You can easily extend this approach to all-new monadic effects you define yourself, by defining your own typeclass and providing boilerplate instances for the standard transformers, but all-new monads are rare; most of the time, you'll get by simply composing the standard set offered by mtl.)
You can make your functions monad-agnostic by using typeclasses instead of concrete monad stacks.
Let's say that you have this function, for example:
bangMe :: State String ()
bangMe = do
str <- get
put $ str ++ "!"
-- or just modify (++"!")
Of course, you realize that it works as a transformer as well, so one could write:
bangMe :: Monad m => StateT String m ()
However, if you have a function that uses a different stack, let's say ReaderT [String] (StateT String IO) () or whatever, you'll have to use the dreaded lift function! So how is that avoided?
The trick is to make the function signature even more generic, so that it says that the State monad can appear anywhere in the monad stack. This is done like this:
bangMe :: MonadState String m => m ()
This forces m to be a monad that supports state (virtually) anywhere in the monad stack, and the function will thus work without lifting for any such stack.
There's one problem, though; since IO isn't part of the mtl, it doesn't have a transformer (e.g. IOT) nor a handy type class per default. So what should you do when you want to lift IO actions arbitrarily?
To the rescue comes MonadIO! It behaves almost identically to MonadState, MonadReader etc, the only difference being that it has a slightly different lifting mechanism. It works like this: you can take any IO action, and use liftIO to turn it into a monad agnostic version. So:
action :: IO ()
liftIO action :: MonadIO m => m ()
By transforming all of the monadic actions you wish to use in this way, you can intertwine monads as much as you want without any tedious lifting.

Using Haskell's type system to enforce modularity

I'm thinking about ways to use Haskell's type system to enforce modularity in a program. For example, if I have a web application, I'm curious if there's a way to separate all database code from CGI code from filesystem code from pure code.
For example, I'm envisioning a DB monad, so I could write functions like:
countOfUsers :: DB Int
countOfUsers = select "count(*) from users"
I would like it to be impossible to use side effects other than those supported by the DB monad. I am picturing a higher-level monad that would be limited to direct URL handlers and would be able to compose calls to the DB monad and the IO monad.
Is this possible? Is this wise?
Update: I ended up achieving this with Scala instead of Haskell: http://moreindirection.blogspot.com/2011/08/implicit-environment-pattern.html
I am picturing a higher-level monad that would be limited to direct URL handlers and would be able to compose calls to the DB monad and the IO monad.
You can certainly achieve this, and get very strong static guarantees about the separation of the components.
At its simplest, you want a restricted IO monad. Using something like a "tainting" technique, you can create a set of IO operations lifted into a simple wrapper, then use the module system to hide the underlying constructors for the types.
In this way you'll only be able to run CGI code in a CGI context, and DB code in a DB context. There are many examples on Hackage.
Another way is to construct an interpreter for the actions, and then use data constructors to describe each primitive operation you wish. The operations should still form a monad, and you can use do-notation, but you'll instead be building a data structure that describes the actions to run, which you then execute in a controlled way via an interpreter.
This gives you perhaps more introspection than you need in typical cases, but the approach does give you full power to insspect user code before you execute it.
I think there's a third way beyond the two Don Stewart mentioned, which may even be simpler:
class Monad m => MonadDB m where
someDBop1 :: String -> m ()
someDBop2 :: String -> m [String]
class Monad m => MonadCGI m where
someCGIop1 :: ...
someCGIop2 :: ...
functionWithOnlyDBEffects :: MonadDB m => Foo -> Bar -> m ()
functionWithOnlyDBEffects = ...
functionWithDBandCGIEffects :: (MonadDB m, MonadCGI m) => Baz -> Quux -> m ()
functionWithDBandCGIEffects = ...
instance MonadDB IO where
someDBop1 = ...
someDBop2 = ...
instance MonadCGI IO where
someCGIop1 = ...
someCGIop2 = ...
The idea is very simply that you define type classes for the various subsets of operations you want to separate out, and then parametrize your functions using them. Even if the only concrete monad you ever make an instance of the classes is IO, the functions parametrized on any MonadDB will still only be allowed to use MonadDB operations (and ones built from them), so you achieve the desired result. And in a "can do anything" function in the IO monad, you can use MonadDB and MonadCGI operations seamlessly, because IO is an instance.
(Of course, you can define other instances if you want to. Ones to lift the operations through various monad transformers would be straightforward, and I think there's actually nothing stopping you from writing instances for the "wrapper" and "interpreter" monads Don Stewart mentions, thereby combining the approaches - although I'm not sure if there's a reason you would want to.)
Thanks for this question!
I did some work on a client/server web framework that used monads to distinguish between different exection environments. The obvious ones were client-side and server-side, but it also allowed you to write both-side code (which could run on both client and server, because it didn't contain any special features) and also asynchronous client-side which was used for writing non-blocking code on the client (essentially a continuation monad on the client-side). This sounds quite related to your idea of distinguishing between CGI code and DB code.
Here are some resources about my project:
Slides from a presentation that I did about the project
Draft paper that I wrote with Don Syme
And I also wrote my Bachelor thesis on this subject (which is quite long though)
I think this is an interesting approach and it can give you interesting guarantees about the code. There are some tricky questions though. If you have a server-side function that takes an int and returns int, then what should be the type of this function? In my project, I used int -> int server (but it may be also possible to use server (int -> int).
If you have a couple of functions like this, then it isn't as straightforward to compose them. Instead of writing goo (foo (bar 1)), you need to write the following code:
do b <- bar 1
f <- foo b
return goo f
You could write the same thing using some combinators, but my point is that composition is a bit less elegant.

Resources