Using ST monad for efficiency purposes

Using ST monad for efficiency purposes - haskell

I have some traditional state-passing code which I need to optimize. I've heard that if you're maintaining and updating state a lot that using the ST monad can help improve efficiency. However, after looking into the ST stuff a bit I'm a bit unclear as to how/where the ST monad should be used.
A couple of approaches that come to mind:
Instead of passing state everywhere, pass STRef's instead. So for instance, foo :: State -> a -> b -> (State,c) becomes foo :: STRef s State -> a -> b -> ST s c and so on.
Keep my function signatures the same but use ST under the hood using runST.
Only use ST when updating the state in my main execution loop and escape ST using either runST or stToIO.
Obviously, these questions will ultimately depend on the specifics of my project but I'm wondering if there are any rule-of-thumb type guidelines that might be helpful before more detail is required.

Only approach 1 seems to be viable, in the general case.
When runST action returns and delivers its result, it will deallocate all the memory which has been allocated running the action using newSTRef. If you use one runST per function, you can't have any reference persist across function calls. Even if you don't need that, making each function call allocate its mutable state on entry only to deallocate it on exit (as in your option 2) looks pointless. I would just pass a STRef s State around, and mutate the state through that.
There might be other aspects to consider, though. If a function performs many mutations inside, option 2 might still be OK, since the extra cost of entering/leaving would be negligible.
(I probably don't completely understand your option 3. Isn't it option 1, essentially?)

Related

In Haskell, how to make read-only parameters that depend on ST

Context:
When considering the signature of a function in a typical imperative language, some parameters might be denoted as mutable references, some parameters might be denoted as immutable references, some parameters might be seen as simple pure constants.
I am trying to understand how to reproduce this in Haskell, most importantly the mutable/immutable handling of variables that depend on the state.
There are several approaches to manage state in Haskell.
One approach seems to be via State/StateT/MonadState which fit well with monad transformers.
Among stateful function parameters, if I want to make it explicit that one should be regarded as immutable inside the function body, I believe answers to that question: Making Read-Only functions for a State in Haskell explain well how to do it, by using Reader or MonadReader.
Another approach to manage state (which I am more interested in that case), is with ST. I like ST better because, it allows to manage more than just one memory cell at the same time, and it appears to be more performant than State.
The problem now is that I don't know how to properly manage a distinction between mutable/immutable stateful variables in ST. The Reader way does not seem to apply in that case. I have been looking at the STMonadTrans package which seems to help make ST fit with monad transformers, but I am not sure how to use it.
Question: Do you have a simple example of a function f that creates a mutable variable x with newSTRef, and passes x to a function g, in an immutable way, that is, in such a way that g can read x but not modify x? If not is there a workaround?
Remark 1: A workaround could be to freeze the mutable variables before passing them to make them pure, however in my case its not acceptable solution because freezing can be either expensive or unsafe, and it is not possible to freeze complex structures quickly such as vectors of vectors. Unsafe coerce is not acceptable either. I am looking for a safe zero runtime cost solution.
Remark 2: Someone said I can just read the reference before going into the function, but this is over simplified answer to my simplified question. In a more general context, it is possible that one cannot readSTRef the variable x before going into the function g because x is more complex like a set of mutable arrays.
I am still asking my question in that simple way to try to figure out how to do the general thing on a simple example.
Thanks

I haven't read the novel that is the comments section, but a pattern like
newtype ReadOnly s a = ReadOnly (ST s a)
makeReadOnly :: STRef s a -> ReadOnly s a
makeReadOnly = ReadOnly . readSTRef
has served me well. This is a classic trick: if you want a data type to support some operations, just define the data type to be a record of the operations you want to support. In this case, there is only one.
(As a bonus, it can be seen from this that "read only variables" are highly composable!)

Getting value out of context or leaving them in?

Assume code below. Is there a quicker way to get the contextual values out of findSerial rather than writing a function like outOfContext?
The underlying question is: does one usually stick within context and use Functors, Applicatives, Monoids and Monads to get the job done, or is it better to take it out of context and apply the usual non-contextual computation methods. In brief: don't want to learn Haskell all wrong, since it takes time enough as it does.
import qualified Data.Map as Map
type SerialNumber = (String, Int)
serialList :: Map.Map String SerialNumber
serialList = Map.fromList [("belt drive",("BD",0001))
,("chain drive",("CD",0002))
,("drive pulley",("DP",0003))
,("drive sprocket",("DS",0004))
]
findSerial :: Ord k => k -> Map.Map k a -> Maybe a
findSerial input = Map.lookup input
outOfContext (Just (a, b)) = (a, b)

Assuming I understand it correctly, I think your question essentially boils down to “Is it idiomatic in Haskell to write and use partial functions?” (which your outOfContext function is, since it’s just a specialized form of the built-in partial function fromJust). The answer to that question is a resounding no. Partial functions are avoided whenever possible, and code that uses them can usually be refactored into code that doesn’t.
The reason partial functions are avoided is that they voluntarily compromise the effectiveness of the type system. In Haskell, when a function has type X -> Y, it is generally assumed that providing it an X will actually produce a Y, and that it will not do something else entirely (i.e. crash). If you have a function that doesn’t always succeed, reflecting that information in the type by writing X -> Maybe Y forces the caller to somehow handle the Nothing case, and it can either handle it directly or defer the failure further to its caller (by also producing a Maybe). This is great, since it means that programs that typecheck won’t crash at runtime. The program might still have logical errors, but knowing before even running the program that it won’t blow up is still pretty nice.
Partial functions throw this guarantee out the window. Any program that uses a partial function will crash at runtime if the function’s preconditions are accidentally violated, and since those preconditions are not reflected in the type system, the compiler cannot statically enforce them. A program might be logically correct at the time of its writing, but without enforcing that correctness with the type system, further modification, extension, or refactoring could easily introduce a bug by mistake.
For example, a programmer might write the expression
if isJust n then fromJust n else 0
which will certainly never crash at runtime, since fromJust’s precondition is always checked before it is called. However, the type system cannot enforce this, and a further refactoring might swap the branches of the if, or it might move the fromJust n to a different part of the program entirely and accidentally omit the isJust check. The program will still compile, but it may fail at runtime.
In contrast, if the programmer avoids partial functions, using explicit pattern-matching with case or total functions like maybe and fromMaybe, they can replace the tricky conditional above with something like
fromMaybe 0 n
which is not only clearer, but ensures any accidental misuse will simply fail to typecheck, and the potential bug will be detected much earlier.
For some concrete examples of how the type system can be a powerful ally if you stick exclusively to total functions, as well as some good food for thought about different ways to encode type safety for your domain into Haskell’s type system, I highly recommend reading Matt Parsons’s wonderful blog post, Type Safety Back and Forth, which explores these ideas in more depth. It additionally highlights how using Maybe as a catch-all representation of failure can be awkward, and it shows how the type system can be used to enforce preconditions to avoid needing to propagate Maybe throughout an entire system.

ST Monad == code smell?

I'm working on implementing the UCT algorithm in Haskell, which requires a fair amount of data juggling. Without getting into too much detail, it's a simulation algorithm where, at each "step," a leaf node in the search tree is selected based on some statistical properties, a new child node is constructed at that leaf, and the stats corresponding to the new leaf and all of its ancestors are updated.
Given all that juggling, I'm not really sharp enough to figure out how to make the whole search tree a nice immutable data structure à la Okasaki. Instead, I've been playing around with the ST monad a bit, creating structures composed of mutable STRefs. A contrived example (unrelated to UCT):
import Control.Monad
import Control.Monad.ST
import Data.STRef
data STRefPair s a b = STRefPair { left :: STRef s a, right :: STRef s b }
mkStRefPair :: a -> b -> ST s (STRefPair s a b)
mkStRefPair a b = do
a' <- newSTRef a
b' <- newSTRef b
return $ STRefPair a' b'
derp :: (Num a, Num b) => STRefPair s a b -> ST s ()
derp p = do
modifySTRef (left p) (\x -> x + 1)
modifySTRef (right p) (\x -> x - 1)
herp :: (Num a, Num b) => (a, b)
herp = runST $ do
p <- mkStRefPair 0 0
replicateM_ 10 $ derp p
a <- readSTRef $ left p
b <- readSTRef $ right p
return (a, b)
main = print herp -- should print (10, -10)
Obviously this particular example would be much easier to write without using ST, but hopefully it's clear where I'm going with this... if I were to apply this sort of style to my UCT use case, is that wrong-headed?
Somebody asked a similar question here a couple years back, but I think my question is a bit different... I have no problem using monads to encapsulate mutable state when appropriate, but it's that "when appropriate" clause that gets me. I'm worried that I'm reverting to an object-oriented mindset prematurely, where I have a bunch of objects with getters and setters. Not exactly idiomatic Haskell...
On the other hand, if it is a reasonable coding style for some set of problems, I guess my question becomes: are there any well-known ways to keep this kind of code readable and maintainable? I'm sort of grossed out by all the explicit reads and writes, and especially grossed out by having to translate from my STRef-based structures inside the ST monad to isomorphic but immutable structures outside.

I don't use ST much, but sometimes it is just the best solution. This can be in many scenarios:
There are already well-known, efficient ways to solve a problem. Quicksort is a perfect example of this. It is known for its speed and in-place behavior, which cannot be imitated by pure code very well.
You need rigid time and space bounds. Especially with lazy evaluation (and Haskell doesn't even specify whether there is lazy evaluation, just that it is non-strict), the behavior of your programs can be very unpredictable. Whether there is a memory leak could depend on whether a certain optimization is enabled. This is very different from imperative code, which has a fixed set of variables (usually) and defined evaluation order.
You've got a deadline. Although the pure style is almost always better practice and cleaner code, if you are used to writing imperatively and need the code soon, starting imperative and moving to functional later is a perfectly reasonable choice.
When I do use ST (and other monads), I try to follow these general guidelines:
Use Applicative style often. This makes the code easier to read and, if you do switch to an immutable version, much easier to convert. Not only that, but Applicative style is much more compact.
Don't just use ST. If you program only in ST, the result will be no better than a huge C program, possibly worse because of the explicit reads and writes. Instead, intersperse pure Haskell code where it applies. I often find myself using things like STRef s (Map k [v]). The map itself is being mutated, but much of the heavy lifting is done purely.
Don't remake libraries if you don't have to. A lot of code written for IO can be cleanly, and fairly mechanically, converted to ST. Replacing all the IORefs with STRefs and IOs with STs in Data.HashTable was much easier than writing a hand-coded hash table implementation would have been, and probably faster too.
One last note - if you are having trouble with the explicit reads and writes, there are ways around it.

Algorithms which make use of mutation and algorithms which do not are different algorithms. Sometimes there is a strightforward bounds-preserving translation from the former to the latter, sometimes a difficult one, and sometimes only one which does not preserve complexity bounds.
A skim of the paper reveals to me that I don't think it makes essential use of mutation -- and so I think a potentially really nifty lazy functional algorithm could be developed. But it would be a different but related algorithm to that described.
Below, I describe one such approach -- not necessarily the best or most clever, but pretty straightforward:
Here's the setup a I understand it -- A) a branching tree is constructed B) payoffs are then pushed back from the leafs to the root which then indicates the best choice at any given step. But this is expensive, so instead, only portions of the tree are explored to the leafs in a nondeterministic manner. Furthermore, each further exploration of the tree is determined by what's been learned in previous explorations.
So we build code to describe the "stage-wise" tree. Then, we have another data structure to define a partially explored tree along with partial reward estimates. We then have a function of randseed -> ptree -> ptree that given a random seed and a partially explored tree, embarks on one further exploration of the tree, updating the ptree structure as we go. Then, we can just iterate this function over an empty seed ptree to get a list of increasingly more sampled spaces in the ptree. We then can walk this list until some specified cutoff condition is met.
So now we've gone from one algorithm where everything is blended together to three distinct steps -- 1) building the whole state tree, lazily, 2) updating some partial exploration with some sampling of a structure and 3) deciding when we've gathered enough samples.

It's can be really difficult to tell when using ST is appropriate. I would suggest you do it with ST and without ST (not necessarily in that order). Keep the non-ST version simple; using ST should be seen as an optimization, and you don't want to do that until you know you need it.

I have to admit that I cannot read the Haskell code. But if you use ST for mutating the tree, then you can probably replace this with an immutable tree without losing much because:
Same complexity for mutable and immutable tree
You have to mutate every node above the new leaf. An immutable tree has to replace all nodes above the modified node. So in both cases the touched nodes are the same, thus you don't gain anything in complexity.
For e.g. Java object creation is more expensive than mutation, so maybe you can gain a bit here in Haskell by using mutation. But this I don't know for sure. But a small gain does not buy you much because of the next point.
Updating the tree is presumably not the bottleneck
The evaluation of the new leaf will probably be much more expensive than updating the tree. At least this is the case for UCT in computer Go.

Use of the ST monad is usually (but not always) as an optimization. For any optimization, I apply the same procedure:
Write the code without it,
Profile and identify bottlenecks,
Incrementally rewrite the bottlenecks and test for improvements/regressions,
The other use case I know of is as an alternative to the state monad. The key difference being that with the state monad the type of all of the data stored is specified in a top-down way, whereas with the ST monad it is specified bottom-up. There are cases where this is useful.

How to write a haskell function without IO in type sig by hiding 'state' changes

I wrote a function in haskell that takes a few parameters like Word32, String (ignore currying) and outputs IO Word32. Now, this is a function in the true sense: for the same inputs, the output will always be the same. There are no side-effects. The reason the function returns IO Word32 instead of Word32 is that the function updates many 32 bit Linear Feedback Shift registers (lfsr) and other registers several times in a loop in order to compute the final Word32 output.
My question is this: Given that this function effectively has no side-effects, is it possible to hide those register updates inside the function implementation so that the function returns Word32 and not IO Word32? If so, how?

Yes! Haskell can do this.
The ST monad
If you actually use mutable state (registers), that are entirely hidden from the observer outside the function, then you are in the ST monad, a monad for memory effects only. You enter the ST world via runST, and when you exit the function, all effects are guaranteed to not be visible.
It is precisely the right computational environment for working with local, mutable state.
Purely functional state: the State monad
If, however, you're not actually mutating registers or cells, but rather updating a purely functional value many times, a simpler environment is available: the State monad. This doesn't allow mutable state, but gives an illusion of local state.
IO, and unsafePerformIO
Finally, if you have local, mutable effects, like in the ST monad, but for some reason or another, you are going to need IO operations on that state (such as via an FFI call), you can simulate the ST monad, with almost as much safety, by using unsafePerformIO, instead of runST, to introduce a local IO environment. As the IO monad doesn't have nice types to enforce abstraction, you will need to manually assure yourself that the side effects will not be observable.

If you imported that function using the FFI, just remove the IO from the return type. Else, use unsafePerformIO :: IO a -> a from System.IO.Unsafe. Please note, that this function is one of the most dangerous functions in Haskell. Don't use it, if you're not really shure about the consequences. But for your purpose, it seems okay.

Yes, this would be a legitimate use of unsafePerformIO.
But it's only OK if you are really sure there are no visible effects.

Is/Should wrapping functions into a monad transformer be considered bad practice?

Let's say we want to use ReaderT [(a,b)] over the Maybe monad, and then we want to do a lookup in the list.
Now an easy, and not too uncommon way to this is:
first possibility
find a = ReaderT (lookup a)
However it does seem like this asserts some non-trivial thing about how the ReaderT transformer works. Looking at the source code for Control.Monad.Reader it's clear that this works just fine. But I haven't read any documentation supporting this. However we could also write find like this:
second possibility
find a = do y <- ask
lift (lookup a y)
Similar ideas hold for wrapping MaybeT, StateT, State and Reader. Usually I write something like the first example, but most of the time it is really obvious how to write it like the second example, and you might even say it's more readable. So my question is: should code like the first example be considered bad?

I think the biggest problem with the first way is:
If the mtl authors (or whatever transformer library you use), decide to stop exporting the data constructor for ReaderT then it will stop working. This happened with the State monad in the version bump from mtl 1 to mtl 2 and it's quite annoying. Whereas, ask is part of the official api of Reader and you should plan on it sticking around.
On the other hand, I wouldn't consider the first way wrong.

The current version of the mtl library - which is based on the transformers library - exports the function reader :: (r -> a) -> Reader r a for exactly this purpose when using the simple Reader monad. So we see that the design of the library does take into account this usage. Since no such function is provided for ReaderT, we can say with some confidence that the officially supported way to do this with ReaderT is to use the constructor directly.
I'll agree with you if you say that an analogous readerT :: Monad m => (r -> a) -> ReaderT r m a ought to be added to the library. That would be good both for consistency, and for allowing the possibility of changing the internal representation someday without breaking anyone's code.
But for now, your "first possibility" is the way to go.

At least there is a speed difference.
I wrote a program, which uses a random gen as a state and must generate about 5000000 random values while running. Now consider these two functions, which roll a dice:
random16 = State $ randomR (1,6) -- Using the internal representation
random16' = do
s <- get
(r,s') <- randomR (1,6) s
put s'
return r
Whith the first one, the program runs in about 6 seconds, while the second one is much slower, taking about 8 seconds. I can image, that it is similar for reader, so maybe use this one instead of the more clearer when runtime is important. I used the strict version for this.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string