Maintaining complex state in Haskell

Maintaining complex state in Haskell - haskell

Suppose you're building a fairly large simulation in Haskell. There are many different types of entities whose attributes update as the simulation progresses. Let's say, for the sake of example, that your entities are called Monkeys, Elephants, Bears, etc..
What is your preferred method for maintaining these entities' states?
The first and most obvious approach I thought of was this:
mainLoop :: [Monkey] -> [Elephant] -> [Bear] -> String
mainLoop monkeys elephants bears =
let monkeys' = updateMonkeys monkeys
elephants' = updateElephants elephants
bears' = updateBears bears
in
if shouldExit monkeys elephants bears then "Done" else
mainLoop monkeys' elephants' bears'
It's already ugly having each type of entity explicitly mentioned in the mainLoop function signature. You can imagine how it would get absolutely awful if you had, say, 20 types of entities. (20 is not unreasonable for complex simulations.) So I think this is an unacceptable approach. But its saving grace is that functions like updateMonkeys are very explicit in what they do: They take a list of Monkeys and return a new one.
So then the next thought would be to roll everything into one big data structure that holds all state, thus cleaning up the signature of mainLoop:
mainLoop :: GameState -> String
mainLoop gs0 =
let gs1 = updateMonkeys gs0
gs2 = updateElephants gs1
gs3 = updateBears gs2
in
if shouldExit gs0 then "Done" else
mainLoop gs3
Some would suggest that we wrap GameState up in a State Monad and call updateMonkeys etc. in a do. That's fine. Some would rather suggest we clean it up with function composition. Also fine, I think. (BTW, I'm a novice with Haskell, so maybe I'm wrong about some of this.)
But then the problem is, functions like updateMonkeys don't give you useful information from their type signature. You can't really be sure what they do. Sure, updateMonkeys is a descriptive name, but that's little consolation. When I pass in a god object and say "please update my global state," I feel like we're back in the imperative world. It feels like global variables by another name: You have a function that does something to the global state, you call it, and you hope for the best. (I suppose you still avoid some concurrency problems that would be present with global variables in an imperative program. But meh, concurrency isn't nearly the only thing wrong with global variables.)
A further problem is this: Suppose the objects need to interact. For example, we have a function like this:
stomp :: Elephant -> Monkey -> (Elephant, Monkey)
stomp elephant monkey =
(elongateEvilGrin elephant, decrementHealth monkey)
Say this gets called in updateElephants, because that's where we check to see if any of the elephants are in stomping range of any monkeys. How do you elegantly propagate the changes to both the monkeys and elephants in this scenario? In our second example, updateElephants takes and returns a god object, so it could effect both changes. But this just muddies the waters further and reinforces my point: With the god object, you're effectively just mutating global variables. And if you're not using the god object, I'm not sure how you'd propagate those types of changes.
What to do? Surely many programs need to manage complex state, so I'm guessing there are some well-known approaches to this problem.
Just for the sake of comparison, here's how I might solve the problem in the OOP world. There would be Monkey, Elephant, etc. objects. I'd probably have class methods to do lookups in the set of all live animals. Maybe you could lookup by location, by ID, whatever. Thanks to the data structures underlying the lookup functions, they'd stay allocated on the heap. (I'm assuming GC or reference counting.) Their member variables would get mutated all the time. Any method of any class would be able to mutate any live animal of any other class. E.g. an Elephant could have a stomp method that would decrement the health of a passed-in Monkey object, and there would be no need to pass that
Likewise, in an Erlang or other actor-oriented design, you could solve these problems fairly elegantly: Each actor maintains its own loop and thus its own state, so you never need a god object. And message passing allows one object's activities to trigger changes in other objects without passing a bunch of stuff all the way back up the call stack. Yet I have heard it said that actors in Haskell are frowned upon.

The answer is functional reactive programming (FRP). It it a hybrid of two coding styles: component state management and time-dependent values. Since FRP is actually a whole family of design patterns, I want to be more specific: I recommend Netwire.
The underlying idea is very simple: You write many small, self-contained components each with their own local state. This is practically equivalent to time-dependent values, because each time you query such a component you may get a different answer and cause a local state update. Then you combine those components to form your actual program.
While this sounds complicated and inefficient it's actually just a very thin layer around regular functions. The design pattern implemented by Netwire is inspired by AFRP (Arrowized Functional Reactive Programming). It's probably different enough to deserve its own name (WFRP?). You may want to read the tutorial.
In any case a small demo follows. Your building blocks are wires:
myWire :: WireP A B
Think of this as a component. It is a time-varying value of type B that depends on a time-varying value of type A, for example a particle in a simulator:
particle :: WireP [Particle] Particle
It depends on a list of particles (for example all currently existing particles) and is itself a particle. Let's use a predefined wire (with a simplified type):
time :: WireP a Time
This is a time-varying value of type Time (= Double). Well, it's time itself (starting at 0 counted from whenever the wire network was started). Since it doesn't depend on another time-varying value you can feed it whatever you want, hence the polymorphic input type. There are also constant wires (time-varying values that don't change over time):
pure 15 :: Wire a Integer
-- or even:
15 :: Wire a Integer
To connect two wires you simply use categorical composition:
integral_ 3 . 15
This gives you a clock at 15x real time speed (the integral of 15 over time) starting at 3 (the integration constant). Thanks to various class instances wires are very handy to combine. You can use your regular operators as well as applicative style or arrow style. Want a clock that starts at 10 and is twice the real time speed?
10 + 2*time
Want a particle that starts and (0, 0) with (0, 0) velocity and accelerates with (2, 1) per second per second?
integral_ (0, 0) . integral_ (0, 0) . pure (2, 1)
Want to display statistics while the user presses the spacebar?
stats . keyDown Spacebar <|> "stats currently disabled"
This is just a small fraction of what Netwire can do for you.

I know this is old topic. But I am facing the same problem right now while trying to implement Rail Fence cipher exercise from exercism.io. It is quite disappointing to see such a common problem having such poor attention in Haskell. I don't take it that to do some as simple as maintaining state I need to learn FRP. So, I continued googling and found solution looking more straightforward - State monad: https://en.wikibooks.org/wiki/Haskell/Understanding_monads/State

Related

Why does FRP consider time as a factor for values?

Behaviors are ubiquitously defined as “time-varying value”s1.
Why? time being the dependency/parameter for varying values is very uncommon.
My intuition for FRP would be to have behaviors as event-varying values instead; it is much more common, much more simple, I wage a much more of an efficient idea, and extensible enough to support time too (tick event).
For instance, if you write a counter, you don't care about time/associated timestamps, you just care about the "Increase-button clicked" and "Decrease-button clicked" events.
If you write a game and want a position/force behavior, you just care about the WASD/arrow keys held events, etc. (unless you ban your players for moving to the left in the afternoon; how iniquitous!).
So: Why time is a consideration at all? why timestamps? why are some libraries (e.g. reactive-banana, reactive) take it up to the extent of having Future, Moment values? Why work with event-streams instead of just responding to an event occurrence? All of this just seems to over-complicate a simple idea (event-varying/event-driven value); what's the gain? what problem are we solving here? (I'd love to also get a concrete example along with a wonderful explanation, if possible).
1 Behaviors have been defined so here, here, here... & pretty much everywhere I've encountered.

Behaviors differ from Events primarily in that a Behavior has a value right now while an Event only has a value whenever a new event comes in.
So what do we mean by "right now"? Technically all changes are implemented as push or pull semantics over event streams, so we can only possibly mean "the most recent value as of the last event of consequence for this Behavior". But that's a fairly hairy concept—in practice "now" is much simpler.
The reasoning for why "now" is simpler comes down to the API. Here are two examples from Reactive Banana.
Eventually an FRP system must always produce some kind of externally visible change. In Reactive Banana this is facilitated by the reactimate :: Event (IO ()) -> Moment () function which consumes event streams. There is no way to have a Behavior trigger external changes---you always have to do something like reactimate (someBehavior <# sampleTickEvent) to sample the behavior at concrete times.
Behaviors are Applicatives unlike Events. Why? Well, let's assume Event was an applicative and think about what happens when we have two event streams f and x and write f <*> x: since events occur all at different times the chances of f and x being defined simultaneously are (almost certainly) 0. So f <*> x would always mean the empty event stream which is useless.
What you really want is for f <*> x to cache the most current values for each and take their combined value "all of the time". That's really confusing concept to talk about in terms of an event stream, so instead lets consider f and x as taking values for all points in time. Now f <*> x is also defined as taking values for all points in time. We've just invented Behaviors.

Because it was the simplest way I could think of to give a precise denotation (implementation-independent meaning) to the notion of behaviors, including the sorts of operations I wanted, including differentiation and integration, as well as tracking one or more other behaviors (including but not limited to user-generated behavior).
Why? time being the dependency/parameter for varying values is very uncommon.
I suspect that you're confusing the construction (recipe) of a behavior with its meaning. For instance, a behavior might be constructed via a dependency on something like user input, possibly with additional synthetic transformation. So there's the recipe. The meaning, however, is simply a function of time, related to the time-function that is the user input. Note that by "function", I mean in the math sense of the word: a (deterministic) mapping from domain (time) to range (value), not in the sense that there's a purely programmatic description.
I've seen many questions asking why time matters and why continuous time. If you apply the simple discipline of giving a mathematical meaning in the style of denotational semantics (a simple and familiar style for functional programmers), the issues become much clearer.
If you really want to grok the essence of and thinking behind FRP, I recommend you read my answer to "Specification for a Functional Reactive Programming language" and follow pointers, including "What is Functional Reactive Programming".

Conal Elliott's Push-Pull FRP paper describes event-varying data, where the only points in time that are interesting are when events occcur. Reactive event-varying data is the current value and the next Event that will change it. An Event is a Future point in the event-varying Reactive data.
data Reactive a = a ‘Stepper ‘ Event a
newtype Event a = Ev (Future (Reactive a))
The Future doesn't need to have a time associated with it, it just need to represent the idea of a value that hasn't happened yet. In an impure language with events, for example, a future can be an event handle and a value. When the event occurs, you set the value and raise the handle.
Reactive a has a value for a at all points in time, so why would we need Behaviors? Let's make a simple game. In between when the user presses the WASD keys, the character, accelerated by the force applied, still moves on the screen. The character's position at different points in time is different, even though no event has occurred in the intervening time. This is what a Behavior describes - something that not only has a value at all points in time, but its value can be different at all points in time, even with no intervening events.
One way to describe Behaviors would be to repeat what we just stated. Behaviors are things that can change in-between events. In-between events they are time-varying values, or functions of time.
type Behavior a = Reactive (Time -> a)
We don't need Behavior, we could simply add events for clock ticks, and write all of the logic in our entire game in terms of these tick events. This is undesirable to some developers as the code declaring what our game is is now intermingled with the code providing how it is implemented. Behaviors allow the developer to separate this logic between the description of the game in terms of time-varying variables and the implementation of the engine that executes that description.

FRP - Event streams and Signals - what is lost in using just signals?

In recent implementations of Classic FRP, for instance reactive-banana, there are event streams and signals, which are step functions (reactive-banana calls them behaviours but they are nevertheless step functions). I've noticed that Elm only uses signals, and doesn't differentiate between signals and event streams. Also, reactive-banana allows to go from event streams to signals (edited: and it's sort of possible to act on behaviours using reactimate' although it not considered good practice), which kind of means that in theory we could apply all the event stream combinators on signals/behaviours by first converting the signal to event stream, applying and then converting again. So, given that it's in general easier to use and learn just one abstraction, what is the advantage of having separated signals and event streams ? Is anything lost in using just signals and converting all the event stream combinators to operate on signals ?
edit: The discussion has been very interesting. The main conclusions that I took from the discussion myself is that behaviours/event sources are both needed for mutually recursive definitions (feedback) and for having an output depend on two inputs (a behaviour and an event source) but only cause an action when one of them changes (<#>).

(Clarification: In reactive-banana, it is not possible to convert a Behavior back to an Event. The stepper function is a one-way ticket. There is a changes function, but its type indicates that it is "impure" and it comes with a warning that it does not preserve the semantics.)
I believe that having two separates concepts makes the API more elegant. In other words, it boils down to a question of API usability. I think that the two concepts behave sufficiently differently that things flow better if you have two separate types.
For example, the direct product for each type is different. A pair of Behavior is equivalent to a Behavior of pairs
(Behavior a, Behavior b) ~ Behavior (a,b)
whereas a pair of Events is equivalent to an Event of a direct sum:
(Event a, Event b) ~ Event (EitherOrBoth a b)
If you merge both types into one, then neither of these equivalence will hold anymore.
However, one of the main reasons for the separation of Event and Behavior is that the latter does not have a notion of changes or "updates". This may seem like an omission at first, but it is extremely useful in practice, because it leads to simpler code. For instance, consider a monadic function newInput that creates an input GUI widget that displays the text indicated in the argument Behavior,
input <- newInput (bText :: Behavior String)
The key point now is that the text displayed does not depend on how often the Behavior bText may have been updated (to the same or a different value), only on the actual value itself. This is a lot easier to reason about than the other case, where you would have to think about what happens when two successive event occurrences have the same value. Do you redraw the text while the user edits it?
(Of course, in order to actually draw the text, the library has to interface with the GUI framework and does keep track of changes in the Behavior. This is what the changes combinator is for. However, this can be seen as an optimization and is not available from "within FRP".)
The other main reason for the separation is recursion. Most Events that recursively depend on themselves are ill-defined. However, recursion is always allowed if you have mutual recursion between an Event and a Behavior
e = ((+) <$> b) <#> einput
b = stepper 0 e
There is no need to introduce delays by hand, it just works out of the box.

Something critically important to me is lost, namely the essence of behaviors, which is (possibly continuous) variation over continuous time.
Precise, simple, useful semantics (independent of a particular implementation or execution) is often lost as well.
Check out my answer to "Specification for a Functional Reactive Programming language", and follow the links there.
Whether in time or in space, premature discretization thwarts composability and complicates semantics.
Consider vector graphics (and other spatially continuous models like Pan's). Just as with premature finitization of data structures as explained in Why Functional Programming Matters.

I don't think there's any benefit to using the signals/behaviors abstraction over elm-style signals. As you point out, it's possible to create a signal-only API on top of the signal/behavior API (not at all ready for use, but see https://github.com/JohnLato/impulse/blob/dyn2/src/Reactive/Impulse/Syntax2.hs for an example). I'm pretty sure it's also possible to write a signal/behavior API on top of an elm-style API as well. That would make the two APIs functionally equivalent.
WRT efficiency, with a signals-only API the system should have a mechanism where only signals that have updated values will cause recomputations (e.g. if you don't move the mouse, the FRP network won't re-calculate the pointer coordinates and redraw the screen). Provided this is done, I don't think there's any loss of efficiency compared to a signals-and-streams approach. I'm pretty sure Elm works this way.
I don't think the continuous-behavior issue makes any difference here (or really at all). What people mean by saying behaviors are continuous over time is that they are defined at all times (i.e. they're functions over a continuous domain); the behavior itself isn't a continuous function. But we don't actually have a way to sample a behavior at any time; they can only be sampled at times corresponding to events, so we can't use the full power of this definition!
Semantically, starting from these definitions:
Event == for some t ∈ T: [(t,a)]
Behavior == ∀ t ∈ T: t -> b
since behaviors can only be sampled at times where events are defined, we can create a new domain TX where TX is the set of all times t at which Events are defined. Now we can loosen the Behavior definition to
Behavior == ∀ t ∈ TX: t -> b
without losing any power (i.e. this is equivalent to the original definition within the confines of our frp system). Now we can enumerate all times in TX to transform this to
Behavior == ∀ t ∈ TX: [(t,b)]
which is identical to the original Event definition except for the domain and quantification. Now we can change the domain of Event to TX (by the definition of TX), and the quantification of Behavior (from forall to for some) and we get
Event == for some t ∈ TX: [(t,a)]
Behavior == for some t ∈ TX: [(t,b)]
and now Event and Behavior are semantically identical, so they could obviously be represented using the same structure in an FRP system. We do lose a bit of information at this step; if we don't differentiate between Event and Behavior we don't know that a Behavior is defined at every time t, but in practice I don't think this really matters much. What elm does IIRC is require both Events and Behaviors to have values at all times and just use the previous value for an Event if it hasn't changed (i.e. change the quantification of Event to forall instead of changing the quantification of Behavior). This means you can treat everything as a signal and it all Just Works; it's just implemented so that the signal domain is exactly the subset of time that the system actually uses.
I think this idea was presented in a paper (which I can't find now, anyone else have a link?) about implementing FRP in Java, perhaps from POPL '14? Working from memory, so my outline isn't as rigorous as the original proof.
There's nothing to stop you from creating a more-defined Behavior by e.g. pure someFunction, this just means that within an FRP system you can't make use of that extra defined-ness, so nothing is lost by a more restricted implementation.
As for notional signals such as time, note that it's impossible to implement an actual continuous-time signal using typical programming languages. Since the implementation will necessarily be discrete, converting that to an event stream is trivial.
In short, I don't think anything is lost by using just signals.

I've unfortunately have no references in mind, but I distinctly remember
different reactive authors claiming this choice is just for efficiency. You expose
both to give the programmer a choice in what implementation of the same idea is
more efficient for your problem.
I might be lying now, but I believe Elm implements everything as event streams under the
hood. Things like time wouldn't be so nice like event streams though, since there are an
infinite amount of events during any time frame. I'm not sure how Elm solves this, but I
think it's a good example on something that makes more sense as a signal, both conceptually
and in implementation.

Full example for netwire?

I've read the netwire quickstart, but I have problems imagining how the whole thing would look like in a 'real' application. As the tutorial only deals with pure wires, I'm especially interested in in games if there are any, but I wouldn't mind others. Reactive Banana examples would probably do as well. They should just illustrate how FRP is useful.

Netwire is used in some real-world applications, but I don't know where to find them. Likely they are currently behind closed doors. However, a few example applications have been blogged about on Reddit, so you may want to check out /r/haskell. Just search for "netwire" there. Unfortunately reactive-banana examples won't help, because the concepts of these two libraries especially for event handling are entirely different.
The overall structure of a Netwire application is this: First you define a reactive value, let's call it simulation. In the simplest case it's a pure wire:
simulation :: WireP a [Particle]
Without detailing (yet) how to write that wire let me now explain what it is. The output type is [Particle], so it's a reactive list of particles. That means, this list can change over time. Notably the input type is fully polymorphic, so you know that this reactive value does not depend on other reactive values.
Now you want to get the actual value of the particle list at certain points in time. This is where your session and stepping functions come in. Most applications only need one of the session stepping functions like stepSessionP in this case. You just call this function in a loop to get the current value of the wire at this instant. There is no need to call this function continuously.
You will notice that the stepping function doesn't give you a [Particle], but a Either LastException [Particle]. This is because reactive values in Netwire can inhibit. This is the event concept. You know from the category laws that
w . id
is the same as just w in about the same way x + 0 is the same as x. The identity wire is neutral with respect to (.). However, now imagine
w . myId
where myId acts like the identity wire, just resulting in whatever reactive value it depends on, but sometimes it doesn't result at all. Sometimes it ignores the value and just inhibits, in which case the composition itself inhibits. You can interpret myId as an event wire and read the composition as "w if myId":
w . keyDown Space
Then you have your selection operator (<|>), which acts like an "or" for events:
w1 . ev1 <|> w2 . ev2 <|> w3
If ev1 inhibits, the rest is tried. Ideally the main wire never inhibits, so you could use stepSessionP_ instead, but it is composed of potentially inhibiting wires. You can also use your own inhibition monoid to have things like quit signals.

For those of you who stumble on this in the future, I thought this was a great example of a full application using netwire. He is also starting (as of today) to work on PACMAN as well.
http://www.reddit.com/r/haskell/comments/1kmes7/building_an_asteroids_clone_in_haskell_using/
http://ocharles.org.uk/blog/posts/2013-08-18-asteroids-in-netwire.html
https://github.com/ocharles/netwire-classics/

Is there an object-identity-based, thread-safe memoization library somewhere?

I know that memoization seems to be a perennial topic here on the haskell tag on stack overflow, but I think this question has not been asked before.
I'm aware of several different 'off the shelf' memoization libraries for Haskell:
The memo-combinators and memotrie packages, which make use of a beautiful trick involving lazy infinite data structures to achieve memoization in a purely functional way. (As I understand it, the former is slightly more flexible, while the latter is easier to use in simple cases: see this SO answer for discussion.)
The uglymemo package, which uses unsafePerformIO internally but still presents a referentially transparent interface. The use of unsafePerformIO internally results in better performance than the previous two packages. (Off the shelf, its implementation uses comparison-based search data structures, rather than perhaps-slightly-more-efficient hash functions; but I think that if you find and replace Cmp for Hashable and Data.Map for Data.HashMap and add the appropraite imports, you get a hash based version.)
However, I'm not aware of any library that looks answers up based on object identity rather than object value. This can be important, because sometimes the kinds of object which are being used as keys to your memo table (that is, as input to the function being memoized) can be large---so large that fully examining the object to determine whether you've seen it before is itself a slow operation. Slow, and also unnecessary, if you will be applying the memoized function again and again to an object which is stored at a given 'location in memory' 1. (This might happen, for example, if we're memoizing a function which is being called recursively over some large data structure with a lot of structural sharing.) If we've already computed our memoized function on that exact object before, we can already know the answer, even without looking at the object itself!
Implementing such a memoization library involves several subtle issues and doing it properly requires several special pieces of support from the language. Luckily, GHC provides all the special features that we need, and there is a paper by Peyton-Jones, Marlow and Elliott which basically worries about most of these issues for you, explaining how to build a solid implementation. They don't provide all details, but they get close.
The one detail which I can see which one probably ought to worry about, but which they don't worry about, is thread safety---their code is apparently not threadsafe at all.
My question is: does anyone know of a packaged library which does the kind of memoization discussed in the Peyton-Jones, Marlow and Elliott paper, filling in all the details (and preferably filling in proper thread-safety as well)?
Failing that, I guess I will have to code it up myself: does anyone have any ideas of other subtleties (beyond thread safety and the ones discussed in the paper) which the implementer of such a library would do well to bear in mind?
UPDATE
Following #luqui's suggestion below, here's a little more data on the exact problem I face. Let's suppose there's a type:
data Node = Node [Node] [Annotation]
This type can be used to represent a simple kind of rooted DAG in memory, where Nodes are DAG Nodes, the root is just a distinguished Node, and each node is annotated with some Annotations whose internal structure, I think, need not concern us (but if it matters, just ask and I'll be more specific.) If used in this way, note that there may well be significant structural sharing between Nodes in memory---there may be exponentially more paths which lead from the root to a node than there are nodes themselves. I am given a data structure of this form, from an external library with which I must interface; I cannot change the data type.
I have a function
myTransform : Node -> Node
the details of which need not concern us (or at least I think so; but again I can be more specific if needed). It maps nodes to nodes, examining the annotations of the node it is given, and the annotations its immediate children, to come up with a new Node with the same children but possibly different annotations. I wish to write a function
recursiveTransform : Node -> Node
whose output 'looks the same' as the data structure as you would get by doing:
recursiveTransform Node originalChildren annotations =
myTransform Node recursivelyTransformedChildren annotations
where
recursivelyTransformedChildren = map recursiveTransform originalChildren
except that it uses structural sharing in the obvious way so that it doesn't return an exponential data structure, but rather one on the order of the same size as its input.
I appreciate that this would all be easier if say, the Nodes were numbered before I got them, or I could otherwise change the definition of a Node. I can't (easily) do either of these things.
I am also interested in the general question of the existence of a library implementing the functionality I mention quite independently of the particular concrete problem I face right now: I feel like I've had to work around this kind of issue on a few occasions, and it would be nice to slay the dragon once and for all. The fact that SPJ et al felt that it was worth adding not one but three features to GHC to support the existence of libraries of this form suggests that the feature is genuinely useful and can't be worked around in all cases. (BUT I'd still also be very interested in hearing about workarounds which will help in this particular case too: the long term problem is not as urgent as the problem I face right now :-) )
1 Technically, I don't quite mean location in memory, since the garbage collector sometimes moves objects around a bit---what I really mean is 'object identity'. But we can think of this as being roughly the same as our intuitive idea of location in memory.

If you only want to memoize based on object identity, and not equality, you can just use the existing laziness mechanisms built into the language.
For example, if you have a data structure like this
data Foo = Foo { ... }
expensive :: Foo -> Bar
then you can just add the value to be memoized as an extra field and let the laziness take care of the rest for you.
data Foo = Foo { ..., memo :: Bar }
To make it easier to use, add a smart constructor to tie the knot.
makeFoo ... = let foo = Foo { ..., memo = expensive foo } in foo
Though this is somewhat less elegant than using a library, and requires modification of the data type to really be useful, it's a very simple technique and all thread-safety issues are already taken care of for you.

It seems that stable-memo would be just what you needed (although I'm not sure if it can handle multiple threads):
Whereas most memo combinators memoize based on equality, stable-memo does it based on whether the exact same argument has been passed to the function before (that is, is the same argument in memory).
stable-memo only evaluates keys to WHNF.
This can be more suitable for recursive functions over graphs with cycles.
stable-memo doesn't retain the keys it has seen so far, which allows them to be garbage collected if they will no longer be used. Finalizers are put in place to remove the corresponding entries from the memo table if this happens.
Data.StableMemo.Weak provides an alternative set of combinators that also avoid retaining the results of the function, only reusing results if they have not yet been garbage collected.
There is no type class constraint on the function's argument.
stable-memo will not work for arguments which happen to have the same value but are not the same heap object. This rules out many candidates for memoization, such as the most common example, the naive Fibonacci implementation whose domain is machine Ints; it can still be made to work for some domains, though, such as the lazy naturals.

Ekmett just uploaded a library that handles this and more (produced at HacPhi): http://hackage.haskell.org/package/intern. He assures me that it is thread safe.
Edit: Actually, strictly speaking I realize this does something rather different. But I think you can use it for your purposes. It's really more of a stringtable-atom type interning library that works over arbitrary data structures (including recursive ones). It uses WeakPtrs internally to maintain the table. However, it uses Ints to index the values to avoid structural equality checks, which means packing them into the data type, when what you want are apparently actually StableNames. So I realize this answers a related question, but requires modifying your data type, which you want to avoid...

Keeping State in a Purely Functional Language

I am trying to figure out how to do the following, assume that your are working on a controller for a DC motor you want to keep it spinning at a certain speed set by the user,
(def set-point (ref {:sp 90}))
(while true
(let [curr (read-speed)]
(controller #set-point curr)))
Now that set-point can change any time via a web a application, I can't think of a way to do this without using ref, so my question is how functional languages deal with this sort of thing? (even though the example is in clojure I am interested in the general idea.)

This will not answer your question but I want to show how these things are done in Clojure. It might help someone reading this later so they don't think they have to read up on monads, reactive programming or other "complicated" subjects to use Clojure.
Clojure is not a purely functional language and in this case it might be a good idea to leave the pure functions aside for a moment and model the inherent state of the system with identities.
In Clojure, you would probably use one of the reference types. There are several to choose from and knowing which one to use might be difficult. The good news is they all support the unified update model so changing the reference type later should be pretty straight forward.
I've chosen an atom but depending on your requirements it might be more appropriate to use a ref or an agent.
The motor is an identity in your program. It is a "label" for some thing that has different values at different times and these values are related to each other (i.e., the speed of the motor). I have put a :validator on the atom to ensure that the speed never drops below zero.
(def motor (atom {:speed 0} :validator (comp not neg? :speed)))
(defn add-speed [n]
(swap! motor update-in [:speed] + n))
(defn set-speed [n]
(swap! motor update-in [:speed] (constantly n)))
> (add-speed 10)
> (add-speed -8)
> (add-speed -4) ;; This will not change the state of motor
;; since the speed would drop below zero and
;; the validator does not allow that!
> (:speed #motor)
2
> (set-speed 12)
> (:speed #motor)
12
If you want to change the semantics of the motor identity you have at least two other reference types to choose from.
If you want to change the speed of the motor asynchronously you would use an agent. Then you need to change swap! with send. This would be useful if, for example, the clients adjusting the motor speed are different from the clients using the motor speed, so that it's fine for the speed to be changed "eventually".
Another option is to use a ref which would be appropriate if the motor need to coordinate with other identities in your system. If you choose this reference type you change swap! with alter. In addition, all state changes are run in a transaction with dosync to ensure that all identities in the transaction are updated atomically.
Monads are not needed to model identities and state in Clojure!

For this answer, I'm going to interpret "a purely functional language" as meaning "an ML-style language that excludes side effects" which I will interpret in turn as meaning "Haskell" which I'll interpret as meaning "GHC". None of these are strictly true, but given that you're contrasting this with a Lisp derivative and that GHC is rather prominent, I'm guessing this will still get at the heart of your question.
As always, the answer in Haskell is a bit of sleight-of-hand where access to mutable data (or anything with side effects) is structured in such a way that the type system guarantees that it will "look" pure from the inside, while producing a final program that has side effects where expected. The usual business with monads is a large part of this, but the details don't really matter and mostly distract from the issue. In practice, it just means you have to be explicit about where side effects can occur and in what order, and you're not allowed to "cheat".
Mutability primitives are generally provided by the language runtime, and accessed through functions that produce values in some monad also provided by the runtime (often IO, sometimes more specialized ones). First, let's take a look at the Clojure example you provided: it uses ref, which is described in the documentation here:
While Vars ensure safe use of mutable storage locations via thread isolation, transactional references (Refs) ensure safe shared use of mutable storage locations via a software transactional memory (STM) system. Refs are bound to a single storage location for their lifetime, and only allow mutation of that location to occur within a transaction.
Amusingly, that whole paragraph translates pretty directly to GHC Haskell. I'm guessing that "Vars" are equivalent to Haskell's MVar, while "Refs" are almost certainly equivalent to TVar as found in the stm package.
So to translate the example to Haskell, we'll need a function that creates the TVar:
setPoint :: STM (TVar Int)
setPoint = newTVar 90
...and we can use it in code like this:
updateLoop :: IO ()
updateLoop = do tvSetPoint <- atomically setPoint
sequence_ . repeat $ update tvSetPoint
where update tv = do curSpeed <- readSpeed
curSet <- atomically $ readTVar tv
controller curSet curSpeed
In actual use my code would be far more terse than that, but I've left things more verbose here in hopes of being less cryptic.
I suppose one could object that this code isn't pure and is using mutable state, but... so what? At some point a program is going to run and we'd like it to do input and output. The important thing is that we retain all the benefits of code being pure, even when using it to write code with mutable state. For instance, I've implemented an infinite loop of side effects using the repeat function; but repeat is still pure and behaves reliably and nothing I can do with it will change that.

A technique to tackle problems that apparently scream for mutability (like GUI or web applications) in a functional way is Functional Reactive Programming.

The pattern you need for this is called Monads. If you really want to get into functional programming you should try to understand what monads are used for and what they can do. As a starting point I would suggest this link.
As a short informal explanation for monads:
Monads can be seen as data + context that is passed around in your program. This is the "space suit" often used in explanations. You pass data and context around together and insert any operation into this Monad. There is usually no way to get the data back once it is inserted into the context, you just can go the other way round inserting operations, so that they handle data combined with context. This way it almost seems as if you get the data out, but if you look closely you never do.
Depending on your application the context can be almost anything. A datastructure that combines multiple entities, exceptions, optionals, or the real world (i/o-monads). In the paper linked above the context will be execution states of an algorithm, so this is quite similar to the things you have in mind.

In Erlang you could use a process to hold the value. Something like this:
holdVar(SomeVar) ->
receive %% wait for message
{From, get} -> %% if you receive a get
From ! {value, SomeVar}, %% respond with SomeVar
holdVar(SomeVar); %% recursively call holdVar
%% to start listening again
{From, {set, SomeNewVar}} -> %% if you receive a set
From ! {ok}, %% respond with ok
holdVar(SomeNewVar); %% recursively call holdVar with
%% the SomeNewVar that you received
%% in the message
end.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string