Do we care about the 'past' in FRP? - haskell

When toying around with implementing FRP one thing I've found that is confusing is what to
do with the past? Basically, my understanding was that I would be able to do this with a Behaviour at any point:
beh.at(x) // where time x < now
This seems like it could be problematic performance wise in a case such as this:
val beh = Stepper(0, event) // stepwise behaviour
Here we can see that to evaluate the Behaviour in the past we need to keep all the Events and we will end up performing (at worst) linear scans each time we sample.
Do we want this ability to be available or should Behaviours only be allowed to be evaluated at a time >= now? Do we even want to expose the at function to the programmer?

While a behaviour is considered to be a function of time, reliance on an arbitrary amount of past data in FRP is a Bad Thing, and is referred to as a time leak. That is, transformations on behaviours should generally be streaming/reactive in that they do not rely on more than a bounded amount of the past (and should accumulate this knowledge of the history explicitly).
So, no, at is not desirable in a real FRP system: it should not be possible to look at either the past or the future. (The latter is, of course, impossible, if the state of the future depends on anything external to the FRP system.)
Of course, this leads to the problem that only being able to look at the exact present severely restricts what you can do when writing a function to transform behaviours: Behaviour a -> Behaviour b becomes the same as a -> b, which makes many things we'd like to do impossible. But this is more an issue of finding a semantics, one of FRP's persistent problems, than anything else; as long as the primitive transformations on behaviours you provide are powerful enough without causing time leaks, everything should be fine. For more information on this, see Garbage collecting the semantics of FRP.

Related

Handling invalid states in haskell

I'm trying to get a better feel for how to handle error states in Haskell, since there seem to be a lot of ways to do it. Ideally, my data structures would make any invalid inputs unrepresentable, but despite considerable effort to the contrary, I still occasionally end up working with data where the type system can allow invalid states. As an example, let's consider that my program input is the training results for a neural network. In order for math to work, each matrix needs to have the correct bounds, and that's not (really) representable by the type system. If data is invalid, there's really nothing the application can do but halt any further processing and notify someone of the problem (so it's not recoverable). What's the best way to handle this in Haskell? It seems like I could:
1) Use error or other partial functions when processing my data. My understanding is this should only be used to represent a bug in the code. So it would have to be coupled with some sort of validation at the point that I load the data, and any point "after" that check I just assume that the data is in a valid format. This feels imperative to me, and doesn't seem to fit very well with lazy, declarative code.
2) Throw an exception when processing the data using Control.Exception.throw, and then catch it at the top level where I can alert someone. Contrary to error, I believe this doesn't indicate a bug in the program, so perhaps there wouldn't be verification when I load the data beyond what can be represented through the type system? The presence or absence of an exception when processing the data would define the verification.
3) Lift any data processing that could fail into the IO monad and use Control.Exception.throwIO.
4) Lift any data processing that could fail into the IO monad and use fail (I've read that using fail frowned on by the community?)
5) Return an Either or something similar, and let that bubble up through all your logic. I've definitely had some cases where composing Eithers becomes (to me) exceedingly impractical.
6) Use Control.Monad.Exception, which I only marginally understand, but seems to involve lifting any data processing that could fail into some exceptional monad, that I think is supposed to be more easily composeable than Either?
and I'm not even sure that's all the options. Is there an approach to this problem that's generally accepted by the community, or is this really an opinionated topic?

Why does FRP consider time as a factor for values?

Behaviors are ubiquitously defined as “time-varying value”s1.
Why? time being the dependency/parameter for varying values is very uncommon.
My intuition for FRP would be to have behaviors as event-varying values instead; it is much more common, much more simple, I wage a much more of an efficient idea, and extensible enough to support time too (tick event).
For instance, if you write a counter, you don't care about time/associated timestamps, you just care about the "Increase-button clicked" and "Decrease-button clicked" events.
If you write a game and want a position/force behavior, you just care about the WASD/arrow keys held events, etc. (unless you ban your players for moving to the left in the afternoon; how iniquitous!).
So: Why time is a consideration at all? why timestamps? why are some libraries (e.g. reactive-banana, reactive) take it up to the extent of having Future, Moment values? Why work with event-streams instead of just responding to an event occurrence? All of this just seems to over-complicate a simple idea (event-varying/event-driven value); what's the gain? what problem are we solving here? (I'd love to also get a concrete example along with a wonderful explanation, if possible).
1 Behaviors have been defined so here, here, here... & pretty much everywhere I've encountered.
Behaviors differ from Events primarily in that a Behavior has a value right now while an Event only has a value whenever a new event comes in.
So what do we mean by "right now"? Technically all changes are implemented as push or pull semantics over event streams, so we can only possibly mean "the most recent value as of the last event of consequence for this Behavior". But that's a fairly hairy concept—in practice "now" is much simpler.
The reasoning for why "now" is simpler comes down to the API. Here are two examples from Reactive Banana.
Eventually an FRP system must always produce some kind of externally visible change. In Reactive Banana this is facilitated by the reactimate :: Event (IO ()) -> Moment () function which consumes event streams. There is no way to have a Behavior trigger external changes---you always have to do something like reactimate (someBehavior <# sampleTickEvent) to sample the behavior at concrete times.
Behaviors are Applicatives unlike Events. Why? Well, let's assume Event was an applicative and think about what happens when we have two event streams f and x and write f <*> x: since events occur all at different times the chances of f and x being defined simultaneously are (almost certainly) 0. So f <*> x would always mean the empty event stream which is useless.
What you really want is for f <*> x to cache the most current values for each and take their combined value "all of the time". That's really confusing concept to talk about in terms of an event stream, so instead lets consider f and x as taking values for all points in time. Now f <*> x is also defined as taking values for all points in time. We've just invented Behaviors.
Because it was the simplest way I could think of to give a precise denotation (implementation-independent meaning) to the notion of behaviors, including the sorts of operations I wanted, including differentiation and integration, as well as tracking one or more other behaviors (including but not limited to user-generated behavior).
Why? time being the dependency/parameter for varying values is very uncommon.
I suspect that you're confusing the construction (recipe) of a behavior with its meaning. For instance, a behavior might be constructed via a dependency on something like user input, possibly with additional synthetic transformation. So there's the recipe. The meaning, however, is simply a function of time, related to the time-function that is the user input. Note that by "function", I mean in the math sense of the word: a (deterministic) mapping from domain (time) to range (value), not in the sense that there's a purely programmatic description.
I've seen many questions asking why time matters and why continuous time. If you apply the simple discipline of giving a mathematical meaning in the style of denotational semantics (a simple and familiar style for functional programmers), the issues become much clearer.
If you really want to grok the essence of and thinking behind FRP, I recommend you read my answer to "Specification for a Functional Reactive Programming language" and follow pointers, including "What is Functional Reactive Programming".
Conal Elliott's Push-Pull FRP paper describes event-varying data, where the only points in time that are interesting are when events occcur. Reactive event-varying data is the current value and the next Event that will change it. An Event is a Future point in the event-varying Reactive data.
data Reactive a = a ‘Stepper ‘ Event a
newtype Event a = Ev (Future (Reactive a))
The Future doesn't need to have a time associated with it, it just need to represent the idea of a value that hasn't happened yet. In an impure language with events, for example, a future can be an event handle and a value. When the event occurs, you set the value and raise the handle.
Reactive a has a value for a at all points in time, so why would we need Behaviors? Let's make a simple game. In between when the user presses the WASD keys, the character, accelerated by the force applied, still moves on the screen. The character's position at different points in time is different, even though no event has occurred in the intervening time. This is what a Behavior describes - something that not only has a value at all points in time, but its value can be different at all points in time, even with no intervening events.
One way to describe Behaviors would be to repeat what we just stated. Behaviors are things that can change in-between events. In-between events they are time-varying values, or functions of time.
type Behavior a = Reactive (Time -> a)
We don't need Behavior, we could simply add events for clock ticks, and write all of the logic in our entire game in terms of these tick events. This is undesirable to some developers as the code declaring what our game is is now intermingled with the code providing how it is implemented. Behaviors allow the developer to separate this logic between the description of the game in terms of time-varying variables and the implementation of the engine that executes that description.

Suitable Haskell type for large, frequently changing sequence of floats

I have to pick a type for a sequence of floats with 16K elements. The values will be updated frequently, potentially many times a second.
I've read the wiki page on arrays. Here are the conclusions I've drawn so far. (Please correct me if any of them are mistaken.)
IArrays would be unacceptably slow in this case, because they'd be copied on every change. With 16K floats in the array, that's 64KB of memory copied each time.
IOArrays could do the trick, as they can be modified without copying all the data. In my particular use case, doing all updates in the IO monad isn't a problem at all. But they're boxed, which means extra overhead, and that could add up with 16K elements.
IOUArrays seem like the perfect fit. Like IOArrays, they don't require a full copy on each change. But unlike IOArrays, they're unboxed, meaning they're basically the Haskell equivalent of a C array of floats. I realize they're strict. But I don't see that being an issue, because my application would never need to access anything less than the entire array.
Am I right to look to IOUArrays for this?
Also, suppose I later want to read or write the array from multiple threads. Will I have backed myself into a corner with IOUArrays? Or is the choice of IOUArrays totally orthogonal to the problem of concurrency? (I'm not yet familiar with the concurrency primitives in Haskell and how they interact with the IO monad.)
A good rule of thumb is that you should almost always use the vector library instead of arrays. In this case, you can use mutable vectors from the Data.Vector.Mutable module.
The key operations you'll want are read and write which let you mutably read from and write to the mutable vector.
You'll want to benchmark of course (with criterion) or you might be interested in browsing some benchmarks I did e.g. here (if that link works for you; broken for me).
The vector library is a nice interface (crazy understatement) over GHC's more primitive array types which you can get to more directly in the primitive package. As are the things in the standard array package; for instance an IOUArray is essentially a MutableByteArray#.
Unboxed mutable arrays are usually going to be the fastest, but you should compare them in your application to IOArray or the vector equivalent.
My advice would be:
if you probably don't need concurrency first try a mutable unboxed Vector as Gabriel suggests
if you know you will want concurrent updates (and feel a little brave) then first try a MutableArray and then do atomic updates with these functions from the atomic-primops library. If you want fine-grained locking, this is your best choice. Of course concurrent reads will work fine on whatever array you choose.
It should also be theoretically possible to do concurrent updates on a MutableByteArray (equivalent to IOUArray) with those atomic-primops functions too, since a Float should always fit into a word (I think), but you'd have to do some research (or bug Ryan).
Also be aware of potential memory reordering issues when doing concurrency with the atomic-primops stuff, and help convince yourself with lots of tests; this is somewhat uncharted territory.

Parallelism in functional languages

One of FP features advertised is that a program is "parallel by default" and that naturally fits modern multi-core processors. Indeed, reducing a tree is parallel by its nature. However, I don't understand how it maps to multi-threading. Consider the following fragment (pseudo code):
let y = read-and-parse-a-number-from-console
let x = get-integer-from-web-service-call
let r = 5 * x - y * 4
write-r-to-file
How can a translator determine which of tree branches should be run on a thread? After you obtained x or y it would be stupid to reduce 5 * x or y * 4 expressions on a separate thread (even if we grab it from a thread pool), wouldn't it? So how different functional languages handle this?
We're not quite there yet.
Programs in pure declarative style (functional style is included in this category, but so are some other styles) tend to be much more amenable to parallelisation, because all data dependencies are explicit. This makes it very easy for the programmer to manually use primitives the language provides for specifying that two independent computations should be done in parallel, regardless of whether they share access to any data; if everything's immutable and there are no side effects, then changing the order in which things are done can't affect the result.
If purity is enforced by the language (as in Haskell, Mercury, etc, but unlike in Scala, F#, etc where purity is encouraged but unenforced), then it is possible for the compiler to attempt to automatically parallelise the program, but no existing language that I know of does this by default. If the language allows unchecked impure operations then it's generally impossible for the compiler to do the analysis necessary to prove that a given attempt to auto-parallelise the program is valid. So I do not expect any such language to ever support auto-parallelisation very effectively.
Note that the pseudo program you wrote is probably not pure declarative code. let y = read-and-parse-a-number-from-console and let x = get-integer-from-web-service-call are calculating x and y from impure external operations, and there's nothing in the program that fixes the order in which they should run. It's possible in general that executing two impure operations in either order will produce different results, and running those two operations in different threads gives up control of the order in which they run. So if a language like that was to auto-parallelise your program, it would almost certainly either introduce horrible concurrency bugs, or refuse to significantly parallelise anything.
However the functional style still makes it easy to manually parallelise such programs. A human programmer can tell that it almost certainly doesn't matter in which order you read from the console and the network. Knowing that there's no shared mutable state can decide to run those two operations in parallel without digging into their implementations (which you'd have to do in imperative algorithms where there might be mutable shared state even if it doesn't look like there is from the interface).
But the big complication that's in the way of auto-parallelising compilers for enforced-purity languages is knowing how much parallelisation to do. Running every computation possible in parallel vastly overwhelms any possible benefit with all the startup cost of spawning new threads (not to mention the context switches), as you try to run huge numbers of very short-lived threads on a small number of processors. The compiler needs to identify a much smaller number of "chunks" of computation that are reasonably large, and run the chunks in parallel while running the sub-computations of each chunk sequentially.
But only "embarrassingly parallel" programs decompose nicely into very large completely independent computations. Most programs are much more interdependent. So unless you only want to be able to auto-parallelise programs that are very easy to manually parallelise, your auto-parallelisation probably needs to be able to identify and run "chunks" in parallel which are partially dependent on each other, having them wait when they get to points that really need a result that's supposed to be computed by another "chunk". This introduces extra overhead of synchronisation between the threads, so the logic that chooses what to run in parallel needs to be even better in order to beat the trivial strategy of just running everything sequentially.
The developers of Mercury (a pure logical programming language) are working on various methods of tackling these problem, from static analysis to using profiling data. If you're interested, their research papers have a lot more information. I presume other researches are working on this area in other languages, but I don't know much about any other projects.
In that specific example, the third statement depends on the first and the second, but there is no interdependency between the first and the second. Therefore, a runtime environment could execute read-and-parse-a-number-from-console on a different thread from get-integer-from-web-service-call, but the execution of the third statement would have to wait until the first two are finished.
Some languages or runtime environments may be able to calculate a partial result (such as y * 4) before obtaining an actual value of x. As a high level programmer though, you would be unlikely to be able to detect this.

Is there an object-identity-based, thread-safe memoization library somewhere?

I know that memoization seems to be a perennial topic here on the haskell tag on stack overflow, but I think this question has not been asked before.
I'm aware of several different 'off the shelf' memoization libraries for Haskell:
The memo-combinators and memotrie packages, which make use of a beautiful trick involving lazy infinite data structures to achieve memoization in a purely functional way. (As I understand it, the former is slightly more flexible, while the latter is easier to use in simple cases: see this SO answer for discussion.)
The uglymemo package, which uses unsafePerformIO internally but still presents a referentially transparent interface. The use of unsafePerformIO internally results in better performance than the previous two packages. (Off the shelf, its implementation uses comparison-based search data structures, rather than perhaps-slightly-more-efficient hash functions; but I think that if you find and replace Cmp for Hashable and Data.Map for Data.HashMap and add the appropraite imports, you get a hash based version.)
However, I'm not aware of any library that looks answers up based on object identity rather than object value. This can be important, because sometimes the kinds of object which are being used as keys to your memo table (that is, as input to the function being memoized) can be large---so large that fully examining the object to determine whether you've seen it before is itself a slow operation. Slow, and also unnecessary, if you will be applying the memoized function again and again to an object which is stored at a given 'location in memory' 1. (This might happen, for example, if we're memoizing a function which is being called recursively over some large data structure with a lot of structural sharing.) If we've already computed our memoized function on that exact object before, we can already know the answer, even without looking at the object itself!
Implementing such a memoization library involves several subtle issues and doing it properly requires several special pieces of support from the language. Luckily, GHC provides all the special features that we need, and there is a paper by Peyton-Jones, Marlow and Elliott which basically worries about most of these issues for you, explaining how to build a solid implementation. They don't provide all details, but they get close.
The one detail which I can see which one probably ought to worry about, but which they don't worry about, is thread safety---their code is apparently not threadsafe at all.
My question is: does anyone know of a packaged library which does the kind of memoization discussed in the Peyton-Jones, Marlow and Elliott paper, filling in all the details (and preferably filling in proper thread-safety as well)?
Failing that, I guess I will have to code it up myself: does anyone have any ideas of other subtleties (beyond thread safety and the ones discussed in the paper) which the implementer of such a library would do well to bear in mind?
UPDATE
Following #luqui's suggestion below, here's a little more data on the exact problem I face. Let's suppose there's a type:
data Node = Node [Node] [Annotation]
This type can be used to represent a simple kind of rooted DAG in memory, where Nodes are DAG Nodes, the root is just a distinguished Node, and each node is annotated with some Annotations whose internal structure, I think, need not concern us (but if it matters, just ask and I'll be more specific.) If used in this way, note that there may well be significant structural sharing between Nodes in memory---there may be exponentially more paths which lead from the root to a node than there are nodes themselves. I am given a data structure of this form, from an external library with which I must interface; I cannot change the data type.
I have a function
myTransform : Node -> Node
the details of which need not concern us (or at least I think so; but again I can be more specific if needed). It maps nodes to nodes, examining the annotations of the node it is given, and the annotations its immediate children, to come up with a new Node with the same children but possibly different annotations. I wish to write a function
recursiveTransform : Node -> Node
whose output 'looks the same' as the data structure as you would get by doing:
recursiveTransform Node originalChildren annotations =
myTransform Node recursivelyTransformedChildren annotations
where
recursivelyTransformedChildren = map recursiveTransform originalChildren
except that it uses structural sharing in the obvious way so that it doesn't return an exponential data structure, but rather one on the order of the same size as its input.
I appreciate that this would all be easier if say, the Nodes were numbered before I got them, or I could otherwise change the definition of a Node. I can't (easily) do either of these things.
I am also interested in the general question of the existence of a library implementing the functionality I mention quite independently of the particular concrete problem I face right now: I feel like I've had to work around this kind of issue on a few occasions, and it would be nice to slay the dragon once and for all. The fact that SPJ et al felt that it was worth adding not one but three features to GHC to support the existence of libraries of this form suggests that the feature is genuinely useful and can't be worked around in all cases. (BUT I'd still also be very interested in hearing about workarounds which will help in this particular case too: the long term problem is not as urgent as the problem I face right now :-) )
1 Technically, I don't quite mean location in memory, since the garbage collector sometimes moves objects around a bit---what I really mean is 'object identity'. But we can think of this as being roughly the same as our intuitive idea of location in memory.
If you only want to memoize based on object identity, and not equality, you can just use the existing laziness mechanisms built into the language.
For example, if you have a data structure like this
data Foo = Foo { ... }
expensive :: Foo -> Bar
then you can just add the value to be memoized as an extra field and let the laziness take care of the rest for you.
data Foo = Foo { ..., memo :: Bar }
To make it easier to use, add a smart constructor to tie the knot.
makeFoo ... = let foo = Foo { ..., memo = expensive foo } in foo
Though this is somewhat less elegant than using a library, and requires modification of the data type to really be useful, it's a very simple technique and all thread-safety issues are already taken care of for you.
It seems that stable-memo would be just what you needed (although I'm not sure if it can handle multiple threads):
Whereas most memo combinators memoize based on equality, stable-memo does it based on whether the exact same argument has been passed to the function before (that is, is the same argument in memory).
stable-memo only evaluates keys to WHNF.
This can be more suitable for recursive functions over graphs with cycles.
stable-memo doesn't retain the keys it has seen so far, which allows them to be garbage collected if they will no longer be used. Finalizers are put in place to remove the corresponding entries from the memo table if this happens.
Data.StableMemo.Weak provides an alternative set of combinators that also avoid retaining the results of the function, only reusing results if they have not yet been garbage collected.
There is no type class constraint on the function's argument.
stable-memo will not work for arguments which happen to have the same value but are not the same heap object. This rules out many candidates for memoization, such as the most common example, the naive Fibonacci implementation whose domain is machine Ints; it can still be made to work for some domains, though, such as the lazy naturals.
Ekmett just uploaded a library that handles this and more (produced at HacPhi): http://hackage.haskell.org/package/intern. He assures me that it is thread safe.
Edit: Actually, strictly speaking I realize this does something rather different. But I think you can use it for your purposes. It's really more of a stringtable-atom type interning library that works over arbitrary data structures (including recursive ones). It uses WeakPtrs internally to maintain the table. However, it uses Ints to index the values to avoid structural equality checks, which means packing them into the data type, when what you want are apparently actually StableNames. So I realize this answers a related question, but requires modifying your data type, which you want to avoid...

Resources