Suitable Haskell type for large, frequently changing sequence of floats - haskell

I have to pick a type for a sequence of floats with 16K elements. The values will be updated frequently, potentially many times a second.
I've read the wiki page on arrays. Here are the conclusions I've drawn so far. (Please correct me if any of them are mistaken.)
IArrays would be unacceptably slow in this case, because they'd be copied on every change. With 16K floats in the array, that's 64KB of memory copied each time.
IOArrays could do the trick, as they can be modified without copying all the data. In my particular use case, doing all updates in the IO monad isn't a problem at all. But they're boxed, which means extra overhead, and that could add up with 16K elements.
IOUArrays seem like the perfect fit. Like IOArrays, they don't require a full copy on each change. But unlike IOArrays, they're unboxed, meaning they're basically the Haskell equivalent of a C array of floats. I realize they're strict. But I don't see that being an issue, because my application would never need to access anything less than the entire array.
Am I right to look to IOUArrays for this?
Also, suppose I later want to read or write the array from multiple threads. Will I have backed myself into a corner with IOUArrays? Or is the choice of IOUArrays totally orthogonal to the problem of concurrency? (I'm not yet familiar with the concurrency primitives in Haskell and how they interact with the IO monad.)

A good rule of thumb is that you should almost always use the vector library instead of arrays. In this case, you can use mutable vectors from the Data.Vector.Mutable module.
The key operations you'll want are read and write which let you mutably read from and write to the mutable vector.

You'll want to benchmark of course (with criterion) or you might be interested in browsing some benchmarks I did e.g. here (if that link works for you; broken for me).
The vector library is a nice interface (crazy understatement) over GHC's more primitive array types which you can get to more directly in the primitive package. As are the things in the standard array package; for instance an IOUArray is essentially a MutableByteArray#.
Unboxed mutable arrays are usually going to be the fastest, but you should compare them in your application to IOArray or the vector equivalent.
My advice would be:
if you probably don't need concurrency first try a mutable unboxed Vector as Gabriel suggests
if you know you will want concurrent updates (and feel a little brave) then first try a MutableArray and then do atomic updates with these functions from the atomic-primops library. If you want fine-grained locking, this is your best choice. Of course concurrent reads will work fine on whatever array you choose.
It should also be theoretically possible to do concurrent updates on a MutableByteArray (equivalent to IOUArray) with those atomic-primops functions too, since a Float should always fit into a word (I think), but you'd have to do some research (or bug Ryan).
Also be aware of potential memory reordering issues when doing concurrency with the atomic-primops stuff, and help convince yourself with lots of tests; this is somewhat uncharted territory.

Related

How can I obtain constant time access (like in an array) in a data structure in Haskell?

I'll get straight to it - is there a way to have a dynamically sized constant-time access data-structure in Haskell, much like an array in any other imperative language?
I'm sure there is a module somewhere that does this for us magically, but I'm hoping for a general explanation of how one would do this in a functional manner :)
As far as I'm aware, Map uses a binary tree representation so it has O(log(n)) access time, and lists of course have O(n) access time.
Additionally, if we made it so that it was immutable, it would be pure, right?
Any ideas how I could go about this (beyond something like Array = Array { one :: Int, two :: Int, three :: Int ...} in template Haskell or the like)?
If your key is isomorphic to Int then you can use IntMap as most of its operations are O(min(n,W)), where n is the number of elements and W is the number of bits in Int (usually 32 or 64), which means that as the collection gets large the cost of each individual operation converges to a constant.
a dynamically sized constant-time access data-structure in Haskell,
Data.Array
Data.Vector
etc etc.
For associative structures you can choose between:
Log-N tree and trie structures
Hash tables
Mixed hash mapped tries
With various different log-complexities and constant factors.
All of these are on hackage.
In addition to the other good answers, it might be useful to say that:
When restricted to Algebraic Data Types and purity, all dynamically
sized data structure must have at least logarithmic worst-case access
time.
Personally, I like to call this the price of purity.
Haskell offers you three main ways around this:
Change the problem: Use hashes or prefix trees.
For constant-time reads use pure Arrays or the more recent Vectors; they are not ADTs and need compiler support / hidden IO inside. Constant-time writes are not possible since purity forbids the original data structure to be modified.
For constant-time writes use the IO or ST monad, preferring ST when you can to avoid externally visible side effects. These monads are implemented in the compiler.
It's true that you can't have constant time access arrays in Haskell without compiler/runtime magic.
However, this isn't (just) because Haskell is functional. Arrays in Java and C# also require runtime magic. In Rust you might be able to implement them in unsafe code, but not in safe Rust.
The truth is any language that doesn't allow you to allocate memory of dynamic size, or that doesn't allow you to use pointers is going to require runtime magic to implement arrays.
That excludes any safe language, whether object oriented, or functional.
The only difference between Haskell and eg. Java with respect to Arrays, is that arrays are far less useful in Haskell than in Java, but in Java arrays are so core to everything we do that we don't even notice that they're magic.
There is one way though that Haskell requires more magic for arrays than eg. Java.
With Java you can initialise an empty array (which requires magic) and then fill it up with values (which doesn't).
With Haskell this would obviously go against immutability. So any array would have to be initialised with its values. Thus the compiler magic doesn't just stretch to giving you an empty chunk of memory to index into. It also requires giving you a way to initialise the array with values. So creation and initialisation of the array has to be a single step, entirely handled by the compiler.

Why aren't FingerTrees used enough to have a stable implementation?

A while ago, I ran across an article on FingerTrees (See Also an accompanying Stack Overflow Question) and filed the idea away. I have finally found a reason to make use of them.
My problem is that the Data.FingerTree package seems to have a little bit rot around the edges. Moreover, Data.Sequence in the Containers package which makes use of the data structure re-implements a (possibly better) version, but doesn't export it.
As theoretically useful as this structure seems to be, it doesn't seem to get a lot of actual use or attention. Have people found that FingerTrees are not useful as a practical matter, or is this a case not enough attention?
further explanation:
I'm interested in building a data structure holding text that has good concatenation properties. Think about building an HTML document from assorted fragments. Most pre-built solutions use bytestrings, but I really want something that deals with Unicode text properly. My plan at the moment is to layer Data.Text fragments into a FingerTree.
I would also like to borrow the trick from Data.Vector of taking slices without copying using (offset,length) manipulation. Data.Text.Text has this built in to the data type, but only uses it for efficient uncons and unsnoc opperations. In FingerTree this information could very easily becomes the v or annotation of the tree.
To answer your question about finger trees in particular, I think the problem is that they have relatively high constant costs compared to arrays, and are more complex than other ways of achieving efficient concatenation. A Builder has a more efficient interface for just appending chunks, and they're usually readily available (see the links in #informatikr's answer). Suppose that Data.Text.Lazy is implemented with a linked list of chunks, and you're creating a Data.Text.Lazy from a builder. Unless you have a lot of chunks (probably more than 50), or are accessing data near the end of the list repeatedly, the high constant cost of a finger tree probably isn't worth it.
The Data.Sequence implementation is specialized for performance reasons, and isn't as general as the full interface provided by the fingertree package. That's why it isn't exported; it's not really possible to use it for anything other than a Sequence.
I also suspect that many programmers are at a loss as to how to actually use the monoidal annotation, as it's behind a fairly significant abstraction barrier. So many people wouldn't use it because they don't see how it can be useful compared to other data types.
I didn't really get it until I read Chung-chieh Shan's blog series on word numbers (part2, part3, part4). That's proof that the idea can definitely be used in practical code.
In your case, if you need to both inspect partial results and have efficient appends, using a fingertree may be better than a builder. Depending on the builder's implementation, you may end up doing a lot of repeated work as you convert to Text, add more stuff to the builder, convert to Text again, etc. It would depend on your usage pattern though.
You might be interested in my splaytree package, which provides splay trees with monoidal annotations, and several different structures build upon them. Other than the splay tree itself, the Set and RangeSet modules have more-or-less complete API's, the Sequence module is mostly a skeleton I used for testing. It's not a "batteries included" solution to what you're looking for (again, #informatikr's answer provides those), but if you want to experiment with monoidal annotations it may be more useful than Data.FingerTree. Be aware that a splay tree can get unbalanced if you traverse all the elements in sequence (or continually snoc onto the end, or similar), but if appends and lookups are interleaved performance can be excellent.
In addition to John Lato's answer, I'll add some specific details about the performance of finger trees, since I spent some time looking at that in the past.
The broad summary is:
Data.Sequence has great constant factors and asymptotics: it is almost as fast as [] when accessing the front of the list (where both data structures have O(1) asymptotics), and much faster elsewhere in the list (where Data.Sequence's logarithmic asymptotics trounce []'s linear asymptotics).
Data.FingerTree has the same asymptotics as Data.Sequence, but is about an order of magnitude slower.
Just like lists, finger trees have high per-element memory overheads, so they should be combined with chunking for better memory and cache use. Indeed, a few packages do this (yi, trifecta, rope). If Data.FingerTree could be brought close to Data.Sequence in performance, I would hope to see a Data.Text.Sequence type, which implemented a finger tree of Data.Text values. Such a type would lose the streaming behaviour of Data.Text.Lazy, but benefit from improved random access and concatenation performance. (Similarly, I would want to see Data.ByteString.Sequence and Data.Vector.Sequence.)
The obstacle to implementing these now is that no efficient and generic implementation of finger trees exists (see below where I discuss this further). To produce efficient implementations of Data.Text.Sequence one would have to completely reimplement finger trees, specialised to Text - just as Data.Text.Lazy completely reimplements lists, specialised to Text. Unfortunately, finger trees are much more complex than lists (especially concatenation!), so this is a considerable amount of work.
So as I see it the answer is:
specialised finger trees are great, but a lot of work to implement
chunked finger trees (e.g. Data.Text.Sequence) would be great, but at present the poor performance of Data.FingerTree means they are not a viable alternative to chunked lists in the common case
builders and chunked lists achieve many of the benefits of chunked finger trees, and so they suffice for the common case
in the uncommon case where builders and chunked lists don't suffice, we grit our teeth and put up with the poor constant factors of chunked finger trees (e.g. in yi and trifecta).
Obstacles to an efficient and generic finger tree
Much of the performance gap between Data.Sequence and Data.FingerTree is due to two optimisations in Data.Sequence:
The measure type is specialised to Int, so measure manipulations will compile down to efficient integer arithmetic rather
The measure type is unpacked into the Deep constructor, which saves pointer dereferences in the inner loops of the tree operations.
It is possible to apply these optimisations in the general case of Data.FingerTree by using data families for generic unpacking and by exploiting GHC's inliner and specialiser - see my fingertree-unboxed package, which brings generic finger tree performance almost up to that of Data.Sequence. Unfortunately, these techniques have some significant problems:
data families for generic unpacking is unpleasant for the user, because they have to define lots of instances. There is no clear solution to this problem.
finger trees use polymorphic recursion, which GHC's specialiser doesn't handle well (1, 2). This means that, to get sufficient specialisation on the measure type, we need lots of INLINE pragmas, which causes GHC to generate huge amounts of code.
Due to these problems, I never released the package on Hackage.
Ignoring your Finger Tree question and only responding to your further explanation: did you look into Data.Text.Lazy.Builder or, specifically for building HTML, blaze-html?
Both allow fast concatenation. For slicing, if that is important for solving your problem, they might not have ideal performance.

Is it possible to make fast big circular buffer arrays for stream recording in Haskell?

I'm considering converting a C# app to Haskell as my first "real" Haskell project. However I want to make sure it's a project that makes sense. The app collects data packets from ~15 serial streams that come at around 1 kHz, loads those values into the corresponding circular buffers on my "context" object, each with ~25000 elements, and then at 60 Hz sends those arrays out to OpenGL for waveform display. (Thus it has to be stored as an array, or at least converted to an array every 16 ms). There are also about 70 fields on my context object that I only maintain the current (latest) value, not the stream waveform.
There are several aspects of this project that map well to Haskell, but the thing I worry about is the performance. If for each new datapoint in any of the streams, I'm having to clone the entire context object with 70 fields and 15 25000-element arrays, obviously there's going to be performance issues.
Would I get around this by putting everything in the IO-monad? But then that seems to somewhat defeat the purpose of using Haskell, right? Also all my code in C# is event-driven; is there an idiom for that in Haskell? It seems like adding a listener creates a "side effect" and I'm not sure how exactly that would be done.
Look at this link, under the section "The ST monad":
http://book.realworldhaskell.org/read/advanced-library-design-building-a-bloom-filter.html
Back in the section called “Modifying array elements”, we mentioned
that modifying an immutable array is prohibitively expensive, as it
requires copying the entire array. Using a UArray does not change
this, so what can we do to reduce the cost to bearable levels?
In an imperative language, we would simply modify the elements of the
array in place; this will be our approach in Haskell, too.
Haskell provides a special monad, named ST, which lets us work
safely with mutable state. Compared to the State monad, it has some
powerful added capabilities.
We can thaw an immutable array to give a mutable array; modify the
mutable array in place; and freeze a new immutable array when we are
done.
...
The IO monad also provides these capabilities. The major difference between the two is that the ST monad is intentionally designed so that we can escape from it back into pure Haskell code.
So should be possible to modify in-place, and it won't defeat the purpose of using Haskell after all.
Yes, you would probably want to use the IO monad for mutable data. I don't believe the ST monad is a good fit for this problem space because the data updates are interleaved with actual IO actions (reading input streams). As you would need to perform the IO within ST by using unsafeIOToST, I find it preferable to just use IO directly. The other approach with ST is to continually thaw and freeze an array; this is messy because you need to guarantee that old copies of the data are never used.
Although evidence shows that a pure solution (in the form of Data.Sequence.Seq) is often faster than using mutable data, given your requirement that data be pushed out to OpenGL, you'll possible get better performance from working with the array directly. I would use the functions from Data.Vector.Storable.Mutable (from the vector package), as then you have access to the ForeignPtr for export.
You can look at arrows (Yampa) for one very common approach to event-driven code. Another area is Functional Reactivity (FRP). There are starting to be some reasonably mature libraries in this domain, such as Netwire or reactive-banana. I don't know if they'd provide adequate performance for your requirements though; I've mostly used them for gui-type programming.

Haskell: Pure data structure that efficiently models a block of memory?

I'm interested in writing a virtual machine that works with a block of memory. I'd like to model the block of memory (say 1MB) with a pure data structure that's still efficient to reads and writes anywhere within the block. Not very interested in a mutable structure. Does such a structure exist?
The vector package offers immutable (and mutable) boxed and unboxed vectors, with all the time complexities you'd expect. The mutable ones are usable from both IO and ST, and you can have an unboxed array of any instance of Storable. It's a lot nicer than the standard array modules.
However, since you mentioned efficient immutable updates, I would suggest using a Map-like data structure; perhaps a HashMap from unordered-containers. It might even be worthwhile to have a map with small unboxed vectors at the leaves, to avoid some tree overhead.
Depending on your use-case, you might also be interested in the standard Data.Sequence, which has O(1) access to the beginning and end, and pretty good access times into the middle of the sequence.
Is Data.Array.ST good enough for you?

Is there an object-identity-based, thread-safe memoization library somewhere?

I know that memoization seems to be a perennial topic here on the haskell tag on stack overflow, but I think this question has not been asked before.
I'm aware of several different 'off the shelf' memoization libraries for Haskell:
The memo-combinators and memotrie packages, which make use of a beautiful trick involving lazy infinite data structures to achieve memoization in a purely functional way. (As I understand it, the former is slightly more flexible, while the latter is easier to use in simple cases: see this SO answer for discussion.)
The uglymemo package, which uses unsafePerformIO internally but still presents a referentially transparent interface. The use of unsafePerformIO internally results in better performance than the previous two packages. (Off the shelf, its implementation uses comparison-based search data structures, rather than perhaps-slightly-more-efficient hash functions; but I think that if you find and replace Cmp for Hashable and Data.Map for Data.HashMap and add the appropraite imports, you get a hash based version.)
However, I'm not aware of any library that looks answers up based on object identity rather than object value. This can be important, because sometimes the kinds of object which are being used as keys to your memo table (that is, as input to the function being memoized) can be large---so large that fully examining the object to determine whether you've seen it before is itself a slow operation. Slow, and also unnecessary, if you will be applying the memoized function again and again to an object which is stored at a given 'location in memory' 1. (This might happen, for example, if we're memoizing a function which is being called recursively over some large data structure with a lot of structural sharing.) If we've already computed our memoized function on that exact object before, we can already know the answer, even without looking at the object itself!
Implementing such a memoization library involves several subtle issues and doing it properly requires several special pieces of support from the language. Luckily, GHC provides all the special features that we need, and there is a paper by Peyton-Jones, Marlow and Elliott which basically worries about most of these issues for you, explaining how to build a solid implementation. They don't provide all details, but they get close.
The one detail which I can see which one probably ought to worry about, but which they don't worry about, is thread safety---their code is apparently not threadsafe at all.
My question is: does anyone know of a packaged library which does the kind of memoization discussed in the Peyton-Jones, Marlow and Elliott paper, filling in all the details (and preferably filling in proper thread-safety as well)?
Failing that, I guess I will have to code it up myself: does anyone have any ideas of other subtleties (beyond thread safety and the ones discussed in the paper) which the implementer of such a library would do well to bear in mind?
UPDATE
Following #luqui's suggestion below, here's a little more data on the exact problem I face. Let's suppose there's a type:
data Node = Node [Node] [Annotation]
This type can be used to represent a simple kind of rooted DAG in memory, where Nodes are DAG Nodes, the root is just a distinguished Node, and each node is annotated with some Annotations whose internal structure, I think, need not concern us (but if it matters, just ask and I'll be more specific.) If used in this way, note that there may well be significant structural sharing between Nodes in memory---there may be exponentially more paths which lead from the root to a node than there are nodes themselves. I am given a data structure of this form, from an external library with which I must interface; I cannot change the data type.
I have a function
myTransform : Node -> Node
the details of which need not concern us (or at least I think so; but again I can be more specific if needed). It maps nodes to nodes, examining the annotations of the node it is given, and the annotations its immediate children, to come up with a new Node with the same children but possibly different annotations. I wish to write a function
recursiveTransform : Node -> Node
whose output 'looks the same' as the data structure as you would get by doing:
recursiveTransform Node originalChildren annotations =
myTransform Node recursivelyTransformedChildren annotations
where
recursivelyTransformedChildren = map recursiveTransform originalChildren
except that it uses structural sharing in the obvious way so that it doesn't return an exponential data structure, but rather one on the order of the same size as its input.
I appreciate that this would all be easier if say, the Nodes were numbered before I got them, or I could otherwise change the definition of a Node. I can't (easily) do either of these things.
I am also interested in the general question of the existence of a library implementing the functionality I mention quite independently of the particular concrete problem I face right now: I feel like I've had to work around this kind of issue on a few occasions, and it would be nice to slay the dragon once and for all. The fact that SPJ et al felt that it was worth adding not one but three features to GHC to support the existence of libraries of this form suggests that the feature is genuinely useful and can't be worked around in all cases. (BUT I'd still also be very interested in hearing about workarounds which will help in this particular case too: the long term problem is not as urgent as the problem I face right now :-) )
1 Technically, I don't quite mean location in memory, since the garbage collector sometimes moves objects around a bit---what I really mean is 'object identity'. But we can think of this as being roughly the same as our intuitive idea of location in memory.
If you only want to memoize based on object identity, and not equality, you can just use the existing laziness mechanisms built into the language.
For example, if you have a data structure like this
data Foo = Foo { ... }
expensive :: Foo -> Bar
then you can just add the value to be memoized as an extra field and let the laziness take care of the rest for you.
data Foo = Foo { ..., memo :: Bar }
To make it easier to use, add a smart constructor to tie the knot.
makeFoo ... = let foo = Foo { ..., memo = expensive foo } in foo
Though this is somewhat less elegant than using a library, and requires modification of the data type to really be useful, it's a very simple technique and all thread-safety issues are already taken care of for you.
It seems that stable-memo would be just what you needed (although I'm not sure if it can handle multiple threads):
Whereas most memo combinators memoize based on equality, stable-memo does it based on whether the exact same argument has been passed to the function before (that is, is the same argument in memory).
stable-memo only evaluates keys to WHNF.
This can be more suitable for recursive functions over graphs with cycles.
stable-memo doesn't retain the keys it has seen so far, which allows them to be garbage collected if they will no longer be used. Finalizers are put in place to remove the corresponding entries from the memo table if this happens.
Data.StableMemo.Weak provides an alternative set of combinators that also avoid retaining the results of the function, only reusing results if they have not yet been garbage collected.
There is no type class constraint on the function's argument.
stable-memo will not work for arguments which happen to have the same value but are not the same heap object. This rules out many candidates for memoization, such as the most common example, the naive Fibonacci implementation whose domain is machine Ints; it can still be made to work for some domains, though, such as the lazy naturals.
Ekmett just uploaded a library that handles this and more (produced at HacPhi): http://hackage.haskell.org/package/intern. He assures me that it is thread safe.
Edit: Actually, strictly speaking I realize this does something rather different. But I think you can use it for your purposes. It's really more of a stringtable-atom type interning library that works over arbitrary data structures (including recursive ones). It uses WeakPtrs internally to maintain the table. However, it uses Ints to index the values to avoid structural equality checks, which means packing them into the data type, when what you want are apparently actually StableNames. So I realize this answers a related question, but requires modifying your data type, which you want to avoid...

Resources