How to use Compare-and-swap for complex data structures - multithreading

Classic CaS examples deal with simple data structures where critical change is to one primitive variable. However, even linked list, for example, requires change in multiple data items. For example: head, tail, next and prev. How to deal with it?
I have in mind some "capturing" 4 pointers in 4-long memory block and putting the block into CAS routine. Is it technically available and good practice?

Related

Haskell alternative for Doubly-linked-list coupled with Hash-table pattern

There's a useful pattern in imperative programming, namely, a doubly-linked-list coupled with a hash-table for constant time lookup in the linked list.
One application of this pattern is in LRU cache. The head of the doubly-linked-list will contain the least recently used entry in the cache and the last element in the doubly-linked-list will contain the most recently used entry. The keys in the hash-table are keys of the entries and the values are pointers to nodes in the linked-list corresponding to the key/entry. When an entry is queried in the cache, hash-table will be used to point to its node in the linked-list and then the node will be removed from its current location in the linked-list and be placed at the end of the linked-list making it the most-recently-used entry. For eviction, we simply remove entries from the head of the linked-list as they are the least recently used ones. Both lookup and eviction operations will take constant time.
I can think of implementing this in Haskell using two TreeMaps and I know that the time complexity will be O(log n). But I am a little uncomfortable as the constant factor in the time complexity seems a little high. Specifically, to perform a look-up, first I need to check if the entry exists and save its value, then I need to first delete it from the LRU map and re-insert it with a new key. This means that each lookup will result in a root-to-node traversal three times.
Is there a better way of doing this in Haskell?
As comments indicate, mutable vectors are perfectly acceptable when required. However, I think there's an issue with the way you've stated the question - unless the idea is to duplicate "as closely as possible" (without mutable structures) the imperative code, why bother having 2 treemaps? A single priority search queue (see packages pqueue or PSQueue) would be an appropriate structure whilst maintaining purity. It supports efficiently both priorities (for eviction) and searching (for lookups of your desired cached argument).
On a related note, some structures support eg. Data.Map's alterF, which effectively provides you with a continuation allowing you to "do something else" dependent on the Maybe value at a key, but "remembering" where you are and thus avoiding to pay the full cost to re-traverse the structure to subsequently modify at this key. See also the at lens.

C++ Threads writing to different parts of array of vector

I have an std::array<std::vector, NUM_THREADS> and I basically want each thread to go get some data, and store it in its own std::vector, and also to read from its vector.
Is this safe? Or am I going to have to use a mutex or something?
The rule regarding data-races is that if every memory location is either accessed by no more than one thread at a time, or is only read (by any number of threads, but no writes), you don't need atomicity. Otherwise, you need either atomicity or synchronization (such as mutual-exclusion).
If every thread is only writing to and reading from its own vector, this would be safe. If two threads are writing to the same vector elements without synchronization, or if they're both writing to the same vector itself (e.g., appending or truncating the vector), you're pretty much clobbered --- that's two simultaneous writes. If two threads are each writing to elements of their own vectors and reading from both vectors, it's more complicated, but in general I would expect it to be unsafe. There are very specific arrangements where it may be safe/legal, but they will be very brittle, and likely hard to maintain, so it's probably better to re-architect to avoid it.
As an example of a usage like this where it would be legal (but again, brittle and hard to retain safety during code maintenance) would be where none of the vectors are changing size (a reallocation is going to be a write to the vector itself which would preclude any reads on the vector or its elements by other threads) and each thread is able to avoid reading from any specific element of a vector that is written to by any other thread (for example, you have two threads, one reading from and writing to even elements of the vectors and the other reading from and writing to odd elements of the vectors).
The above example is very artificial and probably not all that useful for real access patterns that might be desired. Other examples I could think of would probably also be artificial and unhelpful. And it's very easy to do some simple operation that would destroy the whole guarantee. In particular, if any thread performs push_back() on their own vector, any threads that may be concurrently reading the vector are almost guaranteed to result in undefined behavior. (You might be able to align the stars using reserve() very carefully and make code that is legal, but I certainly wouldn't attempt it myself.)

Suitable Haskell type for large, frequently changing sequence of floats

I have to pick a type for a sequence of floats with 16K elements. The values will be updated frequently, potentially many times a second.
I've read the wiki page on arrays. Here are the conclusions I've drawn so far. (Please correct me if any of them are mistaken.)
IArrays would be unacceptably slow in this case, because they'd be copied on every change. With 16K floats in the array, that's 64KB of memory copied each time.
IOArrays could do the trick, as they can be modified without copying all the data. In my particular use case, doing all updates in the IO monad isn't a problem at all. But they're boxed, which means extra overhead, and that could add up with 16K elements.
IOUArrays seem like the perfect fit. Like IOArrays, they don't require a full copy on each change. But unlike IOArrays, they're unboxed, meaning they're basically the Haskell equivalent of a C array of floats. I realize they're strict. But I don't see that being an issue, because my application would never need to access anything less than the entire array.
Am I right to look to IOUArrays for this?
Also, suppose I later want to read or write the array from multiple threads. Will I have backed myself into a corner with IOUArrays? Or is the choice of IOUArrays totally orthogonal to the problem of concurrency? (I'm not yet familiar with the concurrency primitives in Haskell and how they interact with the IO monad.)
A good rule of thumb is that you should almost always use the vector library instead of arrays. In this case, you can use mutable vectors from the Data.Vector.Mutable module.
The key operations you'll want are read and write which let you mutably read from and write to the mutable vector.
You'll want to benchmark of course (with criterion) or you might be interested in browsing some benchmarks I did e.g. here (if that link works for you; broken for me).
The vector library is a nice interface (crazy understatement) over GHC's more primitive array types which you can get to more directly in the primitive package. As are the things in the standard array package; for instance an IOUArray is essentially a MutableByteArray#.
Unboxed mutable arrays are usually going to be the fastest, but you should compare them in your application to IOArray or the vector equivalent.
My advice would be:
if you probably don't need concurrency first try a mutable unboxed Vector as Gabriel suggests
if you know you will want concurrent updates (and feel a little brave) then first try a MutableArray and then do atomic updates with these functions from the atomic-primops library. If you want fine-grained locking, this is your best choice. Of course concurrent reads will work fine on whatever array you choose.
It should also be theoretically possible to do concurrent updates on a MutableByteArray (equivalent to IOUArray) with those atomic-primops functions too, since a Float should always fit into a word (I think), but you'd have to do some research (or bug Ryan).
Also be aware of potential memory reordering issues when doing concurrency with the atomic-primops stuff, and help convince yourself with lots of tests; this is somewhat uncharted territory.

Any way to manually indicate element of a MutableArray# safe to GC?

In my application I'm working with MutableArrays (via the primitive package) shared across threads. I know when individual elements are no longer used and I'd like some way (unsafeMarkGarbage or something) to indicate to the runtime that they can be collected. At least I'd like to experiment with that if such a function or equivalent technique exists.
EDIT, to add a bit more detail: I've got a conceptual "infinite tape" implemented as a linked list of short MutableArray segments, something like:
data Seg a = Seg (MutableArray a) (IORef (Maybe (Seg a)))
I access the tape using a concurrent counter and always know when an element of the tape will no longer be accessed. In certain cases when a thread is descheduled it's possible that entire array segments (both the array and its elements) which could have been GC'd will stick around as their references will persist.
An ideal solution would avoid an additional write (maybe that's silly), avoid another layer of indirection in the array, and allow entire MutableArrays to be collected when all their elements expire.
Weak references do seem to be the most promising sort of mechanism I've seen, but I can't yet see how they can help me here.
I would suggest you store undefined in the positions that you would like to garbage collect.

Weak Tables in lua - What are the practical uses?

I understand what weak tables are.
But I'd like to know where weak tables can be used practically?
The docs say
Weak tables are often used in situations where you wish to annotate
values without altering them.
I don't understand that. What does that mean?
Posted as an answer from comments...
Since Lua doesn't know what you consider garbage, it won't collect anything it isn't sure to be garbage. In some situations (one of which could be debugging) you want to specify a value for a variable without causing it to be considered "not trash" by Lua. From my understanding, weak tables allow you to do what you'd normally do with variables/objects/etc, but if they're weak referenced (or in a weak table), they will still be considered garbage by Lua and collected when the garbage collection function is called.
Example: Think about if you wanted to use an associative array, with key/value pairs in two separate private tables. If you only wanted to use the key table for one specific use, once you are done using it, it will be locked into existence in Lua. If you were to use a weak table, however, you'd be able to collect it as garbage as soon as you were done using it, freeing up the resources it was using.
To explain that one cryptic sentence about annotating, when you "alter" a variable, you lock it into existence and Lua no longer considers it to be garbage. To "annotate" a variable means to give it a name, number, or some other value. So, it means that you're allowed to give a variable a name/value without locking it into existence (so then Lua can garbage collect it).
Translation:
Weak tables are often used in situations where you wish to give a name to a value without locking the value into existence, which takes up memory.
Normally, storing a reference to an obect will prevent that object from being reclaimed when the object goes out of scope. Weak references do not prevent garbage collection.

Resources