Uniqueness Types Instead STM - programming-languages

A forum post incidates using uniqueness types instead of STM. I don't understand what it is saying. How is uniqueness types suppose to deal with the problem that STM is trying to deal with where multiple threads are updating the same variable for example?
I've looked at wikipedia's articles on uniqueness types and linear types and its still not clear what the forum post meant.

Designing systems where data is shared and mutated concurrently by multiple threads is hard.
Approaches to make concurrency easier include:
STM -- With STM, data can still be shared and mutated by multiple threads, but concurrent mutations are detected thanks to the use of transactions.
Uniqueness types -- With uniqueness types, at most one reference to an object exists. So, by definition, it is impossible to mutate the same data concurrently (you would need two references at least, one per thread).
Immutability -- Avoid the problem of concurrent mutations altogether and share only immutable data.
Actors -- Actors rely on asynchronous messages, and serialize the messages they receive, thus avoiding concurrent modifications.

Related

What constructs are not possible using Ponylang's lock-free model?

Ponylang is a new language that is lock-free and datarace-free. My impression is that to accomplish this, Ponylang looks at the sentence "if two threads can see the same object, then writes must prohibit any other operation by another thread", and uses a type system to enforce the various special cases. For example, there's a type descriptor that says, "no other thread can see this object", and one that says, "this reference is read-only", and various others. Admittedly my understanding of this is quite poor, and ponylang's documentation is short on examples.
My question is: are there operations possible with a lock-based language that aren't translatable into ponylang's type-based system at all? Also, are there such operations that are not translatable into efficient constructs in ponylang?
[...] are there operations possible with a lock-based language that aren't translatable into ponylang's type-based system at all?
The whole point with reference capabilities, in Pony, is to prevent you from doing things that are possible and even trivial, in other languages, like sharing a list between two threads and add elements to it concurrently. So, yes, in languages like Java, you can share data between threads in a way that is impossible in Pony.
Also, are there such operations that are not translatable into efficient constructs in ponylang?
If you're asking if the lock-based languages can be more efficient in some situations, than pony, then I think so. You can always create a situation that benefits from N threads and 1 lock and is worse when you use the actor model which forces you to pass information around in messages.
This thing is not to see the actor model as superior in all cases. It's a different model of concurrency and problems are solved differently. For example, to compute N values and accumulate the results in a list:
In a thread-model you would
create a thread pool,
create thread-safe list,
Create N tasks sharing the list, and
wait for N tasks to finish.
In an actor-model you would
create an actor A waiting for N values,
create N actors B sharing the actor A, and
wait for A to produce a list.
Obviously, each task would add a value to the list and each actor B would send the value to actor A. Depending on how messages are passed between actors, it can be a slower to send N values than to lock N times. Typically it will be slower but, on the other hand, you will never get a list with an unexpected size.
I believe it can do anything that a shared everything + locks can do. with just iso objects and consume it is basically pure a message passing system which can do anything that a lock system does. As in mach3 can do anything linux can.

C++ Threads writing to different parts of array of vector

I have an std::array<std::vector, NUM_THREADS> and I basically want each thread to go get some data, and store it in its own std::vector, and also to read from its vector.
Is this safe? Or am I going to have to use a mutex or something?
The rule regarding data-races is that if every memory location is either accessed by no more than one thread at a time, or is only read (by any number of threads, but no writes), you don't need atomicity. Otherwise, you need either atomicity or synchronization (such as mutual-exclusion).
If every thread is only writing to and reading from its own vector, this would be safe. If two threads are writing to the same vector elements without synchronization, or if they're both writing to the same vector itself (e.g., appending or truncating the vector), you're pretty much clobbered --- that's two simultaneous writes. If two threads are each writing to elements of their own vectors and reading from both vectors, it's more complicated, but in general I would expect it to be unsafe. There are very specific arrangements where it may be safe/legal, but they will be very brittle, and likely hard to maintain, so it's probably better to re-architect to avoid it.
As an example of a usage like this where it would be legal (but again, brittle and hard to retain safety during code maintenance) would be where none of the vectors are changing size (a reallocation is going to be a write to the vector itself which would preclude any reads on the vector or its elements by other threads) and each thread is able to avoid reading from any specific element of a vector that is written to by any other thread (for example, you have two threads, one reading from and writing to even elements of the vectors and the other reading from and writing to odd elements of the vectors).
The above example is very artificial and probably not all that useful for real access patterns that might be desired. Other examples I could think of would probably also be artificial and unhelpful. And it's very easy to do some simple operation that would destroy the whole guarantee. In particular, if any thread performs push_back() on their own vector, any threads that may be concurrently reading the vector are almost guaranteed to result in undefined behavior. (You might be able to align the stars using reserve() very carefully and make code that is legal, but I certainly wouldn't attempt it myself.)

Clojure, atoms and refs

I have a couple of question about refs and atom, and clojure reference types in general after reading clojure programming and mostly the question is related to this book.
First:
The books says about coordination, and it says "A coordinated operation is one where multiple actors must cooperate in order to yield correct results.". Does this mean if I have 3 fn fn1, fn2, and fn3, and each of them does some operation that possibly change the state of the reference (assuming it happens in each own Thread), it happens in a synchronous way in a chained operation? Something like, output of fn1 is input of fn2 and so on.
Second:
I cannot understand the difference between refs and atoms. The book says refs is for coordinated sync and atoms is for uncoordinated sync. Each of them (refs and atoms) has their own example, where atoms is used in such way where it is being operated by multiple function (1 atom 2 function), and multiple refs with 1 function. The book didn't give an example why we shouldn't or can't do the other way around.
Atoms allow multiple threads to apply transformations to a single value and guarantee the transformations are atomic. swap! takes the atom and a function expecting the current value of the atom. The result of calling that function with the current value is stored in the atom. multiple calls to swap! may interleave, but each call will run in isolation.
Refs allow multiple threads to update multiple values in a co-ordinated way. All updates to all refs inside a sync will complete or none will. You MUST write your code such that transaction retries are catered for. There are a few potential performance tweaks, if you can relax the ordering of operations, which MAY reduce the chance of transaction retry (but don't guarantee it).
Difference is really easy.
refs are operating under a transaction (similar to Databases transactions). Imagine a banking system. You can represent an account as a ref.
To transfer money, you start a Clojure STM transaction -via (dosync)-. Subtract X amount of money from ref-1 and add that amount to account ref-2.
If something goes wrong, then Clojure STM will restart the operation.
Imagine there is no transaction. You subtracted X amount of money from ref-1 and before you add that amount to ref-2, something went wrong in your system. Your customers will not be happy at all (if you aren't sued any way).
The Clojure STM is implemented as MVCC.
Atoms on the other hand don't need a transaction in place to operate. Atoms are convenient when there is no coordination. For example, a counter that increases the total number of a visited page in a web analytics system.
Have a look at Clojure Refs. It offers a lot of valuable information.

"Wait-free" data in Haskell

I've been led to believe that the GHC implementation of TVars is lock-free, but not wait-free. Are there any implementations that are wait-free (e.g. a package on Hackage)?
Wait-freedom is a term from distributed computing. An algorithm is wait-free if a thread (or distributed node) is able to terminate correctly even if all input from other threads is delayed/lost at any time.
If you care about consistency, then you cannot guarantee wait-freedom (assuming that you always want to terminate correctly, i.e. guarantee availability). This follows from the CAP theorem [1], since wait-freedom essentially implies partition-tolerance.
[1] http://en.wikipedia.org/wiki/CAP_theorem
Your question "Are there any implementations that are wait-free?" is a bit incomplete. STM (and thus TVar) is rather complex and has support built into the compiler - you can't build it properly with Haskell primitives.
If you're looking for any data container that allows mutation and can be non-blocking then you want IORefs or MVars (but those can block if no value is available).

DDD: what's the use of the difference between entities and value objects?

Entities and value objects are both domain objects. What's the use of knowing the distinction between the two in DDD? Eg does thinking about domain objects as being either an entity or value object foster a cleaner domain model?
Yes, it is very helpful to be able to tell the difference, particularly when you are designing and implementing your types.
One of the main differences is when it comes to dealing with equality, since Entities should have quite different behavior than Value Objects. Knowing whether your object is an Entity or a Value Object tells you how you should implement equality for the type. This is helpful in itself, but it doesn't stop there.
Entities are mutable types (at least by concept). The whole idea behind an Entity is that it represents a Domain concept with a known lifetime progression (i.e. it is created, it undergoes several transformations, it is archived and perhaps eventually deleted). It represents the same particular 'thing' even if months or years pass by, and it changes state along the way.
Value Objects, on the other hand, simply represent values without any inherent identity. Although you don't have to do this, they lend themselves tremendously well to be implemented as immutable types. This is very interesting because any immutable type is by definition thread-safe. As we are moving into the multi-core age, knowing when to implement an object as an immutable type is very valuable.
It also helps a lot in unit testing when the equality semantics are well-known. In both cases, equality is well-defined. I don't know what language you use, but in many languages (C#, Java, VB.NET) equality is determined by reference by default, which in many cases isn't particularly useful.

Resources