What constructs are not possible using Ponylang's lock-free model? - multithreading

Ponylang is a new language that is lock-free and datarace-free. My impression is that to accomplish this, Ponylang looks at the sentence "if two threads can see the same object, then writes must prohibit any other operation by another thread", and uses a type system to enforce the various special cases. For example, there's a type descriptor that says, "no other thread can see this object", and one that says, "this reference is read-only", and various others. Admittedly my understanding of this is quite poor, and ponylang's documentation is short on examples.
My question is: are there operations possible with a lock-based language that aren't translatable into ponylang's type-based system at all? Also, are there such operations that are not translatable into efficient constructs in ponylang?

[...] are there operations possible with a lock-based language that aren't translatable into ponylang's type-based system at all?
The whole point with reference capabilities, in Pony, is to prevent you from doing things that are possible and even trivial, in other languages, like sharing a list between two threads and add elements to it concurrently. So, yes, in languages like Java, you can share data between threads in a way that is impossible in Pony.
Also, are there such operations that are not translatable into efficient constructs in ponylang?
If you're asking if the lock-based languages can be more efficient in some situations, than pony, then I think so. You can always create a situation that benefits from N threads and 1 lock and is worse when you use the actor model which forces you to pass information around in messages.
This thing is not to see the actor model as superior in all cases. It's a different model of concurrency and problems are solved differently. For example, to compute N values and accumulate the results in a list:
In a thread-model you would
create a thread pool,
create thread-safe list,
Create N tasks sharing the list, and
wait for N tasks to finish.
In an actor-model you would
create an actor A waiting for N values,
create N actors B sharing the actor A, and
wait for A to produce a list.
Obviously, each task would add a value to the list and each actor B would send the value to actor A. Depending on how messages are passed between actors, it can be a slower to send N values than to lock N times. Typically it will be slower but, on the other hand, you will never get a list with an unexpected size.

I believe it can do anything that a shared everything + locks can do. with just iso objects and consume it is basically pure a message passing system which can do anything that a lock system does. As in mach3 can do anything linux can.

Related

C++ Threads writing to different parts of array of vector

I have an std::array<std::vector, NUM_THREADS> and I basically want each thread to go get some data, and store it in its own std::vector, and also to read from its vector.
Is this safe? Or am I going to have to use a mutex or something?
The rule regarding data-races is that if every memory location is either accessed by no more than one thread at a time, or is only read (by any number of threads, but no writes), you don't need atomicity. Otherwise, you need either atomicity or synchronization (such as mutual-exclusion).
If every thread is only writing to and reading from its own vector, this would be safe. If two threads are writing to the same vector elements without synchronization, or if they're both writing to the same vector itself (e.g., appending or truncating the vector), you're pretty much clobbered --- that's two simultaneous writes. If two threads are each writing to elements of their own vectors and reading from both vectors, it's more complicated, but in general I would expect it to be unsafe. There are very specific arrangements where it may be safe/legal, but they will be very brittle, and likely hard to maintain, so it's probably better to re-architect to avoid it.
As an example of a usage like this where it would be legal (but again, brittle and hard to retain safety during code maintenance) would be where none of the vectors are changing size (a reallocation is going to be a write to the vector itself which would preclude any reads on the vector or its elements by other threads) and each thread is able to avoid reading from any specific element of a vector that is written to by any other thread (for example, you have two threads, one reading from and writing to even elements of the vectors and the other reading from and writing to odd elements of the vectors).
The above example is very artificial and probably not all that useful for real access patterns that might be desired. Other examples I could think of would probably also be artificial and unhelpful. And it's very easy to do some simple operation that would destroy the whole guarantee. In particular, if any thread performs push_back() on their own vector, any threads that may be concurrently reading the vector are almost guaranteed to result in undefined behavior. (You might be able to align the stars using reserve() very carefully and make code that is legal, but I certainly wouldn't attempt it myself.)

Clojure, atoms and refs

I have a couple of question about refs and atom, and clojure reference types in general after reading clojure programming and mostly the question is related to this book.
First:
The books says about coordination, and it says "A coordinated operation is one where multiple actors must cooperate in order to yield correct results.". Does this mean if I have 3 fn fn1, fn2, and fn3, and each of them does some operation that possibly change the state of the reference (assuming it happens in each own Thread), it happens in a synchronous way in a chained operation? Something like, output of fn1 is input of fn2 and so on.
Second:
I cannot understand the difference between refs and atoms. The book says refs is for coordinated sync and atoms is for uncoordinated sync. Each of them (refs and atoms) has their own example, where atoms is used in such way where it is being operated by multiple function (1 atom 2 function), and multiple refs with 1 function. The book didn't give an example why we shouldn't or can't do the other way around.
Atoms allow multiple threads to apply transformations to a single value and guarantee the transformations are atomic. swap! takes the atom and a function expecting the current value of the atom. The result of calling that function with the current value is stored in the atom. multiple calls to swap! may interleave, but each call will run in isolation.
Refs allow multiple threads to update multiple values in a co-ordinated way. All updates to all refs inside a sync will complete or none will. You MUST write your code such that transaction retries are catered for. There are a few potential performance tweaks, if you can relax the ordering of operations, which MAY reduce the chance of transaction retry (but don't guarantee it).
Difference is really easy.
refs are operating under a transaction (similar to Databases transactions). Imagine a banking system. You can represent an account as a ref.
To transfer money, you start a Clojure STM transaction -via (dosync)-. Subtract X amount of money from ref-1 and add that amount to account ref-2.
If something goes wrong, then Clojure STM will restart the operation.
Imagine there is no transaction. You subtracted X amount of money from ref-1 and before you add that amount to ref-2, something went wrong in your system. Your customers will not be happy at all (if you aren't sued any way).
The Clojure STM is implemented as MVCC.
Atoms on the other hand don't need a transaction in place to operate. Atoms are convenient when there is no coordination. For example, a counter that increases the total number of a visited page in a web analytics system.
Have a look at Clojure Refs. It offers a lot of valuable information.

Parallelism in functional languages

One of FP features advertised is that a program is "parallel by default" and that naturally fits modern multi-core processors. Indeed, reducing a tree is parallel by its nature. However, I don't understand how it maps to multi-threading. Consider the following fragment (pseudo code):
let y = read-and-parse-a-number-from-console
let x = get-integer-from-web-service-call
let r = 5 * x - y * 4
write-r-to-file
How can a translator determine which of tree branches should be run on a thread? After you obtained x or y it would be stupid to reduce 5 * x or y * 4 expressions on a separate thread (even if we grab it from a thread pool), wouldn't it? So how different functional languages handle this?
We're not quite there yet.
Programs in pure declarative style (functional style is included in this category, but so are some other styles) tend to be much more amenable to parallelisation, because all data dependencies are explicit. This makes it very easy for the programmer to manually use primitives the language provides for specifying that two independent computations should be done in parallel, regardless of whether they share access to any data; if everything's immutable and there are no side effects, then changing the order in which things are done can't affect the result.
If purity is enforced by the language (as in Haskell, Mercury, etc, but unlike in Scala, F#, etc where purity is encouraged but unenforced), then it is possible for the compiler to attempt to automatically parallelise the program, but no existing language that I know of does this by default. If the language allows unchecked impure operations then it's generally impossible for the compiler to do the analysis necessary to prove that a given attempt to auto-parallelise the program is valid. So I do not expect any such language to ever support auto-parallelisation very effectively.
Note that the pseudo program you wrote is probably not pure declarative code. let y = read-and-parse-a-number-from-console and let x = get-integer-from-web-service-call are calculating x and y from impure external operations, and there's nothing in the program that fixes the order in which they should run. It's possible in general that executing two impure operations in either order will produce different results, and running those two operations in different threads gives up control of the order in which they run. So if a language like that was to auto-parallelise your program, it would almost certainly either introduce horrible concurrency bugs, or refuse to significantly parallelise anything.
However the functional style still makes it easy to manually parallelise such programs. A human programmer can tell that it almost certainly doesn't matter in which order you read from the console and the network. Knowing that there's no shared mutable state can decide to run those two operations in parallel without digging into their implementations (which you'd have to do in imperative algorithms where there might be mutable shared state even if it doesn't look like there is from the interface).
But the big complication that's in the way of auto-parallelising compilers for enforced-purity languages is knowing how much parallelisation to do. Running every computation possible in parallel vastly overwhelms any possible benefit with all the startup cost of spawning new threads (not to mention the context switches), as you try to run huge numbers of very short-lived threads on a small number of processors. The compiler needs to identify a much smaller number of "chunks" of computation that are reasonably large, and run the chunks in parallel while running the sub-computations of each chunk sequentially.
But only "embarrassingly parallel" programs decompose nicely into very large completely independent computations. Most programs are much more interdependent. So unless you only want to be able to auto-parallelise programs that are very easy to manually parallelise, your auto-parallelisation probably needs to be able to identify and run "chunks" in parallel which are partially dependent on each other, having them wait when they get to points that really need a result that's supposed to be computed by another "chunk". This introduces extra overhead of synchronisation between the threads, so the logic that chooses what to run in parallel needs to be even better in order to beat the trivial strategy of just running everything sequentially.
The developers of Mercury (a pure logical programming language) are working on various methods of tackling these problem, from static analysis to using profiling data. If you're interested, their research papers have a lot more information. I presume other researches are working on this area in other languages, but I don't know much about any other projects.
In that specific example, the third statement depends on the first and the second, but there is no interdependency between the first and the second. Therefore, a runtime environment could execute read-and-parse-a-number-from-console on a different thread from get-integer-from-web-service-call, but the execution of the third statement would have to wait until the first two are finished.
Some languages or runtime environments may be able to calculate a partial result (such as y * 4) before obtaining an actual value of x. As a high level programmer though, you would be unlikely to be able to detect this.

Why is concurrent haskell non deterministic while parallel haskell primitives (par and pseq) deterministic?

Don't quite understand determinism in the context of concurrency and parallelism in Haskell. Some examples would be helpful.
Thanks
When dealing with pure values, the order of evaluation does not matter. That is essentially what parallelism does: Evaluating pure values in parallel. As opposed to pure values, order usually matters for actions with side-effects. Running actions simultaneously is called concurrency.
As an example, consider the two actions putStr "foo" and putStr "bar". Depending on the order in which those two actions get evaluated, the output is either "foobar", "barfoo" or any state in between. The output is indeterministic as it depends on the specific order of evaluation.
As another example, consider the two values sum [1..10] and 5 * 3. Regardless of the order in which those two get evaluated, they always reduce to the same results. This determinism is something you can usually only guarantee with pure values.
Concurrency and parallelism are two different things.
Concurrency means that you have multiple threads interacting non-deterministically. For example, you might have a chat server where each client is handled by one thread. The non-determinism is essential to the system you're trying to model.
Parallelism is about using multiple threads for simply making your program run faster. However, the end result should be exactly the same as if you run the algorithm sequentially.
Many languages don't have primitives for parallelism, so you have to implement it using concurrency primitives like threads and locks. However, this means that you the programmer have to be careful to ensure that you don't accidentally introduce unwanted non-determinism or other concurrency issues. With explicit parallelism primitives like par and pseq, many of these concerns simply go away.

What does it mean for something to "compose well"?

Many a times, I've come across statements of the form
X does/doesn't compose well.
I can remember few instances that I've read recently :
Macros don't compose well (context: clojure)
Locks don't compose well (context: clojure)
Imperative programming doesn't compose well... etc.
I want to understand the implications of composability in terms of designing/reading/writing code ? Examples would be nice.
"Composing" functions basically just means sticking two or more functions together to make a big function that combines their functionality in a useful way. Essentially, you define a sequence of functions and pipe the results of each one into the next, finally giving the result of the whole process. Clojure provides the comp function to do this for you, you could do it by hand too.
Functions that you can chain with other functions in creative ways are more useful in general than functions that you can only call in certain conditions. For example, if we didn't have the last function and only had the traditional Lisp list functions, we could easily define last as (def last (comp first reverse)). Look at that — we didn't even need to defn or mention any arguments, because we're just piping the result of one function into another. This would not work if, for example, reverse took the imperative route of modifying the sequence in-place. Macros are problematic as well because you can't pass them to functions like comp or apply.
Composition in programming means assembling bigger pieces out of smaller ones.
Composition of unary functions creates a more complicated unary function by chaining simpler ones.
Composition of control flow constructs places control flow constructs inside other control flow constructs.
Composition of data structures combines multiple simpler data structures into a more complicated one.
Ideally, a composed unit works like a basic unit and you as a programmer do not need to be aware of the difference. If things fall short of the ideal, if something doesn't compose well, your composed program may not have the (intended) combined behavior of its individual pieces.
Suppose I have some simple C code.
void run_with_resource(void) {
Resource *r = create_resource();
do_some_work(r);
destroy_resource(r);
}
C facilitates compositional reasoning about control flow at the level of functions. I don't have to care about what actually happens inside do_some_work(); I know just by looking at this small function that every time a resource is created on line 2 with create_resource(), it will eventually be destroyed on line 4 by destroy_resource().
Well, not quite. What if create_resource() acquires a lock and destroy_resource() frees it? Then I have to worry about whether do_some_work acquires the same lock, which would prevent the function from finishing. What if do_some_work() calls longjmp(), and skips the end of my function entirely? Until I know what goes on in do_some_work(), I won't be able to predict the control flow of my function. We no longer have compositionality: we can no longer decompose the program into parts, reason about the parts independently, and carry our conclusions back to the whole program. This makes designing and debugging much harder and it's why people care whether something composes well.
"Bang for the Buck" - composing well implies a high ratio of expressiveness per rule-of-composition. Each macro introduces its own rules of composition. Each custom data structure does the same. Functions, especially those using general data structures have far fewer rules.
Assignment and other side effects, especially wrt concurrency have even more rules.
Think about when you write functions or methods. You create a group of functionality to do a specific task. When working in an Object Oriented language you cluster your behavior around the actions you think a distinct entity in the system will perform. Functional programs break away from this by encouraging authors to group functionality according to an abstraction. For example, the Clojure Ring library comprises a group of abstractions that cover routing in web applications.
Ring is composable where functions that describe paths in the system (routes) can be grouped into higher order functions (middlewhere). In fact, Clojure is so dynamic that it is possible (and you are encouraged) to come up with patterns of routes that can be dynamically created at runtime. This is the essence of composablilty, instead of coming up with patterns that solve a certain problem you focus on patterns that generate solutions to a certain class of problem. Builders and code generators are just two of the common patterns used in functional programming. Function programming is the art of patterns that generate other patterns (and so on and so on).
The idea is to solve a problem at its most basic level then come up with patterns or groups of the lowest level functions that solve the problem. Once you start to see patterns in the lowest level you've discovered composition. As folks discover second order patterns in groups of functions they may start to see a third level. And so on...
Composition (in the context you describe at a functional level) is typically the ability to feed one function into another cleanly and without intermediate processing. Such an example of composition is in std::cout in C++:
cout << each << item << links << on;
That is a simple example of composition which doesn't really "look" like composition.
Another example with a form more visibly compositional:
foo(bar(baz()));
Wikipedia Link
Composition is useful for readability and compactness, however chaining large collections of interlocking functions which can potentially return error codes or junk data can be hazardous (this is why it is best to minimize error code or null return values.)
Provided your functions use exceptions, or alternatively return null objects you can minimize the requirement for branching (if) on errors and maximize the compositional potential of your code at no extra risk.
Object composition (vs inheritance) is a separate issue (and not what you are asking, but it shares the name). It is one of containment to derive object hierarchy as opposed to direct inheritance.
Within the context of clojure, this comment addresses certain aspects of composability. In general, it seems to emerge when units of the system do one thing well, do not require other units to understand its internals, eschew side-effects, and accept and return the system's pervasive data structures. All of the above can be seen in M2tM's C++ example.
composability, applied to functions, means that the functions are smaller and well-defined, thus easy to integrate into other functions (i have seen this idea in the book "the joy of clojure")
the concept can apply to other things that are supposed be composed into something else.
the purpose of composability is reuse. for example, a function well-build (composable) is easier to reuse
macros aren't that well-composable because you can't pass them as parameters
lock are crap because you can't really give them names (define them well) or reuse them. you just do them inplace
imperative languages aren't that composable because (some of them, at least) don't have closures. if you want functionality passed as parameter, you're screwed. you have to build an object and pass that; disclaimer here: this last idea i'm not entirely convinced is true, therefore research more before taking it for granted
another idea on imperative languages is that they don't compose well because they imply state (from wikipedia knowledgebase :) "Imperative programming - describes computation in terms of statements that change a program state").
state does not compose well because although you have given a specific "something" in input, that "something" generates an output according to it's state. different internal state, different behaviour. and thus you can say good-bye to what you where expecting to happen.
with state, you depend to much on knowing what the current state of an object is... if you want to predict it's behavior. more stuff to keep in the back of your mind, less composable (remember well-defined ? or "small and simple", as in "easy to use" ?)
ps: thinking of learning clojure, huh ? investigating... ? good for you ! :P

Resources