Why doesn't a swap / exchange operator exist in imperative or OO languages like C/C++/C#/Java...? - programming-languages

I was always wondering why such a simple and basic operation like swapping the contents of two variables is not built-in for many languages.
It is one of the most basic programming exercises in computer science classes; it is heavily used in many algorithms (e.g. sorting); every now and then one needs it and one must use a temporary variable or use a template/generic function.
It is even a basic machine instruction on many processors, so that the standard scheme with a temporary variable will get optimized.
Many less obvious operators have been created, like the assignment operators (e.g. +=, which was probably created for reflecting the cumulative machine instructions, e.g. add ax,bx), or the ?? operator in C#.
So, what is the reason? Or does it actually exist, and I always missed it?

In my experience, it isn't that commonly needed in actual applications, apart from the already-mentioned sort algorithms and occasionally in low level hardware poking, so in my view it's a bit too special-purpose to have in a general-purpose language.
As has also been mentioned, not all processors support it as an instruction (and many do not support it for objects bigger than a word). So if it were supported with some useful additional semantics (e.g. being an atomic operation) it would be difficult to support on some processors, and if it didn't have the additional semantics then it's just (seldom used) synatatic sugar.
The assigment operators (+= etc) were supported because these are much more common in real-world programs - and so the syntacic sugar they provide was more useful, and also as an optimisation - remember C dates from the late 60s/early 70s, and compiler optimisation wasn't as advanced (and the machines less capable, so you didn't want lengthy optimisation passes anyway).
Paul

C++ does have swapping.
#include <algorithm>
#include <cassert>
int
main()
{
using std::swap;
int a(3), b(5);
swap(a, b);
assert(a == 5 && b == 3);
}
Furthermore, you can specialise swap for custom types too!

It's a widely used example in computer science courses, but I almost never find myself needing it in real code - whereas I use += very frequently.
Yes, in sorting it would be handy - but you don't tend to need to implement sorting yourself, so the number of actual uses in source code would still be pretty low.

You do have the XOR operator that does a variable substitution for primitive type...

I think they just forgot to add it :-) Yes, not all CPUs have this kind of instructions, so what ? We have bunch of other things that most CPUs don't have instructions to compute. It would be much easier/clearer and also faster ( by intrinsic ) if we had it !!!

Related

Are floating point operations deterministic when running in multiple threads?

Suppose I have a function that runs calculations, example being something like a dot product - I pass in an arrays A, B of vectors and a float array C, and the functions assigns:
C[i] = dot(A[i], B[i]);
If I create and start two threads that will run this function, and pass in the same three arrays to both the threads, under what circumstances is this type of action (perhaps using a different non-random mathematical operation etc.) not guaranteed give the same result (running the same application without any recompilation, and on the same machine)? I'm only interested in the context of a consumer PC.
I know that float operations are in general deterministic, but I do wonder whether perhaps something weird could happen and maybe on one thread the calculations will use an intermediate 80 bit register, but not in the other.
I would assume it's pretty much guaranteed the same binary code should run in both threads (is there some way this could not happen? The function being compiled multiple times for some reason, the compiler somehow figuring out it will run in multiple threads, and compiling it again, for some reason, for the second thread?).
But I'm a a bit more worried that CPU cores might not have the same instruction sets, even on consumer level PCs.
Side question - what about GPUs in a similar scenario?
//
I'm assuming x86_64, Windows, c++, and dot is a.x * b.x + a.y * b.y. Can't give more info than that - using Unity IL2CPP, don't know how it compiles/with what options.
Motivation for the question: I'm writing a computational geometry procedure that modifies a mesh - I'll call this the "geometric mesh". The issue is that it could happen that the "rendering mesh" has multiple vertices for certain geometric positions - it's needed for flat shading for example - you have multiple vertices with different normals. However, the actual computational geometry procedure only uses purely geometric data of the positions in space.
So I see two options:
Create a map from the rendering mesh to the geometric mesh (example - duplicate vertices being mapped to one unique vertex), run the procedure on the geometric mesh, then somehow modify the rendering mesh based on the result.
Work with the rendering mesh directly. Slightly more inefficient as the procedure does calculations for all vertices, but much easier from a code perspective. But most of all I'm a bit worried that I could get two different values for two vertices that actually have the same position and that shouldn't happen. Only the position is used, and the position would be the same for both such vertices.
Floating point (FP) operations are not associative (but it is commutative). As a result, (x+y)+z can give different results than x+(y+z). For example, (1e-13 + (1 - 1e-13)) == ((1e-13 + 1) - 1e-13) is false with 64-bit IEEE-754 floats. The C++ standard is not very restrictive about floating-point numbers. However, the widely-used IEEE-754 standard is. It specifies the precision of 32-bit and 64-bit number operations, including rounding modes. x86-64 processors are IEEE-754 compliant and mainstream compilers (eg. GCC, Clang and MSVC) are also IEEE-754 compliant by default. ICC is not compliant by default since it assumes the FP operations are associative for the sake of performance. Mainstream compilers have compilation flags to make such assumption so to speed up codes. It is generally combined with other ones like the assumption that all FP values are not NaN (eg. -ffast-math). Such flags break the IEEE-754 compliance, but they are often used in the 3D or video game industry so to speed up codes. IEEE-754 is not required by the C++ standard, but you can check this with std::numeric_limits<T>::is_iec559.
Threads can have different rounding modes by default. However, you can set the rounding mode using the C code provided in this answer. Also, please note that denormal numbers are sometimes disabled on some platforms because of their very-high overhead (see this for more information).
Assuming the IEEE-754 compliance is not broken, the rounding mode is the same and the threads does the operations in the same order, then the result should be identical up to at least 1 ULP. In practice, if they are compiled using a same mainstream compiler, the result should be exactly the same.
The thing is using multiple threads often result in a non-deterministic order of the applied FP operations which causes non-deterministic results. More specifically, atomic operations on FP variables often cause such an issue because the order of the operations often changes at runtime. If you want deterministic results, you need to use a static partitioning, avoid atomic operations on FP variables or more generally atomic operations that could result in a different ordering. The same thing applies for locks or any synchronization mechanisms.
The same thing is true for GPUs. In fact, such problem is very frequent when developers use atomic FP operations for example to sum values. They often do that because implementing fast reductions is complex (though it is more deterministic) and atomic operations as pretty fast on modern GPUs (since they use dedicated efficient units).
According to the accepted answer to floating point processor non-determinism?, C++ floating point is not non-deterministic. The same sequence of instructions will give the same results.
There are a few things to take into account though:
Firstly, the behavior (i.e. the result) of a particular piece of C++ source code doing a FP calculation may depend on the compiler and the chosen compiler options. For example, it may depend on whether the compiler chooses to emit 64 or 80 bit FP instructions. But this is deterministic.
Secondly, similar C++ source code may give different results; e.g. due to non-associative behavior of certain FP instructions. This also is deterministic.
Determinism won't be affected by multi-threading by default. The C++ compiler will probably be unaware of whether the code is multi-threaded or not. And it definitely has no reason to emit different FP code.
Admittedly, FP behavior depends on the rounding mode selected, and that can be set on a per-thread basis. However, for this to happen, something (application code) would have to explicitly set different rounding modes for different threads. Once again, that is deterministic. (And a pretty daft thing for the application code to do, IMO.)
The idea that a PC would would use different FP hardware with different behavior for different threads seems far-fetched to me. Sure a PC could have (say) an Intel chipset and an ARM chipset, but it is not plausible that different threads of the same C++ application (executable) would simultaneously run on both chipsets.
Likewise for GPUs. Indeed, given that you need to program GPUs in a way that is radically different to ordinary (or threaded) C++, I would doubt that they could even share the same source code.
In short, I think that you are worrying about a hypothetical problem that you are unlikely to encounter in reality ... given the current state of the art in hardware and C++ compilers.

Equivalent terms of LLVM IR for watermarking by renumbering?

I want to apply an algorithm for watermarking that basically reorders equivalent terms of a programming language:
https://books.google.dk/books?id=mig-bH3u0Z0C&pg=PT595&lpg=PT595&dq=obfuscation+renumbering+register&source=bl&ots=b3vMhp-yTq&sig=RERdnDNewRqBi7ZmSNMlsnPy-Hw&hl=da&sa=X&ved=0ahUKEwiLw-zWrpnSAhWEHJoKHXCpAkMQ6AEIGTAA#v=onepage&q=obfuscation%20renumbering%20register&f=false
Say, T1, T2,...,Tn are equivalent terms of the language, then the watermark is a permutation f such that f(Ti) = Tj.
In this case the programming language is LLVM IR, which is an intermediate language.
The book gives an example of renumbering registers by applying a permutation. However, registers are not in the scope of LLVM IR, since they are a lower-level detail?
I've been thinking of equivalent terms of LLVM, but cannot come up with some. The more the better, since this means a more flexible degree of watermarking.
Can you think of equivalent terms of LLVM IR such that each could be substituted for some other? Or is it only possible to do such watermarking at the machine code level?
Even if you perform this at the IR level (and you could by changing patterns), you won't get far since the machine instruction level would reshuffle everything. You better off writing a (potentially post-RA) machine instruction level pass.

Parallelism in functional languages

One of FP features advertised is that a program is "parallel by default" and that naturally fits modern multi-core processors. Indeed, reducing a tree is parallel by its nature. However, I don't understand how it maps to multi-threading. Consider the following fragment (pseudo code):
let y = read-and-parse-a-number-from-console
let x = get-integer-from-web-service-call
let r = 5 * x - y * 4
write-r-to-file
How can a translator determine which of tree branches should be run on a thread? After you obtained x or y it would be stupid to reduce 5 * x or y * 4 expressions on a separate thread (even if we grab it from a thread pool), wouldn't it? So how different functional languages handle this?
We're not quite there yet.
Programs in pure declarative style (functional style is included in this category, but so are some other styles) tend to be much more amenable to parallelisation, because all data dependencies are explicit. This makes it very easy for the programmer to manually use primitives the language provides for specifying that two independent computations should be done in parallel, regardless of whether they share access to any data; if everything's immutable and there are no side effects, then changing the order in which things are done can't affect the result.
If purity is enforced by the language (as in Haskell, Mercury, etc, but unlike in Scala, F#, etc where purity is encouraged but unenforced), then it is possible for the compiler to attempt to automatically parallelise the program, but no existing language that I know of does this by default. If the language allows unchecked impure operations then it's generally impossible for the compiler to do the analysis necessary to prove that a given attempt to auto-parallelise the program is valid. So I do not expect any such language to ever support auto-parallelisation very effectively.
Note that the pseudo program you wrote is probably not pure declarative code. let y = read-and-parse-a-number-from-console and let x = get-integer-from-web-service-call are calculating x and y from impure external operations, and there's nothing in the program that fixes the order in which they should run. It's possible in general that executing two impure operations in either order will produce different results, and running those two operations in different threads gives up control of the order in which they run. So if a language like that was to auto-parallelise your program, it would almost certainly either introduce horrible concurrency bugs, or refuse to significantly parallelise anything.
However the functional style still makes it easy to manually parallelise such programs. A human programmer can tell that it almost certainly doesn't matter in which order you read from the console and the network. Knowing that there's no shared mutable state can decide to run those two operations in parallel without digging into their implementations (which you'd have to do in imperative algorithms where there might be mutable shared state even if it doesn't look like there is from the interface).
But the big complication that's in the way of auto-parallelising compilers for enforced-purity languages is knowing how much parallelisation to do. Running every computation possible in parallel vastly overwhelms any possible benefit with all the startup cost of spawning new threads (not to mention the context switches), as you try to run huge numbers of very short-lived threads on a small number of processors. The compiler needs to identify a much smaller number of "chunks" of computation that are reasonably large, and run the chunks in parallel while running the sub-computations of each chunk sequentially.
But only "embarrassingly parallel" programs decompose nicely into very large completely independent computations. Most programs are much more interdependent. So unless you only want to be able to auto-parallelise programs that are very easy to manually parallelise, your auto-parallelisation probably needs to be able to identify and run "chunks" in parallel which are partially dependent on each other, having them wait when they get to points that really need a result that's supposed to be computed by another "chunk". This introduces extra overhead of synchronisation between the threads, so the logic that chooses what to run in parallel needs to be even better in order to beat the trivial strategy of just running everything sequentially.
The developers of Mercury (a pure logical programming language) are working on various methods of tackling these problem, from static analysis to using profiling data. If you're interested, their research papers have a lot more information. I presume other researches are working on this area in other languages, but I don't know much about any other projects.
In that specific example, the third statement depends on the first and the second, but there is no interdependency between the first and the second. Therefore, a runtime environment could execute read-and-parse-a-number-from-console on a different thread from get-integer-from-web-service-call, but the execution of the third statement would have to wait until the first two are finished.
Some languages or runtime environments may be able to calculate a partial result (such as y * 4) before obtaining an actual value of x. As a high level programmer though, you would be unlikely to be able to detect this.

A language in which everything compiles

I'm trying to do some research for a new project, and I need to create objects dynamically from random data.
For this to work, I need a language / compiler that doesn't have problems with weird uncompilable code lying around.
Basically, I need the random code to compile (or be interpreted) as much as possible - Meaning that the uncompilable parts will be ignored, and only the compilable parts will create the objects (which could be ran).
Object Oriented-ness is not a must, but is a very strong advantage.
I thought of ASM, but it's very messy, and I'd probably need a more readable code
Thanks!
It sounds like you're doing something very much like genetic programming; even if you aren't, GP has to solve some of the same problems—using randomness to generate valid programs. The approach to this that is typically used is to work with a syntax tree: rather than storing x + y * 3 - 2, you store something like the following:
Then, instead of randomly changing the syntax, one can randomly change nodes in the tree instead. And if x should randomly change to, say, +, you can statically know that this means you need to insert two children (or not, depending on how you define +).
A good choice for a language to work with for this would be any Lisp dialect. In a Lisp, the above program would be written (- (+ x (* y 3)) 2), which is just a linearization of the syntax tree using parentheses to show depth. And in fact, Lisps expose this feature: you can just as easily work with the object '(- (+ x (* y 3)) 2) (note the leading quote). This is a three-element list, whose first element is -, second element is another list, and third element is 2. And, though you might or might not want it for your particular application, there's an eval function, such that (eval '(- (+ x (* y 3)) 2)) will take in the given list, treat it as a Lisp syntax tree/program, and evaluate it. This is what makes Lisps so attractive for doing this sort of work; Lisp syntax is basically a reification of the syntax-tree, and if you operate at the syntax-tree level, you can work on code as though it was a value. Lisp won't help you read /dev/random as a program directly, but with a little interpretation layered on top, you should be able to get what you want.
I should also mention, though I don't know anything about it (not that I know much about ordinary genetic programming) the existence of linear genetic programming. This is sort of like the assembly model that you mentioned—a linear stream of very, very simple instructions. The advantage here would seem to be that if you are working with /dev/random or something like it, the amount of interpretation needed is very small; the disadvantage would be, as you mentioned, the low-level nature of the code.
I'm not sure if this is what you're looking for, but any programming language can be made to function this way. For any programming language P, define the language Palways as follows:
If p is a valid program in P, then p is a valid program in Palways whose meaning is the same as its meaning in P.
If p is not a valid program in P, then p is a valid program in Palways whose meaning is the same as a program that immediately terminates.
For example, I could make the language C++always so that this program:
#include <iostream>
using namespace std;
int main() {
cout << "Hello, world!" << endl;
}
would compile as "Hello, world!", while this program:
Hahaha! This isn't legal C++ code!
Would be a legal program that just does absolutely nothing.
To solve your original problem, just take any OOP language like Java, Smalltalk, etc. and construct the appropriate Javaalways, Smalltalkalways, etc. language from it. Again, I'm not sure if this is at all what you're looking for, but it could be done very easily.
Alternatively, consider finding a grammar for any OOP language and then using that grammar to produce random syntactically valid programs. You could then filter those programs down by using the Palways programming language for that language to eliminate syntactically but not semantically valid programs.
Divide the ASCII byte values into 9 classes (division modulo 9 would help). Then assign then to Brainfuck codewords (see http://en.wikipedia.org/wiki/Brainfuck). Then interpret as Brainfuck.
There you go, any sequence of ASCII characters is a program. Not that it's going to do anything sensible... This approach has a much better chance, compared to templatetypedef's answer, to get a nontrivial program from a random byte sequence.
Text Editors
You could try feeding random character strings to an editor like Emacs or VI. Many (most?) characters will perform an editing action but some will do nothing (other than beep, perhaps). You would have to ensure that the random code mutator never generates the character sequence that exits the editor. However, this experience would be much like programming a Turing machine -- the code is not too readable.
Mathematica
In Mathematica, undefined symbols and other expressions evaluate to themselves, without error. So, that language might be a viable choice if you can arrange for the random code mutator to always generate well-formed expressions. This would be readily achievable since the basic Mathematica syntax is trivial, making it easy to operate on syntactic units rather than at the character level. It would be even easier if the mutator were written in Mathematica itself since expression-munging is Mathematica's forte. You could define a mini-language of valid operations within a Mathematica package that does not import the system-defined symbols. This would allow you to generate well-formed expressions to your heart's content without fear of generating a dangerous expression, like DeleteFile[FileNames["*.*", "/", Infinity]].
I believe Common Lisp should suit your needs. I always have some code in my SLIME/Emacs session that wouldn't compile. You can always tweak things, redefine functions in run-time. It is actually very good for prototyping.
A few years ago it took me quite a while to learn. But nowadays we have quicklisp and everything is so much easier.
Here I describe my development environment:
Install lisp on my linux machine
PS: I want to give an example, where Common Lisp was useful for me:
Up to maybe 2004 I used to write small programs in C (the keep it simple Unix way).
The last 3 years I had to get lots of different hardware running. Motorized stages, scientific cameras, IO cards.
The cameras turned out to be quite annoying. Usually you have to cool them down to -50 degree celsius or so and (in some SDKs) they don't like it when you close them. But this
is exactly how my C development cycle worked: write (30s), compile (1s), run (0.1s), repeat.
Eventually I decided to just use Common Lisp. Often it is straight forward to define the foreign function interfaces to talk to the SDKs and I can do this without ever leaving the running Lisp image. I start the editor in the morning define the open-device function, to talk to the device and after 3 hours I have enough of the functions implemented to set gain, temperature, region of interest and obtain the video.
Then I can often put the SDK manual away and just use the camera.
I used the same interactive programming approach when I have to parse some webpage or some weird XML.

Is there a language with native pass-by-reference/pass-by-name semantics, which could be used in modern production applications?

This is a reopened question.
I look for a language and supporting platform for it, where the language could have pass-by-reference or pass-by-name semantics by default. I know the history a little, that there were Algol, Fortran and there still is C++ which could make it possible; but, basically, what I look for is something more modern and where the mentioned value pass methodology is preferred and by default (implicitly assumed).
I ask this question, because, to my mind, some of the advantages of pass-by-ref/name seem kind of obvious. For example when it is used in a standalone agent, where copyiong of values is not necessary (to some extent) and performance wouldn't be downgraded much in that case. So, I could employ it in e.g. rich client app or some game-style or standalone service-kind application.
The main advantage to me is the clear separation between identity of a symbol, and its current value. I mean when there is no reduntant copying, you know that you're working with the exact symbol/path you have queried/received. And intristing boxing of values will not interfere with the actual logic of program.
I know that there is C# ref keyword, but it's something not so intristic, though acceptable. Equally, I realize that pass-by-reference semantics could be simulated in virtually any language (Java as an instant example) and so on.. not sure about pass by name :)
What would you recommend - create a something like DSL for such needs wherever it be appropriate; or use some languages that I already know? Maybe, there is something that I'm missing?
Thank you!
UPDATE: Currently, I think that Haskell would be appropriate. But I didn't investigate much, so I think I'll update this text later.
Scala provides very flexible parameter passing semantics including real call-by-name:
def whileLoop(cond: => Boolean)(body: => Unit) {
if (cond) {
body
whileLoop(cond)(body)
}
}
And it really works
var i = 10
whileLoop (i > 0) {
println(i)
i -= 1
}
Technical details:
Though all parameters are passed by value (and these are usually references) much like Java, the notation => Type will make Scala generate the required closures automatically in order to emulate call-by-name.
Note that there is lazy evaluation too.
lazy val future = evalFunc()
The interesting thing is that you have consistent strict call-by-value semantics but can punctually change these where you really need to - nearly without any syntactic overhead.
Haskell has call-by-need as its default (and indeed only) evaluation strategy.
Now, you asked for two things: call-by-name and modern. Well, Haskell is a pure language and in a pure language call-by-name and call-by-need are semantically the same thing, or more precisely they always have the same result, the only difference being that call-by-need is usually faster and at worst only a constant factor slower than call-by-name. And Haskell surely is a modern language: it is merely 23 years old and in many of its features it is actually 10 years ahead of many languages that were created just recently.
The other thing you asked about is call-by-reference. Again, in a pure language, call-by-value and call-by-reference are the same thing, except that the latter is faster. (Which is why, even though most functional languages are usually described as being call-by-value, they actually implement call-by-reference.)
Now, call-by-name (and by extension call-by-need) are not the same thing as call-by-value (and by extension call-by-reference), because call-by-name may return a result in cases where call-by-value doesn't terminate.
However, in all cases where call-by-value or call-by-reference terminates, in a pure language, call-by-value, call-by-reference, call-by-name and call-by-need are the same thing. And in cases where they are not the same thing, call-by-name and call-by-need are in some sense "better", because they give you an answer in cases where call-by-value and call-by-reference would basically have run into an infinite loop.
Ergo, Haskell is your answer. Although probably not the one you were looking for :-)
Pass-by name is rare nowadays. However, you can simulate it in most functional programming languages using a lambda-nill:
// Pass by value
(dosomething (random))
// Pass by name hack
(dosomething (lambda () (random)))
Other then that: ML and O'CaML has a distinction between pass-by-value (default), pass-by-ref (using ref variables) and of course using lambdas. However, I'm not sure either of them qualifies as a "modern" language.
I'm not quite following your reasoning for why C#'s ref and out modifiers aren't "intrinsic." Seems to me that it provides almost exactly what you're looking for: A modern language and environment that supports pass-by-value and pass-by-reference. (As Little Bobby Tables pointed out, pass-by-name is very rare these days, you're better off with a lambda/closure.)
AFAIK, modern Fortran is pass-by-reference (preserving compatibility with ye olde FORTRAN).
Modern Fortran has all the niceties you expect of a modular language, so you can build just fine systems in it. Nobody does, because "Fortran is passe" and everybody wants to code in C# "because its cool.".
In Java, all objects are passed by reference.

Resources