I am trying to find my feet in lock-free programming. Having read different explanations for memory ordering semantics, I would like to clear up what possible reordering may happen. As far as I understood, instructions may be reordered by the compiler (due to optimization when the program is compiled) and CPU (at runtime?).
For the relaxed semantics cpp reference provides the following example:
// Thread 1:
r1 = y.load(memory_order_relaxed); // A
x.store(r1, memory_order_relaxed); // B
// Thread 2:
r2 = x.load(memory_order_relaxed); // C
y.store(42, memory_order_relaxed); // D
It is said that with x and y initially zero the code is allowed to produce r1 == r2 == 42 because, although A is sequenced-before B within thread 1 and C is sequenced before D within thread 2, nothing prevents D from appearing before A in the modification order of y, and B from appearing before C in the modification order of x. How could that happen? Does it imply that C and D get reordered, so the execution order would be DABC? Is it allowed to reorder A and B?
For the acquire-release semantics there is the following sample code:
std::atomic<std::string*> ptr;
int data;
void producer()
{
std::string* p = new std::string("Hello");
data = 42;
ptr.store(p, std::memory_order_release);
}
void consumer()
{
std::string* p2;
while (!(p2 = ptr.load(std::memory_order_acquire)))
;
assert(*p2 == "Hello"); // never fires
assert(data == 42); // never fires
}
I'm wondering what if we used relaxed memory order instead of acquire? I guess, the value of data could be read before p2 = ptr.load(std::memory_order_relaxed), but what about p2?
Finally, why it is fine to use relaxed memory order in this case?
template<typename T>
class stack
{
std::atomic<node<T>*> head;
public:
void push(const T& data)
{
node<T>* new_node = new node<T>(data);
// put the current value of head into new_node->next
new_node->next = head.load(std::memory_order_relaxed);
// now make new_node the new head, but if the head
// is no longer what's stored in new_node->next
// (some other thread must have inserted a node just now)
// then put that new head into new_node->next and try again
while(!head.compare_exchange_weak(new_node->next, new_node,
std::memory_order_release,
std::memory_order_relaxed))
; // the body of the loop is empty
}
};
I mean both head.load(std::memory_order_relaxed) and head.compare_exchange_weak(new_node->next, new_node, std::memory_order_release, std::memory_order_relaxed).
To summarize all the above, my question is essentially when do I have to care about potential reordering and when I don't?
For #1, compiler may issue the store to y before the load from x (there are no dependencies), and even if it doesn't, the load from x can be delayed at cpu/memory level.
For #2, p2 would be nonzero, but neither *p2 nor data would necessarily have a meaningful value.
For #3 there is only one act of publishing non-atomic stores made by this thread, and it is a release
You should always care about reordering, or, better, not assume any order: neither C++ nor hardware executes code top to bottom, they only respect dependencies.
I read a lot of articles about PTHREAD_MUTEX_INITIALIZER, I understood what does it do, however, I am unable to understand how does it do that? How a macro can be used to initialize a variable just by assigning its name to that variable?
What I know about macros is that they can be used just as functions, such as:
#define MAX(a, b) ((a) > (b) ? (a) : (b))
Now we can use this macro as a function like Max(a, b).
But how can we write a macro that can be used in the way which PTHREAD_MUTEX_INITIALIZER is used like:
int x = Macro_Name;
Then x will be initialized to a specific value (like when a mutex is initialized once PTHREAD_MUTEX_INITIALIZER is assigned to it).
Here is a snippet from the source code of libpthread, taken from http://git.savannah.gnu.org/cgit/hurd/libpthread.git/tree/sysdeps/pthread/bits/types/struct___pthread_mutex.h (I only removed comments that are irrelevant to the question)
/* User visible part of a mutex. */
struct __pthread_mutex
{
__pthread_spinlock_t __held;
__pthread_spinlock_t __lock;
char *__cthreadscompat1;
struct __pthread *__queue;
struct __pthread_mutexattr *__attr;
void *__data;
void *__owner;
unsigned __locks;
};
# define __PTHREAD_MUTEX_INITIALIZER \
{ __PTHREAD_SPIN_LOCK_INITIALIZER, __PTHREAD_SPIN_LOCK_INITIALIZER, 0, 0, 0, 0, 0, 0 }
From that, it can be seen that the macro hides an initializer list for the structure that represents the "user visible part of a mutex". Most members of the struct (including pointers) are set to 0, and internal spin locks are initialized with their own initializer macro, which is probably defined similarly.
Of course it's just one implementation, but I guess other implementations might have something similar.
When passing an array of structs to my kernel as an argument, I get weird values for the items after the first (array[1], array[2], etc). It seems to be an alignment issue maybe?
Here is the struct:
typedef struct Sphere
{
float3 color;
float3 position;
float3 reflectivity;
float radius;
int phong;
bool isReflective;
} Sphere;
Here is the host side init code:
cl::Buffer cl_spheres = cl::Buffer(context, CL_MEM_READ_ONLY, sizeof(Sphere) * MAX_SPHERES, NULL, &err);
err = queue.enqueueWriteBuffer(cl_spheres, CL_TRUE, 0, sizeof(Sphere) * MAX_SPHERES, spheres, NULL, &event);
err = kernel.setArg(3, cl_spheres);
What happens is that the color for the second Sphere struct in the array will actually have the last value of what I set color to on the host side (s3 or z), a non initialized zero value, and the first value of what I set position to on the host side (s0 or x). I noticed that the float3 datatype actually still has a fourth value (s3) that does not get initialized. I think that is where the non initialized zero value is coming from. So it seems that it is an alignment issue. I really am at a loss as to what I could do to fix it. I was hoping maybe someone could shed some light on this problem. I have ensured that my struct definitions are exactly the same on both sides.
From the OpenCL 1.2 specs, section 6.11.1:
Note that the alignment of any given struct or union type is required
by the ISO C standard to be at least a perfect multiple of the lowest
common multiple of the alignments of all of the members of the struct
or union in question and must also be a power of two.
Also cl_float3 counts as a cl_float4, see section 6.1.5.
Finally, in section 6.9.k:
Arguments to kernel functions in a program cannot be declared with the
built-in scalar types bool, half, size_t, ptrdiff_t, intptr_t, and
uintptr_t or a struct and/or union that contain fields declared to be
one of these built-in scalar types.
To comply with these rules, and probably make accesses faster, you can try (OpenCL C side; on the host use cl_float4):
typedef struct Sphere
{
float4 color;
float4 position;
float4 reflectivity;
float4 radiusPhongReflective; // each value uses 1 float
} Sphere;
I am getting all kinds of errors when passing my array to this function. The function is suppose to have the user enter a name and a score and store them in 2 seperate arrays, one for the names, one for the scores. I believe I have to use pointers but have no idea on how to use them. I don't want the answer, just a push in the right direction. Here is the code:
#include <iostream>
int InputData(int &, char, int);
using namespace std;
int main()
{
char playerName[100][20];
int score[100];
int numPlayers = 0;
InputData(numPlayers, playerName, score);
return 0;
}
int InputData(int &numPlayers, char playerName[][20], int score[])
{
while (numPlayers <= 100)
{
cout << "Enter Player Name (Q to quit): ";
cin.getline(playerName, 100, ‘\n’);
if ((playerName[numPlayers] = 'Q') || (playerName[numPlayers] = 'q'))
return 0;
cout << "Enter score for " << playerName[numPlayers] <<": ";
cin >> score[numPlayers];
numPlayers++;
}
}
Ok, I made some more changes and the errors are less, must be getting close, Lol!
This looks like a school assignment and I applaud you for not asking for the answer. There are several ways to do it, but you are already fairly close in the approach that you are using. When you pass an array reference, you do not want to include the length of the array. For example, the parameter int score[100] should be int score[]. The exception, especially in your scenario, is with multidimensional arrays. In this case, you want to use char playerName[][20]. Your function declaration also needs to change to match. Don't forget InputData returns an int. Your declarations and function call are correct; you just need to adjust your function signature.
Keeping the errors aside -
InputData(numPlayers, playerName, score, size);
// ^^^^ size is no where declared
// resulting Undeclared indentifier error
Prototype mentions of taking 3 arguments but calling the function passing 4 parameters.
Hint regarding errors:
An 1D array decays to a pointer pointing to first element in the array while passing to a function.
A 2D array decays to a pointer pointing to the 1D array ( i.e., T[][size] ) while passing to a function.
Return type of main() should be int.
It seems with the given hints you corrected most of the errors. But you forgot to change the prototype. So, change -
int InputData(int &, char, int);
to
int InputData(int &, char[][20], int[]);
Why aren't you using std::string array for player names ? Use it and remove rest of the errors. Good luck.
I was wondering, is there any programming language where you can have function calls like this:
function_name(parameter1)function_name_continued(parameter2);
or
function_name(param1)function_continued(param2)...function_continued(paramN);
For example you could have this function call:
int dist = distanceFrom(cityA)to(cityB);
if you have defined distanceFromto function like this:
int distanceFrom(city A)to(city B)
{
// find distance between city A and city B
// ...
return distance;
}
As far as I know, in C, Java and SML programming languages, this cannot be done.
Are you aware of any programming language that let's you define and call
functions in this way?
It looks an awful lot like Objective-C
- (int)distanceFrom:(City *)cityA to:(City *)cityB {
// woah!
}
Sounds a lot like Smalltalk's syntax, (which would explain Objective-C's syntax - see kubi's answer).
Example:
dist := metric distanceFrom: cityA to: cityB
where #distanceFrom:to: is a method on some object called metric.
So you have "function calls" (they're really message sends) like
'hello world' indexOf: $o startingAt: 6. "$o means 'the character literal o"
EDIT: I'd said "Really, #distanceFrom:to: should be called #distanceTo: on a City class, but anyway." Justice points out that this couples a City to a Metric, which is Bad. There are good reasons why you might want to vary the metric - aeroplanes might use a geodesic while cars might use a shortest path based on the road network.)
For the curious, Agda2 has a similar, very permissive syntax. The following is valid code:
data City : Set where
London : City
Paris : City
data Distance : Set where
_km : ℕ → Distance
from_to_ : City → City → Distance
from London to London = 0 km
from London to Paris = 342 km
from Paris to London = 342 km
from Paris to Paris = 0 km
If
from Paris to London
is evaluated, the result is
342 km
Looks a lot like a fluent interface or method chaining to me.
In Python, you can explicitly pass the name of the arguments you're calling the function with, which lets you pass them in a different order or skip optional arguments:
>>> l = [3,5,1,2,4]
>>> print l.sort.__doc__
L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*;
cmp(x, y) -> -1, 0, 1
>>> l.sort (reverse=True)
>>> l
[5, 4, 3, 2, 1]
This looks a lot like what the Objective C syntax is doing, tagging each argument to a function with its name.
C# 4.0's Named and Optional Arguments feature allows you to achieve something pretty similar:
public static int Distance(string from, string to, string via = "")
{
...
}
public static void Main()
{
int distance;
distance = Distance(from: "New York", to: "Tokyo");
distance = Distance(to: "Tokyo", from: "New York");
distance = Distance(from: "New York", via: "Athens", to: "Tokyo");
}
(see my very favourite personal effort - the final C++ approach at the end of this answer)
Language One
Objective-C but the calling syntax is [object message] so would look like:
int dist = [cities distanceFrom:cityA to:cityB];
if you have defined distanceFromto function like this, within a cities object:
- (int)distanceFrom:(City *)cityA to:(City *)cityB
{
// find distance between city A and city B
// ...
return distance;
}
Language Two
I also suspect you could achieve something very close to this in the IO Language but I'm only just looking at it. You may also want to read about it in comparison to other languages in Seven Languages in Seven Weeks which has a free excerpt about IO.
Language Three
There's an idiom ("chaining") in C++ where you return temporary objects or the current object that is used to replace keyword arguments, according to The Design and Evolution of C++ and looks like this:
int dist = distanceFrom(cityA).to(cityB);
if you have defined distanceFrom function like this, with a little helper object. Note that inline functions make this kind of thing compile to very efficient code.
class DistanceCalculator
{
public:
DistanceCalculator(City* from) : fromCity(from) {}
int to(City * toCity)
{
// find distance between fromCity and toCity
// ...
return distance;
}
private:
City* fromCity;
};
inline DistanceCalculator distanceFrom(City* from)
{
return DistanceCalculator(from);
}
Duhh, I was in a hurry earlier, realised I can refactor to just use a temporary object to give the same syntax:
class distanceFrom
{
public:
distanceFrom(City* from) : fromCity(from) {}
int to(City * toCity)
{
// find distance between fromCity and toCity
// ...
return distance;
}
private:
City* fromCity;
};
MY FAVOURITE
and here's an even more inspired C++ version that allows you to write
int dist = distanceFrom cityA to cityB;
or even
int dist = distanceFrom cityA to cityB to cityC;
based on a wonderfully C++ ish combination of #define and classes:
#include <vector>
#include <numeric>
class City;
#define distanceFrom DistanceCalculator() <<
#define to <<
class DistanceCalculator
{
public:
operator int()
{
// find distance between chain of cities
return std::accumulate(cities.begin(), cities.end(), 0);
}
DistanceCalculator& operator<<(City* aCity)
{
cities.push_back(aCity);
return *this;
}
private:
std::vector<City*> cities;
};
NOTE this may look like a useless exercise but in some contexts it can be very useful to give people a domain-specific language in C++ which they compile alongside libraries. We used a similar approach with Python for geo-modeling scientists at the CSIRO.
You can do this in C, albeit unsafely:
struct Arg_s
{
int from;
int to;
};
int distance_f(struct Arg_s args)
{
return args.to - args.from;
}
#define distance(...) distance_f( ((struct Arg_s){__VA_ARGS__}) )
#define from_ .from =
#define to_ .to =
uses compound literals and designated initializers.
printf("5 to 7 = %i\n",distance(from_ 5, to_ 7));
// 5 to 7 = 2
3 of the 4 confederated languages from RemObjects in their Elements Compiler have this capability in precisely the OP's requested syntax (to support Objective-C runtime, but made available to all operating systems).
in Hydrogene (an extended C#)
https://docs.elementscompiler.com/Hydrogene/LanguageExtensions/MultiPartMethodNames
in Iodine (an extended Java)
https://docs.elementscompiler.com/Iodine/LanguageExtensions/MultiPartMethodNames
in Oxygene (an extended ObjectPascal), scroll down to Multi-Part Method Names section
https://docs.elementscompiler.com/Oxygene/Members/Methods
This looks similar to function overloading (C++/C#)/default parameters (VB).
Default Parameters allow the person defining the function to set defaults for the latter parameters:
e.g. c# overloading:
int CalculateDistance(city A, city B, city via1, city via2)
{....}
int CalculateDistance(city A, city B)
{
return CalculateDistance(city A, city B, null, null)
}
You can use a member function for this.
cityA.distance_to(cityB);
That's valid code in C++, C(with a little tweaking), C#, Java. Using method chains, you can do:
cityA.something(cityB).something(cityC).something(cityD).something(cityE);
In SML you could simply make "to" some value (unit, for example), and "distanceFrom" a curried function that takes three parameters. For example:
val to = ()
fun distanceFrom x _ y = (* implementation function body *)
val foo = distanceFrom cityA to cityB
You could also take advantage of the fact that SML doesn't enforce naming conventions on datataype constructors (much to many peoples' annoyance), so if you want to make sure that the type system enforces your custom syntax:
datatype comp = to
fun distanceFrom x to y = (* implementation *)
val foo = distanceFrom cityA to cityB (* works *)
val foo' = distanceFrom cityA cityB (* whoops, forgot 'to' - type error! *)
You could do this in Scheme or LISP using macros.
The form will be something like:
(DISTANCE-FROM city-a TO city-b)
The symbols in uppercase denotes syntax.
You could even do something like 'named parameters':
(DISTANCE TO city-a FROM city-b)
(DISTANCE FROM city-a TO city-b)
Tcl allows you to do something like this:
proc distance {from cityA to cityB} {...}
set distance [distance from "Chicago IL" to "Tulsa OK"]
I'm not sure if that's quite what you are thinking of though.
You can do it in Java, Use Builder pattern that appears in the book Effective Java by Joshua Bosch (this is second time I put this link in SO, I still didn't use that patern, but looks great)
Well, in Felix you can implement this in two steps: first, you write an ordinary function. Then, you can extend the grammar and map some of the new non-terminals to the function.
This is a bit heavyweight compared to what you might want (welcome to help make it easier!!) I think this does what you want and a whole lot more!
I will give a real example because the whole of the Felix language is actually defined by this technique (below x is the non-terminal for expressions, the p in x[p] is a precedence code):
// alternate conditional
x[sdollar_apply_pri] := x[stuple_pri] "unless" x[let_pri]
"then" x[sdollar_apply_pri] =>#
"`(ast_cond ,_sr ((ast_apply ,_sr (lnot ,_3)) ,_1 ,_5))";
Here's a bit more:
// indexes and slices
x[sfactor_pri] := x[sfactor_pri] "." "[" sexpr "]" =>#
"`(ast_apply ,_sr (,(noi 'subscript) (,_1 ,_4)))";
x[sfactor_pri] := x[sfactor_pri] "." "[" sexpr "to" sexpr "]" =>#
"`(ast_apply ,_sr (,(noi 'substring) (,_1 ,_4 ,_6)))";
x[sfactor_pri] := x[sfactor_pri] "." "[" sexpr "to" "]" =>#
"`(ast_apply ,_sr (,(noi 'copyfrom) (,_1 ,_4)))";
x[sfactor_pri] := x[sfactor_pri] "." "[" "to" sexpr "]" =>#
"`(ast_apply ,_sr (,(noi 'copyto) (,_1 ,_5)))";
The Felix grammar is ordinary user code. In the examples the grammar actions are written in Scheme. The grammar is GLR. It allows "context sensitive keywords", that is, identifiers that are keywords in certain contexts only, which makes it easy to invent new constructs without worrying about breaking existing code.
Perhaps you would like to examine Felix Grammar Online.