MSVC++ Class Cast to the a struct - visual-c++

I am currently reading an existing code in MS Visual C++ 6.0. I notice a code pattern where they cast object into a structure.
There is a CMemory object.
CMemory a;
MY_STRUCTURE_A* a = (MY_STRUCTURE_A*)(void *)a;
MY_STRUCTURE_B* a = (MY_STRUCTURE_B*)(void *)a;
I checked the Custom memory class and it really is a class object. It does have a = operator defined but I do not think that would allow it to be reinterpreted to a structure. Why is this being done. How is an object type being cast to different objects?
Any idea why this is being done? I know there is a reinterpret_cast and I am guessing that this technique of casting to void pointer to a structure pointer is similar. But I am not sure if it is the same. Is this pattern safe casting a class object to a struct?
Note: the CMemory is just an arbritary name of the object used. It is not part of the MFC class.
Added based on Necrolis' comment.
The CMemory and it has only 3 members declared in the following order (1) char pointer, (2) int specifying the allocated memory of (1), and (3) a previous and next pointer to other instance of CMemory. It also has a lot of member method. From what I understand, even if I directly cast a class to a structure. The class would start should start with the first member variable which is the char pointer.
class CMemory {
public:
CAMemory();
... Other methods
private:
char *m_pMemory;
int m_memorySize;
... Other field
}

Going by the name of the class and the casting, CMemory is more than likely a generic memory block tag (for a GC, arbitrary hash table etc), and to access the memory its tagging requires a cast. Of course this is a "best guess", it means nothing without seeing the full code for CMemory.
Is this safe, totally not, its not only UB, but there is no check (at least in your example) as to whether the object you casting to is the object represented by the memory layout. Also, with this being C++, they should be avoiding C casts (as you have noted. the double cast is in fact to get around compiler errors/warnings, which is always the worst way to solve them).

Related

Is it safe to use Class level Predicate in Multithreading Application

I am trying to understand if there could be any issues with Predicate defined at class level in multithreaded application. ? We have defined such predicates in our services and using them up different methods of same class. Currently we have not seen any issue but I am curious to understand our class level Predicate object is going to function. will there be any inconsistency in the behaviour?
eg:
class SomeClass{
Predicate<String> check = (value) -> value.contains("SomeString");
// remaning impl. of the class.
}
The predicate in your example is categorically thread-safe. It is calling a method on an intrinsicly thread-safe (and immutable) object.
This does not generalize to all predicates though. For example
Predicate<StringBuilder> check = (value) -> value.indexOf("SomeString") >= 0;
is not thread-safe. Another thread could mutate the contents of the StringBuilder argument while this predicate is checking it. The predicate could also be vulnerable to memory model related inconsistencies.
(The StringBuilder class is not thread-safe; see javadoc.)
It is not clear what you mean by "class level". Your example shows a predicate declared as a regular field, not a static (class level) field.
With a variable declared as a (mutable) instance field, it is difficult to reason about the thread-safety of the field in isolation. This can be solved by declaring the field as final.

Why do some struct types let us set members that can only be a certain value?

I was reading up on some vulkan struct types, this is one of many examples, but the one I will use is vkInstanceCreateInfo. The documentation states:
The VkInstanceCreateInfo structure is defined as:
typedef struct VkInstanceCreateInfo {
VkStructureType sType;
const void* pNext;
VkInstanceCreateFlags flags;
const VkApplicationInfo* pApplicationInfo;
uint32_t enabledLayerCount;
const char* const* ppEnabledLayerNames;
uint32_t enabledExtensionCount;
const char* const* ppEnabledExtensionNames;
} VkInstanceCreateInfo;
Then below in the options we see:
sType is the type of this structure
sType must be VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO
If we dont have any options anyway, why is this parameter not just set implicitly upon creation of the type?
Note: I realise this is not something specific to the vulkan API.
Update: I'm not just talking specifically about vulkan, just all parameters that can only be a certain type.
The design allows structures to be chained together so that extensions can create additional parameters to existing calls without interfering with the original API structures and without interfering with each other.
Nearly every struct in Vulkan has sType as it's first member, and pNext as it's second member. That means that if you have a void* and all you know is that it is some kind of Vulkan API struct, you can safely read the first 32 bits and it will be a VkStructureType and read the next 32 or 64 bits and it will tell you if there's another structure in the chain.
So for instance, there's a VkMemoryAllocateInfo structure for allocating memory that has (aside from sType and pNext the size of the allocation and the heap index it should come from. But what if I want to use the "dedicated allocation" extension. Then I also need to fill out a VkMemoryDedicatedAllocateInfo structure with extra information. But I still need to call the same vkAllocateMemory function that only takes a VkMemoryAllocateInfo... so where do I put the VkMemoryDedicatedAllocateInfo structure I filled out? I put a pointer to it in the pNext field of VkMemoryAllocateInfo.
Maybe I also want to share this memory with some OpenGL code. There's an extension that lets you do that, but you need to fill out a VkExportMemoryAllocateInfo structure and pass it in during the allocation as well. Well, I can do that by putting it in the pNext field of my VkMemoryDedicatedAllocateInfo structure. I can create a chain of structures like that as long as I want.
Here's the really important part. Since all structures have sType as their first field, an extension can navigate along this chain of structures and find the ones it cares about without knowing anything about the structures other than that they always start with sType and pNext.
All of this means that Vulkan can be extended in ways that alter the behavior of existing functions, but without changing the function itself, or the structures that are passed to it.
You might ask why all of the core structures have sType and pNext, even though you're passing them to functions with typed pointers, rather than void pointers. The reason is consistency, and because you never know when an existing structure might be needed as part of the chain for some new extension.
If we dont have any options anyway, why is this parameter not just set implicitly upon creation of the type?
Because C isn't C++. There's no way to declare a structure in C and say that this portion of the structure will always have this value. In C++ you can, by declaring something as const and providing the initial default value. In fact, one of the things I like about the Vulkan C++ bindings is that you can basically forget about sType forever. If you're using extensions you still need to populate pNext as appropriate.

Are There Any Hidden Costs to Passing Around a Struct With a Single Reference?

I was recently reading this article on structs and classes in D, and at one point the author comments that
...this is a perfect candidate for a struct. The reason is that it contains only one member, a pointer to an ALLEGRO_CONFIG. This means I can pass it around by value without care, as it's only the size of a pointer.
This got me thinking; is that really the case? I can think of a few situations in which believing you're passing a struct around "for free" could have some hidden gotchas.
Consider the following code:
struct S
{
int* pointer;
}
void doStuff(S ptrStruct)
{
// Some code here
}
int n = 123;
auto s = S(&n);
doStuff(s);
When s is passed to doStuff(), is a single pointer (wrapped in a struct) really all that's being passed to the function? Off the top of my head, it seems that any pointers to member functions would also be passed, as well as the struct's type information.
This wouldn't be an issue with classes, of course, since they're always reference types, but a struct's pass by value semantics suggests to me that any extra "hidden" data such as described above would be passed to the function along with the struct's pointer to int. This could lead to a programmer thinking that they're passing around an (assuming a 64-bit machine) 8-byte pointer, when they're actually passing around an 8-byte pointer, plus several other 8-byte pointers to functions, plus however many bytes an object's typeinfo is. The unwary programmer is then allocating far more data on the stack than was intended.
Am I chasing shadows here, or is this a valid concern when passing a struct with a single reference, and thinking that you're getting a struct that is a pseudo reference type? Is there some mechanism in D that prevents this from being the case?
I think this question can be generalized to wrapping native types. E.g. you could make a SafeInt type which wraps and acts like an int, but throws on any integer overflow conditions.
There are two issues here:
Compilers may not optimize your code as well as with a native type.
For example, if you're wrapping an int, you'll likely implement overloaded arithmetic operators. A sufficiently-smart compiler will inline those methods, and the resulting code will be no different than that as with an int. In your example, a dumb compiler might be compiling a dereference in some clumsy way (e.g. get the address of the struct's start, add the offset of the pointer field (which is 0), then dereference that).
Additionally, when calling a function, the compiler may decide to pass the struct in some other way (due to e.g. poor optimization, or an ABI restriction). This could happen e.g. if the compiler doesn't pay attention to the struct's size, and treats all structs in the same way.
struct types in D may indeed have a hidden member, if you declare it in a function.
For example, the following code works:
import std.stdio;
void main()
{
string str = "I am on the stack of main()";
struct S
{
string toString() const { return str; }
}
S s;
writeln(s);
}
It works because S saves a hidden pointer to main()'s stack frame. You can force a struct to not have any hidden pointers by prefixing static to the declaration (e.g. static struct S).
There is no hidden data being passed. A struct consists exactly of what's declared in it (and any padding bytes if necessary), nothing else. There is no need to pass type information and member function information along because it's all static. Since a struct cannot inherit from another struct, there is no polymorphism.

C++ pointer of one type or another?

I have a program that solves PDEs in big grids working in C, but want to port it to C++ to learn object oriented programming.
My problem is that I have two data structures, lets call them Class1 and Class2. To make things simple lets assume that Class1 contains two doubles (a and b) and Class2 contains four doubles (a,b,c and d).
Now, at runtime I want to generate a dynamic array of some sort that will hold one class OR the other depending on the conditions of the PDE.
Something like:
if (PDEtype == 1) pointer = new Class1[n]
else pointer = new Class2[n]
Then after that I will need to access pointer with something like: pointer[2].a = 1.0 or pointer[4].d = 10.0 etc.
A union will not work because I need to store the lowest amount of memory. (I will be working with big problems of possibly millions of points)
Is there a way to do this in C++?
Thanks in advance!!!
If your Class2 is actually derived from Class1, you can have an array of Class1*'s. If not, I'd base both Class1 and Class2 off a common base class, and had an array of BaseClass*'s.
Of course, you need some way to know what's the actual content of the entry (which is better done in C++ with virtual functions, if applicable).
what you have is pretty close actually. the problem is that you will need to create the pointer beforehand and THEN call new Class1[n]. your problem seems to be needing to figure out how to declare the pointer beforehand.
there are two options that i can think of:
1) don't worry about declaring two different pointer types. declare ONE pointer type (in this case it would make sense to have this be Class2 - since Class1 can be thought of as a "subset" of Class1) and then just declare any unused values in Class2 to some specified value signifying that they are unused.
2) i'm not even quite sure if this technique is correct, but there is definitely something along these lines that will work: declare pointer to be of type Void*. you will also need to define ptr1 and ptr2 to point to types of Class1 and Class2 respectively. you will want your Void* pointer to actually point to the class you are defining, then create a function to cast this void* pointer to type ptr1 (a pointer to a Class1 type) or ptr (a pointer to a Class2 type).
this is much like creating a void* (a pointer to a type void) and then casting it as an integer if your memory location actually contains ints. you can then treat this cast as a pointer to an integer.
finally, you say "pointer[2].a = 1.0" and "pointer[4].d = 10.0" in your example. are you creating an array of classes? if you want your pointer to point to one class, then you will end up using "pointer.a" and "pointer.d", right?

What's going on in the 'offsetof' macro?

Visual C++ 2008 C runtime offers an operator 'offsetof', which is actually macro defined as this:
#define offsetof(s,m) (size_t)&reinterpret_cast<const volatile char&>((((s *)0)->m))
This allows you to calculate the offset of the member variable m within the class s.
What I don't understand in this declaration is:
Why are we casting m to anything at all and then dereferencing it? Wouldn't this have worked just as well:
&(((s*)0)->m)
?
What's the reason for choosing char reference (char&) as the cast target?
Why use volatile? Is there a danger of the compiler optimizing the loading of m? If so, in what exact way could that happen?
An offset is in bytes. So to get a number expressed in bytes, you have to cast the addresses to char, because that is the same size as a byte (on this platform).
The use of volatile is perhaps a cautious step to ensure that no compiler optimisations (either that exist now or may be added in the future) will change the precise meaning of the cast.
Update:
If we look at the macro definition:
(size_t)&reinterpret_cast<const volatile char&>((((s *)0)->m))
With the cast-to-char removed it would be:
(size_t)&((((s *)0)->m))
In other words, get the address of member m in an object at address zero, which does look okay at first glance. So there must be some way that this would potentially cause a problem.
One thing that springs to mind is that the operator & may be overloaded on whatever type m happens to be. If so, this macro would be executing arbitrary code on an "artificial" object that is somewhere quite close to address zero. This would probably cause an access violation.
This kind of abuse may be outside the applicability of offsetof, which is supposed to only be used with POD types. Perhaps the idea is that it is better to return a junk value instead of crashing.
(Update 2: As Steve pointed out in the comments, there would be no similar problem with operator ->)
offsetof is something to be very careful with in C++. It's a relic from C. These days we are supposed to use member pointers. That said, I believe that member pointers to data members are overdesigned and broken - I actually prefer offsetof.
Even so, offsetof is full of nasty surprises.
First, for your specific questions, I suspect the real issue is that they've adapted relative to the traditional C macro (which I thought was mandated in the C++ standard). They probably use reinterpret_cast for "it's C++!" reasons (so why the (size_t) cast?), and a char& rather than a char* to try to simplify the expression a little.
Casting to char looks redundant in this form, but probably isn't. (size_t) is not equivalent to reinterpret_cast, and if you try to cast pointers to other types into integers, you run into problems. I don't think the compiler even allows it, but to be honest, I'm suffering memory failure ATM.
The fact that char is a single byte type has some relevance in the traditional form, but that may only be why the cast is correct again. To be honest, I seem to remember casting to void*, then char*.
Incidentally, having gone to the trouble of using C++-specific stuff, they really should be using std::ptrdiff_t for the final cast.
Anyway, coming back to the nasty surprises...
VC++ and GCC probably won't use that macro. IIRC, they have a compiler intrinsic, depending on options.
The reason is to do what offsetof is intended to do, rather than what the macro does, which is reliable in C but not in C++. To understand this, consider what would happen if your struct uses multiple or virtual inheritance. In the macro, when you dereference a null pointer, you end up trying to access a virtual table pointer that isn't there at address zero, meaning that your app probably crashes.
For this reason, some compilers have an intrinsic that just uses the specified structs layout instead of trying to deduce a run-time type. But the C++ standard doesn't mandate or even suggest this - it's only there for C compatibility reasons. And you still have to be careful if you're working with class heirarchies, because as soon as you use multiple or virtual inheritance, you cannot assume that the layout of the derived class matches the layout of the base class - you have to ensure that the offset is valid for the exact run-time type, not just a particular base.
If you're working on a data structure library, maybe using single inheritance for nodes, but apps cannot see or use your nodes directly, offsetof works well. But strictly speaking, even then, there's a gotcha. If your data structure is in a template, the nodes may have fields with types from template parameters (the contained data type). If that isn't POD, technically your structs aren't POD either. And all the standard demands for offsetof is that it works for POD. In practice, it will work - your type hasn't gained a virtual table or anything just because it has a non-POD member - but you have no guarantees.
If you know the exact run-time type when you dereference using a field offset, you should be OK even with multiple and virtual inheritance, but ONLY if the compiler provides an intrinsic implementation of offsetof to derive that offset in the first place. My advice - don't do it.
Why use inheritance in a data structure library? Well, how about...
class node_base { ... };
class leaf_node : public node_base { ... };
class branch_node : public node_base { ... };
The fields in the node_base are automatically shared (with identical layout) in both the leaf and branch, avoiding a common error in C with accidentally different node layouts.
BTW - offsetof is avoidable with this kind of stuff. Even if you are using offsetof for some jobs, node_base can still have virtual methods and therefore a virtual table, so long as it isn't needed to dereference member variables. Therefore, node_base can have pure virtual getters, setters and other methods. Normally, that's exactly what you should do. Using offsetof (or member pointers) is a complication, and should only be used as an optimisation if you know you need it. If your data structure is in a disk file, for instance, you definitely don't need it - a few virtual call overheads will be insignificant compared with the disk access overheads, so any optimisation efforts should go into minimising disk accesses.
Hmmm - went off on a bit of a tangent there. Whoops.
char is guarenteed to be the smallest number of bits the architectural can "bite" (aka byte).
All pointers are actually numbers, so cast adress 0 to that type because it's the beginning.
Take the address of member starting from 0 (resulting into 0 + location_of_m).
Cast that back to size_t.
1) I also do not know why it is done in this way.
2) The char type is special in two ways.
No other type has weaker alignment restrictions than the char type. This is important for reinterpret cast between pointers and between expression and reference.
It is also the only type (together with its unsigned variant) for which the specification defines behavior in case the char is used to access stored value of variables of different type. I do not know if this applies to this specific situation.
3) I think that the volatile modifier is used to ensure that no compiler optimization will result in attempt to read the memory.
2 . What's the reason for choosing char reference (char&) as the cast target?
if type s has operator& overloaded then we can't get address using &s
so we reinterpret_cast the type s to primitive type char because primitive type char
doesn't have operator& overloaded
now we can get address from that
if in C then reinterpret_cast is not required
3 . Why use volatile? Is there a danger of the compiler optimizing the loading of m? If so, in what exact way could that happen?
here volatile is not relevant to compiler optimizing.
if type s have const or volatile or both qualifier(s) then
reinterpret_cast can't cast to char& because reinterpret_cast can't remove cv-qualifiers
so result is using <const volatile char&> for casting work from any combination

Resources