C99 Macro to build a quoted string literal after evaluation - string

I'm developing an embedded application in C99, and the project contains some integer constants defined like:
#define LEVEL1 0x0000
#define LEVEL2 (LEVEL1 + 1)
It has since become useful to keep track of these values for logging purposes, so I would like to use a macro to create a string literal from the evaluated versions of the above. For example:
strncpy(str, STRING(LEVEL2), len);
would ideally evaluate to
strncpy(str, "0x0001", len);
or even
strncpy(str, "0001", len);
Using a two-stage macro with the # operator (as suggested by this question) almost works. It evaluates to
strncpy(str, "(LEVEL1 + 1)", len);
I would like to avoid the use of a run-time function - hence my attempt at a macro solution. Suggestions?

Since the pre-processor stringizer is a massive pain, you need to add a level of indirection both when creating version numbers and when stringizing:
#define STRING1(s) #s
#define STRING(s) STRING1(s)
#define LEVEL(x) x
#define LEVEL1 LEVEL(1)
#define LEVEL2 LEVEL(2)
printf(STRING(LEVEL2));
//2

You cannot do this because the preprocessor knows nothing about the C language so it cannot to evaluation.
I see two options to get the desired result:
Manual evaluation
Write your levels exactly as you want them to appear and use a single stringizer operator:
#define LEVEL1 0x0000
#define LEVEL2 0x0001
#define STRING(x) # x
strncpy(str, STRING(LEVEL2), len);
A disadvantage is that this is error prone and might clash with local coding conventions.
Runtime evaluation
Use one of the string format functions sprintf or snprintf.
#define LEVEL1 0x0000
#define LEVEL2 0x0001
char level[7];
snprintf(level, sizeof level, "%#06x", LEVEL2);
strncpy(str, level, len);
This has the runtime overhead you wanted to avoid.

Related

How to use memcpy in function that is passed a char pointer?

I'm quite new to pointers in c.
Here is a snippet of code I'm working on. I am probably not passing the pointer correctly but I can't figure out what's wrong.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
__uint16_t CCrc8();
__uint16_t process_command();
int main () {
//command format: $SET,<0-1023>*<checksum,hex>\r\n
char test_payload[] = "SET,1023*6e";
process_command(test_payload);
return 0;
}
__uint16_t process_command(char *str1) {
char local_str[20];
memcpy(local_str, str1, sizeof(str1));
printf(str1);
printf("\n");
printf(local_str);
}
This results in:
SET,1023*6e
SET,1023
I'm expecting both lines to be the same. Anything past 8 characters is left off.
The only thing I can determine is that the problem is something with sizeof(str1). Any help appreciated.
Update: I've learned sizeof(*char) is 2 on 16bit systems, 4 on 32bit systems and 8 on 64-bit systems.
So how can I use memcpy to get a local copy of str1 when I'm unsure of the size it will be?
sizeof is a compiler keyword. What you need is strlen from #include <string.h>.
The value of sizeof is determinated at compile time. For example sizeof(char[10]) just means 10. strlen on the other hand is a libc function that can determine string length dynamically.
sizeof on a pointer tells you the size of the pointer itself, not of what it points to. Since you're on a 64-bit system, pointers are 8 bytes long, so your memcpy is always copying 8 bytes. Since your string is null terminated, you should use stpncpy instead, like this:
if(stpncpy(local_str, str1, 20) == local_str + 20) {
// too long - handle it somehow
}
That will copy the string until it gets to a NUL terminator or runs out of space in the destination, and in the latter case you can handle it.

Ext2/3: Block Type Clarification: IND vs DIND vs TIND

I'm seeing references to "IND" vs "DIND" vs "TIND" block-types in a few places, whereas the definition in the code is very terse:
(https://github.com/torvalds/linux/blob/master/fs/ext4/ext4.h#L362)
#define EXT4_NDIR_BLOCKS 12
#define EXT4_IND_BLOCK EXT4_NDIR_BLOCKS
#define EXT4_DIND_BLOCK (EXT4_IND_BLOCK + 1)
#define EXT4_TIND_BLOCK (EXT4_DIND_BLOCK + 1)
#define EXT4_N_BLOCKS (EXT4_TIND_BLOCK + 1)
Can someone clarify what they are, as well as, potentially, why the definitions imply that a TIND block includes a DIND, and a DIND block includes a IND block.
I've looked, feverishly, but there aren't any obvious discussions or comments on the subject and it's going to take me a bit more time to figure it out from the code.
#define EXT4_NDIR_BLOCKS /* number of direct blocks */
#define EXT4_IND_BLOCK /* single indirect block */
#define EXT4_DIND_BLOCK /* double indirect block */
#define EXT4_TIND_BLOCK /* trible indirect block */
#define EXT4_N_BLOCKS /* total number of blocks */
NDIR is the number of direct blocks.
IND is the single indirect block.
DIND is the double indirect block.
TIND is the trible indirect block
N is the total number of blocks.

any way to stop unaligned access from c++ standard library on x86_64?

I am trying to check for any unaligned reads in my program. I enable unaligned access processor exception via (using x86_64 on g++ on linux kernel 3.19):
asm volatile("pushf \n"
"pop %%rax \n"
"or $0x40000, %%rax \n"
"push %%rax \n"
"popf \n" ::: "rax");
I do an optional forced unaligned read which triggers the exception so i know its working. After i disable that I get an error in a piece of code which otherwise seems fine :
char fullpath[eMaxPath];
snprintf(fullpath, eMaxPath, "%s/%s", "blah", "blah2");
the stacktrace shows a failure via __memcpy_sse2 which leads me to suspect that the standard library is using sse to fulfill my memcpy but it doesnt realize that i have now made unaligned reads unacceptable.
Is my thinking correct and is there any way around this (ie can i make the standard library use an unaligned safe sprintf/memcpy instead)?
thanks
While I hate to discourage an admirable notion, you're playing with fire, my friend.
It's not merely sse2 access but any unaligned access. Even a simple int fetch.
Here's a test program:
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <malloc.h>
void *intptr;
void
require_aligned(void)
{
asm volatile("pushf \n"
"pop %%rax \n"
"or $0x00040000, %%eax \n"
"push %%rax \n"
"popf \n" ::: "rax");
}
void
relax_aligned(void)
{
asm volatile("pushf \n"
"pop %%rax \n"
"andl $0xFFFBFFFF, %%eax \n"
"push %%rax \n"
"popf \n" ::: "rax");
}
void
msg(const char *str)
{
int len;
len = strlen(str);
write(1,str,len);
}
void
grab(void)
{
volatile int x = *(int *) intptr;
}
int
main(void)
{
setlinebuf(stdout);
// minimum alignment from malloc is [usually] 8
intptr = malloc(256);
printf("intptr=%p\n",intptr);
// normal access to aligned pointer
msg("normal\n");
grab();
// enable alignment check exception
require_aligned();
// access aligned pointer under check [will be okay]
msg("aligned_norm\n");
grab();
// this grab will generate a bus error
intptr += 1;
msg("aligned_except\n");
grab();
return 0;
}
The output of this is:
intptr=0x1996010
normal
aligned_norm
aligned_except
Bus error (core dumped)
The program generated this simply because of an attempted 4 byte int fetch from address 0x1996011 [which is odd and not a multiple of 4].
So, once you turn on the AC [alignment check] flag, even simple things will break.
IMO, if you truly have some things that are not aligned optimally and are trying to find them, using printf, instrumenting your code with debug asserts, or using gdb with some special watch commands or breakpoints with condition statements are a better/safer way to go
UPDATE:
I a using my own custom allocator am preparing my code to run on an architecture that doesnt suport unaligned read/writes so I want to make sure my code will not break on that architecture.
Fair enough.
Side note: My curiousity has gotten the better of me as the only [major] arches I can recall [at the moment] that have this issue are Motorola mc68000 and older IBM mainframes (e.g. IBM System 370).
One practical reason for my curiosity is that for certain arches (e.g. ARM/android, MIPS) there are emulators available. You could rebuild the emulator from source, adding any extra checks, if needed. Otherwise, doing your debugging under the emulator might be an option.
I can trap unaligned read/write using either the asm , or via gdb but both cause SIGBUS which i cant continue from in gdb and im getting too many false positives from std library (in the sense that their implementation would be aligned access only on the target).
I can tell you from experience that trying to resume from a signal handler after this doesn't work too well [if at all]. Using gdb is the best bet if you can eliminate the false positives by having AC off in the standard functions [see below].
Ideally i guess i would like to use something like perf to show me callstacks that have misaligned but so far no dice.
This is possible, but you'd have to verify that perf even reports them. To see, you could try perf against my original test program above. If it works, the "counter" should be zero before and one after.
The cleanest way may be to pepper your code with "assert" macros [that can be compiled in and out with a -DDEBUG switch].
However, since you've gone to the trouble of laying the groundwork, it may be worthwhile to see if the AC method can work.
Since you're trying to debug your memory allocator, you only need AC on in your functions. If one of your functions calls libc, disable AC, call the function, and then reenable AC.
A memory allocator is fairly low level, so it can't rely on too many standard functions. Most standard functions rely on being able to call malloc. So, you might also want to consider a vtable interface to the rest of the [standard] library.
I've coded some slightly different AC bit set/clear functions. I put them into a .S function to eliminate inline asm hassles.
I've coded up a simple sample usage in three files.
Here are the AC set/clear functions:
// acbit/acops.S -- low level AC [alignment check] operations
#define AC_ON $0x00040000
#define AC_OFF $0xFFFFFFFFFFFBFFFF
.text
// acpush -- turn on AC and return previous mask
.globl acpush
acpush:
// get old mask
pushfq
pop %rax
mov %rax,%rcx // save to temp
or AC_ON,%ecx // turn on AC bit
// set new mask
push %rcx
popfq
ret
// acpop -- restore previous mask
.globl acpop
acpop:
// get current mask
pushfq
pop %rax
and AC_OFF,%rax // clear current AC bit
and AC_ON,%edi // isolate the AC bit in argument
or %edi,%eax // lay it in
// set new mask
push %rax
popfq
ret
// acon -- turn on AC
.globl acon
acon:
jmp acpush
// acoff -- turn off AC
.globl acoff
acoff:
// get current mask
pushfq
pop %rax
and AC_OFF,%rax // clear current AC bit
// set new mask
push %rax
popfq
ret
Here is a header file that has the function prototypes and some "helper" macros:
// acbit/acbit.h -- common control
#ifndef _acbit_acbit_h_
#define _acbit_acbit_h_
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <malloc.h>
typedef unsigned long flags_t;
#define VARIABLE_USED(_sym) \
do { \
if (1) \
break; \
if (!! _sym) \
break; \
} while (0)
#ifdef ACDEBUG
#define ACPUSH \
do { \
flags_t acflags = acpush()
#define ACPOP \
acpop(acflags); \
} while (0)
#define ACEXEC(_expr) \
do { \
acoff(); \
_expr; \
acon(); \
} while (0)
#else
#define ACPUSH /**/
#define ACPOP /**/
#define ACEXEC(_expr) _expr
#endif
void *intptr;
flags_t
acpush(void);
void
acpop(flags_t omsk);
void
acon(void);
void
acoff(void);
#endif
Here is a sample program that uses all of the above:
// acbit/acbit2 -- sample allocator
#include <acbit.h>
// mymalloc1 -- allocation function [raw calls]
void *
mymalloc1(size_t len)
{
flags_t omsk;
void *vp;
// function prolog
// NOTE: do this on all "outer" (i.e. API) functions
omsk = acpush();
// do lots of stuff ...
vp = NULL;
// encapsulate standard library calls like this to prevent false positives:
acoff();
printf("%p\n",vp);
acon();
// function epilog
acpop(omsk);
return vp;
}
// mymalloc2 -- allocation function [using helper macros]
void *
mymalloc2(size_t len)
{
void *vp;
// function prolog
ACPUSH;
// do lots of stuff ...
vp = NULL;
// encapsulate standard library calls like this to prevent false positives:
ACEXEC(printf("%p\n",vp));
// function epilog
ACPOP;
return vp;
}
int
main(void)
{
int x;
setlinebuf(stdout);
// minimum alignment from malloc is [usually] 8
intptr = mymalloc1(256);
intptr = mymalloc2(256);
x = *(int *) intptr;
return x;
}
UPDATE #2:
I like the idea of disabling the check before any library calls.
If the AC H/W works and you wrap the library calls, this should yield no false positives. The only exception would be if the compiler made a call to its internal helper library (e.g. doing 64 bit divide on 32 bit machine, etc.).
Be aware/wary of the ELF loader (e.g. /lib64/ld-linux-x86-64.so.2) doing dynamic symbol resolution on "lazy" bindings of symbols. Shouldn't be a big problem. There are ways to force the relocations to occur at program start, if necessary.
I have given up on perf for this as it seems to show me garbage even for a simple program like the one you wrote.
The perf code in the kernel is complex enough that it may be more trouble than it's worth. It has to communicate with the perf program with a pipe [IIRC]. Also, doing the AC thing is [probably] uncommon enough that the kernel's code path for this isn't well tested.
Im using ocperf with misalign_mem_ref.loads and stores but either way the counters dont correlate at all. If i record and look at the callstacks i get completely unrecognizable callstacks for these counters so i suspect either the counter doesnt work on my hardware/perf or it doesnt actually count what i think it counts
I honestly don't know if perf handles process reschedules to different cores properly [or not]--it should [IMO]. But, using sched_setaffinity to lock your program to a single core might help.
But, using the AC bit is more direct and definitive, IMO. I think that's the better bet.
I've talked about adding "assert" macros in the code.
I've coded some up below. These are what I'd use. They are independent of the AC code. But, they can also be used in conjunction with the AC bit code in a "belt and suspenders" approach.
These macros have one distinct advantage. When properly [and liberally] inserted, they can check for bad pointer values at the time they're calculated. That is, much closer to the true source of the problem.
With AC, you may calculate a bad value, but AC only kicks in [sometime] later, when the pointer is dereferenced [which may not happen in your API code at all].
I've done a complete memory allocator before [with overrun checks and "guard" pages, etc.]. The macro approach is what I used. And, if I had only one tool for this, it is the one I'd use. So, I recommend it above all else.
But, as I said, it can be used with the AC code as well.
Here's the header file for the macros:
// acbit/acptr.h -- alignment check macros
#ifndef _acbit_acptr_h_
#define _acbit_acptr_h_
#include <stdio.h>
typedef unsigned int u32;
// bit mask for given width
#define ACMSKOFWID(_wid) \
((1u << (_wid)) - 1)
#ifdef ACDEBUG2
#define ACPTR_MSK(_ptr,_msk) \
acptrchk(_ptr,_msk,__FILE__,__LINE__)
#else
#define ACPTR_MSK(_ptr,_msk) /**/
#endif
#define ACPTR_WID(_ptr,_wid) \
ACPTR_MSK(_ptr,(_wid) - 1)
#define ACPTR_TYPE(_ptr,_typ) \
ACPTR_WID(_ptr,sizeof(_typ))
// acptrfault -- pointer alignment fault
void
acptrfault(const void *ptr,const char *file,int lno);
// acptrchk -- check pointer for given alignment
static inline void
acptrchk(const void *ptr,u32 msk,const char *file,int lno)
{
#ifdef ACDEBUG2
#if ACDEBUG2 >= 2
printf("acptrchk: TRACE ptr=%p msk=%8.8X file='%s' lno=%d\n",
ptr,msk,file,lno);
#endif
if (((unsigned long) ptr) & msk)
acptrfault(ptr,file,lno);
#endif
}
#endif
Here's the "fault" handler function:
// acbit/acptr -- alignment check macros
#include <acbit/acptr.h>
#include <acbit/acbit.h>
#include <stdlib.h>
// acptrfault -- pointer alignment fault
void
acptrfault(const void *ptr,const char *file,int lno)
{
// NOTE: it's easy to set a breakpoint on this function
printf("acptrfault: pointer fault -- ptr=%p file='%s' lno=%d\n",
ptr,file,lno);
exit(1);
}
And, here's a sample program that uses them:
// acbit/acbit3 -- sample allocator using check macros
#include <acbit.h>
#include <acptr.h>
static double static_array[20];
// mymalloc3 -- allocation function
void *
mymalloc3(size_t len)
{
void *vp;
// get something valid
vp = static_array;
// do lots of stuff ...
printf("BEF vp=%p\n",vp);
// check pointer
// NOTE: these can be peppered after every [significant] calculation
ACPTR_TYPE(vp,double);
// do something bad ...
vp += 1;
printf("AFT vp=%p\n",vp);
// check again -- this should fault
ACPTR_TYPE(vp,double);
return vp;
}
int
main(void)
{
int x;
setlinebuf(stdout);
// minimum alignment from malloc is [usually] 8
intptr = mymalloc3(256);
x = *(int *) intptr;
return x;
}
Here's the program output:
BEF vp=0x601080
acptrchk: TRACE ptr=0x601080 msk=00000007 file='acbit/acbit3.c' lno=22
AFT vp=0x601081
acptrchk: TRACE ptr=0x601081 msk=00000007 file='acbit/acbit3.c' lno=29
acptrfault: pointer fault -- ptr=0x601081 file='acbit/acbit3.c' lno=29
I left off the AC code in this example. On your real target system, the dereference of intptr in main would/should fault on an alignment, but notice how much later that is in the execution timeline.
Like I commented on the question, that asm isn't safe, because it steps on the red-zone. Instead, use
asm volatile ("add $-128, %rsp\n\t"
"pushf\n\t"
"orl $0x40000, (%rsp)\n\t"
"popf\n\t"
"sub $-128, %rsp\n\t"
);
(-128 fits in a sign-extended 8bit immediate, but 128 doesn't, hence using add $-128 to subtract 128.)
Or in this case, there are dedicated instructions for toggling that bit, like there are for the carry and direction flags:
asm("stac"); // Set AC flag
asm("clac"); // Clear AC flag
It's a good idea to have some idea when your code uses unaligned memory. It's not necessarily a good idea to change your code to avoid it in every case. Sometimes better locality from packing data closer together is more valuable.
Given that you shouldn't necessarily aim to eliminate all unaligned accesses anyway, I don't think this is the easiest way to find the ones you do have.
modern x86 hardware has fast hardware support for unaligned loads/stores. When they don't span a cache-line boundary, or lead to store-forwarding stalls, there's literally no penalty.
What you might try is looking at performance counters for some of these events:
misalign_mem_ref.loads [Speculative cache line split load uops dispatched to L1 cache]
misalign_mem_ref.stores [Speculative cache line split STA uops dispatched to L1 cache]
ld_blocks.store_forward [This event counts loads that followed a store to the same address, where the data could not be forwarded inside the pipeline from the store to the load.
The most common reason why store forwarding would be blocked is when a load's address range overlaps with a preceeding smaller uncompleted store.
See the table of not supported store forwards in the Intel? 64 and IA-32 Architectures Optimization Reference Manual.
The penalty for blocked store forwarding is that the load must wait for the store to complete before it can be issued.]
(from ocperf.py list output on my Sandybridge CPU).
There are probably other ways to detect unaligned memory access. Maybe valgrind? I searched on valgrind detect unaligned and found this mailing list discussion from 13 years ago. Probably still not implemented.
The hand-optimized library functions do use unaligned accesses because it's the fastest way for them to get their job done. e.g. copying bytes 6 to 13 of a string to somewhere else can and should be done with just a single 8-byte load/store.
So yes, you would need special slow&safe versions of library functions.
If your code would have to execute extra instructions to avoid using unaligned loads, it's often not worth it. Esp. if the input is usually aligned, having a loop that does the first up-to-alignment-boundary elements before starting the main loop may just slow things down. In the aligned case, everything works optimally, with no overhead of checking alignment. In the unaligned case, things might work a few percent slower, but as long as the unaligned cases are rare, it's not worth avoiding them.
Esp. if it's not SSE code, since non-AVX legacy SSE can only fold loads into memory operands for ALU instructions when alignment is guaranteed.
The benefit of having good-enough hardware support for unaligned memory ops is that software can be faster in the aligned case. It can leave alignment-handling to hardware, instead of running extra instructions to handle pointers that are probably aligned. (Linus Torvalds had some interesting posts about this on the http://realworldtech.com/ forums, but they're not searchable so I can't find it.
You're not going to like it, but there is only one answer: don't link against the standard libraries. By changing that setting you have changed the ABI and the standard library doesn't like it. memcpy and friends are hand-written assembly so it's not a matter of compiler options to convince the compiler to do something else.

Arduino: need assistance in understanding <keyboard.h> library

I have Leonardo/Micro device that should emulate Keyboard.
I would like to modify library. The reason is I would like to be able to send raw scancodes, wheras the library does some preparation.
I looked in the source code, also of HID library, dbut have difficulty to understand following points:
Keyboard_::begin() and Keyboard_::end() are supposed to start and stop keboard emulation, but they have empty bodies; https://www.arduino.cc/en/Reference/KeyboardBegin
KeyReport is especially mysterious:
What exactly happens to the keyreport? I lost track in USB_Send function in HID.cpp. Couldnt find where it comes from
What are modifiers, what they ar4 doing?
Is number of keys sent limited to 6 or, theoretically could be be arbitrary?
I will try to answer your questions the best I can. Let me know if you still have questions:
Keyboard_::begin() and Keyboard_::end() are supposed to start and stop keboard emulation, but they have empty bodies
I believe those are just placeholders in case any initialization or cleanup would need to be done. The other libraries have the same functions (e.g. the Mouse library). I suspect they are there for consistency and just in case they are needed.
KeyReport is especially mysterious.
typedef struct
{
uint8_t modifiers;
uint8_t reserved;
uint8_t keys[6];
} KeyReport;
KeyReport is the data structure that represents the USB message sent to the host computer.
The modifiers member is an 8-bit unsigned integer that contains various flags (e.g. Left Shift, Left Ctrl, Left Alt, etc.)
The reserved member is an 8-bit unsigned integer that is not used, but must be there.
The keys member is an array of six 8-bit unsigned integers that represent the keys that are currently pressed.
What exactly happens to the keyreport? I lost track in USB_Send function in HID.cpp.
It gets sent to the host computer.
What are modifiers, what they are doing?
Some keys are “regular” keys (e.g. A, B, 1, 2, #, etc.). Other keys are modifiers (e.g. Shift, Ctrl, Alt). The modifier keys set flags in KeyReport.modifiers. For example, the Left Shift key is 0x02.
Is number of keys sent limited to 1 or, theoretically could be arbitrary?
The number of “regular” keys that can be press simultaneously is 6, but you could also have the modifier keys pressed (Shift, Alt, Ctrl, etc.).
FYI: I have been able to add additional keys (e.g. the numeric keypad keys) by adding new key definitions to the USBAPI.h file:
#define KEY_NUMPAD_DIVIDE 0xDC
#define KEY_NUMPAD_MULTIPLY 0xDD
#define KEY_NUMPAD_MINUS 0xDE
#define KEY_NUMPAD_PLUS 0xDF
#define KEY_NUMPAD_ENTER 0xE0
#define KEY_NUMPAD_1 0xE1
#define KEY_NUMPAD_2 0xE2
#define KEY_NUMPAD_3 0xE3
#define KEY_NUMPAD_4 0xE4
#define KEY_NUMPAD_5 0xE5
#define KEY_NUMPAD_6 0xE6
#define KEY_NUMPAD_7 0xE7
#define KEY_NUMPAD_8 0xE8
#define KEY_NUMPAD_9 0xE9
#define KEY_NUMPAD_0 0xEA
#define KEY_NUMPAD_DEL 0xEB

What is the point of using arrays of one element in ddk structures?

Here is an excerpt from ntdddisk.h
typedef struct _DISK_GEOMETRY_EX {
DISK_GEOMETRY Geometry; // Standard disk geometry: may be faked by driver.
LARGE_INTEGER DiskSize; // Must always be correct
UCHAR Data[1]; // Partition, Detect info
} DISK_GEOMETRY_EX, *PDISK_GEOMETRY_EX;
What is the point of UCHAR Data[1];? Why not just UCHAR Data; ?
And there are a lot of structures in DDK which have arrays of one element in declarations.
Thanks, thats clear now. The one thing is not clear the implementation of offsetof.
It's defined as
#ifdef _WIN64
#define offsetof(s,m) (size_t)( (ptrdiff_t)&(((s *)0)->m) )
#else
#define offsetof(s,m) (size_t)&(((s *)0)->m)
#endif
How this works:
((s *)0)->m ???
This
(size_t)&((DISK_GEOMETRY_EX *)0)->Data
is like
sizeof (DISK_GEOMETRY) + sizeof( LARGE_INTEGER);
But there is two additional questions:
1)
What type is this? And why we should use & for this?
((DISK_GEOMETRY_EX *)0)->Data
2) ((DISK_GEOMETRY_EX *)0)
This gives me 00000000. Is it convering to the address alignment? interpret it like an address?
Very common in the winapi as well, these are variable length structures. The array is always the last element in the structure and it always includes a field that indicates the actual array size. A bitmap for example is declared that way:
typedef struct tagBITMAPINFO {
BITMAPINFOHEADER bmiHeader;
RGBQUAD bmiColors[1];
} BITMAPINFO, FAR *LPBITMAPINFO, *PBITMAPINFO;
The color table has a variable number of entries, 2 for a monochrome bitmap, 16 for a 4bpp and 256 for a 8bpp bitmap. Since the actual length of the structure varies, you cannot declare a variable of that type. The compiler won't reserve enough space for it. So you always need the free store to allocate it using code like this:
#include <stddef.h> // for offsetof() macro
....
size_t len = offsetof(BITMAPINFO, bmiColors) + 256 * sizeof(RGBQUAD);
BITMAPINFO* bmp = (BITMAPINFO*)malloc(len);
bmp->bmiHeader.biClrUsed = 256;
// etc...
//...
free(bmp);

Resources