Assign string to zmq::message_t without copying

Assign string to zmq::message_t without copying - string

I need to do some high performance c++ stuff and that is why I need to avoid copying data whenever possible.
Therefore I want to directly assign a string buffer to a zmq::message_t object without copying it. But there seems to be some deallocation of the string which avoids successful sending.
Here is the piece of code:
for (pair<int, string> msg : l) {
comm_out.send_int(msg.first);
comm_out.send_int(t_id);
int size = msg.second.size();
zmq::message_t m((void *) std::move(msg.second).data(), size, NULL, NULL);
comm_out.send_frame_msg(m, false); // some zmq-wrapper class
}
How can I avoid that the string is deallocated before the message is send out? And when is the string deallocated exactly?
Regards

I think that zmq::message_t m((void *) std::move(msg.second).data()... is probably undefined behaviour, but is certainly the cause of your problem. In this instance, std::move isn't doing what I suspect you think it does.
The call to std::move is effectively creating an anonymous temporary of a string, moving the contents of msg.second into it, then passing a pointer to that temporary data into the message_t constructor. The 0MQ code assumes that pointer is valid, but the temporary object is destroyed after the constructor of message_t completes - i.e. before you call send_frame.
Zero-copy is a complicated matter in 0mq (see the 0MQ Guide) for more details, but you have to ensure that the data that hasn't been copied is valid until 0MQ tells you explicitly that it's finished with it.
Using C++ strings in this situation is hard, and requires a lot of thought. Your question about how to "avoid that the string is deallocated..." goes right to the heart of the issue. The only answer to that is "with great care".
In short, are you sure you need zero-copy at all?

Related

Analyzing a crash from a rust-generated C library

I needed to call Rust code from my Go code. Then I used C as my interface. I did this:
I've created a Rust library that takes a CStr as a parameter, and returns a new processed string back as CStr.
This code is statically compiled to a static C library my_lib.a.
Then, this library is statically linked with my Go code, that then calls the library API using CGo (Go's representation to C String, just like Rusts's Cstr).
The final Go binary is sitting inside a docker container in my kubernetes. Now, my problem is that is some cases where the library's API is called, my pod (container) is crashing. Due to the nature of using CStr and his friends, I must use unsafe scopes in some places, and I highly suspect a segfault that is caused by one of the ptrs used in my code, but I have no way of communicating the error back to the Go code that could be then printed OR alternatively get some sort of a core dump from Rust/C so I can point out the problematic code. The pod just crashes with no info whatsoever, at least to my knowledge..
So my question is, how can I:
Recover from panic/crashes that happen inside an unsafe code? or maybe wrap it with a recoverable safe scope?
Override the SIG handlers so I can at least "catch" the errors and not crash? So I can debug it.
Perhaps communicate a signal interruption that was caused in my c-lib that was generated off Rust back to the caller?
I realize that once Rust is compiled to a c-library, it is a matter of C, but I have no idea how to tackle this one.
Thanks!

I've created a Rust library that takes a CStr as a parameter, and returns a new processed string back as CStr.
Neither operation seems OK:
the CStr documentation specifically notes that CStr is not repr(C)
CStr is a borrowed string, a "new processed string" would have to be owned (so a CString, which also isn't repr(C)).
Due to the nature of using CStr and his friends, I must use unsafe scopes in some places, and I highly suspect a segfault that is caused by one of the ptrs used in my code, but I have no way of communicating the error back to the Go code that could be then printed OR alternatively get some sort of a core dump from Rust/C so I can point out the problematic code. [...] Recover from panic/crashes that happen inside an unsafe code? or maybe wrap it with a recoverable safe scope?
If you're segfaulting there's no panic or crash which Rust can catch or manipulate in any way. A segfault means the OS itself makes your program go away. However you should have a core dump the usual way, this might be a configuration issue with your container thing.
Override the SIG handlers so I can at least "catch" the errors and not crash? So I can debug it.
You can certainly try to handle SIGSEGV, but after a SIGSEGV I'd expect the program state to be completely hosed, this is not an innocuous signal.

Placing a small object at the beggining of memory block

I need to store a object describing the memory details of a memory block allocated by sbrk(), at the beggining of the memory block itself.
for example:
metaData det();
void* alloc = sbrk(sizeof(det)+50000);
//a code piece to locate det at the beggining of alocate.
I am not allowed to use placement new, And not allowed to allocate memory using new/malloc etc.
I know that simply assigning it to the memory block would cause undefined behaviour.
I was thinking about using memcpy (i think that can cause problems as det is not dynamicly allocated).
Could assigning a pointer to the object at the beginning work (only if theres no other choise), or memcpy?
thanks.

I am not allowed to use placement new
Placement new is the only way to place an object in an existing block of memory.
[intro.object] An object is created by a definition, by a new-expression, when implicitly changing the active member of a union, or when a temporary object is created
You cannot make a definition to refer to an existing memory region, so that's out.
There's no union, so that's also out.
A temporary object cannot be created in an existing block of memory, so that's also out.
The only remaining way is with a new expression, and of those, only placement new can refer to an existing block of memory.
So you are out of luck as far as the C++ standard goes.
However the same problem exists with malloc. There are tons of code out there that use malloc without bothering to use placement new. Such code just casts the result of malloc to the target type and proceeds from there. This method works in practice, and there is no sign of it being ever broken.
metaData *det = static_cast<metaData *>(alloc);
On an unrelated note, metaData det(); declares a function, and sizeof is not applicable to functions.

Safely zeroing buffers after working with crypto/*

Is there a way to zero buffers containing e. g. private keys after
using them and make sure that compilers don't delete the zeroing code as
unused? Something tells me that a simple:
copy(privateKey, make([]byte, keySize))
Is not guaranteed to stay there.

Sounds like you want to prevent sensitive data remaining in memory. But have you considered the data might have been replicated, or swapped to disk?
For these reasons I use the https://github.com/awnumar/memguard package.
It provides features to destroy the data when no longer required, while keeping it safe in the mean time.
You can read about its background here; https://spacetime.dev/memory-security-go

How about checking (some of) the content of the buffer after zeroing it and passing it to another function? For example:
copy(privateKey, make([]byte, keySize))
if privateKey[0] != 0 {
// If you pass the buffer to another function,
// this check and above copy() can't be optimized away:
fmt.Println("Zeroing failed", privateKey[0])
}
To be absolutely safe, you could XOR the passed buffer content with random bytes, but if / since the zeroing is not optimized away, the if body is never reached.
You might think a very intelligent compiler might deduce the above copy() zeros privateKey[0] and thus determine the condition is always false and still optimize it away (although this is very unlikely). The solution to this is not to use make([]byte, keySize) but e.g. a slice coming from a global variable or a function argument (whose value can only be determined at runtime) so the compiler can't be smart enough to deduce the condition is going to be always false at compile time.

Are objects accessed indirectly in D?

As I've read all objects in D are fully location independent. How this requirement is achieved?
One thing that comes to my mind, is that all references are not pointers to the objects, but to some proxy, so when you move object (in memory) you just update that proxy, not all references used in program.
But this is just my guess. How it is done in D for real?

edit: bottom line up front, no proxy object, objects are referenced directly through regular pointers. /edit
structs aren't allowed to keep a pointer to themselves, so if they get copied, they should continue to just work. This isn't strictly enforced by the language though:
struct S {
S* lol;
void beBad() {
lol = &this; // this compiler will allow this....
}
}
S pain() {
S s;
s.beBad();
return s;
}
void main() {
S s;
s = pain();
assert(s.lol !is &s); // but it will also move the object without notice!
}
(EDIT: actually, I guess you could use a postblit to update internal pointers, so it isn't quite without notice. If you're careful enough, you could make it work, but then again, if you're careful enough, you can shoot between your toes without hitting your foot too. EDIT2: Actually no, the compiler/runtime is still allowed to move it without even calling the postblit. One example of where this happens is if it copies a stack frame to the heap to make a closure. The struct data is moved to a new address without being informed. So yeah. /edit)
And actually, that assert isn't guaranteed to pass, the compiler might choose to call pain straight on the local object declared in main, so the pointer would work (though I'm not able to force this optimization here for a demo, generally, when you return a struct from a function, it is actually done via a hidden pointer the caller passes - the caller says "put the return value right here" thus avoiding a copy/move in some cases).
But anyway, the point just is that the compiler is free to copy or not to copy a struct at its leisure, so if you do keep the address of this around in it, it may become invalid without notice; keeping that pointer is not a compile error, but it is undefined behavior.
The situation is different with classes. Classes are allowed to keep references to this internally since a class is (in theory, realized by the garbage collector implementation)) an independent object with an infinite lifetime. While it may be moved (such as be a moving GC (not implemented in D today)), if it is moved, all references to it, internal and external, would also be required to be updated.
So classes can't have the memory pulled out from under them like structs can (unless you the programmer take matters into your own hands and bypass the GC...)
The location independent thing I'm pretty sure is referring only to structs and only to the rule that they can't have pointers to themselves. There's no magic done with references or pointers - they indeed work with memory addresses, no proxy objects.

c++ multi threading - lock one pointer assignment?

I have a method as below
SomeStruct* abc;
void NullABC()
{
abc = NULL;
}
This is just example and not very interesting.
Many thread could call this method at the same time.
Do I need to lock "abc = NULL" line?
I think it is just pointer so it could be done in one shot and there isn't really need for it but just wanted to make sure.
Thanks

It depends on the platform on which you are running. On many platforms, as long as abc is correctly aligned, the write will be atomic.
However, if your platform does not have such a guarantee, you need to synchronize access to the variable, using a lock, an atomic variable, or an interlocked operation.

No you do not need a lock, at least not on x86. A memory barrier is required in may real world situations though, and locking is one way to get this (the other would be an explicit barrier). You may also consider using an interlocked operation, like VisualC's InterlockedExchangePointer if you need access to the original pointer. There are equivalent intrinsics supported by most compilers.

If no other threads are ever using abc for any other purpose, then the code as shown is fine... but of course it's a bit silly to have a pointer that never gets used except to set it to NULL.
If there is some other code somewhere that does something like this, OTOH:
if (abc != NULL)
{
abc->DoSomething();
}
Then in this case both the code that uses the abc pointer (above) and the code that changes it (that you posted) needed to lock a mutex before accessing (abc). Otherwise the code above risks crashing if the value of abc gets set to NULL after the if statement but before the DoSomething() call.
A borderline case would be if the other code does this:
SomeStruct * my_abc = abc;
if (my_abc != NULL)
{
my_abc->DoSomething();
}
That will probably work, because at the time the abc pointer's value is copied over to my_abc, the value of abc is either NULL or it isn't... and my_abc is a local variable, so other thread's won't be able to change it before DoSomething() is called. The above could theoretically break on some platforms where copying of pointers isn't atomic though (in which case my_abc might end up being an invalid pointer, with have of abc's bits and half NULL bits)... but common/PC hardware will copy pointers atomically, so it shouldn't be an issue there. It might be worthwhile to use a Mutex anyway just to for paranoia's sake though.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string