How to reset/emptying to a std::wstring? - visual-c++

How to reset/emptying to a std::wstring?
It seems that my function is making a delay when using these line:
std::wstring currentUrl; // <--- I declare this as global.
currentUrl = _bstr_t(url->bstrVal);
Any idea how can I resolve this?

How did you measure that delay? The only reliable way is through a profiler, and a profiler would also show you how that time was spent.
That said, assigning to a string often (unless the string can reuse its old buffer or small string optimization kicks in) involves deleting the old buffer and allocating a new buffer. And dynamic memory is slow.
I don't know _bstr_t, but since std::wstring does only have assignment operators to assign from another std::wstring and const wchar_t*, I assume this is the latter. If that is the case, the string doesn't know the size of the string it will get assigned, so if the string is big, it might have to incrementally increase its buffer, which again involves allocation and deallocation plus copying characters, so this might be quite expensive.
You could try to use an assign() member function instead of the assignment operator. I think there's an overload of assign() that takes a const wchar_t* and the size of the string, allowing it to know the exact buffer size before-hand.
However, as always with performance problems, you need to measure using a profiler. Guessing will not get you far.

Related

Placing a small object at the beggining of memory block

I need to store a object describing the memory details of a memory block allocated by sbrk(), at the beggining of the memory block itself.
for example:
metaData det();
void* alloc = sbrk(sizeof(det)+50000);
//a code piece to locate det at the beggining of alocate.
I am not allowed to use placement new, And not allowed to allocate memory using new/malloc etc.
I know that simply assigning it to the memory block would cause undefined behaviour.
I was thinking about using memcpy (i think that can cause problems as det is not dynamicly allocated).
Could assigning a pointer to the object at the beginning work (only if theres no other choise), or memcpy?
thanks.
I am not allowed to use placement new
Placement new is the only way to place an object in an existing block of memory.
[intro.object] An object is created by a definition, by a new-expression, when implicitly changing the active member of a union, or when a temporary object is created
You cannot make a definition to refer to an existing memory region, so that's out.
There's no union, so that's also out.
A temporary object cannot be created in an existing block of memory, so that's also out.
The only remaining way is with a new expression, and of those, only placement new can refer to an existing block of memory.
So you are out of luck as far as the C++ standard goes.
However the same problem exists with malloc. There are tons of code out there that use malloc without bothering to use placement new. Such code just casts the result of malloc to the target type and proceeds from there. This method works in practice, and there is no sign of it being ever broken.
metaData *det = static_cast<metaData *>(alloc);
On an unrelated note, metaData det(); declares a function, and sizeof is not applicable to functions.

Safely zeroing buffers after working with crypto/*

Is there a way to zero buffers containing e. g. private keys after
using them and make sure that compilers don't delete the zeroing code as
unused? Something tells me that a simple:
copy(privateKey, make([]byte, keySize))
Is not guaranteed to stay there.
Sounds like you want to prevent sensitive data remaining in memory. But have you considered the data might have been replicated, or swapped to disk?
For these reasons I use the https://github.com/awnumar/memguard package.
It provides features to destroy the data when no longer required, while keeping it safe in the mean time.
You can read about its background here; https://spacetime.dev/memory-security-go
How about checking (some of) the content of the buffer after zeroing it and passing it to another function? For example:
copy(privateKey, make([]byte, keySize))
if privateKey[0] != 0 {
// If you pass the buffer to another function,
// this check and above copy() can't be optimized away:
fmt.Println("Zeroing failed", privateKey[0])
}
To be absolutely safe, you could XOR the passed buffer content with random bytes, but if / since the zeroing is not optimized away, the if body is never reached.
You might think a very intelligent compiler might deduce the above copy() zeros privateKey[0] and thus determine the condition is always false and still optimize it away (although this is very unlikely). The solution to this is not to use make([]byte, keySize) but e.g. a slice coming from a global variable or a function argument (whose value can only be determined at runtime) so the compiler can't be smart enough to deduce the condition is going to be always false at compile time.

writing and reading strings in multithread environment

Having two threads running simultaneously can give strange behavior when writing to and reading from a variable from both threads simultaneously. It can be thread safe, but is not in every case.
Thread safe example: TThread.Terminated
The Boolean Terminated just reads FTerminated, which is set only once and since it is a Boolean, the writing process is atomic. So the value can be read in the MainThread as well as in the thread and is always thread safe to read.
My example: I have a string, which is written only once. Unlike TThread.Terminated, the writing of my string is not atomic, so the reading of it is not thread safe per se. But there may be a thread safe way in a special case: I have a situation where I just want to compare the string to another string. I only do something if they are the same (and it's not critical if they are not equal because the string is just not completely written yet). So I thought about whether this may be thread safe or not. So what happens exactly when the string is written and what may go wrong if I read the string when it's only half way written?
Steps to be done when writing a string:
Reference Count = 1:
Allocate additional memory, if new string is longer than old one
Copy Characters
Set new string length
Deallocate memory, if new string is shorter than old one
Reference Count > 1 (due to copy-on-write semantics a new string instance is needed):
Allocate memory for new string instance
Copy characters to new location and set length of the string
Locate string instance pointer to new location
Under what circumstances is it safe to read the string which is written to in just this same moment?
Reference Count = 1:
It is only (and in this case always) safe to read if the order of steps is as listed above and reading the string right before its length is set only gives the set length back (not all the allocated bytes)
Reference Count > 1:
It is only (and in this case always) safe to read if the pointer to the string is set as the last step (as setting this pointer is an atomic operation) or if length is initialized to 0 before the pointer to the string is set and the conditions for the case "Reference Count = 1" apply to the new string
Question to the ones who have such deep-knowledge: Are my assumptions true? If yes, can I rely on this safely? Or is it a such bad idea to rely on this implementation specifics that it's not even worth to think about all this and just not read strings unprotectedly when they are written to in another thread?
Delphi strings are "thread-safe" only in a sense that a string's reference count is guarantied to be valid in a multithreaded code.
Copy-On-Write of Delphi strings is not a threadsafe operation; if you need a multithreaded read/write access to the same string you generally should use some synchronization, otherwise you are potentially in trouble.
Example of what could happen without any lock.
String is being written: it should become bigger than it was, so new memory is allocated. But pointer is not yet modified, it points to old string.
At the same time reading thread got a pointer and began to read old string.
Context switched again to writing thread. It changed pointer, so now it is valid. Old string got refcount 0 and was immediately freed.
Context switch again: reading thread continues to process old string, but now it is access to deallocated memory which may easily result in access violation.

Assign string to zmq::message_t without copying

I need to do some high performance c++ stuff and that is why I need to avoid copying data whenever possible.
Therefore I want to directly assign a string buffer to a zmq::message_t object without copying it. But there seems to be some deallocation of the string which avoids successful sending.
Here is the piece of code:
for (pair<int, string> msg : l) {
comm_out.send_int(msg.first);
comm_out.send_int(t_id);
int size = msg.second.size();
zmq::message_t m((void *) std::move(msg.second).data(), size, NULL, NULL);
comm_out.send_frame_msg(m, false); // some zmq-wrapper class
}
How can I avoid that the string is deallocated before the message is send out? And when is the string deallocated exactly?
Regards
I think that zmq::message_t m((void *) std::move(msg.second).data()... is probably undefined behaviour, but is certainly the cause of your problem. In this instance, std::move isn't doing what I suspect you think it does.
The call to std::move is effectively creating an anonymous temporary of a string, moving the contents of msg.second into it, then passing a pointer to that temporary data into the message_t constructor. The 0MQ code assumes that pointer is valid, but the temporary object is destroyed after the constructor of message_t completes - i.e. before you call send_frame.
Zero-copy is a complicated matter in 0mq (see the 0MQ Guide) for more details, but you have to ensure that the data that hasn't been copied is valid until 0MQ tells you explicitly that it's finished with it.
Using C++ strings in this situation is hard, and requires a lot of thought. Your question about how to "avoid that the string is deallocated..." goes right to the heart of the issue. The only answer to that is "with great care".
In short, are you sure you need zero-copy at all?

Whats the best way to send QStrings in a function call?

I would like to know what is the most efficient and practical way of sending a Qstring as a parameter to a function, in QT more specifically. I want to use a reference. The problem is I also want to instantiate that string in the function itself like so for example:
this is the function prototype:
void myFunction(QString & theMsg);
this is the function call:
myFunction(tr("Hello StringWorld"));
now the function tr() returns a QString but it doesn't work with a reference(I can see why).
I have to do this:
QString theQstr("Hello StringWorld");
myFunction(theQstr);
Is there a simpler way to do this while still using references or could I just change the function parameter to use a QString and it would still be efficient?
QString uses COW (Copy On Write) behind the scenes, so the actual string isn't copied even if you use a signature like this:
void myFunction(QString theMsg)
(until you modify it that is).
If you absolutely want a reference I would use a const& unless you plan to modify the input argument.
void myFunction(QString const& theMsg)
The most efficient and practical way is using a const reference. The QString COW will be slower than pass by reference but faster than a regular copy.
When you pass a QString, the call site has to call QString copy ctor then a dtor: both implies an atomic reference counting operation (not neglectable), and some more generated assembly code. Hence slower and bigger code (I don't mention here the less common std::move scenario).
On the other hand, when you pass a const QString&, on the called site, there is a double indirection to access the characters: a pointer to a pointer. Hence this is slower than passing a QString, especially if the QString parameter is much used.
I would recommend to always pass a const QString&, and if you need maximum speed on the called side, make a QString copy there, and access this local copy to avoid a double-indirection (faster, less generated code).

Resources