How to compare managed and unmanaged Strings?

How to compare managed and unmanaged Strings? - string

I have an application that uses managed System::String in the UI elements, but then refers to un-managed (read: legacy) code for the more complex computation.
Additionally, there is not a consistent encoding for the strings - managed strings can be either regular "strings" or Unicode L"strings" and un-managed strings come in all of char *, wchar_t *, std::string, std::wstring varieties.
What is the best way to compare the various flavors of strings? I'm hoping that I can do this without having to implement half a dozen comparison methods like
int compare(System::String ^ s1, char * s2);
int compare(System::String ^ s1, wchar_t * s2);
int compare(System::String ^ s1, std::string s2);
int compare(System::String ^ s1, std::wstring s2);
int compare(char * s1, System::String ^ s2);
int compare(wchar_t * s1, System::String ^ s2);
...
The primary purpose will be equality comparisons, so if those are significantly easier to do, then I would like to see those answers as well.

Here's an excellent MSDN article covering this topic in both directions:
http://msdn.microsoft.com/en-us/library/42zy2z41.aspx
And here's the Marshal class:
http://msdn.microsoft.com/en-us/library/atxe881w.aspx
With this, I would suggest defining various managed code methods that take the different types of native strings, converts them into a managed string, then compares them.
Unfortunately, I see no way to get around permutating the different native string types. They are literally different data types, even though they both represent what we call a string. And if it messes up the conversion, you can get into some dangerous territory.
Also, I would drop std::string out of the running, since you can easily call c_str() to get a const char * out. Same for std::wstring and wchar_t.
Here's an example of one:
using namespace System::Runtime::InteropServices;
public static int NativeCompare(System::String ^ s1, char * s2)
{
System::String ms2 = Marshal::PtrToStringAnsi((IntPtr)s2);
return s1.CompareTo(ms2);
}

Related

How to convert AnsiString to std::string in C++ Builder?

I would like to ask how can I get a text input from TEdit control and cast it to std::string (not AnsiString).
For example, if I have a TEdit control with the name User, I get the text from it with the User->Text command. What I want to do is to assign that value to a std::string, for example string my_str = User->Text;.
I would like to ask, how can I do this in C++ Builder? Is there some sort of a ToString() method or sort of, because I was not able to find one.

In C++Builder 2007 and earlier, TEdit::Text is an 8-bit AnsiString in the user's default ANSI locale. It is very straight forward to convert an AnsiString to a std::string - just use the AnsiString::c_str() method to get a null-terminated char* pointer to the AnsiString data, and then you can assign that to the std::string, eg:
std::string my_str = User->Text.c_str();
/* or:
System::AnsiSystem text = User->Text;
std::string my_str(text.c_str(), text.Length());
*/
If you want the std::string data to be in another character encoding, such as UTF-8, then you will have to convert the AnsiString data accordingly, such as with MultiByteToWideChar()/WideCharToMultiByte(), UTF8Encode(), etc, before assigning it to the std::string.
In C++Builder 2009 and later, TEdit::Text is a 16-bit UnicodeString in UTF-16 format. The easiest way to convert a UnicodeString to a std::string is to first convert to an AnsiStringT<CP> (where CP is the desired ANSI codepage - AnsiString uses CP=0, UTF8String uses CP=65001, etc), and then convert that to std::string, eg:
std::string my_str = AnsiString(User->Text).c_str(); // or UTF8String, etc...
/* or:
System::AnsiString text = User->Text; // or UTF8String, etc...
std::string my_str(text.c_str(), text.Length());
*/
Alternatively, in C++11 and later, you can convert the UnicodeString to a std::wstring first, and then use std::wstring_convert, eg:
#include <locale>
std::wstring my_wstr = User->Text.c_str();
/* or:
System::UnicodeString text = User->Text;
std::wstring my_wstr(text.c_str(), text.Length());
*/
// System::Char may be either wchar_t or char16_t, depending
// on which platform you are compiling for...
std::string my_str = std::wstring_convert<std::codecvt_utf8_utf16<System::Char>>{}.to_bytes(my_wstr);

I had a lot of those to migrate from Borland to Embarcadero Rio. So I created a method to do it.
#include <cwchar.h> //std::wcslen
char* __fastcall AnsiOf(wchar_t* w)
{
static char c[STR_CONV_BUF_SIZE];
memset(c, 0, sizeof(c));
WideCharToMultiByte(CP_ACP, WC_NO_BEST_FIT_CHARS, w, std::wcslen(w), c, STR_CONV_BUF_SIZE, NULL, NULL);
return c;
}
std::string my_str = AnsiOf((User->Text).c_str());

__sync_bool_compare_and_swap with different parameter types in Cython

I am using Cython for fast parallel processing of data, adding items to a shared memory linked list from multiple threads. I use __sync_bool_compare_and_swap, which provides an atomic compare and swap (CAS) operation to compare if the value was not modified (by another thread) before replacing it with a new value.
cdef extern int __sync_bool_compare_and_swap (void **ptr, void *oldval, void *newval) nogil
cdef bint firstAttempt = 1
cdef type *next = NULL
cdef type *newlink = ....
while firstAttempt or not __sync_bool_compare_and_swap( <void**> c, <void*>next, <void*>newlink):
firstAttempt = 0
next = c[0]
newlink.next = next
This works very well. However, now I also want to keep track of the size of the linked list, and want to use the same CAS function for the updates, however, this time it is not pointers that need to be updated but an int. How can use the same external function twice in Cython, once with void** parameter and once with an int* parameter?
EDIT
What I have in mind is two separate atomic operations, in one atomic operation I want to update the linked list, in the other I want to update the size. You can do it in C, but for Cython it means you have to reference the same external function twice with different parameters, can that be done?
CONCLUSION
The answer suggested by DavidW works. In case anyone is thinking to use a similar construction, you should be aware that when using two separate update functions there is no guarantee these are processed in sequence (i.e. another thread can update in between), however, if the objective is to update a cumulative value for instance to monitor progress while multithreading or to create an aggregated result that is not used until all threads are finished, CAS does guarantee that all updates are done exactly once. Unexpectedly, gcc refuses to compile without casting to void*, so either define separate hard-typed versions, or you need to cast. A snippet from my code:
in some_header.h:
#define sync_bool_compare_and_swap_int __sync_bool_compare_and_swap
#define sync_bool_compare_and_swap_vp __sync_bool_compare_and_swap
in some_prog.pxd:
cdef extern from "some_header.h":
cdef extern int sync_bool_compare_and_swap_vp (void **ptr, void *oldval, void *newval) nogil
cdef extern int sync_bool_compare_and_swap_int (int *ptr, int oldval, int newval) nogil
in some_prog.pyx:
cdef void updateInt(int *value, int incr) nogil:
cdef cINT previous = value[0]
cdef cINT updated = previous + incr
while not sync_bool_compare_and_swap_int( c, previous, updated):
previous = value[0]
updated = previous + incr

So the issue (as I understand it) is that it's __sync_bool_compare_and_swap is a compiler intrinsic rather than a function, so doesn't really have a fixed signature, because the compiler just figures it out. However, Cython demands to know the types, and because you want to use it with two different types, you have a problem.
I can't see a simpler way than resorting to a (very) small amount of C to "help" Cython. Create a header file with a bunch of #defines
/* compare_swap.h */
#define sync_bool_compare_swap_voidp __sync_bool_compare_and_swap
#define sync_bool_compare_swap_int __sync_bool_compare_and_swap
You can then tell Cython that each of these is a separate function
cdef extern from "compare_swap.h":
int sync_bool_compare_swap_voidp(void**, void*, void*)
int sync_bool_compare_swap_int(int*, int, int)
At this stage you should be able to use them naturally as plain functions without any type casting (i.e. no <void**> in your code, since this tends to hide real errors). The C preprocessor will generate the code you want and all is well.
Edit: Looking at this a few years later I can see a couple of simpler ways you could probably use (untested, but I don't see why they shouldn't work). First you could use Cython's ability to map a name to a "cname" to avoid the need for an extra header:
cdef extern from *:
int sync_bool_compare_swap_voidp(void**, void*, void*) "__sync_bool_compare_and_swap"
int sync_bool_compare_swap_int(int*, int, int) "__sync_bool_compare_and_swap"
Second (and probably best) you could use a single generic definition (just telling Cython that it's a varargs function):
cdef extern from "compare_swap.h":
int __sync_bool_compare_and_swap(...)
This way Cython won't try to understand the types used, and will just defer it entirely to C (which is what you want).
I wouldn't like to comment on whether it's safe for you to use two atomic operations like this, or whether that will pass through a state with dangerously inconsistent data....

How to convert BSTR string to Unsigned Char (Using com technology in the appln)

I am writing small application which uses com technology. I want to convert BSTR string to an unsigned char. To do this, i used W2A() Macro to convert from BSTR to String and then copied String.C_STR() to an unsigned char array. The code snippet is as follows:
Send(BSTR *packet, int length)
{
std::string strPacket = W2A(*packet);
unsigned char * pBuffer = new unsigned char [strPacket.length()+1];
memset(pBuffer,0,strPacket.length()+1);
memcpy(pBuffer,strPacket.c_str(),strPacket.length()+1);
}
This works fine when packet contains normal string. But if the packet contains a NUL character in it, the problem occurs. Some unknown characters appear after that NUL in the pBuffer i.e, after conversion.
Can anyone please let me know how to avoid that? Or is there any other way to do it correctly?

A BSTR is a Windows API type and must be managed with API macros or functions. If you cannot use W2A macro because your string may have null chars inside, you will have to use functions as WideCharToMultiByte that can convert from wide characters of BSTR to narrow chararacters for a char*. Be sure to have the SDK documentation. Alternatively, you could make you program use WCHARs

Returning string from a remote server using rpcgen

I am going through RPC tutorial and learn few techniques in rpcgen. I have the idea of adding, multiplying different data types using rpcgen.
But I have not found any clue that how could I declare a function in .x file which will return a string. Actually I am trying to build a procedure which will return a random string(rand string array is in server).
Can any one advise me how to proceed in this issue? It will be helpful if you advise me any tutorial regarding this returning string/pointer issue.
Thank you in advance.

Ok, answering to the original question (more than 2 years old), the first answer is correct but a little tricky.
In your .x file, you define your structure with the string inside, having defined previously the size of the string:
typedef string str_t<255>;
struct my_result {
str_t data;
};
...
Then you invoke rpcgen on your .x file to generate client and server stubs and .xdr file:
$rpcgen -N *file.x*
Now you can compile client and server in addition to any program where you pretend to use the remote functions. To do so, I followed the "repcgen Tutorial" in ORACLE's web page:
https://docs.oracle.com/cd/E19683-01/816-1435/rpcgenpguide-21470/index.html
The tricky part is, although you defined a string of size m (array of m characters) what rpcgen and .xdr file create is a pointer to allocated memmory. Something like this:
.h file
typedef char *str_t;
struct my_result {
int res;
str_t data;
};
typedef struct my_result my_result;
.xdr file
bool_t xdr_str_t (XDR *xdrs, str_t *objp)
{
register int32_t *buf;
if (!xdr_string (xdrs, objp, 255))
return FALSE;
return TRUE;
}
So just take into account when using this structure in your server side that it is not a string of size m, but a char pointer for which you'll have to reserve memory before using it or you'll be prompted the same error than me on execution:
Segmentation fault!
To use it on the server you can write:
static my_result response;
static char text[255];
memset(&response, '\0', sizeof(my_result));
memset(text, '\0', sizeof(text));
response.data = text;
And from there you are ready to use it wisely! :)

According to the XDR protocol specification you can define a string type where m is the length of the string in bytes:
The standard defines a string of n (numbered 0 to n -1) bytes to be the number n encoded as an unsigned integer (as described above), and followed by the n bytes of the string. Each byte must be regarded by the implementation as being 8-bit transparent data. This allows use of arbitrary character set encodings. Byte m of the string always precedes byte m +1 of the string, and byte 0 of the string always follows the string's length. If n is not a multiple of four, then the n bytes are followed by enough (0 to 3) residual zero bytes, r, to make the total byte count a multiple of four.
string object<m>;
You can then define a struct with the string type str_t as one of the variables:
typedef string str_t<255>;
struct my_result {
str_t data;
};
Then in your .x file you can define an RPC in your program which returns a struct of type my_result. Since rpcgen will give you a pointer to this struct (which I have called res) you can print the message with prinf("%s\n", res->data);.
program HELLO_PROG {
version HELLO_VERSION {
my_result abc() = 1;
} = 1;
} = 1000;

C++/CLI from tracking reference to (native) reference - wrapping

I need a C# interface to call some native C++ code via the CLI dialect. The C# interface uses the out attribute specifier in front of the required parameters. That translates to a % tracking reference in C++/CLI.
The method I has the following signature and body (it is calling another native method to do the job):
virtual void __clrcall GetMetrics(unsigned int %width, unsigned int %height, unsigned int %colourDepth, int %left, int %top) sealed
{
mRenderWindow->getMetrics(width, height, colourDepth, left, top);
}
Now the code won't compile because of a few compile time errors (all being related to not being able to convert parameter 1 from 'unsigned int' to 'unsigned int &').
As a modest C++ programmer, to me CLI is looking like Dutch to a German speaker. What can be done to make this wrapper work properly in CLI?

Like it was also suggested in a deleted answer, I did the obvious and used local variables to pass the relevant values around:
virtual void __clrcall GetMetrics(unsigned int %width, unsigned int %height, unsigned int %colourDepth, int %left, int %top) sealed
{
unsigned int w = width, h = height, c = colourDepth;
int l = left, t = top;
mRenderWindow->getMetrics(w, h, c, l, t);
width = w; height = h; colourDepth = c; left = l; top = t;
}
It was a bit obvious since the rather intuitive mechanism of tracked references: they're affected by the garbage collector's work and are not really that static/constant as normal &references when they're prone to be put somewhere else in memory. Thus this is the only way reliable enough to overcome the issue. Thanks to the initial answer.

If your parameters use 'out' on the C# side, you need to define your C++/CLI parameters like this: [Out] unsigned int ^%width
Here's an example:
virtual void __clrcall GetMetrics([Out] unsigned int ^%width)
{
width = gcnew UInt32(42);
}
Then on your C# side, you'll get back 42:
ValueType vt;
var res = cppClass.GetMetrics(out vt);
//vt == 42
In order to use the [Out] parameter on the C++/CLI side you'll need to include:
using namespace System::Runtime::InteropServices;
Hope this helps!

You can use pin_ptr so that 'width' doesn't move when native code changes it. The managed side suffers from pin_ptr, but I don't think you can get around that if you want native code directly access it without 'w'.
virtual void __clrcall GetMetrics(unsigned int %width, unsigned int %height, unsigned int %colourDepth, int %left, int %top) sealed
{
pin_ptr<unsigned int> pw = &width; //do the same for height
mRenderWindow->getMetrics(*pw, h, c, l, t);
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to compare managed and unmanaged Strings? - string

Related

How to convert AnsiString to std::string in C++ Builder?

__sync_bool_compare_and_swap with different parameter types in Cython

How to convert BSTR string to Unsigned Char (Using com technology in the appln)

Returning string from a remote server using rpcgen

C++/CLI from tracking reference to (native) reference - wrapping

Categories

Resources