I use the Boost library to implement my application. All the string characters in the data model of my application are wide chars (wchar_t type). But in the boost library, some classes only hand the narrow char (char type), i.e. "address boost::asio::ip::address::from_string(const char* str)". So I need to make the conversion between std::string and std::wstring when call the boost functions.
Is there performance issue due to the string conversions?
In there the converter in Boost, which makes the conversion between std::wstring and std::string with good performance?
UPDATE
Regarding the converter function. I find the code below works.
std::wstring wstr(L"Hello World");
const std::string nstr( wstr.begin(), wstr.end());
const std::wstring wstr2(nstr.begin(), nstr.end());
Add the research conclusion myself.
Regarding the performance overhead of the string conversion. I debugged into the functions above. The conversion is implemented by the C-cast char by char. The time complexity is O(L), L is the length of the string. In my application, the strings required to be converted are not very long. So I don't think there is any obviously performance latency due to the conversions.
Related
I am trying to implement Rusty wrappers for those extended attributes syscalls, if they are guaranteed to be UTF-8 encoded, then I can use the type String, or I have to use OsString.
I googled a lot, but only found these two pages:
freedesktop: CommonExtendedAttributes says that:
Attribute strings should be in UTF-8 encoding.
macOS man page for setxattr(2) says that: The extended attribute names are simple NULL-terminated UTF-8 strings
Seems that this tells us the name is guaranteed to be UTF-8 encoded on macOS,
I would like to know information on as many platforms as possible since I try to cover them all in my implementation.
No, in Linux they are absolutely not guaranteed to be in UTF-8. Attribute values are not even guaranteed to be strings at all. They are just arrays of bytes with no constraints on them.
int setxattr(const char *path, const char *name,
const void *value, size_t size, int flags);
const void *value, size_t size is not a good way to pass a string to a function. const char* name is, and attribute names are indeed strings, but they are null-terminated byte strings.
Freedesktop recommendations are just that, recommendations. They don't prevent anyone from creating any attribute they want.
I have a c++ application, and i need to convert a LPCTSTR to a wchar*.
Is there function to perform this conversion?
Using Visual Studio 2k8.
Thank you
From the comments, you are compiling for Unicode. In which case LPCTSTR evaluates as const wchar_t* and so no conversion is necessary. If you need a modifiable buffer, then you can allocate one and perform a memory copy. That works because the string is already encoded in UTF-16.
Since you are using C++ it makes sense to store strings in string classes rather than using raw C strings. For example you could use std::wstring. Or your could use the MFC/ATL string classes. Exactly which of these options is best for you depends on the specifics of the rest of your code base.
LPCTSTR may either be multibyte or Unicode, determined at compile-time. WinNT.h defines it as follows
#ifdef UNICODE
typedef LPCWSTR LPCTSTR;
#else
typedef LPCSTR LPCTSTR
#endif
meaning that it is already composed of wchar as Rup points out in a comment. So you might want to check UNICODE and use MultiByteToWideChar() if undefined. Of course, you'd need to know the code page the string is using, which depends on where and how it originates. The MultiByteToWideChar documentation has good code samples.
I have a C++/CX Windows 8 application and I need to do something similar to the following conversion:
String^ foo = "32";
byte bar = <the numeric value of foo>
How can I convert the number stored within the String^ into the byte type? I am lost without all of the C# magic that I normally use to achieve this!
Thanks in advance for any help on this.
You are getting into trouble by assuming that C++/CX resembles C#. That's not the case at all, it is pure C++ with just some language extensions to make dealing with WinRT types easier. This is not appropriate use of the Platform::String type, it is not a general purpose string class. That's already covered by the standard C++ library. The class was intentionally crippled to discourage the usage you have in mind. This MSDN library article explains it well:
Use the Platform::String Class when you pass strings back and forth to methods in Windows Runtime classes, or when you are interacting with other Windows Runtime components across the application binary interface (ABI) boundary. The Platform::String Class provides methods for several common string operations, but it's not designed to be a full-featured string class. In your C++ module, use standard C++ string types such as wstring for any significant text processing, and then convert the final result to Platform::String^ before you pass it to or from a public interface. It's easy and efficient to convert between wstring or wchar_t* and Platform::String.
So the appropriate code ought to resemble:
#include <string>
...
std::wstring foo(L"32");
auto bar = static_cast<unsigned char>(std::stol(foo));
In a C++/CLI project (Visual Studio 2010), what is the best way to convert a System::String to a char* so it can be sent to a system function, and similarly convert the received char* to System::String?
Is there a faster way than using the System::Runtime::InteropServices::Marshal class?
char* or wchar_t*? const or no? If what you really need is a const wchar_t* you can do it pretty quick with PtrToStringChars from vcclr.h which won't incur any copying overheard (you'll need to pin the result still however).
Going the other way you're probably not going to be able to significantly beat Marshal, System::String does have constructors that take pointers though.
I have an MFC application in C++ that uses std::string and std::wstring, and frequently casts from one to the other, and a whole lot of other nonsense. I need to standardize everything to a single format, so I was wondering if I should go with CString or std::wstring.
In the application, I'll need to generate strings from a string table, work with a lot of windows calls that require constant tchar or wchar_t pointers, Edit Controls, and interact with a COM object's API that requires BSTR.
I also have vectors of strings, so is there any problem with a vector of CStrings?
Which one is better? What are the pros and cons of each?
Examples
BSTR to wstring
CComBSTR tstr;
wstring album;
if( (trk->get_Info((BSTR *)&tstr)) == S_OK && tstr!= NULL)
album = (wstring)tstr;
wstring to BSTR
CComBSTR tstr = path.c_str();
if(trk->set_Info(tstr) == S_OK)
return true;
String resource to wstring
CString t;
wstring url;
t.LoadString(IDS_SCRIPTURL);
url = t;
GetProfileString() returns a CString.
integer to string format:
wchar_t total[32];
swprintf_s(total, 32, L"%d", trk->getInt());
wstring tot(total);
std::basic_string<> (or rather its specialisations) is horrible to work with, it's imo one of the major shortcomings of the STL (and I'd say C++ in general). It doesn't even know about encodings - c'mon, this is 2010. Being able to define the size of your character isn't enough, 'cause there's no way indicate variable-size characters in a basic_string<>. Now, utf-8 isn't nice to work with with a CString, but it's not as bad as trying to do it with basic_string. While I agree with the spirit of the above posters that a standard solution is better than the alternatives, CString is (if your project uses MFC or ATL anyway) much nicer to work with than std::string/wstring: conversions between ANSI/Unicode (through CStringA and CStringW), BSTR, loading from string table, cast operators to TCHAR (.c_str()? really?), ...
CString also has Format(), which, although not safe and somewhat ugly, is convenient. If you prefer safe formatting libraries, you'll be better off with basic_string.
Furthermore, CString has some algorithms as member functions that you'll need boost string utilities for to do on basic_string such as trim, split etc.
Vectors of CString are no problem.
Guard against a dogmatic dismissal of CString on the basis of it being Windows-only: if you use it in a Windows GUI, the application is Windows-only anyway. That being said, if there's any chance that your code will need to be cross-platform in the future, you're going to be stuck with basic_string<>.
I personally would go with CStrings in this case, since you state that you're working with BSTRs, using COM, and writing this in MFC. While wstrings would be more standards compliant you'll run into issues with constant converting from one to another. Since you're working with COM and writing it in MFC, there's no real reason to worry about making it cross-platform, since no other OS has COM like Windows and MFC is already locking you into Windows.
As you noted, CStrings also have built-in functions to help load strings and convert to BSTRs and the like, all pre-made and already built to work with Windows. So while you need to standardize on one format, why not make it easier to work with?
std::wstring would be much more portable, and benefit from a lot of existing prewritten code in STL and in boost. CString would probably go better with windows API's.
wchat_t : remember that you can get the data out of wstring any time by using the data() function, so you get the needed wchar_t pointer anyway.
BSTR : use SysAllocString to get the BSTR out of a wstring.data().
As for the platform dependance, remember that you can use std::basic_string<T> to define your own string, based on what you want the length of a single character to be.
I'd go for wstring every day....