Why is using tinyxml2-ex::text returning corrupted text? - visual-c++

I am trying to use the tinyxml2-ex library to read some XML data.
When I try using it's specific API call:
const CString strNameToUse(tinyxml2::text(pAssign).c_str());
The resulting string loses things like accents. In the end I have reverted to my original approach with the UTF8 handling:
const CString strNameToUse(CA2CT(pAssign->GetText(), CP_UTF8));
This works fine. Does anyone know why the tinyxml2-ex::text approach fails? Note that it is permissible to the use the tinyxml2 namespace.
The referred to library is using std::string and does it like this:
// helper function to get element text as a string, blank if none
inline std::string text (const XMLElement * element)
{
if (!element)
throw XmlException ("null element"s);
if (auto value = element -> GetText())
return std::string (value);
else
return ""s;
}

The library author explained (GitHub discussion:
It's because tixml2ex::text (see line 465 in tixml2ex.h) does this:
if (auto value = element -> GetText())
return std::string (value);
which will corrupt any string containing characters outside ASCII 127.

Related

C++ override quotes

Ok, so I'm using C++ to make a library that'd help me to print lines into a console.
So, I want to override " "(quote operators) to create an std::string instead of the string literal, to make it easier for me to append other data types to that string I want to output.
I've seen this done before in the wxWidgets with their wxString, but I have no idea how I can do that myself.
Is that possible and how would I go about doing it?
I've already tried using this code, but with no luck:
class PString{
std::string operator""(const char* text, std::size_t len) {
return std::string(text, len);
}
};
I get this error:
error: expected suffix identifier
std::string operator""(const char* text, std::size_t len) {
^~
which, I'd assume, want me to add a suffix after the "", but I don't want that. I want to only use ""(quotes).
Thanks!
You can't use "" without defining a suffix. "" is a const char* by itself either with a prefix (like L"", u"", U"", u8"", R"()") or followed by suffixes like (""s, ""sv, ...) which can be overloaded.
The way that wxString works is set and implicit constructor wxString::wxString(const char*); so that when you pass "some string" into a function it is essentially the same as wxString("some string").
Overriding operator ""X yields string literals as the other answer.

How would I populate a vector with all the elements from a List of type system string while converting it to std::string?

I am trying to understand lambda functions better and would like some example of how I could add to a vector while converting System.String^ to std::string with such a Lambda example (If I am able to).
My current foreach:
List<String^>^ names = //Returning 'System.String' List from C#
for each (System::String^ name in names)
{
std::string convertedString = msclr::interop::marshal_as< std::string >(name);
nameObjects.push_back(MyObject(convertedString, "test"));
}
But I would like to extend it to something like this (My best guess but I am missing the logic to convert each element of "names" to a single string, this is where a Lambda would help me):
std::vector<nameObjects> testObjects{ std::begin(msclr::interop::marshal_as< std::string >(names)), std::end(msclr::interop::marshal_as< std::string >(names)) };
Alright, I figured out a way to make this work...it requires using the obscure cliext classes.
First, create a cliext::vector, there is an overload with takes an IEnumerator.
cliext::vector<String^> v_names(names);
Now, you can use cliext::transform() (not std::transform) to do STL-style iteration, and create MyObject instances with a lambda
std::vector<MyObject> testObjects;
cliext::transform(v_names.begin(), v_names.end(), std::back_inserter(testObjects), [](String^ name)
{
std::string convertedString = msclr::interop::marshal_as< std::string >(name);
return MyObject(convertedString, "test");
});

C++ : Strings, Structures and Access Violation Writing Locations

I'm attempting to try and use a string input from a method and set that to a variable of a structure, which i then place in a linked list. I didn't include, all of code but I did post constructor and all that good stuff. Now the code is breaking at the lines
node->title = newTitle;
node->isbn = newISBN;
So newTitle is the string input from the method that I'm trying to set to the title variable of the Book structure of the variable node. Now, I'm assuming this has to do with a issue with pointers and trying to set data to them, but I can't figure out a fix/alternative.
Also, I tried using
strcpy(node->title, newTitle)
But that had an issue with converting the string into a list of chars because strcpy only uses a list of characters. Also tried a few other things, but none seemed to pan out, help with an explanation would be appreciated.
struct Book
{
string title;
string isbn;
struct Book * next;
};
//class LinkedList will contains a linked list of books
class LinkedList
{
private:
Book * head;
public:
LinkedList();
~LinkedList();
bool addElement(string title, string isbn);
bool removeElement(string isbn);
void printList();
};
//Constructor
//It sets head to be NULL to create an empty linked list
LinkedList::LinkedList()
{
head = NULL;
}
//Description: Adds an element to the link in alphabetical order, unless book with
same title then discards
// Returns true if added, false otherwise
bool LinkedList::addElement(string newTitle, string newISBN)
{
struct Book *temp;
struct Book *lastEntry = NULL;
temp = head;
if (temp==NULL) //If the list is empty, sets data to first entry
{
struct Book *node;
node = (Book*) malloc(sizeof(Book));
node->title = newTitle;
node->isbn = newISBN;
head = node;
}
while (temp!=NULL)
{
... //Rest of Code
Note that your Book struct is already a linked list implementation, so you don't need the LinkedList class at all, or alternatively you don't need the 'next' element of the struct.
But there's no reason from the last (long) code snippet you pasted to have an error at the lines you indicated. node->title = newTitle should copy the string in newTitle to the title field of the struct. The string object is fixed size so it's not possible to overwrite any buffer and cause a seg fault.
However, there may be memory corruption from something you do further up the code, which doesn't cause an error until later on. The thing to look for is any arrays, including char[], that you might be overfilling. Another idea is you mention you save method parameters. If you copy, it's ok, but if you do something like
char* f() {
char str[20];
strcpy(str, "hello");
return str;
}
...then you've got a problem. (Because str is allocated on the stack and you return only the pointer to a location that won't be valid after the function returns.) Method parameters are local variables.
The answer you seek can be found here.
In short: the memory malloc returns does not contain a properly constructed object, so you can't use it as such. Try using new / delete instead.

String splitting in the D language

I am learning D and trying to split strings:
import std.stdio;
import std.string;
auto file = File(path, "r");
foreach (line; file.byLine) {
string[] parts = split(line);
This fails to compile with:
Error: cannot implicitly convert expression (split(line)) of type char[][] to string[]
This works:
auto file = File(path, "r");
foreach (line; file.byLine) {
char[][] parts = split(line);
But why do I have to use a char[][]? As far as I understand the documentation, it says that split returns a string[], which I would prefer.
Use split(line.idup);
split is a template function, the return type depends on its argument. file.byLine.front returns a char[] which is also reused for performance reasons. So if you need the parts after the current loop iteration you have to do a dup or idup, whatever you need.
You can use std.stdio.lines. Depending on how you type the variable of your foreach loop, it will allocate a new buffer for every iteration or reuse the old. This way you can save the .dup/.idup.
However what type to choose depends on your use case (i.e. how long do you need the data).
foreach(string line; lines(file)) { // new string every iteration }
foreach(char[] line; lines(file)) { // reuse buffer }
Using ubyte instead of char will disable the utf8 validation.

Listing files in directory

I have created a windows form in c++ which, upon a button click, opens a dialog box for folder selection.
Now what I would like to do is get the list of files in that directory so that I can process them one by one.
I have googled it in many ways, and found many ways which include external libraries (such as boost and diren.h). I would not like to use external resources, but the ones at my disposal, the default ones.
I've read about FindFirstFile and FindNextFile, but couldnt get that combination to work.
Could you please assist?
Thanks a lot,
Idan.
Here is the updated code:
HANDLE hFind;
WIN32_FIND_DATA FindFileData;
FolderBrowserDialog^ folderBrowserDialog1 = gcnew FolderBrowserDialog;
if (folderBrowserDialog1->ShowDialog() == System::Windows::Forms::DialogResult::OK)
{
String ^ selected = folderBrowserDialog1->SelectedPath;
selected += "\\*";
char* stringPointer = (char*) Marshal::StringToHGlobalAnsi(selected).ToPointer();
hFind = FindFirstFile((LPCWSTR)stringPointer, &FindFileData);
while(hFind != INVALID_HANDLE_VALUE)
{
printf("Found file: %s\r\n", FindFileData.cFileName);
if(FindNextFile(hFind, &FindFileData) == FALSE)
break;
}
}
You obviously compile for UNICODE (wide char) since you need to cast the newStr for the lpFileName parameter of FindFirstFile. But since you pass an ANSI string, you probable won't get a useful result. Youd didn't write, what you expect to find.
In the code beforer FindFirstFile you manually convert the SelectedPath value to ANSI char. That makes no sense, when you need a wide char string anyway. Get the LPCWSTR from the String selected with the StringToHGlobalUni method. This looks somehow like this (not tested):
LPCWSTR stringPointer = Marshal::StringToHGlobalAnsi(selected).ToPointer();
hFind = FindFirstFile(stringPointer, &FindFileData);
In general: Don't use casts except when you need to adapt a bad designed interface. Use it only when you know exactly what you are doing.
Further you don't check the hFind result of FindFirstFile. It will be INVALID_HANDLE_VALUE if you pass a pointer to the wrong string format.

Resources