Debug Inspector showing strange characters? - string

I'm just getting into Go for the first time and finally got things running on my Win10 machine. Finally got breakpoints working inside of IntelliJ IDEA, and I'm seeing stuff like this in my debugger window. Those messes of unicode chars should actually be a 24-char HEX id that's coming from MongoDB.
My best guess is that this is a problem with mgo not properly unmarshalling ObjectId objects, but this doesn't seem to be a problem for any of the devs running linux or macOS, so maybe it's just a Windows thing?
Any input would be appreciated!

No error here. bson.ObjectId has an underlying type of string:
type ObjectId string
But it is used to store 12 "arbitrary" bytes ("arbitrary" means it is not meant to be interpreted by runes, and it's not a valid UTF-8 encoded sequence). It is usually displayed using the hex representation of its bytes, for humans.
Debuggers don't take that convenience. They see it's a string, so they attempt to display it as a string (even though it's not meant to). This is not a Windows-only thing, the Atom editor with the delve debugger does the same on Linux too. Nothing to worry about.
If you print an ObjectId, it's usually the fmt package's "thing" to use its String() method to acquire the string value to be displayed. Debuggers do not necessarily do that.

Related

Beautiful Soup - meaning of letter 'u' in documentation [duplicate]

Like in:
u'Hello'
My guess is that it indicates "Unicode", is that correct?
If so, since when has it been available?
You're right, see 3.1.3. Unicode Strings.
It's been the syntax since Python 2.0.
Python 3 made them redundant, as the default string type is Unicode. Versions 3.0 through 3.2 removed them, but they were re-added in 3.3+ for compatibility with Python 2 to aide the 2 to 3 transition.
The u in u'Some String' means that your string is a Unicode string.
Q: I'm in a terrible, awful hurry and I landed here from Google Search. I'm trying to write this data to a file, I'm getting an error, and I need the dead simplest, probably flawed, solution this second.
A: You should really read Joel's Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) essay on character sets.
Q: sry no time code pls
A: Fine. try str('Some String') or 'Some String'.encode('ascii', 'ignore'). But you should really read some of the answers and discussion on Converting a Unicode string and this excellent, excellent, primer on character encoding.
My guess is that it indicates "Unicode", is it correct?
Yes.
If so, since when is it available?
Python 2.x.
In Python 3.x the strings use Unicode by default and there's no need for the u prefix. Note: in Python 3.0-3.2, the u is a syntax error. In Python 3.3+ it's legal again to make it easier to write 2/3 compatible apps.
I came here because I had funny-char-syndrome on my requests output. I thought response.text would give me a properly decoded string, but in the output I found funny double-chars where German umlauts should have been.
Turns out response.encoding was empty somehow and so response did not know how to properly decode the content and just treated it as ASCII (I guess).
My solution was to get the raw bytes with 'response.content' and manually apply decode('utf_8') to it. The result was schöne Umlaute.
The correctly decoded
für
vs. the improperly decoded
fĂźr
All strings meant for humans should use u"".
I found that the following mindset helps a lot when dealing with Python strings: All Python manifest strings should use the u"" syntax. The "" syntax is for byte arrays, only.
Before the bashing begins, let me explain. Most Python programs start out with using "" for strings. But then they need to support documentation off the Internet, so they start using "".decode and all of a sudden they are getting exceptions everywhere about decoding this and that - all because of the use of "" for strings. In this case, Unicode does act like a virus and will wreak havoc.
But, if you follow my rule, you won't have this infection (because you will already be infected).

Cassandra client UTF-8 garbled

I use cassandra in my project, and it works well, but now I want to use cassandra-cli.bat to initialize some data(include some UTF8 words), when I run cassandra-cli.bat and input some words, the UTF8 words displayed as ????, I don't know why, can you help me?
This sounds like a terminal issue, rather than specifically related to cassandra-cli. If your terminal application supports UTF8, it should just work. I'm not familiar with windows at all, but searching for 'UTF8 Windows terminal' gives a few hints on things you might try.

Vb6 Printer Object Print Japanese

I want to use printer (windows driver) to print Japanese in a vb 6 project.
My project is in Japanese Windows Environment (the OS is English Originally, set Japan region and related language).
I use Printer object to print a simple string type of Japanese such as "レジースター", the code like
Dim s As String
s="レジースター"
Printer.Print s
Printer.EndDoc
but the output result is a set of messy code like "􀀀OEvƒOEƒ|􀀀[ƒg"
Does anybody who can succeeded in printing out Japanese with Vb6 Printer Object in Japanese language Windows Envrionment, please help me.
Finally find the key is simple, it's a little bit tricky but I still don't know why. Just set the font of the Printer Object like "Printer.Font.Charset = 128" (128 for Japanese)
ATTN: Pls pay attention to my case, my OS is English with the language and region setting to Japanese.
What make me confused is that the default ANSI of Windows. As we know, the default value of Printer.Font.Charset is 0, it means ANSI (IF the language environment is Japanese then it will use code page 932, if it is English, it will use Windows-1252).
My OS is Japanese (set to Japanese, not purely, Originally English OS), when I try to Write a file in Japanese it can display Japanese, but when I use the Printer Object to Print, it does have 0(ANSI) value of .Font.Charset, but actually it still use the original OS code page, so it is wired. And when I try to set the system to Chinese and Korean, both of the language is normal, only Japanese have this problem.
the trick that i have used for something like this is to use double StrConv() functions, one with the vbFromUnicode constant and the other with the vbToUnicode constant.
It takes alittle experimenting to get right , but it should look something like this, swap the constants and/or codepage values until you get the right conversion for your system
Dim s as string
s="レジースター"
Dim newS as string
newS = StrConv((StrConv(s,FromUnicode,CodePage1),ToUnicode,CodePage2)
Printer.Print newS
CodePage*N* is the Windows codepage value, 1252 for English, 932 for Japanese
Despite all strings in VB6 are Unicode (UTF-16), when it comes to interfacing with the world VB6 is completely non-Unicode.
You cannot store レジースター in your project file because the file is in ANSI.
You cannot simply pass a string to a declared API function because it will undergo an automatic conversion to ANSI first. To avoid this, you have to declare string parameters as pointers.
Apparently same happens to the Print call - the string gets converted to the currect ANSI codepage before it reaches the printer driver.
You can try printing manually by creating a device context and printing on it.
Or you can search for another solution, like this one (I did not try it).

Why pointer show unicode string reversed?

I have problems with unicode strings. My pointer to a string in farsi (saved as Unicode, codepage 1200) return the string reversed. Why? I know that farsi is a right-to-left language, but this is a C/C++ matter. My pointer to a string should point to the start of secuence as is stored in file.
I'm using VC++2005, standard console app.
Any help will be welcomed, I have attached screenshot and sample project.
test project
screen capture
Regards,
Juan
If the order is reversed in VC++2005, then probably it just does not handle directionality right, i.e. it displays Arabic letters left to right instead of properly obeying their inherent directionality. Such things happen in many editors and development tools. It does not as such affect the behavior of applications.

Xcode debugger: display long strings

While debugging a program in Xcode I have several CFStringRef variables that point to strings with lengths around the 200 character mark.
In the debugger, it only shows the value of these strings up to a certain length and then just ellipses them away. I'd really like to see the full value of the strings.
Is there some option I can configure so it doesn't terminate them at arbitrary length?
In the debugging console you can get the string value by doing something like:
(gdb) print (void)CFShow(myCFString)
or:
(gdb) po (NSString*)myCFString
Either of those will display the entire string's contents to the debugging console. It's probably the easiest way to deal with large, variable-length strings or data structures of any kind.
For more information, the print command in the debugger basically dumps some data structure to the console. You can also call any functions or whatever, but since print doesn't have access to the function declarations, you have to make sure you provide them implicitly (as shown in the example above), or the print command will complain.
po is a shortcut for print-object and is the same as print except for Objective-C objects. It basically functions like this:
(gdb) print (const char *)[[theObject debugDescription] UTF8String]
This is really useful for examining things like NSData object and NSArray/NSDictionary objects.
For a lot more information on debugging topics, see Technical Note TN2124 - Mac OS X Debugging Magic and (from the debugger console) you can issue the help command as well.
To display really long string use method from print long string in xcode 6 debugging console
In lldb console increase max-string-summary-length
setting set target.max-string-summary-length 10000
Print your string with print or po commands
print my_string
If you are compiling c++ project in xcode just use this command
po string_name

Resources