Linux console output gets corrupted with ASCII characters - linux

I am implementing a software project using C++ on Debian. When I execute the stand alone binary on a debian box, program runs fine for at least 15-20 minutes but after a while the console output becomes corrupted. I see lots of ASCII characters for most of the characters, but some characters display fine, so output becomes almost unreadable. If I CTRL+C and stop the execution, whatever I type on command line is also displayed as weird ASCII characters. If I reboot the box and start over, everything works fine for another 15-20 minutes then the same thing happens. Does anybody have any idea what might be going on here? Debian box has only command line support no GUI.

It sounds like you are printing some unwanted characters at some point. I think you may have a problem with managing memory you use for strings. Try running your program under valgrid. You can follow this tutorial. You should expect warnings about reading from uninitialized memory.

I don't think you're using "ASCII" properly here. Considering the fact that ASCII is in range 0-127, there's not much "weird" stuff in that range. I've seen that happen before, it's usually due to escape characters interpreted as display codes. I am a bit fuzzy on this -- I haven't done console stuff in a long while. But I'm pretty sure it's related to raw output of stuff that actually out of ASCII range.

Related

POP3 buffer gets translated in a strange way. Characters are bad when they shouldn't be

I've been trying to write a script for a buffer overflow attack on SLmail service (Within a legal setting, in a VPN. I am following a penetration testing course.).
The issue I'm having came up while trying to define which characters were bad. As it turns out, anything above 7F is getting mangled. I have followed the exact example that my textbook gave me, I tried googling for similar examples, not a single one I found ever mentioned that issue.
In theory, the only bad characters for this particular buffer should be x00, x0a and x0d.
Here everything above 7F is a mess. I get C2s and C3s appearing every other byte, while the rest of the bytes are somehow translated. (FF turns into BF, for example.). This is rendering me completely unable to have my shellcode sent through. I've tried removing some or changing the order. No matter the order I put them in, anything above 7F will come out translated with C2/C3s every other byte.
Link to both my script code and the memory dump resulting from it.
(The for is weird, I know.)
I figured it out.
I was using py3, which required strings to be encoded.
By translating the script into py2.7, I no longer needed to encode them and they went through without any mangling.
https://imgur.com/a/OOct5Z9

vim - Cursor randomly jumps on Linux

This only occurs when I am using vim on Linux (it's Kali Linux to be precise, though I haven't tested it on other distributions). I am using a standard German keyboard layout.
Sometimes when I type in vim (it often happens when I exit insert mode or when I use :w, maybe only on one of these since I often do one after the other), the cursor randomly jumps elsewhere, usually about 100 lines upwards (I don't have an exact number). At the same time, the next number in the line my cursor was in is decremented.
I suspect that this happens because I hit some sequence of keys to quickly, since this, on my Linux distribution, can cause some special characters to be inserted due to one of the keys modifying the other. For example, if I type "yt" quickly with this keyboard, it becomes "yŧ" (with a second bar on the t)
This by itself is somewhat annoying to me so if someone knows a way to turn that off on Linux while still retaining the basic keyboard layout, this would solve my problem, but telling me the exact command I accidentally executed so I can avoid/remove it will also help.
As far as I can remember, this problem only occurred when I was editing .texfiles, but that is also what I have been using vim the most for recently, so I wouldn't assume that it only happens there.
Still, I can post my list of plugins and my .vimrc if necessary. Just in case it has something to do with only LaTeX files, the only vim plugin I have for that is vimtex.
The command you are looking for is mapped to control + x by default. It decrements the next number on the given line.

Linux Terminal Mouse Reporting -- Basic Questions

I have made a simple, mouse-controlled taskbar using a shell script. It's working very well, and uses rxvt-unicode to make the "graphics".
Unfortunately, however, I moved this script from my netbook to my laptop, and when I changed the size of the terminal window and updated the code, I discovered that my mouse reporting stopped working beyond column 95 (it always returns ! no matter where it is clicked beyond 95).
I discovered that there is a "limit" with mouse reporting, at column 95. My program now requires 123 columns, where before it was happening to fit into under 95.
I looked up the problem, and only found one reference to the 95 column limit. Most of what I found actually refered to a 223 column limit. If I had a 223 limit, I'd be utterly fine, but I do not understand how to get it switched over.
Basically, I do not understand enough of the problem to apply what I'm reading on google. Usually I can do my own fishing, but this problem got me.
I'm using this guide to tell me what escape sequence to use (I picked X10, click-only, reporting, or escape sequence \033[?9h).
how to get MouseMove and MouseClick in bash?
I found this that mentioned a 95 column limit, but made little sense of it:
Emacs, unicode, xterm mouse escape sequences, and wide terminals
I am using small code snippets, more or less based on this:
http://www.imbe.net/bizen?Linux/Mouse_In_A_Terminal
I found other others that did not minus 223, but rather 255. My code seemed unaffected by this change.
I have solved my problem. What I never understood was that some of the mouse reporting settings can be active at the same time. When \033[?1015h and \033[?9h were combined, I started to get mouse reporting beyond the 95 mark. The 1015 is suitable for urxvt. I believe 1005h is used in xterm.

special character different window with linux

I have two projects, one in Windows and another one in Linux. I use the same database for both (oracle 10g),I have got an input file which consists of text that includes special characters (ÁTUL ÁD).
the program logic is like this: read input file data to database, on windows the data (including the special characters) is displayed correct, on Linux the special characters display other characters. As I already said, I use same database for both of them, could you give me some help?
The program is complex, it uses the Spring Batch Framework. Maybe the encoding causes the problem, but I have no idea how to solve it. I am using Linux for the first time.
Thanks in advance.
I find one solution which works for me is that you have to use UTF-8 encoding. All for Windows,Linux and Database.

How to determine codepage of a file (that had some codepage transformation applied to it)

For example if I know that ć should be ć, how can I find out the codepage transformation that occurred there?
It would be nice if there was an online site for this, but any tool will do the job. The final goal is to reverse the codepage transformation (with iconv or recode, but tools are not important, I'll take anything that works including python scripts)
EDIT:
Could you please be a little more verbose? You know for certain that some substring should be exactly. Or know just the language? Or just guessing? And the transformation that was applied, was it correct (i.e. it's valid in the other charset)? Or was it single transformation from charset X to Y but the text was actually in Z, so it's now wrong? Or was it a series of such transformations?
Actually, ideally I am looking for a tool that will tell me what happened (or what possibly happened) so I can try to transform it back to proper encoding.
What (I presume) happened in the problem I am trying to fix now is what is described in this answer - utf-8 text file got opened as ascii text file and then exported as csv.
It's extremely hard to do this generally. The main problem is that all the ascii-based encodings (iso-8859-*, dos and windows codepages) use the same range of codepoints, so no particular codepoint or set of codepoints will tell you what codepage the text is in.
There is one encoding that is easy to tell. If it's valid UTF-8, than it's almost certainly no iso-8859-* nor any windows codepage, because while all byte values are valid in them, the chance of valid utf-8 multi-byte sequence appearing in a text in them is almost zero.
Than it depends on which further encodings may can be involved. Valid sequence in Shift-JIS or Big-5 is also unlikely to be valid in any other encoding while telling apart similar encodings like cp1250 and iso-8859-2 requires spell-checking the words that contain the 3 or so characters that differ and seeing which way you get fewer errors.
If you can limit the number of transformation that may have happened, it shouldn't be too hard to put up a python script that will try them out, eliminate the obvious wrongs and uses a spell-checker to pick the most likely. I don't know about any tool that would do it.
The tools like that were quite popular decade ago. But now it's quite rare to see damaged text.
As I know it could be effectively done at least with a particular language. So, if you suggest the text language is Russian, you could collect some statistical information about characters or small groups of characters using a lot of sample texts. E.g. in English language the "th" combination appears more often than "ht".
So, then you could permute different encoding combinations and choose the one which has more probable text statistics.

Resources