nasm assembly limit of storage on the stack - linux

I am writing a program to print out a 32-bit number, and I was thinking of storing each digit on the stack, to make use of its last-in-first-off functionality. This arose the question, could I store 32 digits on the stack?
My question is, how many digits of information could I store on the stack? What is the limit of the number of things I can push onto the stack? Could I store 64 digits? 128? A number of arbitrary length?
Thanks in advance,
Rileyh

It's not actually nasm dictating this, more the linker that you use. All nasm does is create object files which can be linked together.
If you are using the ld linker from Linux, you'll most likely find that your default stack is 2M.
So, no, 32 bytes is not really going to have a massive impact on that and, even if you run out of stack, you can use something like ld --stack 4194304 to bump it up.

Depends a tiny bit on the OS and a bit more on the linker you use, but you should be fine. It's common to allocate a stack of a megabyte or more by default, so 128 bytes is nothing. Just make sure you reset the stack pointer before you return, and everything should be fine.
You can typically tell the linker to allocate a stack of a certain size as well, if you find you need more than you get by default.

Related

Can anyone explain why NO-OP slide is used in shelllcoding?

An example where NO-OP slide is a must for the exploit to work would be really helpful.
An example of when it is a must is when you want an exploit to be portable when targeting a non-ASLR enabled executable/system. Consider a local privilege escalation exploit where you return to shellcode on the stack. Because the stack holds the environment, the shellcode on the stack will be at slightly different offsets from the top of the stack when executing from within different users' shells, or on different systems. By prefixing the shellcode with, for example, 64k nop instructions, you provide a large margin of error for the stack address since your code will execute the same whether you land on the first nop or the last one.
Using nops is generally not as useful when targeting ASLR enabled systems since data sections will be mapped in entirely different areas of memory

what is The poisoned NUL byte, in 1998 and 2014 editions?

I have to make a 10 minutes presentation about "poisoned null-byte (glibc)".
I searched a lot about it but I found nothing, I need help please because operating system linux and the memory and process management isn't my thing.
here is the original article, and here is an old article about the same problem but another version.
what I want is a short and simple explanation to the old and new versions of the problem or/and sufficient references where I can better read about this security threat.
To even begin to understand how this attack works, you will need at least a basic understanding of how a CPU works, how memory works, what the "heap" and "stack" of a process are, what pointers are, what libc is, what linked lists are, how function calls are implemented at the machine level (including calls to function pointers), what the malloc and free functions from the C library do, and so on. Hopefully you at least have some basic knowledge of C programming? (If not, you will probably not be able to complete this assignment in time.)
If you have a couple "gaps" in your knowledge of the basic topics mentioned above, hit the books and fill them in as quickly as you can. Talk to others if you need to, to make sure you understand them. Then read the following very carefully. This will not explain everything in the article you linked to, but will give you a good start. OK, ready? Let's start...
C strings are "null-terminated". That means the end of a string is marked by a zero byte. So for example, the string "abc" is represented in memory as (hex): 0x61 0x62 0x63 0x00. Notice, that 3-character string actually takes 4 bytes, due to the terminating null.
Now if you do something like this:
char *buffer = malloc(3); // not checking for error, this is just an example
strcpy(buffer, "abc");
...then that terminating null (zero byte) will go past the end of the buffer and overwrite something. We allocated a 3-byte buffer, but copied 4 bytes into it. So whatever was stored in the byte right after the end of the buffer will be replaced by a zero byte.
That was what happened in __gconv_translit_find. They had a buffer, which had been allocated with enough space to append ".so", including the terminating null byte, onto the end of a string. But they copied ".so" in starting from the wrong position. They started the copy operation one byte too far to the "right", so the terminating null byte went past the end of the buffer and overwrote something.
Now, when you call malloc to get back a dynamically allocated buffer, most implementations of malloc actually store some housekeeping data right before the buffer. For example, they might store the size of the buffer. Later, when you pass that buffer to free to release the memory, so it can be reused for something else, it will find that "hidden" data right before the beginning of the buffer, and will know how many bytes of memory you are actually freeing. malloc may also "hide" other housekeeping data in the same location. (In the 2014 article you referred to, the implementation of malloc used also stored some "flag" bits there.)
The attack described in the article passed carefully crafted arguments to a command-line program, designed to trigger the buffer overflow error in __gconv_translit_find, in such a way that the terminating null byte would wipe out the "flag" bits stored by malloc -- not the flag bits for the buffer which overflowed, but those for another buffer which was allocated right after the one which overflowed. (Since malloc stores that extra housekeeping data before the beginning of an allocated buffer, and we are overrunning the previous buffer. You follow?)
The article shows a diagram, where 0x00000201 is stored right after the buffer which overflows. The overflowing null byte wipes out the bottom 1 and changes that into 0x00000200. That might not make sense at first, until you remember that x86 CPUs are little-endian -- if you don't understand what "little-endian" and "big-endian" CPUs are, look it up.
Later, the buffer whose flag bit was wiped out is passed to free. As it turns out, wiping out that one flag bit "confuses" free and makes it, in turn, also overwrite some other memory. (You will have to understand the implementation of malloc and free which are used by GNU libc, in order to understand why this is so.)
By carefully choosing the input arguments to the original program, you can set things up so that the memory overwritten by the "confused" free is that used for something called tls_dtor_list. This is a linked list maintained by GNU libc, which holds pointers to certain functions which it must call when the main program is exiting.
So tls_dtor_list is overwritten. The attacker has set things up just right, so that the function pointers in the overwritten tls_dtor_list will point to some code which they want to run. When the main program is exiting, some code in libc iterates over that list and calls each of the function pointers. Result: the attacker's code is executed!
Now, in this case, the attacker already has access to the target system. If all they can do is run some code with the privilege level of their own account, that doesn't get them anywhere. They want to run code with root (administrator) privileges. How is that possible? It is possible because the buggy program is a setuid program, owned by root. If you don't know what "setuid" programs in Unix are, look it up and make sure you understand it, because that is also a key to the whole exploit.
This is all about the 2014 article -- I didn't look at the one from 1998. Good luck!

Portable executable stack size and operation (x64)

In this image: https://i.imgur.com/LIImg.jpg
Under the code section it lists the first instruction as subtracting 0x28 from the stack pointer. Why would it need to subtract from a stack pointer that should be 0, right? Or does it start at the top and work down? Where in the PE headers do you specify the stack size?
The stack pointer doesn't have to be 0. In fact, and as Windows uses a flat memory model, it will have some non-zero value, big enough to allow growing downwards as stack is needed.
The action of substracting a value to the stack pointer is commonly found in the standard prologue of C functions. It allows a function to reserve stack space for local variables. Sometimes the compiler adds its own local variables to aid in some optimizations, or to help some stack checking functions linked to the program if you chose to check for stack buffer overflows at runtime.
You can see the commited and reserved stack space in a PE executable by using the DUMPBIN utility on that executable with the /HEADERS option. You can change both the reserved and commited stack size using Linker options (in Visual Studio)

Kernel Panic -- Failed copy_from_user, kmalloc?

I am writing a rootkit for my OS class (the teacher is okay with me asking for help here). My rootkit hooks the sys_read system call to hide "magic" ports from the user. When I copy the user buffer *buf (one of the arguments of sys_read) to kernel space (into a buffer called kbuf) I get kernel panic/core dump error. It is possible that this is just because breaking read brings the system to a halt, but I wonder if anyone has any perspective on this.
The code is available online. Look at line 207: https://github.com/joshimhoff/toykit/blob/master/toykit.c
I hooked getdents and used copy_from_user to bring the getdents structs into kernel space, and this worked well! I am not sure what is different about read.
Thanks for the help!
I figured it out. I called the actual sys_read function and didn't check the return value. Sometimes it is negative to indicate an error. Instead of failing early, I asked kmalloc for a negative number of bytes.
Imagine that. Allocating negative memory. That would be a crazy world.

Is there any profiler that works with -fomit-frame-pointer on x86_64?

SysProf doesn't properly generate call stack without it, GProf isn't accurate at all. And also, are profilers that work without -fno-omit-frame-pointer as accurate as those that rely on it?
Recent versions of linux perf can be used (with --call-graph dwarf):
perf record -F99 --call-graph dwarf myapp
It uses .eh_frames (or .debug_frames) with libunwind to unwind the stack.
In my experience, it get lost, sometimes.
With recent version of perf+kernel on Haswell, you might be able to use the Last Branch Record with --call-graph lbr.
There are none that I'm aware of. With frame pointers, walking a stack is a fairly simple exercise. You simply dereference the frame pointer to find the old frame pointer, stack pointer, and instruction pointer, and repeat until you're done. Without frame pointers you cannot reliably walk a stack without additional information, which on ELF platforms generally means DWARF CFI. DWARF is fairly complex to parse, and requires you to read in a fair amount of additional information which is tricky to do in the time constraints that profilers need to work in.
One plausible method for implementing this would be to simply save the stack memory at every sample and then walk it offline using the CFI to unwind properly. Depending on the depth of the stack this could require quite a bit of storage, and the copying could be prohibitive. I've never heard of a profiler using this technique, but Julian Seward floated it as a potential implementation strategy for Firefox's built-in profiler.
It would be hard for most profilers to work when -fomit-frame-pointer is asserted. You probably need to not use that and to link against debugging versions of the libraries (which are almost certainly compiled without -fomit-frame-pointer) if you want to do reasonable profiling.

Resources