How to find the current location of the program break

How to find the current location of the program break - linux

I tried adding this inside the brk system call function :
void *addr = sbrk(0);
printk("current-add-is-%p-\n", addr);
But it returned error during kernel compilation that implicit declaration of sbrk function. And I could not find where sbrk is defined!!
All I need to measure that whenever some user process tries to extended its program break address, I would know its current program break address, so that I can measure how much memory processes are requesting.
Thank you.

Looks like you are trying to do something wrong.
There is no 'sbrk' syscall, there is 'brk'. Except then it would be named sys_brk, but you have no reasons to call it. So if you want to find out how to learn the current break address, read brk's sources.
However, where exactly did you put this in if you did not happen to find brk's sources?

Add this line of code:
printf("Address of program break is %p\n", (void *)sbrk(0));
It will return a message to terminal with hex address of the program break.(e.g., 0x#### #### ####.)
If you want the address in other than hex, then use %u or similar. The use of sbrk(0) is documented in man pages (linux programmers manual).
To see documentation, type in command line: man sbrk and documentation will pop up.

Related

How to find the address of a not imported libc function when ASLR is on?

I have a 32bit elf program that I have to exploit remotely (for academic purposes).
The final goal is to spawn a shell. I have a stack that I can fill with any data I want and I can abuse one of the printf format strings. The only problem is that system/execv/execvp is not imported. The .got.plt segment is full of not-very-useful functions and I want to replace atoi with system because of how similar their signature is and the flow of the code indicates that that is the right function to replace. For the following attempts, I used IDA remote debug, so bad stack alignment and not proper format string is out of question. I wanted to make sure it is doable and apparently for me it isn't yet.
At first I tried to replace atoi#.got.plt with the unrandomized address of system. Got SIGSEGV.
Alright, it's probably because of ASLR, so let's try something else. I loaded up gdb and looked up system#0xb7deeda0 and atoi#0xb7de1250. Then I calculated the diff, which is 0xDB50. So the next time when I changed the address of atoi to system in the .got.plt segment, I actually just added diff to that value to get the address of system. Got SIGSEGV again.
My logic:
0xb7deeda0 <__libc_system>
0xb7de1250 <atoi>
diff = 0xb7deeda0 - 0xb7de1250
system#.got.plt = atoi#.got.plt + diff
example: 0x08048726 + DB50 = 0x08056276
Can anyone tell me what I did wrong and how can I jump to a "valid system()" with the help of leaking a function address from .got.plt?

Answering to my own question. Measuring the distance between functions in your
l̲o̲c̲a̲l̲ libc does not guarantee that the r̲e̲m̲o̲t̲e̲ libc will have the same alignment.
You have to find the libc version somehow, then you can get the address difference like so:
readelf -s /lib32/libc-2.19.so | grep printf
Possible ways to find the libc version if you know two addresses:
Libc binary collection
libcdb.com
pwnlib
... or you have access to the shell on the remote machine and can peek into the library with readelf yourself

Why does the sys_read system call end when it detects a new line?

I'm a beginner in assembly (using nasm). I'm learning assembly through a college course.
I'm trying to understand the behavior of the sys_read linux system call when it's invoked. Specifically, sys_read stops when it reads a new line or line feed. According to what I've been taught, this is true. This online tutorial article also affirms the fact/claim.
When sys_read detects a linefeed, control returns to the program and the users input is located at the memory address you passed in ECX.
I checked the linux programmer's manual for the sys_read call (via "man 2 read"). It does not mention the behavior when it's supposed to, right?
read() attempts to read up to count bytes from file descriptor fd
into the buffer starting at buf.
On files that support seeking, the read operation commences at the
file offset, and the file offset is incremented by the number of bytes
read. If the file offset is at or past the end of file, no bytes are
read, and read() returns zero.
If count is zero, read() may detect the errors described below. In
the absence of any errors, or if read() does not check for errors, a
read() with a count of 0 returns zero and has no other effects.
If count is greater than SSIZE_MAX, the result is unspecified.
So my question really is, why does the behavior happen? Is it a specification in the linux kernel that this should happen or is it a consequence of something else?

It's because you're reading from a POSIX tty in canonical mode (where backspace works before you press return to "submit" the line; that's all handled by the kernel's tty driver). Look up POSIX tty semantics / stty / ioctl. If you ran ./a.out < input.txt, you wouldn't see this behaviour.
Note that read() on a TTY will return without a newline if you hit control-d (the EOF tty control-sequence).
Assuming that read() reads whole lines is ok for a toy program, but don't start assuming that in anything that needs to be robust, even if you've checked that you're reading from a TTY. I forget what happens if the user pastes multiple lines of text into a terminal emulator. Quite probably they all end up in a single read() buffer.
See also my answer on a question about small read()s leaving unread data on the terminal: if you type more characters on one line than the read() buffer size, you'll need at least one more read system call to clear out the input.
As you noted, the read(2) libc function is just a thin wrapper around sys_read. The answer to this question really has nothing to do with assembly language, and is the same for systems programming in C (or any other language).
Further reading:
stty(1) man page: where you can change which control character does what.
The TTY demystified: some history, and some diagrams showing how xterm, the kernel, and the process reading from the tty all interact. And stuff about session management, and signals.
https://en.wikipedia.org/wiki/POSIX_terminal_interface#Canonical_mode_processing and related parts of that article.

This is not an attribute of the read() system call, but rather a property of termios, the terminal driver. In the default configuration, termios buffers incoming characters (i.e. what you type) until you press Enter, after which the entire line is sent to the program reading from the terminal. This is for convenience so you can edit the line before sending it off.
As Peter Cordes already said, this behaviour is not present when reading from other kinds of files (like regular files) and can be turned off by configuring termios.
What the tutorial says is garbage, please disregard it.

how can I call a system call in freebsd?

I created a syscall same as /usr/share/examples/kld/syscall/module/syscall.c with a little change in message.
I used kldload and module loaded. now I want to call the syscall.
what is this syscall number so I can call it?
or what is the way to call this syscall?

I suggest you take a look at Designing BSD rootkits, that's how I learned kernel programming on FreeBSD, there's even a section that talks all about making your own syscalls.

Well, if you check /usr/share/examples/kld/syscall directory you will see it contains a test program..... but hey, let's assume the program is not there.
Let's take a look at part of the module itself:
/*
* The offset in sysent where the syscall is allocated.
*/
static int offset = NO_SYSCALL;
[..]
case MOD_LOAD :
printf("syscall loaded at %d\n", offset);
break;
The module prints syscall number on load, so the job now is to learn how to call it... a 'freebsd call syscall' search on google...
Reveals: http://www.freebsd.cz/doc/en/books/developers-handbook/x86-system-calls.html (although arguably not something to use on amd64) and.. https://www.freebsd.org/cgi/man.cgi?query=syscall&sektion=2 - a manual page for a function which allows you to call arbitrary syscalls.
I strongly suggest you do some digging on your own. If you don't, there is absolutely no way you will be able to write any kernel code.

Using assertion in the Linux kernel

I have a question about assert() in Linux: can I use it in the kernel?
If no, what techniques do you usually use if, for example I don't want to enter NULL pointer?

The corresponding kernel macros are BUG_ON and WARN_ON. The former is for when you want to make the kernel panic and bring the system down (i.e., unrecoverable error). The latter is for when you want to log something to the kernel log (viewable via dmesg).
As #Michael says, in the kernel, you need to validate anything that comes from userspace and just handle it, whatever it is. BUG_ON and WARN_ON are to catch bugs in your own code or problems with the hardware.

One option would be to use the macro BUG_ON(). It will printk a message, and then panic() (i.e. crash) the kernel.
http://kernelnewbies.org/KernelHacking-HOWTO/Debugging_Kernel
Of course, this should only be used as an error handling strategy of last resort (just like assert)...

No. Unless you're working on the kernel core and rather on a module, you should do your best to never crash (technically, abort()) the kernel. If you don't want to use a NULL pointer, just don't do it. Check it before using it, and produce an error log if it is.
The closest thing you might want to do if you're actually handling a fatal case is the panic() function or the BUG_ON and WARN_ON macros, which will abort execution and produce diagnostic messages, a stack trace and a list of modules.

Well, dereferencing null pointer will produce an oops, which you can use to find the offending code. Now, if you want to assert() a given condition, you can use
BUG_ON(condition)
A less lethal mechanism is WARN_ON, which will produce a backtrace without crashing the kernel.

I use this macro, it uses BUG() but adds some more info I normally use for debugging, and of course you can edit it to include more info if you wish:
#define ASSERT(x) \
do { if (x) break; \
printk(KERN_EMERG "### ASSERTION FAILED %s: %s: %d: %s\n", \
__FILE__, __func__, __LINE__, #x); dump_stack(); BUG(); \
} while (0)

BUG_ON() is the appropriate approach to do it. It checks for the condition to be true and calls the macro BUG().
How BUG() handles the rest is explained very well in the following article:
http://kernelnewbies.org/FAQ/BUG

gdb break when program opens specific file

Back story: While running a program under strace I notice that '/dev/urandom' is being open'ed. I would like to know where this call is coming from (it is not part of the program itself, it is part of the system).
So, using gdb, I am trying to break (using catch syscall open) program execution when the open call is issued, so I can see a backtrace. The problem is that open is being called alot, like several hundred times so I can't narrow down the specific call that is opening /dev/urandom. How should I go about narrowing down the specific call? Is there a way to filter by arguments, and if so how do I do it for a syscall?
Any advice would be helpful -- maybe I am going about this all wrong.

GDB is a pretty powerful tool, but has a bit of a learning curve.
Basically, you want to set up a conditional breakpoint.
First use the -i flag to strace or objdump -d to find the address of the open function or more realistically something in the chain of getting there, such as in the plt.
set a breakpoint at that address (if you have debug symbols, you can use those instead, omitting the *, but I'm assuming you don't - though you may well have them for library functions if nothing else.
break * 0x080482c8
Next you need to make it conditional
(Ideally you could compare a string argument to a desired string. I wasn't getting this to work within the first few minutes of trying)
Let's hope we can assume the string is a constant somewhere in the program or one of the libraries it loads. You could look in /proc/pid/maps to get an idea of what is loaded and where, then use grep to verify the string is actually in a file, objdump -s to find it's address, and gdb to verify that you've actually found it in memory by combining the high part of the address from maps with the low part from the file. (EDIT: it's probably easier to use ldd on the executable than look in /proc/pid/maps)
Next you will need to know something about the abi of the platform you are working on, specifically how arguments are passed. I've been working on arm's lately, and that's very nice as the first few arguments just go in registers r0, r1, r2... etc. x86 is a bit less convenient - it seems they go on the stack, ie, *($esp+4), *($esp+8), *($esp+12).
So let's assume we are on an x86, and we want to check that the first argument in esp+4 equals the address we found for the constant we are trying to catch it passing. Only, esp+4 is a pointer to a char pointer. So we need to dereference it for comparison.
cond 1 *(char **)($esp+4)==0x8048514
Then you can type run and hope for the best
If you catch your breakpoint condition, and looking around with info registers and the x command to examine memory seems right, then you can use the return command to percolate back up the call stack until you find something you recognize.

(Adapted from a question edit)
Following Chris's answer, here is the process that eventually got me what I was looking for:
(I am trying to find what functions are calling the open syscall on "/dev/urandom")
use ldd on executable to find loaded libraries
grep through each lib (shell command) looking for 'urandom'
open library file in hex editor and find address of string
find out how parameters are passed in syscalls (for open, file is first parameter. on x86_64 it is passed in rdi -- your mileage may vary
now we can set the conditional breakpoint: break open if $rdi == _addr_
run program and wait for break to hit
run bt to see backtrace
After all this I find that glib's g_random_int() and g_rand_new() use urandom. Gtk+ and ORBit were calling these functions -- if anybody was curious.

Like Andre Puel said:
break open if strcmp($rdi,"/dev/urandom") == 0
Might do the job.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string