How to seach for specific instructions in an ELF binary

How to seach for specific instructions in an ELF binary - linux

How do i search of a specific ASM instruction in an ELF executable?
eg. I want to check if the sequence mov $0,%eax in my executable. Are there any tools for this? Is there any tool which lets me search for the similar instructions which varies only by the register used?
eg: It should match both mov $0,%eax as well as move $0,%ecx.

I have resorted to two methods for the requirement.
The easiest was objdump. Convert the binary to asm format using objdump -S myexe > myexe.s and use grep for the most basic type of search.
When my search requirements got advanced, for eg. find jmp instructions with pop instructions just before the jmp, i moved on to using ruby regexes instead of grep to keep track of jmp's and backtrack for pops.
But at times i had to scan through really large binaries ( 100 MB ) and the objdump ruby method was slow and getting killed by the kernel ( my script was very basic. bad scripting could be part of the reason. but it was still very slow )
I went through the code of llvm-objdump and tinkered around to keep track of when a particular opcode was occuring, when particular operands were occuring etc. Trust me, the existing code was very easy to understand and make modifications to. This modified objdump was tons faster and got the job done. I made the modofications to DisassembleObject where there is a
for (Index = Start; Index < End; Index += Size)
main loop which goes through each instruction in the executable. Inst is the name of the current instruction which you can do checks on and introduce your own code.

Related

How to find the address of a not imported libc function when ASLR is on?

I have a 32bit elf program that I have to exploit remotely (for academic purposes).
The final goal is to spawn a shell. I have a stack that I can fill with any data I want and I can abuse one of the printf format strings. The only problem is that system/execv/execvp is not imported. The .got.plt segment is full of not-very-useful functions and I want to replace atoi with system because of how similar their signature is and the flow of the code indicates that that is the right function to replace. For the following attempts, I used IDA remote debug, so bad stack alignment and not proper format string is out of question. I wanted to make sure it is doable and apparently for me it isn't yet.
At first I tried to replace atoi#.got.plt with the unrandomized address of system. Got SIGSEGV.
Alright, it's probably because of ASLR, so let's try something else. I loaded up gdb and looked up system#0xb7deeda0 and atoi#0xb7de1250. Then I calculated the diff, which is 0xDB50. So the next time when I changed the address of atoi to system in the .got.plt segment, I actually just added diff to that value to get the address of system. Got SIGSEGV again.
My logic:
0xb7deeda0 <__libc_system>
0xb7de1250 <atoi>
diff = 0xb7deeda0 - 0xb7de1250
system#.got.plt = atoi#.got.plt + diff
example: 0x08048726 + DB50 = 0x08056276
Can anyone tell me what I did wrong and how can I jump to a "valid system()" with the help of leaking a function address from .got.plt?

Answering to my own question. Measuring the distance between functions in your
l̲o̲c̲a̲l̲ libc does not guarantee that the r̲e̲m̲o̲t̲e̲ libc will have the same alignment.
You have to find the libc version somehow, then you can get the address difference like so:
readelf -s /lib32/libc-2.19.so | grep printf
Possible ways to find the libc version if you know two addresses:
Libc binary collection
libcdb.com
pwnlib
... or you have access to the shell on the remote machine and can peek into the library with readelf yourself

nasm system calls Linux

I have got a question about linux x86 system calls in assembly.
When I am creating a new assembly program with nasm on linux, I'd like to know which system calls I have to use for doing a specific task (for example reading a file, writing output, or simple exiting...). I know some syscall because I've read them on some examples taken around internet (such as eax=0, ebx=1 int 0x80 exit with return value of 1), but nothing more... How could I know if there are other arguments for exit syscall? Or for another syscall? I'm looking for a docs that explain which syscalls have which arguments to pass in which registers.
I've read the man page about exit function etc. but it didn't explain to me what I'm asking.
Hope I was clear enough,
Thank you!

The x86 wiki (which I just updated again :) has links to the system call ABI (what the numbers are for every call, where to put the params, what instruction to run, and which registers will clobbered on return). This is not documented in the man page because it's architecture-specific. Same for binary constants: they don't have to be the same on every architecture.
grep -r O_APPEND /usr/include for your target architecture to recursively search the .h files.
Even better is to set things up so you can use the symbolic constants in your asm source, for readability and to avoid the risk of errors.
The gcc actually does use the C Preprocessor when processing .S files, but including most C header files will also get you some C prototypes.
Or convert the #defines to NASM macros with sed or something. Maybe feed some #include<> lines to the C preprocessor and have it print out just the macro definitions.
printf '#include <%s>\n' unistd.h sys/stat.h |
gcc -dD -E - |
sed -ne 's/^#define $[A-Za-z_0-9]*$ $.$/\1\tequ \2/p'
That turns every non-empty #define into a NASM symbol equ value. The resulting file has many lines of error: expression syntax error when I tried to run NASM on it, but manually selecting some valid lines from that may work.
Some constants are defined in multiple steps, e.g. #define S_IRGRP (S_IRUSR >> 3). This might or might not work when converted to NASM equ symbol definitions.
Also note that in C 0666, is an octal constant. In NASM, you need either 0o666 or 666o; a leading 0 is not special. Otherwise, NASM syntax for hex and decimal constants is compatible with C.

Perhaps you are looking for something like linux/syscalls.h[1], which you have on your system if you've installed the Linux source code via apt-get or whatever your distro uses.
[1] http://lxr.free-electrons.com/source/include/linux/syscalls.h#L326

Requesting examples of sys_fork in nasm

I'm trying to run bzip and have it return control to the calling function from inside a nasm-coded assembly program (under linux). I apparently need to use a combination of the sys_fork and sys_execve system calls to achive this. After much searching, I found an example of how to use sys_execve, however I can't find an example of how to use sys_fork. Any help with my request will be appreciated.

My experience is limited, but as I recall sys_fork is easy. "Just do it" - no parameters. At this point, you're "in two places at once". If eax is zero, you're the child - do sys_execve on bzip. If eax is non-zero (and non-negative!), you're the parent and eax is your PID. Do a sys_waitpid on that PID. As I recall, this returns the exit status of bzip shifted left 8 bytes - sys_execve itself never returns.
I have a crude example that runs an editor, nasm, and ld (all on a hard-coded "hello.asm"). Longish to post, but I can make it available some way if you need it. I found getting the correct parameters to sys_execve the hardest part, as I recall.

how to boot this code?

i am a newbie to assembly and program in c (use GCC in Linux)
can anyone here tell me how to compile c code into assembly and boot from it using pen drive
i use the command (in linux terminal) :
gcc -S bootcode.c
the code gives me a bootcode.S file
what do i do with that ???
i just wanna compile the following code and run it directly from a USB stick
#include<stdio.h>
void main()
{
printf ("hi");
}
any help here ???

First of all,
You Should be aware that when you are writing bootloader codes , you should know that you are CREATING YOUR OWN ENVIRONMENT of CODE, that means, there is nothing such ready made C Library available to you or anything similar , ONLY and ONLY BIOS SERVICES (or INTERRUPT ROUTINES).
Now, if you got this, you will probably figure out that the above code won't boot since, you don't have the "stdio.h" header, this means that the CPU when executing your compiled code won't find this header and thereby won't understand what is "printf" (since printf is a method of the stdio.h header).
So if you want to print any string you need to write this function by YOUR OWN either in a separate file as a header and link its object file at compilation time when creating the final binary file or in the same file. it is up to you. There could be other ways, I'm not well familiar with them, just do some researches.
Another thing you should know, it is the BIOS who is responsible for loading this boot code (your above code in your case) into memory location 0x07C00 (0x0000h:0x7C00 in segment:offset representation), so you HAVE to mention in your code that you are writing this code on this memory location, either by
1-using the ORG instruction
2-Or by loading the appropriate registers for that (cs,ds,es)
Also, you should get yourself familiar with the segment:offset memory representation scheme, just google it or read intel manuals.
Finally, for the BIOS to load your code into the 0x07C00, the boot code must not exceed 512byte (ONLY ON FIRST SECTOR OF THE BOOTABLE MEDIA, since a sectore is 512byte) and he must find at the last two byte of this first sector (byte 510 & byte 511) of your code the boot signature 0x55AA, otherwise the BIOS won't consider this code AS BOOTABLE.
Usually this is coded as :
ORG 0x7C00
...
your boot code and to load more codes since 512byte won't be sufficient.
...
times 510 - ($ - $$) db 0x00 ; Zerofill up to 510 bytes
dw 0xAA55 ;Boot Sector signature,written in reverse order since it
will be stored as little endian notation
Just to let you know, I'm not covering everything here, because if so, I'll be writing pages about it, you need to look for more resources on the net, and here is a link to start with(coding in assembly):
http://www.brokenthorn.com/Resources/OSDevIndex.html
That's all, hopefully this was helpful to you...^_^
Khilo - ALGERIA

Booting a computer is not that easy. A bootloader needs to be written. The bootloader must obey certain rules and correspond with hardware such as ROM. You also need to disable interrupts, reserve some memory etc. Look up MikeOS, it's a great project that can better help you understand the process.
Cheers

gdb break when program opens specific file

Back story: While running a program under strace I notice that '/dev/urandom' is being open'ed. I would like to know where this call is coming from (it is not part of the program itself, it is part of the system).
So, using gdb, I am trying to break (using catch syscall open) program execution when the open call is issued, so I can see a backtrace. The problem is that open is being called alot, like several hundred times so I can't narrow down the specific call that is opening /dev/urandom. How should I go about narrowing down the specific call? Is there a way to filter by arguments, and if so how do I do it for a syscall?
Any advice would be helpful -- maybe I am going about this all wrong.

GDB is a pretty powerful tool, but has a bit of a learning curve.
Basically, you want to set up a conditional breakpoint.
First use the -i flag to strace or objdump -d to find the address of the open function or more realistically something in the chain of getting there, such as in the plt.
set a breakpoint at that address (if you have debug symbols, you can use those instead, omitting the *, but I'm assuming you don't - though you may well have them for library functions if nothing else.
break * 0x080482c8
Next you need to make it conditional
(Ideally you could compare a string argument to a desired string. I wasn't getting this to work within the first few minutes of trying)
Let's hope we can assume the string is a constant somewhere in the program or one of the libraries it loads. You could look in /proc/pid/maps to get an idea of what is loaded and where, then use grep to verify the string is actually in a file, objdump -s to find it's address, and gdb to verify that you've actually found it in memory by combining the high part of the address from maps with the low part from the file. (EDIT: it's probably easier to use ldd on the executable than look in /proc/pid/maps)
Next you will need to know something about the abi of the platform you are working on, specifically how arguments are passed. I've been working on arm's lately, and that's very nice as the first few arguments just go in registers r0, r1, r2... etc. x86 is a bit less convenient - it seems they go on the stack, ie, *($esp+4), *($esp+8), *($esp+12).
So let's assume we are on an x86, and we want to check that the first argument in esp+4 equals the address we found for the constant we are trying to catch it passing. Only, esp+4 is a pointer to a char pointer. So we need to dereference it for comparison.
cond 1 *(char **)($esp+4)==0x8048514
Then you can type run and hope for the best
If you catch your breakpoint condition, and looking around with info registers and the x command to examine memory seems right, then you can use the return command to percolate back up the call stack until you find something you recognize.

(Adapted from a question edit)
Following Chris's answer, here is the process that eventually got me what I was looking for:
(I am trying to find what functions are calling the open syscall on "/dev/urandom")
use ldd on executable to find loaded libraries
grep through each lib (shell command) looking for 'urandom'
open library file in hex editor and find address of string
find out how parameters are passed in syscalls (for open, file is first parameter. on x86_64 it is passed in rdi -- your mileage may vary
now we can set the conditional breakpoint: break open if $rdi == _addr_
run program and wait for break to hit
run bt to see backtrace
After all this I find that glib's g_random_int() and g_rand_new() use urandom. Gtk+ and ORBit were calling these functions -- if anybody was curious.

Like Andre Puel said:
break open if strcmp($rdi,"/dev/urandom") == 0
Might do the job.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string