gdb/ddd Program received signal SIGILL - linux

I wrote a very simple program in Linux using c++, which downloads images from some website over http (basically developed a http client request), using cURL libraries. http://curl.haxx.se/libcurl/c/allfuncs.html
#define CURL_STATICLIB
#include <stdio.h>
#include <stdlib.h>
#include </usr/include/curl/curl.h>
#include </usr/include/curl/stdcheaders.h>
#include </usr/include/curl/easy.h>
size_t write_data(void *ptr, size_t size, size_t nmemb, FILE *stream) {
size_t written = fwrite(ptr, size, nmemb, stream);
return written;
}
int main(void) {
CURL *curl;
FILE *fp;
CURLcode res;
char *url = "http://www.example.com/test_img.png";
char outfilename[FILENAME_MAX] = "/home/c++_proj/output/web_req_img.png";
curl = curl_easy_init();
if (curl) {
fp = fopen(outfilename,"wb");
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
res = curl_easy_perform(curl);
/* always cleanup */
curl_easy_cleanup(curl);
fclose(fp);
}
return 0;
}
I verified the code, and it works fine. I can see the image is downloaded and that I can view the image (with no errors or warnings). Since I plan on expanding my code, I tried to install ddd, and use the debugger, but the debugger doesn't work, and my program exits with some sort of Signal errors, when I try to run my program with ddd.
This is the error:
(Threadd debugging using libthread_db enabled)
Using host libthread_db library "/lib/arm-linux-gnueadihf/libthread_db.so.1"
Program received signal SIGILL, illegal instruction.
0xb6a5c4C0 in ?? () from /usr/lib/arm-linux-gnueadbihf/libcrypto.so.1.0.0
First I thought that I didn't properly install ddd, so I went back to gdb, but I get the exact same errors, when I run the program. (And I believe that I am using the latest version of gdb and ddd)
Then I tried to use ddd on another simple program, that doesn't involve cURL library, and it worked fine !!!
Does anyone know why this is the case, and what is the solution? Do I somehow need to point to cURL libraries while ddd is running? But, in the past, I don't recall doing this with different set of libraries! Maybe it is something abuot the cURL that ddd doesn't like? But the program runs fine itself without the debugger! I would appreciate some help.

I am guessing it may be part of some instruction set detection code. Just let the program continue and see if it handles the signal by itself (since it runs outside of gdb, it probably does). Alternatively, you can tell gdb to not bother you with SIGILL at all before you run the program: handle SIGILL pass nostop noprint.
It's only a problem if the program dies, which was not clear from your question.

Program received signal SIGILL, illegal instruction.
0xb6a5c4C0 in ?? () from /usr/lib/arm-linux-gnueadbihf/libcrypto.so.1.0.0
Does anyone know why this is the case, and what is the solution?
Jester gave you the solution. Here's the reason why it happens.
libcrypto.so is OpenSSL's crypto library. OpenSSL performs cpu feature probes by executing an instruction to see if its available. If a SIGILL is generated, then the feature is not available and an appropriate function is used instead.
The reason you see them on ARM and not IA-32 is, on Intel's IA-32 the cpuid instruction is non-privileged. Any program can execute cpuid to detect cpu features so there's no need for SIGILL-based feature program.
In contrast to IA-32, ARM's equivalent of cpuid is a privileged instruction. Your program needs Exception Level 1 (EL-1), but your program runs at EL-0. To side step the need for privileges on ARM programs setup a jmpbuf and install a SIGILL handler. They then try the instruction in question and the SIGILL handler indicates if the instruction or feature is available or not.
OpenSSL recently changed to SIGILL-free feature detection on some Apple platforms because Apple corrupts things. Also see PR 3108, SIGILL-free processor capabilities detection on MacOS X. Other libraries are doing similar. Also see How to determine ARMv8 features at runtime?
OpenSSL also documents the SIGILL behavior in their FAQ. See item 17 in the OpenSSL FAQ for more details: When debugging I observe SIGILL during OpenSSL initialization: why? Also see SSL_library_init cause SIGILL when running under gdb on Stack Overflow.

For Android developers you can disable SIGILL in Android Studio:
https://developer.oculus.com/documentation/native/android/mobile-studio-debug/#troubleshooting

Related

Stack smashing detected while applying stack & register on the remote identical process

Let us consider that I have an application that is to be executed on 1st node. This application however, cannot execute some function on this 1st node as the node lacks such capabilities. Hence, in order to make this application execution flawless, I am planning to steal the process's stack, heap & its registers using ptrace & send them over to other fully capable 2nd node. Here in this 2nd node, I would like to execute the same process(i.e same executable on the same architecture like x86) until the exact same point 1st process has exeuted, apply the previously stolen stack, heap & register's value onto this process and execute it here and transfer the results back to the 1st node and start executing the application from there.
I have also disabled the ASLR (Address space layout randomization) so that it will be one to one mapping between the process executed on remote node.
On applying such logic, the program ends up with "Stack smashing detected"
Is there anything that I am missing here, or is the idea itself not so feasible???
NOTE: I am also skipping the part of copying kernel stack, as the process on both sides are executed exactly until the same instruction. Please also note that this was a very simple program that I tried as I don't want the complexity of heaps to be involved.
#include <unistd.h>
#include <stdio.h>
#include <signal.h>
void add_one(int *p){
*p += 2;
}
int main(int argc, char **argv)
{
int i = 0;
add_one(&i);
return 0;
}
Above picture holds that program that I experimented with, here I disassembled and found out the address of the function add_one, the point at which I would steal stack & process registers and send them over to apply onto the other identical process in node 2.
Any help on how to do such migrations and the things that I am missing would really help me in moving forward.
if you want to do this you need to at least disable stack canaries, because those will 100% mismatch when carrying over the execution to another machine even if you copied the entire address space.
-fno-stack-protector will do

Troubles at singlestepping on ARM machine [duplicate]

OK, this is a simple question.Does android support the PTRACE_SINGLESTEP when I use ptrace systemcall? when I want to ptrace a android apk program, I find that I can't process the SINGLESTEP trace. But the situation changed when I use the PTRACE_SYSCALL, It can work perfectly. Does the android wipe out this function or arm lack some supports in hardware? Any help will be appreciated!thanks.
this is my core program:
int main(int argc, char *argv[])
{
if(argc != 2) {
__android_log_print(ANDROID_LOG_DEBUG,TAG,"please input the pid!");
return -1;
}
if(0 != ptrace(PTRACE_ATTACH, target_pid, NULL, NULL))
{
__android_log_print(ANDROID_LOG_DEBUG,TAG,"ptrace attach error");
return -1;
}
__android_log_print(ANDROID_LOG_DEBUG,TAG,"start monitor process :%d",target_pid);
while(1)
{
wait(&status);
if(WIFEXITED(status))
{
break;
}
if (ptrace(PTRACE_SINGLESTEP, target_pid, 0, 0) != 0)
__android_log_print(ANDROID_LOG_DEBUG,TAG,"PTRACE_SINGLESTEP attach error");
}
ptrace(PTRACE_DETACH, target_pid, NULL, NULL);
__android_log_print(ANDROID_LOG_DEBUG,TAG,"monitor finished");
return 0;
}
I run this program on shell. And I can get the root privilege.
If I change the request to PTRACE_SYSCALL the program will run normally.
But if the request is PTRACE_SINGLESTEP, the program will get an error!
PTRACE_SINGLESTEP has been removed on ARM Linux since 2011, by this commit.
The HW has no support for single-stepping; previous kernel support involved decoding the instruction to figure out which one's next (branches) and temporarily replacing it with a debug-break software breakpoint.
Quoting a mailing list message about the same commit, describing the old situation: http://lists.infradead.org/pipermail/linux-arm-kernel/2011-February/041324.html
PTRACE_SINGLESTEP is a ptrace request designed to offer single-stepping
support to userspace when the underlying architecture has hardware
support for this operation.
On ARM, we set arch_has_single_step() to 1 and attempt to emulate
hardware single-stepping by disassembling the current instruction to
determine the next pc and placing a software breakpoint on that
location.
Unfortunately this has the following problems:
Only a subset of ARMv7 instructions are supported
Thumb-2 is unsupported
The code is not SMP safe
We could try to fix this code, but it turns out that because of the
above issues it is rarely used in practice. GDB, for example, uses
PTRACE_POKETEXT and PTRACE_PEEKTEXT to manage breakpoints itself and
does not require any kernel assistance.
This patch removes the single-step emulation code from ptrace meaning
that the PTRACE_SINGLESTEP request will return -EIO on ARM. Portable
code must check the return value from a ptrace call and handle the
failure gracefully.
Signed-off-by: Will Deacon <will.deacon at arm.com>
---
The comments I received about v1 suggest that:
If emulation is required, it is plausible to do it from userspace
ltrace uses the SINGLESTEP call (conditionally at compile-time since other architectures, such as mips, do not support this
request) but does not check the return value from ptrace. This is a
bug in ltrace.
strace does not use SINGLESTEP

Why a segfault instead of privilege instruction error?

I am trying to execute the privileged instruction rdmsr in user mode, and I expect to get some kind of privilege error, but I get a segfault instead. I have checked the asm and I am loading 0x186 into ecx, which is supposed to be PERFEVTSEL0, based on the manual, page 1171.
What is the cause of the segfault, and how can I modify the code below to fix it?
I want to resolve this before hacking a kernel module, because I don't want this segfault to blow up my kernel.
Update: I am running on Intel(R) Xeon(R) CPU X3470.
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
#include <sched.h>
#include <assert.h>
uint64_t
read_msr(int ecx)
{
unsigned int a, d;
__asm __volatile("rdmsr" : "=a"(a), "=d"(d) : "c"(ecx));
return ((uint64_t)a) | (((uint64_t)d) << 32);
}
int main(int ac, char **av)
{
uint64_t start, end;
cpu_set_t cpuset;
unsigned int c = 0x186;
int i = 0;
CPU_ZERO(&cpuset);
CPU_SET(i, &cpuset);
assert(sched_setaffinity(0, sizeof(cpuset), &cpuset) == 0);
printf("%lu\n", read_msr(c));
return 0;
}
The question I will try to answer: Why does the above code cause SIGSEGV instead of SIGILL, though the code has no memory error, but an illegal instruction (a privileged instruction called from non-privileged user pace)?
I would expect to get a SIGILL with si_code ILL_PRVOPC instead of a segfault, too. Your question is currently 3 years old and today, I stumbled upon the same behavior. I am disappointed too :-(
What is the cause of the segfault
The cause seems to be that the Linux kernel code decides to send SIGSEGV. Here is the responsible function:
http://elixir.free-electrons.com/linux/v4.9/source/arch/x86/kernel/traps.c#L487
Have a look at the last line of the function.
In your follow up question, you got a list of other assembly instructions which get propagated as SIGSEGV to userspace though they are actually general protection faults. I found your question because I triggered the behavior with cli.
and how can I modify the code below to fix it?
As of Linux kernel 4.9, I'm not aware of any reliable way to distinguish between a memory error (what I would expect to be a SIGSEGV) and a privileged instruction error from userspace.
There may be very hacky and unportable way to distibguish these cases. When a privileged instruction causes a SIGSEGV, the siginfo_t si_code is set to a value which is not directly listed in the SIGSEGV section of man 2 sigaction. The documented values are SEGV_MAPERR, SEGV_ACCERR, SEGV_PKUERR, but I get SI_KERNEL (0x80) on my system. According to the man page, SI_KERNEL is a code "which can be placed in si_code for any signal". In strace, you see SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0}. The responsible kernel code is here.
It would also be possible to grep dmesg for this string.
Please, never ever use those two methods to distinguish between GPF and memory error on a production system.
Specific solution for your code: Just don't run rdmsr from user space. But this answer is really unsatisfying if you are looking for a generic way to figure out why a program received a SIGSEGV.

Simple heap overflow exploit with toy example on old glibc

Consider this example of a heap buffer overflow vulnerable program in Linux, taken directly from the "Buffer Overflow Attacks" (p. 248) book:
#include <stdlib.h>
#include <string.h>
int main(int argc, char **argv)
{
char *A, *B;
A = malloc(128);
B = malloc(32);
strcpy(A, argv[1]);
free(A);
free(B);
return 0;
}
Since unlink() has been changed to prevent the most simple form of exploit using the FD and BK pointers with a sanity check, I'm using a very old system I have with an old version of glibc (version 2.3.2). I'm also setting MALLOC_CHECK_=0 for this testing.
My goal of this toy example is to simply see if I can write 4 bytes to some arbitrary address I specify. The most simple test I can think of is to try write something to 0x41414141, which is an illegal address and should let the program crash to just confirm to me that it is indeed trying to write to this address (something I should be able to observe in GDB).
So I try executing with the argument perl -e 'print "A"x128 . "\xf8\xff\xff\xff" . "\xf8\xff\xff\xff" . "\x41\x41\x41\x41" . "\x41\x41\x41\x41" '
So I have:
Buffer A: 128 bytes of 0x41.
prev_size: 0xfffffff8
size: 0xfffffff8
FD: 0x41414141
BK: 0x41414141
I'm using 0xfffffff8 instead of 0xfffffffc because there is a note that with glibc 2.3 the third lowest bit NON_MAIN_AREA is used for management purposes for the arenas and has to be 0.
This should attempt to write 0x41414141 to 0x41414141 (+ 12 to be more correct, but still an illegal address), correct? However, when I execute this, the program simply terminates normally.
What am I missing here? This seems simple enough that it shouldn't be that hard to get to work.
I've tried various things such as using 0xfffffffc instead for prev_size and size, using legal addresses for FD (some address on the heap). I've tried swapping the order A and B are free()'d, I've tried to step into free() to see what happens in GDB but I got lost. Note that there shouldn't be any other security features on this system as it is very old and wouldn't have NX-bit, ASLR, etc (not that it should matter for the purpose of just writing 4 bytes to an illegal address).
Any ideas for how to make this work?
I could add that if using MALLOC_CHECK_=3 I get this:
malloc: using debugging hooks
malloc: using debugging hooks
free(): invalid pointer 0x8049688!
Program received signal SIGABRT, Aborted.
0x4004a1b1 in kill () from /lib/libc.so.6

SetEnvironmentVariable within 32bit Process on 64 bit Windows OS

I recently found an interesting issue. When using SetEnvironmentVariable, I can use Process Explorer to get the newly created environment variable. However when the process itself is 32 bit and the OS as 64 bit, Process Explorer (at least v10 ~ the latest v11.33) cannot find the new variables. If the program is native 64 bit then everything works fine, just as well as 32 bit process running on 32 bit OS.
The SetEnvironmentVariable API call should be successful, because the return value is TRUE and calling GetEnvironmentVariable returns correct value. Also if you create a child process, you can find the variable was correctly set in the new process by using Process Explorer.
I'm not if this is the limitation of SysWOW64 or a bug in Process Explorer. Anyone knows?
And, is there any way to get the 32 bit environment variables correctly? (for example, force Process Explorer to run in 32 bit mode, or some other tools)
Sample source to reproduce:
#include <stdio.h>
#include <windows.h>
int main(int argc, char *argv[])
{
printf("setting variable... %s\n",
SetEnvironmentVariable("a_new_var", "1.0") ? "OK" : "FAILED");
printf("press anykey to continue...\n");
getchar();
// system(argv[0]); // uncomment to inspect the child process
return 0;
}
I'm not sure how WOW64 works, but I'm pretty (99%) sure there are two PEBs (Process Environment Blocks) created - a 32-bit one and a 64-bit one. The process parameter structures (RTL_USER_PROCESS_PARAMETERS) are probably duplicated as well. So when you call SetEnvironmentVariable it is only modifying the 32-bit environment block. PE would be running as a native 64-bit program, which would mean it only knows about the 64-bit PEB and the 64-bit environment block (which hasn't changed).
Update (2010-07-10):
Just some new info on this old topic: You can find the 32-bit PEB by calling NtQueryInformationProcess with ProcessWow64Information. It gives you a PVOID with the address of the PEB.

Resources