I have a "Seagate Central" NAS with an embedded linux on it
$ cat /etc/*release
MontaVista Linux 6, (.dev-snapshot-20130726)
When I try to run my own application on this NAS, it will be "Killed"
without any notifications on dmesg or /var/log/messages
$ cat /proc/cpuinfo
Processor : ARMv6-compatible processor rev 4 (v6l)
BogoMIPS : 279.34
Features : swp half thumb fastmult vfp edsp java
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xb02
CPU revision : 4
Hardware : Cavium Networks CNS3420 Validation Board
Revision : 0000
Serial : 0000000000000000
My toolchain is
Sourcery_CodeBench_Lite_for_ARM_GNU_Linux/arm-none-linux-gnueabi
and my compile switches are
-march=armv6k -mcpu=mpcore -mfloat-abi=softfp -mfpu=vfp
How can I find out which process is killing my application, or what setting I have to change?
PS: I have created a simple HelloWorld application which is also not working !
$ ldd Hello
$ not a dynamic executable
readelf -a Hello
=> http://pastebin.com/kT9FvkjE
readelf -a zip
=> http://pastebin.com/3V6kqA9b
UPDATE 1
I have comiled a new binary with hard float
Readelf output
http://pastebin.com/a87bKksY
But no success ;(
I guess it is really a "lock" topic, which is blocking my application to execute. How can I find out what application kills mine ?
Or how can I disable such kind of function ?
Use these compiler switches:
-march=armv6k -Wl,-z,max-page-size=0x10000,-z,common-page-size=0x10000,-Ttext-segment=0x10000
See also this link regarding the toolchain.
You can run readelf -a against one of the built-in binaries (e.g. /usr/bin/nano) to see the proper text-segment offset in the section headers and page size / alignment in the program headers. The above compiler flags make self-compiled programs match the structure of built in binaries, and have been tested to work. It seems the Seagate Central NAS uses a page size / offset of 0x10000 while the default for ARM gcc is 0x8000.
Edit: I see you ran readelf already. Your pastebin shows
HelloWorld:[ 1] .interp PROGBITS 00008134 000134 000013 00 A 0 0 1
zip:[ 1] .interp PROGBITS 00010134 000134 000013 00 A 0 0 1
The value 10134-134=10000 (hex) yields the correct text-segment linker option. Further down (LOAD...) are the alignment specifiers, which are 0x8000 for your HelloWorld, but 0x10000 for the zip built-in. In my experience, soft-float has not caused problems.
Do you see any output at all?
Is your application dynamically linked?
If so, run the dynamic linker with the verbose option (you'll have to figure out the name of the dynamic linker on your platform, for Arch linux, it is ldd):
ldd --verbose 'your_program_name'
That will tell you if you're missing any dependencies (shared libs etc)
Run readelf -a 'your_program_name'
Make sure the file mentioned in Requesting program interpreter: /lib/ld-linux.so.2 exists. In this case, that filename is /lib/ld-linux.so.2
If this fails to help you figure out the problem, post the complete output of ldd --verbose 'your_program_name' and readelf -a 'your_program_name' in your question.
Another issue may be that the NAS software just kills foreign programs. I'm not sure why it would, but we're talking about a big corporation here (Seagate) and they have odd ideas of how the world works at times...
Edit, after looking at the pastebin of readelf:
From what I see, your Hello executable differs in 2 ways from the zip executable:
It is not dynamically linked, so that throws out a whole load of problems to look for.
There's a difference in how the 2 programs are built. zip does not use softfloats and Hello does. I suspect the soft-float dependency is due to one or both of these compiler switches: -mfloat-abi=softfp -mfpu=vfp
Hello Flags: 0x5000202, has entry point, Version5 EABI, soft-float ABI
zip Flags: 0x5000002, has entry point, Version5 EABI
I'd start with either:
Removing the soft-float option from the Hello build or:
make sure the soft-float emulation libraries are on the machine. I don't know what libs this would depend on, but I do remember MontaVista supplying them the last time I touched their software. It's been 8+ years since I touched MontaVista so it's clouded in a bit of old-memory fog.
This is an old thread, but I just wanted to add that I succeeded in compiling a "hello world" for this old NAS today.
Running ld-linux.so.3 <app> told me that
ELF load command alignment not page-aligned
Googling this, I found this: https://github.com/JuliaLang/julia/issues/33293, which pointed me to linker-options:
-Wl,-z,max-page-size=0x10000
Compiling with these options yielded en ELF that actually did work!
Are you sure your compilation options are correct ?
Try the following :
strace your application (if strace is present on the NAS)
downloas one of the NAS binary and run arm-none-linux-gnueabi-readelf -a on it, do the same on your helloworld program and see if the abi tag differ.
It looks like an illegal instruction issue, a floating point issue or an incompatible libc issue.
Edit : according to readelf output, nas program are compiled without soft float, you should try that.
Related
Result of below investigation is: Recent Node.js is not portable to AMD Geode (or other non-SSE x86) Processors !!!
I dived deeper into the code and got stuck in ia32-assembler implementation, that deeply integrates SSE/SSE2 instructions into their code (macros, macros, macros,...). The main consequence is, that you can not run a recent version of node.js on AMD geode processors due to the lack of newer instuction set extensions. The fallback to 387 arithmetics only works for the node.js code, but not for the javascript V8 compiler implementation that it depends on. Adjusting V8 to support non-SSE x86 processors is a pain and a lot of effort.
If someone produces proof of the contrary, I would be really happy to hear about ;-)
Investigation History
I have a running ALIX.2D13 (https://www.pcengines.ch), which has an AMD Geode LX as the main processor. It runs voyage linux, a debian jessi based distribution for resource restricted embedded devices.
root#voyage:~# cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 5
model : 10
model name : Geode(TM) Integrated Processor by AMD PCS
stepping : 2
cpu MHz : 498.004
cache size : 128 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fdiv_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu de pse tsc msr cx8 sep pge cmov clflush mmx mmxext 3dnowext 3dnow 3dnowprefetch vmmcall
bugs : sysret_ss_attrs
bogomips : 996.00
clflush size : 32
cache_alignment : 32
address sizes : 32 bits physical, 32 bits virtual
When I install nodejs 8.x following the instructions on https://nodejs.org/en/download/package-manager/, I get some "invalid machine instruction" (not sure if correct, but translated from german error output). This also happens, when I download the binary for 32-bit x86 and also when I compile it manually.
After the answers below, I changed the compiler flags in deps/v8/gypfiles/toolchain.gypi by removing -msse2 and adding -march=geode -mtune=geode. And now I get the same error but with a stack trace:
root#voyage:~/GIT/node# ./node
#
# Fatal error in ../deps/v8/src/ia32/assembler-ia32.cc, line 109
# Check failed: cpu.has_sse2().
#
==== C stack trace ===============================
./node(v8::base::debug::StackTrace::StackTrace()+0x12) [0x908df36]
./node() [0x8f2b0c3]
./node(V8_Fatal+0x58) [0x908b559]
./node(v8::internal::CpuFeatures::ProbeImpl(bool)+0x19a) [0x8de6d08]
./node(v8::internal::V8::InitializeOncePerProcessImpl()+0x96) [0x8d8daf0]
./node(v8::base::CallOnceImpl(int*, void (*)(void*), void*)+0x35) [0x908bdf5]
./node(v8::internal::V8::Initialize()+0x21) [0x8d8db6d]
./node(v8::V8::Initialize()+0xb) [0x86700a1]
./node(node::Start(int, char**)+0xd3) [0x8e89f27]
./node(main+0x67) [0x846845c]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0xb74fc723]
./node() [0x846a09c]
Ungültiger Maschinenbefehl
root#voyage:~/GIT/node#
If you now look into this file, you will find the following
... [line 107-110]
void CpuFeatures::ProbeImpl(bool cross_compile) {
base::CPU cpu;
CHECK(cpu.has_sse2()); // SSE2 support is mandatory.
CHECK(cpu.has_cmov()); // CMOV support is mandatory.
...
I commented the line but still the "Ungültiger Maschinenbefehl" (Invalid machine instruction).
This is what gdb ./node shows (executed run):
root#voyage:~/GIT/node# gdb ./node
GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
[...]
This GDB was configured as "i586-linux-gnu".
[...]
Reading symbols from ./node...done.
(gdb) run
Starting program: /root/GIT/node/node
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
[New Thread 0xb7ce2b40 (LWP 29876)]
[New Thread 0xb74e2b40 (LWP 29877)]
[New Thread 0xb6ce2b40 (LWP 29878)]
[New Thread 0xb64e2b40 (LWP 29879)]
Program received signal SIGILL, Illegal instruction.
0x287a23c0 in ?? ()
(gdb)
I think, it is necessary to compile with debug symbols...
make clean
make CFLAGS="-g"
No chance to resolve all SSE/SSE2-Problems... Giving up! See my topmost section
Conclusion: node.js + V8 normally requires SSE2 when running on x86.
On the V8 ports page: x87 (not officially supported)
Contact/CC the x87 team in the CL if needed. Use the mailing list v8-x87-ports.at.googlegroups.com for that purpose.
Javascript generally requires floating point (every numeric variable is floating point, and using integer math is only an optimization), so it's probably hard to avoid having V8 actually emit FP math instructions.
V8 is currently designed to always JIT, not interpret. It starts off / falls-back to JITing un-optimized machine code when it's still profiling, or when it hits something that makes it "de-optimize".
There is an effort to add an interpreter to V8, but it might not help because the interpreter itself will be written using the TurboFan JIT backend. It's not intended to make V8 portable to architectures it doesn't currently know how to JIT for.
Crazy idea: run node.js on top of a software emulation layer (like Intel's SDE or maybe qemu-user) that could emulate x86 with SSE/SSE2 on an x86 CPU supporting only x87. They use dynamic translation, so would probably run at near-native speed for code that didn't use any SSE instructions.
This may be crazy because node.js + V8 probably some virtual-memory tricks that might confuse an emulation layer. I'd guess that qemu should be robust enough, though.
Original answer left below as a generic guide to investigating this kind of issue for other programs. (tip: grep the Makefiles and so on for -msse or -msse2, or check compiler command lines for that with pgrep -a gcc while it's building).
Your cpuinfo says it has CMOV, which is a 686 (ppro / p6) feature. This says that Geode supports i686. What's missing compared to a "normal" CPU is SSE2, which is enabled by default for -m32 (32-bit mode) in some recent compiler versions.
Anyway, what you should do is compile with -march=geode -O3, so gcc or clang will use everything your CPU supports, but no more.
-O3 -msse2 -march=geode would tell gcc that it can use everything Geode supports as well as SSE2, so you need to remove any -msse and -msse2 options, or add -mno-sse after them. In node.js, deps/v8/gypfiles/toolchain.gypi was setting -msse2.
Using -march=geode implies -mtune=geode, which affects code-gen choices that don't involve using new instructions, so with luck your binary will run faster than if you'd simply used -mno-sse to control instruction-set stuff without overriding -mtune=generic. (If you're building on the geode, you could use -march=native, which should be identical to using -march=geode.)
The other possibility is the problem instructions are in Javascript functions that were JIT-compiled.
node.js uses V8. I did a quick google search, but didn't find anything about telling V8 to not assume SSE/SSE2. If it doesn't have a fall-back code-gen strategy (x87 instructions) for floating point, then you might have to disable JIT altogether and make it run in interpreter mode. (Which is slower, so that may be a problem.)
But hopefully V8 is well-behaved, and checks what instruction sets are supported before JITing.
You should check by running gdb /usr/bin/node, and see where it faults. Type run my_program.js on the GDB command line to start the program. (You can't pass args to node.js when you first start gdb. You have to specify args from inside gdb when you run.)
If the address of the instruction that raised SIGILL is in a region of memory that's mapped to a file (look in /proc/pid/maps if gdb doesn't tell you), that tells you which ahead-of-time compiled executable or library is responsible. Recompile it with -march=geode.
If it's in anonymous memory, it's most likely JIT-compiler output.
GDB will print the instruction address when it stops when the program receives SIGILL. You can also print $ip to see the current value of EIP (the 32-bit mode instruction pointer).
I’d like to run an x86 shared library that I grabbed from an apk on a non-android linux machine.
It’s linked against android libc, so I grabbed the libc.so from the android ndk.
After debugging segfaults for a while, I figured that libc.so is “cheating” and contains only nop implementations of many library functions:
$ objdump -d libc.so | grep memalign -A 8
0000bf82 <memalign>:
bf82: 55 push %ebp
bf83: 89 e5 mov %esp,%ebp
bf85: 5d pop %ebp
bf86: c3 ret
Now the ndk also contains a libc.a that contains actual implementations of these functions, but how do I get my process to load these and override the nop functions of libc.so?
Would also be interested on some more context on why android is doing this trick and how the overriding works there.
As you see libc.so that is taken from NDK contains only stubs, since its purpose it to provide necessary information to linker during creation of your own shared library or executable. Here is nice explanation of why we need stub libraries.
So if you need a real libc.so binary - there are two alternatives:
grab it directly from Android device:
$ adb pull /system/lib/libc.so <local_destination>
Download factory ROM image for your device, unpack it, mount system.img to your local filesystem, and then again copy it from /system/lib of that mounted partition.
But even if you get proper binary it is a really painful exersize - to make it working on your desktop Linux. There are at least two reasons:
Android and desktop Linux ELFs require different interpreters. You can check it with readelf:
$ readelf --all <android_binary> | grep interpreter
[Requesting program interpreter: /system/bin/linker]
$ readelf --all <linux_x64_binary> | grep interpreter
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
(Interpreter is a small program that performs actual loading of your binary and is loaded by kernel) Obviously your Linux system has no /system/bin/linker and kernel will reject loading of such binary. So you must somehow load sections properly and resolve all dependencies by yourself.
Android kernel is not the same as desktop one, it has some extra features that libc.so depends on, so even if you load ELF somehow it is still incompatible with your kernel and surely you'll get problems at some moment.
To top it off: it is practically impossible to reuse android binaries on desktop GNU/Linux even if they are targeted with the same hardware architecture.
I'm trying to make custom binaries for initrd for x86 system. I took generic precompiled Debian 7 gcc (version 4.7.2-5) and compiled kernel with it. Next step was to make helloworld program instead of init script in initrd to check my development progress. Helloworld program was also compiled with that gcc. When I tried to start my custom system, kernel started with no problem, but helloworld program encountered some errors:
kernel: init[24879] general protection ip:7fd7271585e0 sp:7fff1ef55070 error:0 in init[7fd727142000+20000]
(numbers are not mine, I took similar string from google). Helloworld program:
#include <stdio.h>
int main(){
printf("Helloworld\r\n");
sleep(9999999);
return 0;
}
Compilation:
gcc -static -o init test.c
Earlier I also had stuck with same problem on ARM system (took generic compiler, compiled kernel and some binaries with it and tried to run, kernel runs, but binary - not). Solved it with complete buildroot system, and took buildroot compiler in next projects.
So my question is: what difference between gcc compiled as part of buildroot and generic precompiled gcc?
I know that buildroot compiler is made in several steps, with differenet libs and so on, is this main difference, platform independence?
I don't need a solution, I can take buildroot anytime. I want to know source of my problem, to avoid such problems in future. Thanks.
UPD: Replaced sleep with while(1); and got same situation. My kernel output:
init[1]: general protection ip: 8053682 sp: bf978294 error: 0 in init[8048000+81000]
printk: 14300820 message suppressed.
and repeating every second.
UPD2: I added vdso32-int80.so (original name, like in kernel tree), tested - no luck.
I added ld-linux.so (2 files: ld-2.13.so with symbolic link), tested - same error.
Busybox way allows to run binaries without any of this libraries, tested by me on ARM platform.
Thanks for trying to help me, any other ideas?
After a Xen guest domain hang, I took a dump using xm core-dump . Following the sparse documentation I found, I tried using the crash utility to analyze the dump.
Unfortunately, the kernel image (Debian lenny) is stripped, so I am forced to make use of the map file.
However,
crash
/boot/System.map-2.6.26-2-xen-amd64
vmlinux-2.6.26-2-xen-amd64
/mnt/my-core-file
(with vmlinux-2.6.26-2-xen-amd64 being the gunzip'ed vmlinuz image) fails:
crash: vmlinux-2.6.26-2-xen-amd64: no
debugging data available
Then I read that current Xen versions produce ELF-compatible dumps for guest domains. Indeed, this seems to be the case:
~$ sudo file my-core-dump
my-core-dump: ELF 64-bit LSB core file x86-64, version 1
However, gdb vmlinux-2.6.26-2-xen-amd64 my-core-dump fails, too:
...is not a core dump: File format not
recognized
Any hints?
Have you tried attaching to the domU console ?
xm create domU.conf -c
On the subject of the core-dump file, I found this:
http://lists.xensource.com/archives/html/xen-devel/2006-12/msg00456.html
I just want to check that you aren't under the impression that 'xm
dump-core' emits an Elf core file. It doesn't -- the format is custom and as
far as I know is only interpreted by a set of gdbserver patches that we ship
in our repository. Does the crash utility really support this special
format?
Edit: This might help to debug the core-dump: http://os-drive.com/files/docbook/xen-faq.html#setup_gdb
We have an issue related to a Java application running under a (rather old) FC3 on an Advantech POS board with a Via C3 processor. The java application has several compiled shared libs that are accessed via JNI.
Via C3 processor is supposed to be i686 compatible. Some time ago after installing Ubuntu 6.10 on a MiniItx board with the same processor, I found out that the previous statement is not 100% true. The Ubuntu kernel hanged on startup due to the lack of some specific and optional instructions of the i686 set in the C3 processor. These instructions missing in C3 implementation of i686 set are used by default by GCC compiler when using i686 optimizations. The solution, in this case, was to go with an i386 compiled version of Ubuntu distribution.
The base problem with the Java application is that the FC3 distribution was installed on the HD by cloning from an image of the HD of another PC, this time an Intel P4. Afterwards, the distribution needed some hacking to have it running such as replacing some packages (such as the kernel one) with the i386 compiled version.
The problem is that after working for a while the system completely hangs without a trace. I am afraid that some i686 code is left somewhere in the system and could be executed randomly at any time (for example after recovering from suspend mode or something like that).
My question is:
Is there any tool or way to find out at what specific architecture extensions a binary file (executable or library) requires? file does not give enough information.
The unix.linux file command is great for this. It can generally detect the target architecture and operating system for a given binary (and has been maintained on and off since 1973. wow!)
Of course, if you're not running under unix/linux - you're a bit stuck. I'm currently trying to find a java based port that I can call at runtime.. but no such luck.
The unix file command gives information like this:
hex: ELF 32-bit LSB executable, ARM, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.4.17, not stripped
More detailed information about the details of the architecture are hinted at with the (unix) objdump -f <fileName> command which returns:
architecture: arm, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0000876c
This executable was compiled by a gcc cross compiler (compiled on an i86 machine for the ARM processor as a target)
I decide to add one more solution for any, who got here: personally in my case the information provided by the file and objdump wasn't enough, and the grep isn't much of a help -- I resolve my case through the readelf -a -W.
Note, that this gives you pretty much info. The arch related information resides in the very beginning and the very end. Here's an example:
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0x83f8
Start of program headers: 52 (bytes into file)
Start of section headers: 2388 (bytes into file)
Flags: 0x5000202, has entry point, Version5 EABI, soft-float ABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 8
Size of section headers: 40 (bytes)
Number of section headers: 31
Section header string table index: 28
...
Displaying notes found at file offset 0x00000148 with length 0x00000020:
Owner Data size Description
GNU 0x00000010 NT_GNU_ABI_TAG (ABI version tag)
OS: Linux, ABI: 2.6.16
Attribute Section: aeabi
File Attributes
Tag_CPU_name: "7-A"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Application
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-2
Tag_FP_arch: VFPv3
Tag_Advanced_SIMD_arch: NEONv1
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_rounding: Needed
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Needed
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align_needed: 8-byte
Tag_ABI_align_preserved: 8-byte, except leaf SP
Tag_ABI_enum_size: int
Tag_ABI_HardFP_use: SP and DP
Tag_CPU_unaligned_access: v6
I think you need a tool that checks every instruction, to determine exactly which set it belongs to. Is there even an offical name for the specific set of instructions implemented by the C3 processor? If not, it's even hairier.
A quick'n'dirty variant might be to do a raw search in the file, if you can determine the bit pattern of the disallowed instructions. Just test for them directly, could be done by a simple objdump | grep chain, for instance.
To answer the ambiguity of whether a Via C3 is a i686 class processor: It's not, it's an i586 class processor.
Cyrix never produced a true 686 class processor, despite their marketing claims with the 6x86MX and MII parts. Among other missing instructions, two important ones they didn't have were CMPXCHG8b and CPUID, which were required to run Windows XP and beyond.
National Semiconductor, AMD and VIA have all produced CPU designs based on the Cyrix 5x86/6x86 core (NxP MediaGX, AMD Geode, VIA C3/C7, VIA Corefusion, etc.) which have resulted in oddball designs where you have a 586 class processor with SSE1/2/3 instruction sets.
My recommendation if you come across any of the CPUs listed above and it's not for a vintage computer project (ie. Windows 98SE and prior) then run screaming away from it. You'll be stuck on slow i386/486 Linux or have to recompile all of your software with Cyrix specific optimizations.
Expanding upon #Hi-Angel's answer I found an easy way to check the bit width of a static library:
readelf -a -W libsomefile.a | grep Class: | sort | uniq
Where libsomefile.a is my static library. Should work for other ELF files as well.
Quickest thing to find architecture would be to execute:
objdump -f testFile | grep architecture
This works even for binary.