Cross compile from linux to ARM-ELF (ARM926EJ-S/MT7108)

Cross compile from linux to ARM-ELF (ARM926EJ-S/MT7108) - linux

I have installed all cross compile packages on my ubuntu system so far but am having a
problem and need some help.
Processor : ARM926EJ-S rev 5 (v5l)
BogoMIPS : 184.72
Features : swp half thumb fastmult edsp java
CPU implementer : 0x41
CPU architecture: 5TEJ
CPU variant : 0x0
CPU part : 0x926
CPU revision : 5
Cache type : write-back
Cache clean : cp15 c7 ops
Cache lockdown : format C
Cache format : Harvard
I size : 32768
I assoc : 4
I line length : 32
I sets : 256
D size : 32768
D assoc : 4
D line length : 32
D sets : 256
Hardware : MT7108
Revision : 0000
Serial : 0000000000000000
This is the target machine I need to cross compile for. What flags should I
use when compiling?

You have an ARMv5 with no floating-point processor. It should have been enough with -march=armv5 and -mfloat-abi=soft flags.
However if those flags doesn't work for you, I would suggest writing the smallest c application for testing the toolchain.
/* no includes */
int main(void) {
return 42;
}
and compiling it with most complete/strict flags
$arm-linux-gnueabi-gcc -Wall --static -O2 -marm -march=armv5 simple.c -o simple
after this, push simple to target, run it then issue an echo $? to verify if you would get 42. If it works, try to see if you can get printf working. If that one also works, you are pretty much set for everything. If printf fails, easiest solution would be to find right toolchain for your target.

apt-cache search arm | grep ^gcc- gives the following list,
gcc-4.7-aarch64-linux-gnu - GNU C compiler
gcc-4.7-arm-linux-gnueabi - GNU C compiler
gcc-4.7-arm-linux-gnueabi-base - GCC, the GNU Compiler Collection (base package)
gcc-4.7-arm-linux-gnueabihf - GNU C compiler
gcc-4.7-arm-linux-gnueabihf-base - GCC, the GNU Compiler Collection (base package)
gcc-4.7-multilib-arm-linux-gnueabi - GNU C compiler (multilib files)
gcc-4.7-multilib-arm-linux-gnueabihf - GNU C compiler (multilib files)
gcc-aarch64-linux-gnu - The GNU C compiler for arm64 architecture
gcc-arm-linux-gnueabi - The GNU C compiler for armel architecture
gcc-arm-linux-gnueabihf - The GNU C compiler for armhf architecture
You should install gcc-arm-linux-gnueabi which is an alias for gcc-4.7-arm-linux-gnueabi. gcc-4.7-multilib-arm-linux-gnueabi is also possible, but more complicated. Use the flags, -march=armv5te -mtune=arm926ej-s -msoft-float -mfloat-abi=soft. You can do more tuning by specifying the --param NAME=VALUE option to gcc with parameters tuned to your systems memory sub-system timing.
You may not be able to use these gcc versions as your Linux maybe compiled with OABI and/or be quite ancient compared to the one the compiler was built for. In some cases, the libc will call a newer Linux API, which may not be present. If the compiler/libc was not configured to be backwards compatible, then it may not work with your system. You can use crosstool-ng to create a custom compiler that is built to suit your system, but this is much more complex.

Related

Incorrect cross compiling to ARMv6 in Go project

I'm trying to cross compile a Go program for the armv6l architecture, specifically to get it running on a Raspberry Pi Zero W.
$ cat /proc/cpuinfo
processor : 0
model name : ARMv6-compatible processor rev 7 (v6l)
BogoMIPS : 697.95
Features : half thumb fastmult vfp edsp java tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xb76
CPU revision : 7
Hardware : BCM2835
Revision : 9000c1
Serial : 0000000059e0c5a9
Model : Raspberry Pi Zero W Rev 1.1
I was doing what I thought was right by running the command:
env CC=arm-linux-gnueabihf-gcc CGO_ENABLED=1 GOOS=linux GOARM=6 GOARCH=arm go build -o build/dht_sensor_exporter-$(VERSION)-linux-armv6 ./cmd/main.go
At first glance, after running file dht_sensor_exporter-$(VERSION)-linux-armv6 it looks good, as it outputs:
dht_sensor_exporter-$(VERSION)-linux-armv6: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, BuildID[sha1]=d84b6c4b0dc4b77b9c65db7bc30823e5dbe1078c, for GNU/Linux 3.2.0, with debug_info, not stripped
When I try to run it on the Pi Zero W, it then outputs Illegal instruction. After a bit of digging, it seems that the cross compile didn't actually generate the file for the right architecture.
I found this using the following command:
$ readelf -A dht_sensor_exporter
Attribute Section: aeabi
File Attributes
Tag_CPU_name: "7-A"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Application
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-2
Tag_FP_arch: VFPv3-D16
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_rounding: Needed
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Needed
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align_needed: 8-byte
Tag_ABI_enum_size: int
Tag_ABI_VFP_args: VFP registers
Tag_CPU_unaligned_access: v6
Looking at the Tag_CPU_arch it seems that it's been compiled for ARMv7 actually.
Any help would be much appreciated. The repo in question is here and is built using github actions (workflow, makfile) on the latest ubuntu version.
I'm still new at this and had to do some "interesting," potentially incorrect things with the C/C++ compiler (arm-linux-gnueabihf-gcc)

nodejs on ALIX / AMD Geode running voyage linux leads to "invalid machine instruction"

Result of below investigation is: Recent Node.js is not portable to AMD Geode (or other non-SSE x86) Processors !!!
I dived deeper into the code and got stuck in ia32-assembler implementation, that deeply integrates SSE/SSE2 instructions into their code (macros, macros, macros,...). The main consequence is, that you can not run a recent version of node.js on AMD geode processors due to the lack of newer instuction set extensions. The fallback to 387 arithmetics only works for the node.js code, but not for the javascript V8 compiler implementation that it depends on. Adjusting V8 to support non-SSE x86 processors is a pain and a lot of effort.
If someone produces proof of the contrary, I would be really happy to hear about ;-)
Investigation History
I have a running ALIX.2D13 (https://www.pcengines.ch), which has an AMD Geode LX as the main processor. It runs voyage linux, a debian jessi based distribution for resource restricted embedded devices.
root#voyage:~# cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 5
model : 10
model name : Geode(TM) Integrated Processor by AMD PCS
stepping : 2
cpu MHz : 498.004
cache size : 128 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fdiv_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu de pse tsc msr cx8 sep pge cmov clflush mmx mmxext 3dnowext 3dnow 3dnowprefetch vmmcall
bugs : sysret_ss_attrs
bogomips : 996.00
clflush size : 32
cache_alignment : 32
address sizes : 32 bits physical, 32 bits virtual
When I install nodejs 8.x following the instructions on https://nodejs.org/en/download/package-manager/, I get some "invalid machine instruction" (not sure if correct, but translated from german error output). This also happens, when I download the binary for 32-bit x86 and also when I compile it manually.
After the answers below, I changed the compiler flags in deps/v8/gypfiles/toolchain.gypi by removing -msse2 and adding -march=geode -mtune=geode. And now I get the same error but with a stack trace:
root#voyage:~/GIT/node# ./node
#
# Fatal error in ../deps/v8/src/ia32/assembler-ia32.cc, line 109
# Check failed: cpu.has_sse2().
#
==== C stack trace ===============================
./node(v8::base::debug::StackTrace::StackTrace()+0x12) [0x908df36]
./node() [0x8f2b0c3]
./node(V8_Fatal+0x58) [0x908b559]
./node(v8::internal::CpuFeatures::ProbeImpl(bool)+0x19a) [0x8de6d08]
./node(v8::internal::V8::InitializeOncePerProcessImpl()+0x96) [0x8d8daf0]
./node(v8::base::CallOnceImpl(int*, void (*)(void*), void*)+0x35) [0x908bdf5]
./node(v8::internal::V8::Initialize()+0x21) [0x8d8db6d]
./node(v8::V8::Initialize()+0xb) [0x86700a1]
./node(node::Start(int, char**)+0xd3) [0x8e89f27]
./node(main+0x67) [0x846845c]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0xb74fc723]
./node() [0x846a09c]
Ungültiger Maschinenbefehl
root#voyage:~/GIT/node#
If you now look into this file, you will find the following
... [line 107-110]
void CpuFeatures::ProbeImpl(bool cross_compile) {
base::CPU cpu;
CHECK(cpu.has_sse2()); // SSE2 support is mandatory.
CHECK(cpu.has_cmov()); // CMOV support is mandatory.
...
I commented the line but still the "Ungültiger Maschinenbefehl" (Invalid machine instruction).
This is what gdb ./node shows (executed run):
root#voyage:~/GIT/node# gdb ./node
GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
[...]
This GDB was configured as "i586-linux-gnu".
[...]
Reading symbols from ./node...done.
(gdb) run
Starting program: /root/GIT/node/node
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
[New Thread 0xb7ce2b40 (LWP 29876)]
[New Thread 0xb74e2b40 (LWP 29877)]
[New Thread 0xb6ce2b40 (LWP 29878)]
[New Thread 0xb64e2b40 (LWP 29879)]
Program received signal SIGILL, Illegal instruction.
0x287a23c0 in ?? ()
(gdb)
I think, it is necessary to compile with debug symbols...
make clean
make CFLAGS="-g"
No chance to resolve all SSE/SSE2-Problems... Giving up! See my topmost section

Conclusion: node.js + V8 normally requires SSE2 when running on x86.
On the V8 ports page: x87 (not officially supported)
Contact/CC the x87 team in the CL if needed. Use the mailing list v8-x87-ports.at.googlegroups.com for that purpose.
Javascript generally requires floating point (every numeric variable is floating point, and using integer math is only an optimization), so it's probably hard to avoid having V8 actually emit FP math instructions.
V8 is currently designed to always JIT, not interpret. It starts off / falls-back to JITing un-optimized machine code when it's still profiling, or when it hits something that makes it "de-optimize".
There is an effort to add an interpreter to V8, but it might not help because the interpreter itself will be written using the TurboFan JIT backend. It's not intended to make V8 portable to architectures it doesn't currently know how to JIT for.
Crazy idea: run node.js on top of a software emulation layer (like Intel's SDE or maybe qemu-user) that could emulate x86 with SSE/SSE2 on an x86 CPU supporting only x87. They use dynamic translation, so would probably run at near-native speed for code that didn't use any SSE instructions.
This may be crazy because node.js + V8 probably some virtual-memory tricks that might confuse an emulation layer. I'd guess that qemu should be robust enough, though.
Original answer left below as a generic guide to investigating this kind of issue for other programs. (tip: grep the Makefiles and so on for -msse or -msse2, or check compiler command lines for that with pgrep -a gcc while it's building).
Your cpuinfo says it has CMOV, which is a 686 (ppro / p6) feature. This says that Geode supports i686. What's missing compared to a "normal" CPU is SSE2, which is enabled by default for -m32 (32-bit mode) in some recent compiler versions.
Anyway, what you should do is compile with -march=geode -O3, so gcc or clang will use everything your CPU supports, but no more.
-O3 -msse2 -march=geode would tell gcc that it can use everything Geode supports as well as SSE2, so you need to remove any -msse and -msse2 options, or add -mno-sse after them. In node.js, deps/v8/gypfiles/toolchain.gypi was setting -msse2.
Using -march=geode implies -mtune=geode, which affects code-gen choices that don't involve using new instructions, so with luck your binary will run faster than if you'd simply used -mno-sse to control instruction-set stuff without overriding -mtune=generic. (If you're building on the geode, you could use -march=native, which should be identical to using -march=geode.)
The other possibility is the problem instructions are in Javascript functions that were JIT-compiled.
node.js uses V8. I did a quick google search, but didn't find anything about telling V8 to not assume SSE/SSE2. If it doesn't have a fall-back code-gen strategy (x87 instructions) for floating point, then you might have to disable JIT altogether and make it run in interpreter mode. (Which is slower, so that may be a problem.)
But hopefully V8 is well-behaved, and checks what instruction sets are supported before JITing.
You should check by running gdb /usr/bin/node, and see where it faults. Type run my_program.js on the GDB command line to start the program. (You can't pass args to node.js when you first start gdb. You have to specify args from inside gdb when you run.)
If the address of the instruction that raised SIGILL is in a region of memory that's mapped to a file (look in /proc/pid/maps if gdb doesn't tell you), that tells you which ahead-of-time compiled executable or library is responsible. Recompile it with -march=geode.
If it's in anonymous memory, it's most likely JIT-compiler output.
GDB will print the instruction address when it stops when the program receives SIGILL. You can also print $ip to see the current value of EIP (the 32-bit mode instruction pointer).

linux application get Killed

I have a "Seagate Central" NAS with an embedded linux on it
$ cat /etc/*release
MontaVista Linux 6, (.dev-snapshot-20130726)
When I try to run my own application on this NAS, it will be "Killed"
without any notifications on dmesg or /var/log/messages
$ cat /proc/cpuinfo
Processor : ARMv6-compatible processor rev 4 (v6l)
BogoMIPS : 279.34
Features : swp half thumb fastmult vfp edsp java
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xb02
CPU revision : 4
Hardware : Cavium Networks CNS3420 Validation Board
Revision : 0000
Serial : 0000000000000000
My toolchain is
Sourcery_CodeBench_Lite_for_ARM_GNU_Linux/arm-none-linux-gnueabi
and my compile switches are
-march=armv6k -mcpu=mpcore -mfloat-abi=softfp -mfpu=vfp
How can I find out which process is killing my application, or what setting I have to change?
PS: I have created a simple HelloWorld application which is also not working !
$ ldd Hello
$ not a dynamic executable
readelf -a Hello
=> http://pastebin.com/kT9FvkjE
readelf -a zip
=> http://pastebin.com/3V6kqA9b
UPDATE 1
I have comiled a new binary with hard float
Readelf output
http://pastebin.com/a87bKksY
But no success ;(
I guess it is really a "lock" topic, which is blocking my application to execute. How can I find out what application kills mine ?
Or how can I disable such kind of function ?

Use these compiler switches:
-march=armv6k -Wl,-z,max-page-size=0x10000,-z,common-page-size=0x10000,-Ttext-segment=0x10000
See also this link regarding the toolchain.
You can run readelf -a against one of the built-in binaries (e.g. /usr/bin/nano) to see the proper text-segment offset in the section headers and page size / alignment in the program headers. The above compiler flags make self-compiled programs match the structure of built in binaries, and have been tested to work. It seems the Seagate Central NAS uses a page size / offset of 0x10000 while the default for ARM gcc is 0x8000.
Edit: I see you ran readelf already. Your pastebin shows
HelloWorld:[ 1] .interp PROGBITS 00008134 000134 000013 00 A 0 0 1
zip:[ 1] .interp PROGBITS 00010134 000134 000013 00 A 0 0 1
The value 10134-134=10000 (hex) yields the correct text-segment linker option. Further down (LOAD...) are the alignment specifiers, which are 0x8000 for your HelloWorld, but 0x10000 for the zip built-in. In my experience, soft-float has not caused problems.

Do you see any output at all?
Is your application dynamically linked?
If so, run the dynamic linker with the verbose option (you'll have to figure out the name of the dynamic linker on your platform, for Arch linux, it is ldd):
ldd --verbose 'your_program_name'
That will tell you if you're missing any dependencies (shared libs etc)
Run readelf -a 'your_program_name'
Make sure the file mentioned in Requesting program interpreter: /lib/ld-linux.so.2 exists. In this case, that filename is /lib/ld-linux.so.2
If this fails to help you figure out the problem, post the complete output of ldd --verbose 'your_program_name' and readelf -a 'your_program_name' in your question.
Another issue may be that the NAS software just kills foreign programs. I'm not sure why it would, but we're talking about a big corporation here (Seagate) and they have odd ideas of how the world works at times...
Edit, after looking at the pastebin of readelf:
From what I see, your Hello executable differs in 2 ways from the zip executable:
It is not dynamically linked, so that throws out a whole load of problems to look for.
There's a difference in how the 2 programs are built. zip does not use softfloats and Hello does. I suspect the soft-float dependency is due to one or both of these compiler switches: -mfloat-abi=softfp -mfpu=vfp
Hello Flags: 0x5000202, has entry point, Version5 EABI, soft-float ABI
zip Flags: 0x5000002, has entry point, Version5 EABI
I'd start with either:
Removing the soft-float option from the Hello build or:
make sure the soft-float emulation libraries are on the machine. I don't know what libs this would depend on, but I do remember MontaVista supplying them the last time I touched their software. It's been 8+ years since I touched MontaVista so it's clouded in a bit of old-memory fog.

This is an old thread, but I just wanted to add that I succeeded in compiling a "hello world" for this old NAS today.
Running ld-linux.so.3 <app> told me that
ELF load command alignment not page-aligned
Googling this, I found this: https://github.com/JuliaLang/julia/issues/33293, which pointed me to linker-options:
-Wl,-z,max-page-size=0x10000
Compiling with these options yielded en ELF that actually did work!

Are you sure your compilation options are correct ?
Try the following :
strace your application (if strace is present on the NAS)
downloas one of the NAS binary and run arm-none-linux-gnueabi-readelf -a on it, do the same on your helloworld program and see if the abi tag differ.
It looks like an illegal instruction issue, a floating point issue or an incompatible libc issue.
Edit : according to readelf output, nas program are compiled without soft float, you should try that.

How to test if your Linux Support SSE2

Actually I have 2 questions:
Is SSE2 Compatibility a CPU issue or Compiler issue?
How to check if your CPU or Compiler support SSE2?
I am using GCC Version:
gcc (GCC) 4.5.1
When I tried to compile a code it give me this error:
$ gcc -O3 -msse2 -fno-strict-aliasing -DHAVE_SSE2=1 -DMEXP=19937 -o test-sse2-M19937 test.c
cc1: error: unrecognized command line option "-msse2"
And cpuinfo showed this:
processor : 0
vendor : GenuineIntel
arch : IA-64
family : 32
model : 1
model name : Dual-Core Intel(R) Itanium(R) Processor 9140M
revision : 1
archrev : 0
features : branchlong, 16-byte atomic ops
cpu number : 0
cpu regs : 4
cpu MHz : 1669.000503
itc MHz : 416.875000
BogoMIPS : 3325.95
siblings : 2
physical id: 0
core id : 0
thread id : 0

The CPU needs to be able to execute SSE2 instrcutions, and the compiler needs to be able to generate them.
To check if your cpu supports SSE2:
# cat /proc/cpuinfo
It will be somewhere under "flags" if it is supported.
Update: So you cpu doesn't support it.
For the compiler:
# gcc -dumpmachine
# gcc --version
Target of your compiler needs to a kind of x86*, since only this cpus support sse2, which is part of the x86 instruction set
AND
gcc version needs to be >= 3.1 (most likely, since this is about 10 years old or something) for supporting SSE2.
Update: So your compiler doesn't support it on this target, it will if you are using it as a cross compiler for x86.

It's both. The compiler/assembler need to be able to emit/handle SSE2 instructions, and then the CPU needs to support them. If your binary has SSE2 instructions with no conditions attached and you try to run it on a Pentium II you are out of luck.
The best way is to check your GCC manual. For example my GCC manpage refers to the -msse2 option which will allow you to explicitly enable SSE2 instructions in the binaries. Any relatively recent GCC or ICC should support it. As for your cpu, check the flags line in /proc/cpuinfo.
It would be best, though, to have checks in your code using cpuid etc, so that SSE2 sections can be disabled in CPUs that do not support it and your code can fall back on a more common instruction set.
EDIT:
Note that your compiler needs to either be a native compiler running on a x86 system, or a cross-compiler for x86. Otherwise it will not have the necessary options to compile binaries for x86 processors, which includes anything with SSE2.
In your case the CPU does not support x86 at all. Depending on your Linux distribution, there might be packages with the Intel IA32EL emulation layer for x86-software-on-IA64, which may allow you to run x86 software.
Therefore you have the following options:
Use a cross-compiler that will run on IA64 and produce binaries for x86. Cross-compiler toolchains are not an easy thing to setup though, because you need way more than just the compiler (binutils, libraries etc).
Use Intel IA32EL to run a native x86 compiler. I don't know how you would go about installing a native x86 toolchain and all the libraries that your project needs in your distributions does not support it directly. Perhaps a full-blown chroot'ed installation of an x86 distribution ?
Then if you want to test your build on this system you have to install Intel's IA32EL for Linux.
EDIT2:
I suppose you could also run a full x86 linux distribution on an emulator like Bochs or QEMU (with no virtualization of course). You are definitely not going to be dazzled by the resulting speeds though.

Another trick not yet mentioned is do:
gcc -march=native -dM -E - </dev/null | grep SSE2
and get:
#define __SSE2_MATH__ 1
#define __SSE2__ 1
With -march=native you are checking both your compiler and your CPU. If you give a different -march for a specific CPU, like -march=bonnell you can check for that CPU.
Consult your gcc docs for the correct version of gcc:
https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Submodel-Options.html

use asm to check the existence of sse2
enter code here
static
bool HaveSSE2()
{
return false;
__asm mov EAX,1 ;
__asm cpuid ;
__asm test EDX, 4000000h ;test whether bit 26 is set
__asm jnz yes ;yes
return false;
yes:
return true;
}

Try running:
lshw -class processor | grep -w sse2
and look under the processor section.

Using software floating point on x86 linux

Is it (easily) possible to use software floating point on i386 linux without incurring the expense of trapping into the kernel on each call? I've tried -msoft-float, but it seems the normal (ubuntu) C libraries don't have a FP library included:
$ gcc -m32 -msoft-float -lm -o test test.c
/tmp/cc8RXn8F.o: In function `main':
test.c:(.text+0x39): undefined reference to `__muldf3'
collect2: ld returned 1 exit status

It is surprising that gcc doesn't support this natively as the code is clearly available in the source within a directory called soft-fp. It's possible to compile that library manually:
$ svn co svn://gcc.gnu.org/svn/gcc/trunk/libgcc/ libgcc
$ cd libgcc/soft-fp/
$ gcc -c -O2 -msoft-float -m32 -I../config/arm/ -I.. *.c
$ ar -crv libsoft-fp.a *.o
There are a few c files which don't compile due to errors but the majority does compile. After copying libsoft-fp.a into the directory with our source files they now compile fine with -msoft-float:
$ gcc -g -m32 -msoft-float test.c -lsoft-fp -L.
A quick inspection using
$ objdump -D --disassembler-options=intel a.out | less
shows that as expected no x87 floating point instructions are called and the code runs considerably slower as well, by a factor of 8 in my example which uses lots of division.
Note: I would've preferred to compile the soft-float library with
$ gcc -c -O2 -msoft-float -m32 -I../config/i386/ -I.. *.c
but that results in loads of error messages like
adddf3.c: In function '__adddf3':
adddf3.c:46: error: unknown register name 'st(1)' in 'asm'
Seems like the i386 version is not well maintained as st(1) points to one of the x87 registers which are obviously not available when using -msoft-float.
Strangely or luckily the arm version compiles fine on an i386 and seems to work just fine.

Unless you want to bootstrap your entire toolchain by hand, you could start with uclibc toolchain (the i386 version, I imagine) -- soft float is (AFAIK) not directly supported for "native" compilation on debian and derivatives, but it can be used via the "embedded" approach of the uclibc toolchain.

GCC does not support this without some extra libraries. From the 386 documentation:
-msoft-float Generate output containing library calls for floating
point. Warning: the requisite
libraries are not part of GCC.
Normally the facilities of the
machine's usual C compiler are used,
but this can't be done directly in
cross-compilation. You must make your
own arrangements to provide suitable
library functions for
cross-compilation.
On machines where a function returns
floating point results in the 80387
register stack, some floating point
opcodes may be emitted even if
-msoft-float is used
Also, you cannot set -mfpmath=unit to "none", it has to be sse, 387 or both.
However, according to this gnu wiki page, there is fp-soft and ieee. There is also SoftFloat.
(For ARM there is -mfloat-abi=softfp, but it does not seem like something similar is available for 386 SX).
It does not seem like tcc supports software floating point numbers either.
Good luck finding a library that works for you.

G'day,
Unless you're targetting a platform that doesn't have inbuilt FP support, I can't think of a reason why you'd want to emulate FP support.
Doesn't your x386 platform have external FPU support? Pity it's not a x486 with the FPU built in!
In my experience, any soft emulation is bound to be much slower than its hardware equivalent.
That's why I finished up writing a package in Ada to taget the onboard 68k FPU instead of using the soft emulation provided by the compiler manufacturer at the time. They finished up bundling it in their compiler as a matter of fact.
Edit: Just seen your comment below. Hmmm, if you don't need a full suite of FP support is it possible to roll your own for the few math functions you do need? That how the Ada package I mentioned got started.
HTH
cheers,

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string