Can we convert 32 bit library file to 64 bit library file? - linux

I have some 32 bit library files (.a files) in Solaris. I am porting my application to 64 bit Linux environment. Is there any way to convert the 32 bit libraries to 64 bit or should I regenerate the libraries in 64 bit?

It is not just a question of 32-bit vs 64-bit. It's also a question of Solaris versus Linux. These are two operating systems that have different calling conventions and different ABIs. That means things like sizes of data types can be different, the way the compiler puts stuff in registers and on the stack to do a function call is different, the way system calls are done is different, etc.
It is probably possible to convert a static library in the way you want, in some cases, but you would need to write the tools yourself. Compiling from source is way easier, much more reliable, and also something you need to be able to do at will anyway (otherwise you can't easily fix problems in the library, e.g., security issues).

No; you have to recompile them for 64-bit, because a lot of necessary information is lost during the compilation.
Good luck.

Related

C++ .a: what affects portability across distros?

I'm building a .a from C++ code. It only depends on the standard library (libc++/libstdc++). From general reading, it seems that portability of binaries depends on
compiler version (because it can affect the ABI). For gcc, the ABI is linked to the major version number.
libc++/libstdc++ versions (because they could pass a vector<T> into the .a and its representation could change).
I.e. someone using the .a needs to use the same (major version of) the compiler + same standard library.
As far as I can see, if compiler and standard library match, a .a should work across multiple distros. Is this right? Or is there gubbins relating to system calls, etc., meaning a .a for Ubuntu should be built on Ubuntu, .a for CentOS should be built on CentOS, and so on?
Edit: see If clang++ and g++ are ABI incompatible, what is used for shared libraries in binary? (though it doens't answer this q.)
Edit 2: I am not accessing any OS features explicitly (e.g. via system calls). My only interaction with the system is to open files and read from them.
It only depends on the standard library
It could also depend implicitly upon other things (think of resources like fonts, configuration files under /etc/, header files under /usr/include/, availability of /proc/, of /sys/, external programs run by system(3) or execvp(3), specific file systems or devices, particular ioctl-s, available or required plugins, etc...)
These are kind of details which might make the porting difficult. For example look into nsswitch.conf(5).
The evil is in the details.
(in other words, without a lot more details, your question don't have much sense)
Linux is perceived as a free software ecosystem. The usual way of porting something is to recompile it on -or at least for- the target Linux distribution. When you do that several times (for different and many Linux distros), you'll understand what details are significant in your particular software (and distributions).
Most of the time, recompiling and porting a library on a different distribution is really easy. Sometimes, it might be hard.
For shared libraries, reading Program Library HowTo, C++ dlopen miniHowTo, elf(5), your ABI specification (see here for some incomplete list), Drepper's How To Write Shared Libraries could be useful.
My recommendation is to prepare binary packages for various common Linux distributions. For example, a .deb for Debian & Ubuntu (some particular versions of them).
Of course a .deb for Debian might not work on Ubuntu (sometimes it does).
Look also into things like autoconf (or cmake). You may want at least to have some externally provided #define-d preprocessor strings (often passed by -D to gcc or g++) which would vary from one distribution to the next (e.g. on some distributions, you print by popen-ing lp, on others, by popen-ing lpr, on others by interacting with some CUPS server etc...). Details matter.
My only interaction with the system is to open files
But even these vary a lot from one distribution to another one.
It is probable that you won't be able to provide a single -and the same one- lib*.a for several distributions.
NB: you probably need to budget more work than what you believe.

How to install a single Perl Crypt::OpenSSL::AES for use by different linux environments

I have a sticky problem that I am not quite sure how to solve. The situation is as follows:
We have a common 32bit perl 5.10.0
It is used by both 32bit and 64bit linux machines
Now the problem is that I need to install Crypt::OpenSSL::AES module for the Perl, however since it builds a shared library a lot of problems appear:
If built on 64bit machines the module is not usable with "wrong ELF class: ELFCLASS64" error for the generated AES.so
If built on a 32bit machine the module is not usable on the 64bit with undefined symbol: AES_encrypt
The problem I'm guessing is that the different machines have different versions of OpenSSL installed and they are not compatible with each other.
My question is given that I cannot change any of the machine configurations, what should I do to get the AES module working on all the machines?
Thanks!
I solved the problem with a combination of staticperl and building statically linked Crypt::OpenSSL::AES so that I have a single perl executable that is fully statically linked.
Given that I am not able to modify the environment, this is the best I can come up with.
Perl's default configuration very intentionally puts platform-specific things in a separate directory; you appear to have broken that model. Consider restoring it.
I assume you built your perl on a 32 bit machine, so during the build process, Configure didn't include any of the 'make this 32 bit' compiler switches. If you build on a 64 bit machine now, the build process will use exactly the same switches, so you get a 64 bit binary that cant't be loaded from 32 bit perl - not even on the 64 bit machines, beacuse the 32 bit perl binary you're running there can't load a 64 bit shared library either.
You might try building your shared perl on a 64 bit machine, explicitly stating you want a 32 bit perl. There should be some configure parameters for this. That way, you have a perl that sets the "use 32 bit" compiler flag when building modules. Then, you can use that version of perl on each of the machines to build the module. The modules won't be identical, but each of them will run on its bit size, and your software distribution process could pull the correct module when distributing to a specific machine.
However, the real problem is somewhat behind. I assume someone in your company at some point said "We don't want to be dependent on what the distributions provide, let's build our own perl that we can copy everywhere". This sounds like a good idea, but it is NOT. Different Linux versions use different versions of shared libraries, default directories for configuration files, default path variables etc. The configure process takes care of that and creates a perl binary for exactly your machine. If you copy this to a different machine, it might not find symbols in other versions of shared libraries. It might try to read lib from directories that don't exist there. It might not include a workaround for some bug that was corrected on the machine where you built it, but need the workaround on the older system you copied it to. Or, it might provide a workaround for something that has long been fixed on the newer system, thus wasting CPU time.
So, essentially, creating one perl to copy everywhere will ONLY work well if you build a static perl that includes everything and doesn't need any shared libraries. The standard, shared-library-using perl you compile on one machine, does NOT meet the "behaves the same everywhere i copy it to" request you probably had, because it depends way too much on stuff "around" it.

Using x86 materials to learn assembly on a 64 bit OS?

I am teaching myself/reading up about assembly. Most of the books on assembly refer to x86- all the register names in the code begin with "e" and not "r" (as they would in x86-64). However, I use 64-bit Linux and I was wondering if these books have any value because they are not referring to x86-64.
So in short- is it really worth me using these resources to learn x86-64. Or reworded differently, besides the difference in register naming convention- are there any other differences between the two which could make learning from x86 materials difficult?
64 bit Linux allows running 32bit applications, so you still can create 32 bit applications on your computer. This way, the books and example 32 bit code are fully useful.
The only single problem you might have is if the assembly application dynamically link to some 32 bit shared library. In order to fix this you should install 32 bit compatibility layer.
The assembly programs that use only Linux system calls works fine without this layer, which is actually set of shared libraries compiled for 32 bit.
BTW, in my opinion, writing 32 bit code is still better if you want your programs to be useful for more people. There are still many 32 bit computers around and they will not disappear soon.
It's indeed a bit easier to learn assembly on 32bit since the calling conventions and stack management are simpler.
On 64bit you need to worry about ABI. Not only that but the conventions are not the same for every OSes. For instance, the ABI rules on Mac OS X are different than those on Windows (the registers are not the same and on Windows it only uses 4 registers).
You can compile your assembly code using -arch i386 with the assembler (as). With clang or gcc you can use -m32 (at least on Mac OS X, since I haven't used it on Linux proper). You won't be able to link modules that have different bitness (32bit vs 64bit).
Once you're ready to switch or compile your program for 64bit you will have to make sure that when you handle the stack you need to push 64bit words instead of 32bit ones but that kinda goes with saying.

Is it that hard to port an application to 64 bits?

I was surprised to read that Adobe discontinued the 64 bits version of Flash for Linux. While there is a new 32 bits version, and Adobe advises users to use the 32 bits version of Firefox instead.
Was wondering, as I didn't have to do that yet, is it that hard to port an application to 64 bits? Besides the libraries changes and the recompilation (settings in the Makefile), what makes the port difficult? (Flash is an example)
As noted in an Adobe blog post, Flash's ActionScript engine has a JIT compiler, that compiles the ActionScript code into native code.
x64 has a very different instruction set from x86. Therefore, making the JIT compiler generate x64 code is a non-trivial task, and is far more complicated than just making all the words 64 bits. :-)
The real kicker with porting apps to 64-bit is that every OS seems to treat the primitive types as they please. For example, under most linux environments a long is 4 bytes on a 32 bit system and 8 bytes on a 64 bit system (32bit=4bytes 64bit=8bytes.) Meanwhile, the int stays 4 bytes across 64bit and 32 bit systems. Under windows, the opposite seems to be true, the long stays consistently 4 bytes while the int switches between 32 and 64 bit.
That said, I have ported a medium sized project at work before from 32 bit to 64 bit linux (about 25,000 lines of code) only having to make changes in the the assembly code (GASM) which made several faulty assumptions about datatypes being 4 bytes long. Other than this, I had no problems, which suggests that provided you payed strict attention to your data types when you were first developing, porting should be seemless, perhaps only requiring certain compile switches be changed (like -fpic.) There were a few really bizzare corner cases that came up in my porting experience but I think they were mostly due to undefined behaviour of some GASM more than the porting itself.
If you use a lot of ints and floats, it can be amazingly complex to get it to work suitably, esp. if it is a networked app.
It took over 2 years to port xMule to 64 bits and I don't believe its parent project, eMule, has 64 bit at all.
Ideally it should be a recompilation, in reality it takes a significant amount of effort. Even if it is simple there still has to be a full sweep by the QA team to prove it works and that always takes a while.
The obvious problems I guess would be variable sizes (e.g. longs are 64 bits on most 64 bit compilers), this messes up anything that uses size related operations such as bit shifting / some pointer arithmetic. I think adobe just can't be be bothered to scan through and ensure cross compatibility. Especially when 90%+ of browser use is on 32bit versions, I know flash hasn't ever worked on the 64bit IE but even 64bit Windows 7 defaults to the 32-bit version.
Lot of information on it here if your interested:
http://www.viva64.com/content/articles/64-bit-development/?f=20_issues_of_porting_C++_code_on_the_64-bit_platform.html&lang=en&content=64-bit-development
the coding part of porting to 64bits generally isn't that hard, but can require some time and a lot of hairpulling regarding builds/libs etc. however, the real problem, especially for a widely deployed project like flash is going through the proper testing coverage to cover the many code paths and platforms. 64bit is a horizontal feature, so it can possibly break everything, so everything needs to be tested.
for flash on linux in particular, its probably more of a cost-benefit issue. is catering to the percentage of users actually use linux and 64bit worth the development costs for adobe? probably not, at this point.

Is assembler portable between Linux distros?

Is a program shipped in assembler format portable between Linux distributions (modulo CPU architecture differences)?
Here's the background to my question: I'm working on a new programming language (named Aklo), whose modus operandi will be the classic compiling to .s and feeding the result to the GNU assembler.
Obviously it would be nice ultimately to have the implementation written in itself, but I had resigned myself to maintaining it in C++ to solve the chicken and egg problem: suppose you download the compiler for the first time and it is itself written in Aklo, how do you compile it? As I understand it, different Linux distributions and other UNIX like systems have different conventions for binary formats.
But it's just occurred to me, a solution might be to ship the .s file (well, one per CPU architecture): it's fair to assume you have or can install the GNU assembler. Of course I'd still need a bootstrap compiler, but that doesn't need to be fast; I can write it in Python.
Is assembler portable in the way that binaries are not? Are there any other stumbling blocks I haven't thought of?
Added in response to one answer:
I had looked wistfully at LLVM, there is certainly a lot of good stuff there and it would make my life easier -- except that it would incur a dependency on the correct version of LLVM being installed. It wouldn't be so bad having that dependency on development machines, but in a world where it's common to ship programs as source, the same dependency would be incurred for every user of every program ever written in Aklo, and I decided that was too high a price to pay.
But if the solution of shipping compiled programs as assembler works... then that solves that problem, and I can use LLVM after all, which would be a big win.
So the question about portability of assembler is even considerably more important than I had first realized.
Conclusion: from answers here and on the LLVM mailing list http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-January/028991.html it seems the bad news is the problem is unsolvable, but the good news is that means using LLVM makes it no worse, so I'm free to do so and obtain all the advantages thereof.
You might want to check out LLVM before going down this particular path. It might make your life a lot easier, as it provides a low level virtual machine that makes a lot of hard stuff just work and has been very popular.
At a very high level, the ABI consists of { instruction set, system calls, binary format, libraries }.
Distribution as .s may free you from the binary format. This is still rather pointless, because you are fixed to a particular ISA and still need to use libraries and/or make system calls. Libraries vary from distribution to distribution (although this isn't really that bad, especially if you just use libc) and syscalls vary from OS to OS.
It's basically 20 years since I last bootstrapped a C compiler. At the level of compilers, the differences between Linux distributions are minimal.
The much more important reason for going LLVM is cross-platform; if you're not writing some intermediate language, your compiler will be extremely difficult to retarget for different processors. And seeing as, on my laptop, I have compilers for x86, x86_64, two kinds of MIPS, PowerPC, ARM and AVR... you see where I'm going? I can compile multiple languages for most of those targets too (only C for AVR).

Resources