I have a binary file in mips format. I was able to disassemble it, make the changes I wanted to the assembly file in mips. Now I would like to assemble it back into a bin file again. I am using cygwin and am trying to do so with the ar utility.
This is the original object dump:
$ objdump -b binary -h test.bin
test.bin: file format binary
Sections:
Idx Name Size VMA LMA File off Algn
0 .data 00200004 00000000 00000000 00000000 2**0
CONTENTS, ALLOC, LOAD, DATA
I also have the assembly file (test.asm) which contains the mips instructions from the test.bin file.
I then tried to assemble it by:
ar -q test2.bin test.asm --target=elf32-big
and
ar -cr test2.bin test.asm --target=elf32-big
But in both cases I only get a bin file with the contents of the assembly file. Can anyone help on what I am missing to assemble this back to a elf32-big binary?
Thanks Before Hand
To do this, you'll need a MIPS assembler program. If you have a full gcc MIPS cross-compiler, the name of the assembler should be something like mips-as or as.
Actually, it might be easier to compile it with mips-gcc, which will invoke the assembler and linker for you.
I'm using objcopy on bash (ubuntu linux) and im trying to copy 2 sections from an ELF file using the folowing commend:
objcopy -j .section1 -j .section2
The problem is the objcopy is adding some padding between the sections. Is there a way (a flag?) that can stop objcopy from padding the sections?
the sections are placed one after the other in the file so there is no need for any kind of padding....
Solved!
The problem was that the sections was one after the other but not at the same segmant.
One was in a W E segment and one was in a R W segmant.
And thats why objcopy messed up.
Where does the ELF format stores the names of imported functions? Is it always possible to enumerate all import names, like for PE executables?
For example, if a binary is using printf is it possible to tell it does, just by static analysis of the binary itself?
In ELF they're called undefined symbols. You can view the list of undefined symbols by:
nm -D <file>|grep -w U
objdump -T <file>|grep "\*UND\*"
ELF files don't specify which symbols come from which libraries; it just adds a list of shared libraries to link to into the ELF binary, and lets the linker find the symbols in the libraries.
In this particular case I'm trying to discover if a mylib.a file is 32 or 64 bit compatible. I'm familiar with ldd for shared objects (mylib.so) but how do I inspect a regular .a archive?
"nm" and "ar" will give you some information about the library archive.
$ objdump -G /usr/lib/libz.a
In archive /usr/lib/libz.a:
adler32.o: file format elf32-i386
...
$ objdump -G /usr/lib64/libz.a
In archive /usr/lib64/libz.a:
adler32.o: file format elf64-x86-64
...
$ objdump -G /ppc-image/usr/lib/libz.a
In archive /ppc-image/usr/lib/libz.a:
adler32.o: file format elf32-powerpc
...
Requires a multilib-capable binutils, but it's pretty straightforward, is it not?
Standard "nm" and "ar" utilities will give you information about the archive.
To learn about the 32/64 bit ability of the archive use "ar" to extract the .o files inside the mylib.a, then run "file" on the .o files to discover their type including the 32/64 bit usage.
In the general case, I just use the 'file' utility.
A quick question about elf file headers, I can't seem to find anything useful on how to add/change fields in the elf header. I'd like to be able to change the magic numbers and to add a build date to the header, and probably a few other things.
As I understand it the linker creates the header information, but I don't see anything in the LD script that refers to it (though i'm new to ld scripts).
I'm using gcc and building for ARM.
thanks!
Updates:
ok maybe my first question should be: is it possible to create/edit the header file at link time?
I don't know of linker script commands that can do this, but you can do it post-link using the objcopy command. The --add-section option can be used to add a section containing arbitrary data to the ELF file. If the ELF header doesn't contain the fields you want, just make a new section and add them there.
This link (teensy elf binary) was someone's answer to another question, but it goes into the intricacies of an ELF header in some detail.
You can create an object file with informative fields like a version number and link that file such that they are included in the resulting ELF binary.
Ident
For example, as part of you build process, you can generate - say - info.c that contains one or more #ident directives:
#ident "Build: 1.2.3 (Halloween)"
#ident "Environment: example.org"
Compile it:
$ gcc -c info.c
Check if the information is included:
$ readelf -p .comment info.o
String dump of section '.comment':
[ 1] Build: 1.2.3 (Halloween)
[ 1a] Environment: example.org
[ 33] GCC: (GNU) 7.2.1 20170915 (Red Hat 7.2.1-2)
Alternatively, you can use objdump -s --section .comment info.o. Note that GCC also writes its own comment, by default.
Check the information after linking an ELF executable:
$ gcc -o main main.o info.o
$ readelf -p .comment main
String dump of section '.comment':
[ 0] GCC: (GNU) 7.2.1 20170915 (Red Hat 7.2.1-2)
[ 2c] Build: 1.2.3 (Halloween)
[ 45] Environment: example.org
Comment Section
Using #ident in a C translation unit is basically equivalent to creating a .comment section in an assembler file. Example:
$ cat info.s
.section .comment
.string "Build: 1.2.3 (Halloween)"
.string "Environment: example.org"
$ gcc -c info.s
$ readelf -p .comment info.o
String dump of section '.comment':
[ 0] Build: 1.2.3 (Halloween)
[ 19] Environment: example.org
Using an uncommon section name works, as well (e.g. .section .blahblah). But .comment is used and understood by other tools. GNU as also understands the .ident directive, and this is what GCC translates #ident to.
With Symbols
For data that you also want to access from the ELF executable itself you need to create symbols.
Objcopy
Say you want to include some magic bytes stored in a data file:
$ cat magic.bin
2342
Convert into a object file with GNU objcopy:
$ objcopy -I binary -O elf64-x86-64 -B i386 \
--rename-section .data=.rodata,alloc,load,readonly,data,contents \
magic.bin magic.o
Check for the symbols:
$ nm magic.o
0000000000000005 R _binary_magic_bin_end
0000000000000005 A _binary_magic_bin_size
0000000000000000 R _binary_magic_bin_start
Example usage:
#include <stdio.h>
#include <string.h>
#include <inttypes.h>
extern const char _binary_magic_bin_start[];
extern const char _binary_magic_bin_end[];
extern const unsigned char _binary_magic_bin_size;
static const size_t magic_bin_size = (uintptr_t) &_binary_magic_bin_size;
int main()
{
char s[23];
memcpy(s, _binary_magic_bin_start,
_binary_magic_bin_end - _binary_magic_bin_start);
s[magic_bin_size] = 0;
puts(s);
return 0;
}
Link everything together:
$ gcc -g -o main_magic main_magic.c magic.o
GNU ld
GNU ld is also able to turn data files into object files using an objcopy compatible naming scheme:
$ ld -r -b binary magic.bin -o magic-ld.o
Unlike objcopy, it places the symbols into the .data instead of the .rodata section, though (cf. objdump -h magic.o).
incbin
In case GNU objcopy isn't available, one can use the GNU as .incbin directive to create the object file (assemble with gcc -c incbin.s):
.section .rodata
.global _binary_magic_bin_start
.type _binary_magic_bin_start, #object
_binary_magic_bin_start:
.incbin "magic.bin"
.size _binary_magic_bin_start, . - _binary_magic_bin_start
.global _binary_magic_bin_size
.type _binary_magic_bin_size, #object
.set _binary_magic_bin_size, . - _binary_magic_bin_start
.global _binary_magic_bin_end
.type _binary_magic_bin_end, #object
.set _binary_magic_bin_end, _binary_magic_bin_start + _binary_magic_bin_size
; an alternate way to include the size
.global _binary_magic_bin_len
.type _binary_magic_bin_len, #object
.size _binary_magic_bin_len, 8
_binary_magic_bin_len:
.quad _binary_magic_bin_size
xxd
A more portable alternative that doesn't require GNU objcopy nor GNU as is to create an intermediate C file and compile and link that. For example with xxd:
$ xxd -i magic.bin | sed 's/\(unsigned\)/const \1/' > magic.c
$ gcc -c magic.c
$ nm magic.o
0000000000000000 R magic_bin
0000000000000008 R magic_bin_len
$ cat magic.c
const unsigned char magic_bin[] = {
0x32, 0x33, 0x34, 0x32, 0x0a
};
const unsigned int magic_bin_len = 5;
I'm fairly sure that a sufficiently complex ld script can do what you want. However, I have no idea how.
On the other hand, elfsh can easily do all sorts of manipulations to elf objects, so give it a whirl.
You might be able to use libmelf, a dead project on freshmeat, but available from LOPI - http://www.ipd.bth.se/ska/lopi.html
Otherwise, you can get the spec and (over)write the header yourself.
I haven't done this in awhile, but can't you just append arbitrary data to an executable. If you always append fixed-size data it would be trivial to recover anything you append. Variable size wouldn't be much harder. Probably easier than messing w/ elf headers and potentially ruining you executables.
I didn't finish the book but iirc Linkers and Loaders by John Levine had the gory details that you would need to be able to do this.
In Solaris you can use elfedit but I think you are really asking solutions for Linux. Linux Is Not UniX :P
In Linux Console:
$ man ld
$ ld --verbose
HTH