Header contents of a .bin file in Linux - linux

Which contents have the header of an executable .bin file in Linux?
I found information for .exe files in Windows but I can't find any information for .bin files.
TIA

Just to be clear, in Linux, executable files may be called "binary" but don't have an explicit ".bin"
Linux generally uses the ELF format. The first byte is 0x7F followed by ascii for E, L, F - this is easily visible if you can load the binary into a text editor or print it at the command line using 'cat' or 'less'. After that... well I'm rusty but details are easily found on the web.
Try http://www.thegeekstuff.com/2012/07/elf-object-file-format/ and http://www.acsu.buffalo.edu/~charngda/elf.html for a starter. (I found these with a superficial quick search, and do not claim these are the best. Happy hunting!)

I'm not 100% sure what you are asking. But most Linux executables use the ELF format.
The readelf utility can read metadata from ELF executables.
ELF on wikipedia

Related

How does the Linux command `file` recognize the encoding of my files?

How does the Linux command file recognize the encoding of my files?
zell#ubuntu:~$ file examples.desktop
examples.desktop: UTF-8 Unicode text
zell#ubuntu:~$ file /etc/services
/etc/services: ASCII text
The man page is pretty clear
The filesystem tests are based on examining the return from a stat(2)
system call...
The magic tests are used to check for files with data in particular
fixed formats. The canonical example of this is a binary executable
(compiled program) a.out file, whose format is defined in #include
and possibly #include in the standard include
directory. These files have a 'magic number' stored in a particular
place near the beginning of the file that tells the UNIX operating
system that the file is a binary executable, and which of several
types thereof. The concept of a 'magic' has been applied by extension
to data files. Any file with some invariant identifier at a small
fixed offset into the file can usually be described in this way. The
information identifying these files is read from the compiled magic
file /usr/share/misc/magic.mgc, or the files in the directory
/usr/share/misc/magic if the compiled file does not exist. In
addition, if $HOME/.magic.mgc or $HOME/.magic exists, it will be used
in preference to the system magic files. If /etc/magic exists, it will
be used together with other magic files.
If a file does not match any of the entries in the magic file, it is
examined to see if it seems to be a text file. ASCII, ISO-8859-x,
non-ISO 8-bit extended-ASCII character sets (such as those used on
Macintosh and IBM PC systems), UTF-8-encoded Unicode, UTF-16-encoded
Unicode, and EBCDIC character sets can be distinguished by the
different ranges and sequences of bytes that constitute printable text
in each set. If a file passes any of these tests, its character set is
reported.
In short, for regular files, their magic values are tested. If there's no match, then file checks whether it's a text file, making an educated guess about the specific encoding by looking at the actual values of bytes in the file.
Oh, and you can also download the source code and look at the implementation for yourself.
TLDR: Magic File Doesn't Support UTF-8 BOM Markers
(and that's the main charset you need to care about)
The source code is on GitHub so anyone can search it. After doing a quick search, things like BOM, ef bb bf, and feff do not appear at all. That means UTF-8, Byte-Order-Mark reading is not supported. Files made in other applications that use or preserve the BOM marker will all be returned as "charset=unknown" when using file.
In addition, none of the config files mentioned in the Magic File manpage are a part of magic file v. 4.17. In fact, /etc/magicfile/ doesn't exist at all, so I don't see any way in which I can configure it.
If you're stuck trying to get the ACTUAL charset encoding and magic file is all you have, you can determine if you have a UTF-8 file at the Linux CLI with:
hexdump -n 3 -C $path_to_filename
If the above returns the following sequence, ef bb bf, then you are 99% likely in possession of a BOM-marked UTF-8 file. This is not a 100% certainty, but it is far more useful than magic file, where it has no handling whatsoever for Byte Order Marks.

how to convert .txt files to ReST in linux

In Documentation of linux kernel tree I need to convert .txt files to ReST. Is it like simply renaming the extension like .rst?
Having had a cursory glance over a few files in the documentation directory of the kernel source tree I'd say nothing needs doing, they seem to be in rst markup; the extensions don't matter in Linux.

Minidump symbols module id and Build id from ELF on Linux

I'm generating symbols file for an executable for minidump. The first line in the minidump symbols file contains specific id of the executable for which the file was generated. How can I find that id inside the executable? When I use readelf to check build id then it's something different (even the length is different).
How can I find that id inside the executable?
Which is that id? Presumably the one your tool uses, except you didn't tell us which tool you actually use.
Most Linux tools (such as GDB) use a special NT_GNU_BUILD_ID note in the elf binary to associate debug info with the binary. You can see that build-id in readelf -n a.out output.
When I use readelf to check build id then it's something different
Again, what exactly do you see? What command do you run?
Maybe they are one and same, and you are just "holding it wrong". Or they are encoded differently, or you are looking at the wrong thing. We can't tell.

How to dump a core file TEXT pages as disassembled text file?

I have a core file and I want to dump all the executable memory pages it contains to a ascii file to follow the assembler that gets executed. How can I do so ?
I found it. objdump is the tool I need. The option -d in particular disassembles the core file contents. I didn't know objdump also handled core files.

Linux file utility magic.mgc database get content

I write project where I need to identify certain file formats.
For some formats I have found signatures that I use for identifying easily (mp3, ogg), with another formats I have a big problem (like MPEG ADTS) - I just cannot find what kind of signature can be used for it.
I found out that File utility for Linux environment can do it.
I tried to search it in source code, but I've found nothing.
I found that file utility holds its database in magic.mgc file. But it's hold in binary form.
It looks like:
Does someone perhaps know how to find that database in plain text format?
That utility isn't a Linux-specific utility; it's the version of the UN*X file command originally written by Ian Darwin. The binary .mgc file is generated from a bunch of source files.
Your Linux distribution probably has a source code package for it; where you get that package, and how you install it, depends on which distribution you're using.
The source files from which the .mgc file was generated might also be available on your distribution without installing the source package for file; if so, you could use the file command to generate it, using the -C flag. I don't see them anywhere obvious on my Ubuntu 12.04 virtual machine, so that might require some other package to be installed (file itself is installed). (On OS X, they're in the directory /usr/share/file/magic.)
Alternatively, you could download the standard version of that file (which might have been modified by your distribution, so you might not want that version) and modify and build it.
Note that, on some versions of UN*X systems, the bulk of the work done by the file command is done in library routines in the "libmagic" library; see whether your distribution has that or can install it (try, for example, man libmagic) and whether it can do the job for you.

Resources