I am interested in how the binary utilities of Linux are coded and how do they work. Where can I find the source code for them?
Strings is usually part of the binutils and since they are maintained by the Free Software Foundation and licensed under the GNU Public License, the source code is available here:
http://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git
or packages of version related snapshots here:
ftp://sourceware.org/pub/binutils/snapshots
If you want to start with a general overview, try the Wikipedia page for binutils or this explanation of the toolchain (more a general description)
Related
Recently I'm writing a paper associated with open sources and GNU. I need to do some testings to simulate how the pioneers of GNU develop the GNU build system from zero in the early period. But I found one strange thing in README with the statement:
If GNU 'm4' is meant to serve GNU 'autoconf', beware that 'm4' should be fully installed prior to configuring 'autoconf' itself.
Likewise, if you intend on hacking GNU 'm4' from git, the bootstrap process requires that you first install a released copy of GNU 'm4'.
If we follow up this logic, what about the first released copy of GNU m4? Can anybody have clue or hint? Thank you.
If we follow up this logic, what about the first released copy of GNU m4?
Retrocomputing SE would be a better forum for questions about computing history. From a technical perspective, however, it is obvious that the first version of GNU m4 could not have depended on an Autoconf build system if Autoconf also depended on GNU m4, not even if they were part of the same package. In fact, GNU's m4 is not the first or only m4, and it did not always have an Autoconf-based build system. For its part, Autoconf did not always depend specifically on GNU m4.
Bear in mind, too, that
configure scripts similar to those produced by Autoconf were originally written by hand or generated by other tools, and other, less automated approaches predated that.
an Autoconf configure script itself does not ordinarily depend on Autoconf or m4. This is by design. As long as you have a complete distribution of an Autotools-based project (which, by definition, includes a configure script) you do not need to be able to run Autoconf or m4 to build the project.
The Autoconf manual has a chapter on the tool's history that might interest you.
I have a third-party library which depends on libgcc_s_sjlj-1.dll.
My own program is compiled under MSYS2 (mingw-w64) and it depends on libgcc_s_dw2-1.dll.
Please note that the third-party library is pure binaries (no source). Please also note that both libgcc_s_sjlj-1.dll and libgcc_s_dw2-1.dll are 32-bit, so I don't think it's an issue related to architecture.
The outcome is apparent, programs compiled based on libgcc_s_dw2-1.dll can't work with third-party libraries based on libgcc_s_sjlj-1.dll. What I get is a missing entrypoint __gxx_personality_sj0.
I can definitely try to adapt my toolchain to align with the third-party's libgcc_s_sjlj-1.dll, but I do not know how much effort I need to go about doing it. I find no such variant of libgcc dll under MSYS2 using this setjmp/longjmp version. I am even afraid that I need to eliminate the entire toolchain because all the binaries I had under MSYS2 sits atop this libgcc_s_dw2-1.dll module.
My goal is straightforward: I would like to find a solution so that my code will sit on top of libgcc_s_sjlj-1.dll instead of libgcc_s_dw2-1.dll. But I don't know if I am asking a stupid question simply because this is just not possible.
The terms dw2 and sjlj refer to two different types of exception handling that GCC can use on Windows. I don't know the details, but I wouldn't try to link binaries using the different types. Since MSYS2 does not provide an sjlj toolchain, you'll have to find one somewhere else. I would recommend downloading one from the "MingW-W64-builds" project, which you can find listed on this page:
https://mingw-w64.org/doku.php/download
You could use MSYS2 as a Bash shell but you can probably not link to any of its libraries in your program; you would need to recompile all libraries yourself (except for this closed source third-party one).
I'm looking for software for Linux which can compare source code packages - tarballs, SRPMs and etc. and display the differences in the source code. Can you recommend some good software?
Best wishes
Try gendiff (usually packaged with RPM).
I was reading an article about cross-compiling for OSX on linux, but it was quite hard to understand.
What tools do I need? And what configurations are necessary?
Are there any tools for creating packages too?
First you need odcctools, which contains assembler and linker and such (like binutils but capable of handling the Mach-O object format). Then you need the system libraries from the official SDK. You can download it from Apple, but must agree to some stuff and become a member to do so. And finally good old gcc. Quite easy in theory, but in reallity a horrible mess. The easiest way to go (that I know of) is to use I'm Cross!.
Update: I found a newer and better updated method called xchain. It requires more manual work than I'm Cross! thou.
how can I write just a simple disassembler for linux from scratches?
Are there any libs to use? I need something that "just works".
Instead of writing one, try Objdump.
Based on your comment, and your desire to implement from scratch, I take it this is a school project. You could get the source for objdump and see what libraries and techniques it uses.
The BFD library might be of use.
you have to understand the ELF file format first. Then, you can start processing the various sections of code according to the opcodes of your architecture.
You can use libbfd and libopcodes, which are libraries distributed as part of binutils.
http://www.gnu.org/software/binutils/
As an example of the power of these libraries, check out the Online Disassembler (ODA).
http://www.onlinedisassembler.com
ODA supports a myriad of architectures and provides a basic feature set. You can enter binary data in the Live View and watch the disassembly appear as you type, or you can upload a file to disassemble. A nice feature of this site is that you can share the link to the disassembly with others.
You can take a look at the code of ERESI
The ERESI Reverse Engineering Software Interface is a multi-architecture binary analysis framework with a tailored domain specific language for reverse engineering and program manipulation.