How to strip single name from object file on OS X - linux

I am following some Linux instructions on OS X and am stuck on one line:
strip -N main my_file.o
The OS X version of strip doesn't have an -N option and I've read through the man page twice but am just not sure how to do this.
So how do I strip a single name from the symbol table on OS X?

As you say, the OSX version of strip doesn't allow this; the only way therefore is to limit its visibility in code using this on the declaration:
__attribute__((visibility("hidden"))) void MyFunction1();
Alternatively you could compile with -fvisibility=hidden and use "default" in the above __attribute__ to only expose the symbols you want.
This is a better approach anyway, as it does not require an external build step.
Note: I have found this doesn't work as expected when Objective-C code is introduced into the code base...
Reference

Related

How to add own custom function in standard shared library (C) in linux?

I have downloaded libgcrypt library source code and I want to customize this standard shared library by adding my function inside one particular
source code file .
Although compilation/build process of customized shared library is successful, but it shows error at linking time.
Here is what I have done .
inside /src/visibility.c file, I have added my custom function,
void MyFunction(void)
{
printf("This is added just for testing purpose");
}
I have also include function prototype inside /src/gcrypt.h
void MyFunction(void);
#build process
./configure --prefix=/usr
sudo make install
nm command find this custom function.
nm /usr/lib/libgcrypt.so | grep MyFunction
000000000000dd70 t MyFunction
Here is my sample code to access my custom function.
//aes_gcrypt_example.c
#include <stdio.h>
#include <gcrypt.h>
#include <assert.h>
int main()
{
MyFunction();
return 0;
}
gcc aes_gcrypt_example.c -o aes -lgcrypt
/tmp/ccA0qgAB.o: In function `main':
aes_gcrypt_example.c:(.text+0x3a2): undefined reference to `MyFunction'
collect2: error: ld returned 1 exit status
I also tried by making MyFunction as extern inside gcrypt.h, but in that case also I am getting same error.
Why is this happening ?
Is the customization of standard library is not allowed ?
If YES, then is there any FLAG to disable to allow customization ?
If NO, what mistake I am making ?
It would be great help if someone provide some useful link/solution for the above mentioned problem. I am using Ubuntu16.04 , gcc 4.9.
Lower-case t for the symbol type?
nm /usr/lib/libgcrypt.so | grep MyFunction
000000000000dd70 t MyFunction
Are you sure that's a visible function? On my Ubuntu 16.04 VM, the linkable functions defined in an object file have T (not t) as the symbol type. Is there a stray static kicking around and causing confusion? Check a couple of other functions defined in libgcrypt.so (and documented in gcrypt.h) and see whether they have a t or a T. They will have a T and not t. You'll need to work out why your function gets a t — it is not clear from the code you show.
The (Ubuntu) man page for nm includes:
The symbol type. At least the following types are used; others
are, as well, depending on the object file format. If lowercase,
the symbol is usually local; if uppercase, the symbol is global
(external).
The line you show says that MyFunction is not visible outside its source file, and the linker agrees because it is not finding it.
Your problem now is to check that the object file containing MyFunction has a symbol type T — if it doesn't the problem is in the source code.
Assuming that the object file shows symbol type T but the shared object shows symbol type t, you have to find what happens during the shared object creation phase to make the symbol invisible outside the shared object. This is probably because of a 'linker script' that controls which symbols are visible outside the library (or maybe just compilation options). You can search on Google with 'linker script' and various extra words ('tutorial', 'provide', 'example', etc) and come up with links to the relevant documentation.
You may need to research documentation for LibTool, or for the linker BinUtils. LibTool provides ways of manipulating shared libraries. In a compilation command line that you show in a comment, there is the option -fvisibility=hidden. I found (mostly by serendipitous accident) a GCC Wiki on visibility. See also visibility attribute and code generation options.

Difference in md5sums in two object files

I compile twice the same .c and .h files and get object files with the same size but different md5sums.
Here is the only difference from objdump -d:
1) cpcidskephemerissegment.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_ZN68_GLOBAL__N_sdk_segment_cpcidskephemerissegment.cpp_00000000_B8B9E66611MinFunctionEii>:
2) cpcidskephemerissegment.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_ZN68_GLOBAL__N_sdk_segment_cpcidskephemerissegment.cpp_00000000_8B65537811MinFunctionEii>:
What can be the reason? Thanks!
I guess, the compiler didn't know how to name this namespace and used path to the source file plus some random number.
The compiler must guarantee that a symbol in unnamed namespace does not conflict with any other symbol in your program. By default this is achieved by taking full filename of the source, and appending a random hash value to it (it's legal to compile the same source twice (e.g. with different macros) and link the two objects into a single program, and the unnamed namespace symbols must still be distinct, so using just the source filename without the seed is not enough).
If you know that you are not linking the same source file more than once, and want to have a bit-identical object file on re-compile, the solution is to add -frandom-seed="abcd" to your compile line (replace "abcd" with anything you want; it's common to use the filename as the value of random seed). Documentation here.
The reasons can be many:
Using macros like __DATE__ and __TIME__
Embedding counters that are incremented for each build (the Linux kernel does this)
Timestamps (or similarly variable quantities) embedded in the .comments ELF section. One example of a compiler that does this is the xlC compiler on AIX.
Different names as a result of name mangling (e.g. C++)
Changes in environment variables which are affecting the build process.
Compiler bug(s) (however unlikely)
To produce bit identical builds, you can use GCC's -frandom-seed parameter. There were situations where it could break things before GCC 4.3, but GCC now turns functions defined in anonymous namespaces into static symbols. However, you will always be safe if you compile each file using a different value for -frandom-seed, the simplest way being to use
the filename itself as the seed.
Finally I've found the answer!
c++filt command gave the original name of the function:
{unnamed namespace}: MinFunction(int, int)
In the source was:
namespace
{
MinFunction(int a, int b) { ... }
}
I named the namespace and got stable checksum of object file!
As I guess, the compiler didn't know how to name this namespace and used path to the source file plus some random number.

Error linking module in ocaml

I am a complete beginner with Ocaml programming and I am having trouble linking a module into my program. Actually I am doing some regular expression checking and I have written a function that basically tokenizes a string based on a separator string using the Str module . So i use the functions defined in the library like this:
Str.regexp_string /*and so on*/
However, when I try to compile the ml file, I get an error suggesting that I have an undefined global Str . We use List functions by typing in List.length and so on just like I did for Str without having to explicitly include the specific module. I tried
open Str;;
include Str;; /*None of these work and I still get the same error*/
However if in the toplevel I use
load "str.cma" /*Then the program works without problems*/
I want to include the module in the ml file because I have to in the end link 3 cmo's to get the final executable(which is not run in the toplevel). I know this is a really basic question but I am having trouble solving it. Thanks in advance.
You don't need to add anything in your file foo.ml.
You do need to tell the compiler where to find the Str module when compiling foo.ml . To do so, add it to the command line used to compile foo.ml:
ocamlc str.cma foo.ml
or
ocamlopt str.cmxa foo.ml
List and other modules from the standard library are accessible by default, so you don't need to tell the compiler about those often used modules.
Just add str to the libraries field of your dune file.
I think you need to use '-cclib ' compiler directive.
The module name shouldn't include the file ending like .cma.
Below is what I did when trying to use the unix and threads modules.
I think you need to use some combination of the 'custom' and 'cclib' compiler directives.
ocamlc -custom unix.cma threa.ml -cclib -lunix
Look at chapter 7 of this book for help:
http://caml.inria.fr/pub/docs/oreilly-book/html/book-ora063.html
And look at coverage of compiler directives here:
http://caml.inria.fr/pub/docs/manual-ocaml-4.00/manual022.html#c:camlc
ocamlc calc.ml str.cma -o calc
File "calc.ml", line 1:
Error: Error while linking calc.cmo:
Reference to undefined global `Str'
Code is very simple, to cut down scruff.
let split_into_words s =
Str.split ( Str.regexp "[ \n\t]+") s ;;
let _ =
split_into_words "abc def ghi" ;;
On ocaml 4.0.2. Obviously, there is a problem here, but I am too much of a beginner to understand what it is. From toplevel it seems work fine with #load "str.cma", so there is something here we don't understand. Anyone know what it is?

How to embed version information into shared library and binary?

On Linux, is there a way to embed version information into an ELF binary? I would like to embed this info at compile time so it can then be extract it using a script later. A hackish way would be to plant something that can be extracted using the strings command. Is there a more conventional method, similar to how Visual Studio plant version info for Windows DLLs (note version tab in DLL properties)?
One way to do it if using cvs or subversion is to have a special id string formatted specially in your source file. Then add a pre-commit hook to cvs or svn that updates that special variable with the new version of the file when a change is committed. Then, when the binary is built, you can use ident to extract that indformation. For example:
Add something like this to your cpp file:
static char fileid[] = "$Id: fname.cc,v 1.124 2010/07/21 06:38:45 author Exp $";
And running ident (which you can find by installing rcs) on the program should show the info about the files that have an id string in them.
ident program
program:
$Id: fname.cc,v 1.124 2010/07/21 06:38:45 author Exp $
Note As people have mentioned in the comments this technique is archaic. Having the source control system automatically change your source code is ugly and the fact that source control has improved since the days when cvs was the only option means that you can find a better way to achieve the same goals.
To extend the #sashang answer, while avoiding the "$Id:$" issues mentioned by #cdunn2001, ...
You can add a file "version_info.h" to your project that has only:
#define VERSION_MAJOR "1"
#define VERSION_MINOR "0"
#define VERSION_PATCH "0"
#define VERSION_BUILD "0"
And in your main.c file have the line:
static char version[] = VERSION_MAJOR "." VERSION_MINOR "." VERSION_PATCH "." VERSION_BUILD;
static char timestamp[] = __DATE__ " " __TIME__;
(or however you want to use these values in your program)
Then set up a pre-build step which reads the version_info.h file, bumps the numbers appropriately, and writes it back out again. A daily build would just bump the VERSION_BUILD number, while a more serious release would bump other numbers.
If your makefile lists this on your object's prerequisite list, then the build will recompile what it needs to.
The Intel Fortran and C++ compilers can certainly do this, use the -sox option. So, yes there is a way. I don't know of any widespread convention for embedding such information in a binary and I generally use Emacs in hexl-mode for reading the embedded information, which is quite hackish.
'-sox' also embeds the compiler options used to build an executable, which is very useful.
If you declare a variable called program_version or similar you can find out at which address the variable is stored at and then proceed to extract its value. E.g.
objdump -t --demangle /tmp/libtest.so | grep program_version
0000000000600a24 g O .data 0000000000000004 program_version
tells me that program_version resides at address 0000000000600a24 and is of size 4. Then just read the value at that address in the file.
Or you could just write a simple program that links the library in questions and prints the version, defined either as an exported variable or a function.

Platform independent resource management [duplicate]

This question already has answers here:
Is there a Linux equivalent of Windows' "resource files"?
(2 answers)
Closed 4 years ago.
I'm looking for a way to embed text files in my binaries (like windows resource system). I need something thats also platform independent (works in windows and linux). I found Qt resource management to be what I need but I'm not keen on my app depending on Qt for this alone. I also found this tool at http://www.taniwha.com/~paul/res/ .. but it is too platform specific.
The xxd utility can be used to create a C source file, containing your binary blobs as an array (with the -i command line option). You can compile that to an object which is linked into your executable.
xxd should be portable to most platforms.
If you're using QT 4.5, you can make sure that program is only dependent on one small piece of QT, such as libqtcore. QResource is a part of libqtcore.
You can simlpy append all kinds of data to your normal binary. Works in both Windows and Linux. You'll have to open your own binary at runtime and read the data from there.
However, I have to agree that embedding data in binaries is a strange idea. It's common practice to include such data as separate files packaged with the application.
That is not such a great idea. On Linux, for example, data is expected to be installed in a subdirectory of "$datadir" which is, by default, defined to be "$prefix/share", where "$prefix" is the installation prefix. On Mac OS X, resources are expected to be installed in $appbundle/Contents/Resources, where $appbundle is the name of the folder ending in ".app". On Windows, installing data in a folder that is a sibling of the executable is not an uncommon practice. You may be better off using the CMake build system, and using its CPack packaging features for installing/bundling in the default, preferred platform-specific manner.
Although bundling your resources into the executable, itself, may seem cool, it is actually a dangerous idea... for example, will the embedded data be allocated in an executable page? What will happen if you attempt to overwrite or modify the data? What if you want to tweak or modify the data at runtime? Things to think about.
This looks very promising: https://github.com/cyrilcode/embed-resource
CMake based and platform-independent.
As I also do not like the idea of converting files into C arrays only to have them converted back to binaries, I created my own resource compiler using LLVM and Clang:
https://github.com/nohajc/resman
I tested it on Windows, Linux and macOS but it can potentially be run on any platform supported by LLVM.
It is used like this:
Create header file, e.g. res_list.h
#pragma once
#include "resman.h"
// Define a global variable for each file
// It will be used to refer to the resource
constexpr resman::Resource<1> gRes1("resource_file1.jpg"); // resource with ID 1
constexpr resman::Resource<2> gRes2("resource_file2.txt"); // resource with ID 2
constexpr resman::Resource<3> gRes3("resource_file3.mp3"); // resource with ID 3
...
Run resource compiler
$ rescomp res_list.h -o res_bundle.o
Link res_bundle.o to your project
Use the resource files
#include "res_list.h"
...
resman::ResourceHandle handle{gRes1};
// ResourceHandle provides convenient interface to do things like:
// iterate over bytes
for (char c : handle) { ... }
// convert bytes to string
std::string str{handle.begin(), handle.end()};
// query size and id
unsigned size = handle.size();
unsigned id = handle.id();
The resource compiler parses res_list.h (using Clang) but instead of generating cpp files, it goes straight to the native object file (or static library) format (using LLVM).

Resources