How to check the availability of, and load, dynamic libraries in FORTRAN - linux

I have written a FORTRAN library "B" which, depending on the way it is called, may or may not call routines in another library "C". The intent is for "B" to be used in applications "A".
So far, B and C are compiled as static libraries (.a files).
This means that C.a must be available and linked to when compiling B.a, which is okay.
This also means that C.a must be available when compiling the application A, even if A has no intention of using the functionality in B which depends on C. This is annoying and seems unnecessary, as one has to distribute C.a to users who will never use it.
Ideally I would want to have C as a dynamic/shared library, and in B do some run-time availability-check like this (pseudo-code):
if (requested feature from C)
if (is_available(libC.so))
call routine_from_C()
(Go on...)
else
call Error("You need to install C")
else
(We don't need C. Go on...)
Is something like this possible with FORTRAN on Linux?

Related

Compile part of all dependencies as shared libraries

Say I got (regular source) libraries A and B and executable E which depends on both.
Now, I want E to include the object files of A directly, whereas B should be added as a shared library (concrete use: B contains shared types of a plugin architecture). How would I do that with existing tools, preferably stack?
Is that possible or is it rather an all-or-nothing choice (use only shared libraries or link everything into the same binary)?
Optimally, I'd like to specify for each dependency if it should be linked statically or dynamically. Also, that should probably go into the .cabal file, but we have to work with what we got...
(Well, technically that's both statically linked, but in the second case the object code is split up in different files, you get the idea).

Why does NPM's policy of duplicated dependencies work?

By default, when I use NPM to manage a package depending on foo and bar, both of which depend on corelib, by default, NPM will install corelib twice (once for foo, and once for bar). They might even be different versions.
Now, let's suppose that corelib defined some data structure (e.g. a URL object) which is passed between foo, bar and the main application. Now, what I would expect, is if there was ever a backwards incompatible change to this object (e.g. one of the field names changed), and foo depended on corelib-1.0 and bar depended on corelib-2.0, I'd be a very sad panda: bar's version of corelib-2.0 might see a data structure created by the old version of corelib-1.0 and things would not work very well.
I was really surprised to discover that this situation basically never happens (I trawled Google, Stack Overflow, etc, looking for examples of people whose applications had stopped working, but who could have fixed it by running dedupe.) So my question is, why is this the case? Is it because node.js libraries never define data structures that are shared outside of the programmers? Is it because node.js developers never break backwards compatibility of their data structures? I'd really like to know!
this situation basically never happens
Yes, my experience is indeed that that is not a problem in the Node/JS ecosystem. And I think it is, in part, thanks to the robustness principle.
Below is my view on why and how.
Primitives, the early days
I think the first and foremost reason is that the language provides a common basis for primitive types (Number, String, Bool, Null, Undefined) and some basic compound types (Object, Array, RegExp, etc...).
So if I receive a String from one of the libs' APIs I use, and pass it to another, it cannot go wrong because there is just a single String type.
This is what used to happen, and still happens to some extent to this day: Library authors try to rely on the built-ins as much as possible and only diverge when there is sufficient reason to, and with sufficient care and thought.
Not so in Haskell. Before I started using stack, I've run into the following situation quite a few times with Text and ByteString:
Couldn't match type ‘T.Text’
with ‘Text’
NB: ‘T.Text’
is defined in ‘Data.Text.Internal’ in package ‘text-1.2.2.1’
‘Text’ is defined in ‘Data.Text.Internal’ in package ‘text-1.2.2.0’
Expected type: String -> Text
Actual type: String -> T.Text
This is quite frustrating, because in the above example only the patch version is different. The two data types may only be different nominally, and the ADT definition and the underlying memory representation may be completely identical.
As an example, it could have been a minor bugfix to the intersperse function that warranted the release of 1.2.2.1. Which is completely irrelevant to me if all I care about, in this hypothetical example, is concatenating some Texts and comparing their lengths.
Compound types, objects
Sometimes there is sufficient reason to diverge in JS from the built in data types: Take Promises as an example. It's such a useful abstraction over async computations compared to callbacks that many APIs started using them. What now? How come we don't run into many incompatibilities when different versions of these {then(), fail(), ...} objects are being passed up, down and around the dependency tree?
I think it's thanks to the robustness principle.
Be conservative in what you send, be liberal in what you accept.
So if I am authoring a JS library which I know returns promises and takes promises as part of its API, I'll be very careful how I interact with the received objects. E.g. I won't be calling fancy .success(), .finally(), ['catch']() methods on it, since I want to be as compatible as possible with different users, with different implementations of Promises. So, very conservatively, I may just use .then(done, fail), and nothing more. At this point, it doesn't matter if the user uses the promises that my lib returns, or Bluebirds' or even if they hand-write their own, so long as those adhere to the most basic Promise 'laws' -- the most basic API contracts.
Can this still lead to breakage at runtime? Yes, it can. If even the most basic API contract is not fulfilled, you may get an exception saying "Uncaught TypeError: promise.then is not a function". I think the trick here is that library authors are explicit about what their API needs: e.g. a .then method on the supplied object. And then its up to whoever is building on top of that API to make it damn sure that that method is available on the object they pass in.
I'd like to also point out here that this is also the case for Haskell, isn't it? Should I be so foolish as to write an instance for a typeclass that still type-checks without following its laws, I'll get runtime errors, won't I?
Where do we go from here?
Having thought through all this just now, I think we might be able to have the benefits of the robustness principle even in Haskell with much less (or even no(?)) risk for runtime exceptions/errors compared to JavaScript: We just need the typesystem be granular enough so it can distinguish what we want to do with the data we manipulate, and determine if that is still safe or not. E.g. The hypothetical Text example above, I would wager is still safe. And the compiler should only complain if I'm trying to use intersperse, and asks me to qualify it. E.g. with T.intersperse so it can be sure which one I want to use.
How do we do this in practice? Do we need extra support, e.g. language extension flags from GHC? We might not.
Just recently I found bookkeeper, which is a compile-time type-checked anonymous records implementation.
Please note: The following is conjecture on my part, I haven't taken much time to try and experiment with Bookkeeper. But I intend to in my Haskell projects to see if what I write about below could really be achieved with an approach such as this.
With Bookkeeper I could define an API like so:
emptyBook & #then =: id & #fail =: const
:: Bookkeeper.Internal.Book'
'["fail" 'Data.Type.Map.:-> (a -> b -> a),
"then" 'Data.Type.Map.:-> (a1 -> a1)]
Since functions are also first-class values. And whichever API takes this Book as an argument can be very specific what it demands from it: Namely the #then function, and that it has to match a certain type signature. And it cares not for any other function that may or may not be present with whatever signature. All this checked at compile time.
Prelude Bookkeeper
> let f o = (o ?: #foo) "a" "b" in f $ emptyBook & #foo =: (++)
"ab"
Conclusion
Maybe Bookkeeper or something similar will turn out to be useful in my experiments. Maybe Backpack will rush to the rescue with its common interface definitions. Or some other solution comes along. But either way, I hope we can move towards being able to take advantage of the robustness principle. And that Haskell's dependency management can also "just work" most of the time and fail with type errors only when it is truly warranted.
Does the above make sense? Anything unclear? Does it answer your question? I'd be curious to hear.
Further possibly relevant discussion may be found in this /r/haskell reddit thread, where this topic came up just not long ago, and I thought to post this answer to both places.
If I understand well, the supposed problem might be :
Module A
exports = require("c") //v0.1
Module B
console.log(require("a"))
console.log(require("c")) //v0.2
Module C
V0.1
exports = "hello";
V0.2
exports = "world";
By copying C_0.2 in node_modules and C0.1 in node_modules/a/node_modules and creating dummy packages.json, I think I created the case you're talking about.
will B have 2 different conflicting versions of C_data ?
Short answer :
it does. So node does not handle conflicting versions.
The reason you don't see it on the internet is as gustavohenke explained that node naturally does not encourage you to pollute the global scope or chain pass structures between modules.
In other words, it's not often that you'll see a module export another module's structure.
I don't have first-hand experience with this kind of situation in a large JS program, but I would guess that it has to do with the OO style of bundling data together with the functions that act on that data into a single object. Effectively the "ABI" of an object is to pull public methods by name out of a dictionary, and then invoke them by passing the object as the first argument. (Or perhaps the dictionary contains closures that are already partially applied to the object itself; it doesn't really matter.)
In Haskell we do encapsulation at a module level. For example, take a module that defines a type T and a bunch of functions, and exports the type constructor T (but not its definition) and some of the functions. The normal way to use such a module (and the only way that the type system will permit) is to use one exported function create to create a value of type T, and another exported function consume to consume the value of type T: consume (create a b c) x y z.
If I had two different versions of the module with different definitions of T and I was able to use the create from version 1 together with the consume from version 2 then I'd likely get a crash or wrong answer. Note that this is possible even if the public API and externally observable behavior of the two versions is identical; perhaps version 2 has a different representation of T that allows for a more efficient implementation of consume. Of course, GHC's type system stops you from doing this, but there are no such safeguards in a dynamic language.
You can translate this style of programming directly into a language like JavaScript or Python:
import M
result = M.consume(M.create(a, b, c), x, y, z)
and it would have exactly the same kind of problem that you are talking about.
However, it's far more common to use the OO style:
import M
result = M.create(a, b, c).consume(x, y, z)
Note that only create is imported from the module. consume is in a sense imported from the object we got back from create. In your foo/bar/corelib example, let's say that foo (which depends on corelib-1.0) calls create and passes the result to bar (which depends on corelib-2.0) which will call consume on it. Actually, while foo needs a dependency on corelib to call create, bar does not need a dependency on corelib to call consume at all. It's only using the base language notions to invoke consume (what we could spell getattr in Python). In this situation, bar will end up invoking the version of consume from corelib-1.0 regardless of what version of corelib bar "depends on".
Of course for this to work the public API of corelib must not have changed too much between corelib-1.0 and corelib-2.0. If bar wants to use a method fancyconsume which is new in corelib-2.0 then it won't be present on an object created by corelib-1.0. Still, this situation is much better than we had in original Haskell version, where even changes that do not affect the public API at all can cause breakage. And perhaps bar depends on corelib-2.0 features for the objects it creates and consumes itself, but only uses the API of corelib-1.0 to consume objects it receives externally.
To achieve something similar in Haskell, you could use this translation. Rather than directly using the underlying implementation
data TImpl = TImpl ... -- private
create_ :: A -> B -> C -> TImpl
consume_ :: TImpl -> X -> Y -> Z -> R
...
we wrap up the consumer interface with an existential in an API package corelib-api:
module TInterface where
data T = forall a. T { impl :: a,
_consume :: a -> X -> Y -> Z -> R,
... } -- Or use a type class if preferred.
consume :: T -> X -> Y -> Z -> R
consume t = (_consume t) (impl t)
and then the implementation in a separate package corelib:
module T where
import TInterface
data TImpl = TImpl ... -- private
create_ :: A -> B -> C -> TImpl
consume_ :: TImpl -> X -> Y -> Z -> R
...
create :: A -> B -> C -> T
create a b c = T { impl = create_ a b c,
_consume = consume_ }
Now foo uses corelib-1.0 to call create, but bar only needs corelib-api to call consume. The type T lives in corelib-api, so if the public API version does not change, then foo and bar can interoperate even if bar is linked against a different version of corelib.
(I know Backpack has a lot to say about this kind of thing; I'm offering this translation as a way to explain what is happening in the OO programs, not as a style one should seriously adopt.)
Here is a question that mostly answers the same thing: https://stackoverflow.com/a/15948590/2083599
Node.js modules don't pollute the global scope, so when they're required, they'll be private to the module that required them - and this is a great functionality.
When 2 or more packages require different versions of the same lib, NPM will install them for each package, so no conflicts will ever happen.
When they don't, NPM will install only once that lib.
In the other hand, Bower, which is a package manager for the browser, does install only flat dependencies because the libs will go to the global scope, so you can't install jquery 1.x.x and 2.x.x. They'll only export the same jQuery and $ vars.
About the backwards compatibility problems:
All developers do break backwards compatibility at least once! The only difference between Node developers and developers of other platforms is that we have been teached to always use semver.
Considering that most packages out there have not reached v2.0.0 yet, I believe that they have kept the same API in the switch from v0.x.x to v1.0.0.

code segment referenced again with second plugin crashes

I'd like to understand the dynamic-linker/loader behaviour on Linux box in the problematic case I work upon.
Our code that crashes is loaded as a plugin (dlopen(libwrapper.so, RTLD_GLOBAL)). libwrapper.so is just a thin layer that loads another plugins that do the real job. These plugins can be named: P1 and P2, each of these depend on common library called F (all together very much simplified).
Wrapper (libwrapper.so) is introduced to allow loading Pn without RTLD_GLOBAL, since that flag set leads to obvious linkage problems loading Pns (they have the same API). RTLD_DEEPBIND is not an option since target platform is too old - does not support it.
To our surprise, the problem manifests in F library at the load time of P2 (when P1 is already loaded (and initialized) and F as its implicit dependency). At the time P2 is explicitly loaded (dlopen(libP2.so, RTLD_LOCAL | RTLD_NOW)), dynamic linker reports no problems, but calling code within F to instantiate some type instances defined in F (again) leads to segmentation faults on various places (in case one is skipped / out-commented, it crashes on another place - therefore didn't spent time to investigate the code pattern that might be troublesome, since more general problem / misunderstanding is suspected). There are no inlined functions used, code is linked with -Wl,-E, visibility default, GCC is 3.4.4.. F code is very much stable and used within standalone apps or as part of plugins in the past.
I thought to link F as static library to workaround any problem there might be with the dynamic linker, but result is the same.
My view on the topic:
linking F as dynamic library leads dynamic linker to "know" F is referenced second time loading P2 and just increments the reference counter and does not call static initializers (which is ok), but does relocations (again, and this seems to be problematic).
linking F as static library leads dynamic linker to load F code as statically linked part of P2 (P2F) and does relocations within P2F. However, "somehow" common symbols from F gets messed up with P1F code instance.
Assumption about the workaround to make the code at least work:
link P1 ... Pn in a single shared library (single plugin), whether F is shared / static doesn't matter. This way any relocation is done only once.
I'd appreciate any feedback is my view on the topic wrong / too simplified / missing important part? Is this some known GCC / binutils bug from the past?
My view on the topic:
Your view on the topic is wrong; but there is no way to prove that to you.
Write a minimal test case that simulates what your system does, and still crashes in a similar way. Update your question with actual broken code; then we can tell you exactly what the problem is.
There is also a very good chance that in reducing the problem to the minimal example, you'll discover what the problem is yourself.
Either way you'll understand the problem, and will learn something new.

Why should I recompile an entire program just for a library update?

With respect to the following link:
http://www.archlinux.org/news/libpnglibtiff-rebuilds-move-from-testing/
Could someone explain to me why a program should be rebuilt after one of its libraries has been updated?
How does that make any sense since the "main" file is not changed at all?
If the signatures of the functions involved haven't changed, then "rebuilding" the program means that the object files must be linked again. You shouldn't need to compile them again.
An API is contract that describes the interface to the public functions in a library. When the compiler generates code, it needs to know what type of variables to pass to each function, and in what order. It also needs to know the return type, so it knows the size and format of the data that will be returned from the function. When your code is compiled, the address of a library function may be represented as "start of the library, plus 140 bytes." The compiler doesn't know the absolute address, so it simply specifies an offset from the beginning of the library.
But within the library, the contents (that is, the implementations) of the functions may change. When that happens, the length of the code may change, so the addresses of the functions may shift. It's the job of the linker to understand where the entry points of each function reside, and to fill those addresses into the object code to create the executable.
On the other hand, if the data structures in the library have changed and the library requires the callers to manage memory (a bad practice, but unfortunately common), then you will need to recompile the code so it can account for the changes. For example, if your code uses malloc(sizeof(dataStructure)) to allocate memory for a library data structure that's doubled in size, you need to recompile your code because sizeof(dataStructure) will have a larger value.
There are two kinds of compatibility: API and ABI.
API compatibility is about functions and data structures which other programs may rely on. For instance if version 0.1 of libfoo defines an API function called "hello_world()", and version 0.2 removes it, any programs relying on "hello_world()" need updating to work with the new version of libfoo.
ABI compatibility is about the assumptions of how functions and, in particular, data structures are represented in the binaries. If for example libfoo 0.1 also defined a data structure recipe with two fields: "instructions" and "ingredients" and libfoo 0.2 introduces "measurements" before the "ingredients" field then programs based on libfoo 0.1 recipes must be recompiled because the "instructions" and "ingredients" fields will likely be at different positions in the 0.2 version of the libfoo.so binary.
What is a "library"?
If a "library" is only a binary (e.g. a dynamically linked library aka ".dll", ".dylib" or ".so"; or a statically linked library aka ".lib" or ".a") then there is no need to recompile, re-linking should be enough (and even that can be avoided in some special cases)
On the other hand, libraries often consist of more than just the binary object - e.g. the header-files might include some in-line (or macro) logic.
if so, re-linking is not enough, and you might to have to re-compile in order to make use of the newest version of the lib.

Extraneous Library Linkage

I have a question which may be somewhat silly because I'm pretty sure I may know the answer already.
Suppose you have static library A, and dynamic shared object library B and your program C under linux. Suppose that library A calls functions from library B and your program calls functions from library A. Now suppose that all functions that C calls in A make no use of functions in B.
To compile C will it be enough to link just A and omit B and furthermore can your program C be run on a system without library B installed?
If your program calls functions in A that don't reference B then B is not required either at link or load time, assuming that the functions in A are in separate compilation units, which is usually the case for a library.
The linker will pull the functions from the library that C uses and since none of them call functions in B, B will not be needed.
Holy placeholder name overload, batman. Let's first replace A, B, and C, with libstatic, libshared, and myapp to make things a little more legible:
Suppose you have static library libstatic, and
dynamic shared object library libshared and
your program myapp under linux. Suppose
that library libstatic calls functions from
library libshared and your program (myapp) calls
functions from library libstatic. Now suppose
that all functions that myapp calls in libstatic
make no use of functions in libshared.
To compile myapp will it be enough to link
just libstatic and omit libshared and furthermore can
your program myapp be run on a system
without library libshared installed?
So the way I understand your question, there is a library libstatic, some functions in which make use of libshared. You want to know: if I don't use any of the libstatic functions that are dependent on libshared, will myapp link and run without libshared?
The answer is yes, so long as two things are true:
The calls you make into libstatic do not depend on libshared directly or indirectly. Meaning that if myapp calls a function in libstatic which calls another function in libstatic which calls a function in libshared, then myapp is now dependent on libshared.
The calls you make into libstatic do not depend on any function in libstatic whose implementation appears in the same compilation unit (object file) with a call to libshared. The linker brings in code from the static library at the level of object files, not at the level of individual functions. And remember, this dependency is similarly chained, so if you call a function in foo.o, and something else in foo.o calls a function in bar.o, and something in bar.o depends on libshared, you're toast.
When you link in a static library into an application, only the object files that contain the symbols used (directly or indirectly) are linked. So if it turns out that none of the object files that myapp ends up needing from libstatic depend on libshared, then myapp doesn't depend on libshared.

Resources