Distributing CUDA runtime to customers but it's too big

Distributing CUDA runtime to customers but it's too big - pytorch

At my company, we are building software that we need to push to customers when we update software (It's being pushed to custom hardware).
We have a GPU on that custom hardware that is fixed, but sometimes, we might need to upgrade the CUDA and CUDNN runtime if we upgrade things in our software (such as libtorch).
The problem now is that because of this, we have to ship CUDA and CUDNN together, which bloats the size of the binaries to over 2GB.
While the actual size of our executable is only 100MB. Is there any smart way around this?

https://pytorch.org doesn't advertize it, but there is a static version of libtorch available (replace 'shared' with 'static' in the URL).
Link against those libraries instead. Your binary will be a bit bigger (depending on how much of the library your code is using), but on the plus side you'll be saving 1.2GB there, because you don't have to ship the libraries.
CUDA and cuDNN should also have static versions available, although they might be missing in some re-distributions (like in Anaconda).

Related

How to find in Linux system, which suitable version of glibc can be installed

I'm trying to update glibc 2.19-r1 to newer version 2.23-r1 in order to overcome some security vulnerabilities. I generated a new binary package (tbz2) using Gentoo system, but now I'm having problems with installing it to my system.
my question is: How can I know if there is anther feature/application that also needs to be updated? Which dependencies does glibc has?
Thank you,
Sami

Which dependencies does glibc has?
It doesn't have any.
When configured, it may require a minimal kernel version on which it would run. Usually it supports kernels that are newer than at least 5 years; on x86 often much older kernels are also supported.
To build it, you also need sufficiently recent versions of gcc, make and some other tools (but these dependencies don't transitively apply to the system on which you want to install it).

Can I run a binary compiled from higher version of gcc?

For example, my local machine only has gcc 4.4. Can I run a binary compiled from gcc 4.8?

Running a binary is dependent on your libraries, not your compiler, so it doesn't matter which gcc is installed. Check your library versions.

There are several possible answers here, depending on the particular circumstances:
First, an important factor whether the binary in question is C or C++. The C++ ABI is tightly coupled to the compiler. It's highly unlikely that you will be able to run C++ code built by a higher version of gcc. The C++ standard library comes with the compiler. If you do not have gcc 4.8 installed, you won't have that version's libstdc++ installed either, and it's unlikely that any gcc 4.8 C++ code will run. Generally, it's far more likely that libstdc++ will have forward, than backward, compatibility.
The C ABI, rather, is less tightly married to the compiler, and there is a moderate chance that pure C code will run, as long as it doesn't use any symbols from the C standard library that are specific to a higher version of libc. Same goes for every other shared library used by the particular code in question.
But there is also one other answer:
Simply "no", that's it. This question is often asked in the context of having an established seat of an older version of a commercial Linux distribution which ships with a stale version of some popular package, often Apache, or it could also be Mysql, perhaps PHP or something similar; and there is a requirement to install another product that requires a newer version of the component, that's available in the current version of the commercial Linux distribution, but not in the older version being used.
So this kind of question is an attempt to figure out whether it's possible to simply grab Apache, or whatnot, from the current version of the distro, and somehow cram it inside the older version. When I hear the question being asked here, it's been my experience that this kind of a question is really a manifestation of this class of an XY problem.
And the proper answer is to either update your commercial Linux distro to the current version, or grab the source code to the newer version of the software package in question, and build it using your commercial Linux distro's packaging tool, then proceed to execute a standard update of this system component using the commercial Linux distro's regular package update process; rather than attempting to execute this kind of a half-hearted transplant from a newer version of the Linux distro, due to lack of local developer and/or sysadmin resource which has the necessary knowledge to prepare a proper software update build. This kind of a brain transplant rarely works, at best it results in random instability and crashes, and becomes a perpetual drain of support resources.
The answer here needs to be the right answer: either update your Linux distro properly, or build the newer version of the software in question from source, using the system compiler.

Linux distro/version to support when releasing a software on Linux

We are about to release a couple of softwares with Linux support.
As for Mac and Windows, the number of version to support is quite limited (xp, 2000, vista, 7 for win, 10.4-6 for Mac). But for linux it's another story.
We'd like to support as many Linux as possible, but the choice is large.
The questions are:
Which distribution format (binaries) to use to support as many Linux as possible?
For testing, what "base linux" can we test on and extend our results to other linuxes.
According we provide statically linked binary with all the dependencies, what do we need to check? I assume kernel version and libc version, but I'm wondering.
Our software is written in ANSI compliant C with a bit of BSD and POSIX (gettimeofday, pthreads).

So you think three versions each for Mac and Windows is normal, but you shy away from Linux? Hm.
Just make sure it builds using the standard tool chains -- configure, make and make install traditionally. The rest should take care of itself.
Else, pick what you are comfortable with. For me that would be Debian/Ubuntu, others prefer Fedora. Look at the Linux Standards Base and things like FreeDesktop.org for other standards. Kernel and libc should not matter unless you are doing something very hardware or driver-specific.

The kernel strives to maintain a backwards-compatible binary API. Statically linked binaries built against 1.0 series kernels are supposed to still run fine to this day on the latest 2.6 series kernels.
If you are statically linking with everything (including libc), then the major problem you are likely to face is different filesystem arrangements, which may not even be a great issue for you. (Testing is the only way to find out, though).

An idea is to survey your proposed customer base so see which linux version they run and make a short list from their feedback. However from what I know (which is subjective!) ...
I would suggest running two different distribution types -- rpm and .tar.gz. With rpm you cater for the latest Fedora/openSUSE/RHEL/SLES (and derived distros, which is a fair chunk of the corporate market). You are already handing a lot of dependency problem by static linking, so kernel version should be sufficient.
With .tar.gz distribution you cater for 'all others' but watch support and configuration problems as they quickly become a time sink.
For testing, have virtual machines of each version you choose to support. These can also be used for product support (I assume you will need to provide product support??) I wouldn't try to extrapolate results between linux versions because there a too many hidden 'gotchas'.

You can release statically compiled Linux binaries against the kernel & version of glibc. You really only need worry about compatibility-breaking revisions. If you have some time, you can setup everything to cross-compile on the same host. The kernel is backward compatible. glibc is more temperamental.
File paths can be assumed to be Linux Standard Base, if you want to package it with an installer. The more flexible you can be here, the better. I've never heard a customer complain about receiving a tarball of binaries, which I'd recommend offering. I have had customers complain about incorrect assumptions.
Your best bet for a formal package format is probably between DEB (Debian Linux & derivatives, like Ubuntu) and RPM (Red-Hat & derivatives, like Cent-OS). Packages are nice to have, but are just a headache if you don't plan on utilizing the native update manager.
For test & build, I'd personally recommend Gentoo. It's pretty raw, however, so you might want to look into Ubuntu as a distant second choice.

This is an issue for your product management team. Once they have determined that producing a Linux version is a desirable idea (i.e. on a cost-benefit basis), then you will need to find out what distros your customers use or want supported.
In principle you can support any but the more you support the more of a headache it will be, so you want as FEW as possible.
Support as few OS / architecture combinations as your PM thinks you can get away with
Deprecate OSs / architectures as soon as you can
Only take on new ones if premium support customers demand it, or to get big deals, as per your PM's decision.
How hard it is to support them is largely dependent on how complex your product is (esp. dependencies) and how complete its auto-test suite is. Adding more supported OSs ties your hands with respect to library usage, kernel feature usage etc as well as testing, so it's not something you want to be lumbered with long-term.
So in short, it's not a software engineering issue, but a product management one.

How do I build an app for an old linux distribution, and avoid the FATAL: kernel too old error?

I distribute a statically linked binary version of my application on linux. However, on systems with the 2.4 kernel, I get a segfault on startup, and the message: "FATAL: kernel too old."
How can I easily get a version up and running with a 2.4 kernel? Some of the libraries I need aren't even available on old linux distributions circa 2003. Is there an apt-get install or something that will allow me to easily target older kernels?

The easiest way is to simply install VirtualBox (or something similar, e.g. VMWare),Install CentOS 3 or any suitable old distro with a 2.4 kernel and build/test your app on that.
Since you're getting a "kernel too old", chances are you're relying on some features not present in 2.4 kernels so you'll have to trace down and rework that.
The error might simply be caused by linking statically to glibc, you could try linking to glibc dynamically and all your other libs statically, though to be backwards compatible you'd have to build your app on an old glibv system. Using the lsb tools to build could help too

For my use case, I can't statically link my supporting libraries. Also, current Linux distributions seem to make this difficult to accomplish for certain situations. But I needed my application binaries to run on 10-year old Linux systems.
I also didn't want to limit myself to an ancient 10-year old C/C++ compiler. I also found that the hardware I needed to use prevented me from installing a 10 year old Linux distribution for some reason.
So, I did this:
Installed docker.
Within a docker instance, install a 10-year old Linux system (I used Debian's Lenny distribution). This has the added advantage of making this build system available to any other machine that can run docker.
Within the docker instance, build the current GNU compilers (8.3.0 when I did this).
This gave me a modern compiler that compiled binaries that would run on very old Linux systems. I did this for both 32-bit and 64-bit processors.
From there, I created a series of scripts that allowed me to use the docker-contained cross-compiler to build all my supporting libraries. I made sure to set the rpath to my compiled binaries to a path relative to my binaries (using -Wl,-rpath,$ORIGIN/../lib), and a built a script to retrieve any supporting libraries from the compiler, using g++ -print-search-dirs to get the paths, ldd to get the supporting libraries I needed from my binaries, and some aggressive bash scripting to find the supporting libraries existing within the search-dirs from g++, dropping these libs into the rpath I set up.
From there, I package my binary accordingly, with all supporting libs.
Yeah, this is somewhat painful, but it results in a fully functioning binary capable of working on ridiculously old linux systems without having to install different Linux distributions on multiple virtual machines.
I tried creating a proper cross-compiler (native to the current Linux distribution hosting my docker images), but found it too difficult to work with, even with the best tools I could find to help me. Compiling the compiler within a docker image took far less of my time, and worked rather smoothly.

Why I need to re-compile vmware kernel module after a linux kernel upgrade?

After a linux kernel upgrade, my VMWare server cannot start until using vmware-config.pl to do some re-config work (including build some kernel modules).
If I update my windows VMWare host with latest Windows Service Pack, I usually not need to do anything to run VMWare.
Why VMWare works differently between Linux and Windows? Does this re-compile action brings any benifits on Linux platform over Windows?

Go read The Linux Kernel Driver Interface.
This is being written to try to explain why Linux does not have a binary kernel interface, nor does it have a stable kernel interface. Please realize that this article describes the _in kernel_ interfaces, not the kernel to userspace interfaces. The kernel to userspace interface is the one that application programs use, the syscall interface. That interface is _very_ stable over time, and will not break. I have old programs that were built on a pre 0.9something kernel that still works just fine on the latest 2.6 kernel release. This interface is the one that users and application programmers can count on being stable.
It reflects the view of a large portion of Linux kernel developers:
the freedom to change in-kernel implementation details and APIs at any time allows them to develop much faster and better.
Without the promise of keeping in-kernel interfaces identical from release to release, there is no way for a binary kernel module like VMWare's to work reliably on multiple kernels.
As an example, if some structures change on a new kernel release (for better performance or more features or whatever other reason), a binary VMWare module may cause catastrophic damage using the old structure layout. Compiling the module again from source will capture the new structure layout, and thus stand a better chance of working -- though still not 100%, in case fields have been removed or renamed or given different purposes.
If a function changes its argument list, or is renamed or otherwise made no longer available, not even recompiling from the same source code will work. The module will have to adapt to the new kernel. Since everybody (should) have source and (can find somebody who) is able to modify it to fit. "Push work to the end-nodes" is a common idea in both networking and free software: since the resources [at the fringes]/[of the developers outside the Linux kernel] are larger than the limited resources [of the backbone]/[of the Linux developers], the trade-off to make the former do more of the work is accepted.
On the other hand, Microsoft has made the decision that they must preserve binary driver compatibility as much as possible -- they have no choice, as they are playing in a proprietary world. In a way, this makes it much easier for outside developers who no longer face a moving target, and for end-users who never have to change anything. On the downside, this forces Microsoft to maintain backwards-compatibility, which is (at best) time-consuming for Microsoft's developers and (at worst) is inefficient, causes bugs, and prevents forward progress.

Linux does not have a stable kernel ABI - things like the internal layout of datastructures, etc changes from version to version. VMWare needs to be rebuilt to use the ABI in the new kernel.
On the other hand, Windows has a very stable kernel ABI that does not change from service pack to service pack.

To add to bdonlan's answer, ABI compatibility is a mixed bag. On one hand, it allows you to distribute binary modules and drivers which will work with newer versions of the kernel. On the other hand, it forces kernel programmers to add a lot of glue code to retain backwards compatibility. Because Linux is open-source, and because kernel developers even whether they're even allowed, the ability to distribute binary modules isn't considered that important. On the upside, Linux kernel developers don't have to worry about ABI compatibility when altering datastructures to improve the kernel. In the long run, this results in cleaner kernel code.

It's a consequence of Linux and Windows being developed in different cultural environments and expectations: http://www.joelonsoftware.com/articles/Biculturalism.html. In short: Windows is designed to be suitable for users, whereas Linux evolves to be suitable for open source developers.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string