Enormous appimage created by appimage-builder - linux

I'm packaging an application I have written into an AppImage so that it can be delivered to Linux users.
One of the key features of the GUI toolkit I'm using is that it is small and lightweight, allowing me to compile a build which is statically linked to the GUI library of around 6Mb.
However, after building the AppImage - where I do what the instructions say - use all the functionality (which basically includes only using file browser dialogues to load files) - it generates an absolutely enormous AppImage of around 200Mb!
I know that AppImages are supposed to be a "little bit" big, but this is completely mad as a proposed solution for portability when the natively compiled binary including a statically linked GUI toolkit is only 6Mb.
However, I'm not convinced at all that I need all of that 200Mb. A very similar piece of software to mine, but that additionally uses Qt (which is pretty bloated in comparison) is only about 30Mb. I actually suspect that appimage-builder is doing something very wrong - I think it is listing the files in the directory I explore when using the file browser dialogue as dependencies (they are big files). I have no real other explanation. But if so how do I stop it doing that?
Why is mine so big? What can I do about it?
For the record I am using this method for building my AppImage
Building my binary separately
Running appimage-builder --generate and completing the form
Running appimage-builder --recipe AppImageBuilder.yml --skip-tests
Edit: Removing the obviously not needed files that were being packaged have reduced the size of the appimage to just 140Mb, but this is still almost 5 times bigger than equivalent appimages I've seen. Are there some tricks/options I'm not aware of?

In few recent days got started with AppImage and faced the same problem.
Shortly: check dependencies of your app by any possible ways and configure recipe to include only concrete dependencies and avoid includings of any theme/icon/etc packages which are not realy used :)
My case is a small app, written in Dart (with Flutter). The built project itself weights about 22MB (du -sh . in output directory). My host os is Linux Mint (Cinnamon).
First time I run appimage-builder --generate it generated me the "recipe" with 17 packages to be installed and bunch of libraries to be copied from /lib/x86_64-linux-gnu/. When I generated AppImage from this recipe, result was about 105MB, which are extremely large in my opinion for small app.
My first experiments was to cleanup included files section, as I guess "all necessary" libraries should be installed from apt. I referred to a few configs from network where were marked only few libraries for include and was exclude section, which contains some DE related files (themes, fonts, icons and etc.)
Using that I got result about 50MB (which are still large enough).
Next experiments were referred to from this issue - https://github.com/AppImageCrafters/appimage-builder/issues/130#issuecomment-843288012
Shortly - after generating an AppImage file, there appeared file .bundle.yml file inside AppDir folder, which contains deployed libraries. Advice is to try exclude something from that. May be it's a good enough advice, but it takes too long time to check for each package/library if it breaks resulted AppImage file at least with official tests of appimage-builder (docker containers). I faced more broken results than any sane size reduction.
My next experiment was to reduce dependencies which should be installed from package manager and use files from host system. I deleted AppDir and appimage-builder-cache folders and regenerated the recipe. At next step I commented all packages which should be installed from package manager and leaved only included files. Result was fail, because of needing one package, but after adding it I got AppImage result in 36MB. That sounds much better than starting 105MB or even previous result of 50MB.
Here I got small "boost" - Flutter project built into AOT binaries, without runtime. So I checked output of ldd for my app, and then mapped list of required libraries to list of library files which were detected by appimage-builder. Finally some of them was correct, some not found in ldd output and some was in ldd output, but were not detected by appimage-builder. I added all undetected, removed all unused. My final result is 26MB and it passed all appimage-builder tests (running in docker images of fedora, cent, debian, ubuntu and arch)
I understand that it's bad enough for continuous building, because it will require to always check for used libraries and adapt config if something changed, but for rare enough builds I guess it's has some kind of balance between good and bad.

Related

Understanding Yocto Project sstate-cache functioning

I'm newbie in Yocto Project. I'm managing several projects which all of them have a version for each: development/debug and field/industrialization. While working with the build system, I've noticed the recurrent following scenario.
Let's assume workspace is clean, a fresh build.
Launch bitbake, minimal-image with certain linux-kernel device tree and defconfig parameters. Bitbake will take some time and output files will be created.
Now, change parameters in previously mentioned device tree and defconfig (imagine new peripherals are added). Relaunch bitbake, output files wil be created for this new compilation.
Now, here comes the trick. Reset device tree and defconfig files to the setting before compiling in step 2. Relaunch bitbake and it will be nearly instant. Output files are replaced by the ones created in step 2.
So, I know this is possible due to bitbake and the use of the sstate-cache, or that's what I suppose. I've been googling around for a while but info is not too clear. How does this exactly work? Is there any kind of signature created in compilation time with the inputs of config files and stored associated to the compilation? I'm concerned about this becase I really need to trust that what I'm sending to field is precisely the correct compilation and not the unsafe development version.
And related to this, which is the difference between launching a bitbake -c cleanall or hard deleting deploy and sstate-cache dirs?
Thanks in advance.
The information is available in the official manual:
https://www.yoctoproject.org/docs/3.0/overview-manual/overview-manual.html#shared-state-cache

Meaning warning "File is touched by more than one package"

I am creating a simple linux kernel with buildroot and I am adding a small driver I've done myself, I created the Config.in file and drivername.mk to be able to select the driver in make menuconfig succesfully.
When executing make to build the image, the compilation goes correctly until my driver starts to compile, it looks to compile and create the image right but I get loooots of warnings saying that different files in ./lib/gcc/arm-buildroot-linux-uclibcgnueabihf/ are touched by more than one package: [u'host-gcc-initial', u'host-gcc-final'].
Anyone can explain me a bit about this issue and what is causing it? Do you need any more info to know what is happening? Is it safe to ignore them?
Thanks beforehand
Actually, doing a search on 'touched by more than one package', I found http://lists.busybox.net/pipermail/buildroot/2017-October/205602.html, where we find that this warning can safely be ignored if you're not doing a parallel build and aren't a kernel maintainer.
That said, if you're submitting code for inclusion in the Linux kernel, please be a good citizen and make sure you identify all of the things your code is dependent upon. (I'm not actually an active kernel hacker, so I don't know what method they're using for this right now.)
The basic idea is that there are a bunch of steps in compiling things that need to be done in a logical order. In a small project, we simply use dependencies that we know to put in because we also coded in that dependency. But with a project the size of the kernel, you can guarantee that not everyone does this. Some of them instead just specify dependencies if they're needed for things to build properly - if the default order works, things could go years before someone figures out that there was a missing dependency, causing them grief when they were trying to update just the one thing that was a missing dependency, and the other code not getting updated as a result.
When you're doing things in parallel, on the other hand, it becomes a lot more complicated. Now you really need to have every dependency specified, because there is no longer any inherent dependable order. Some people will probably still build serially, while others use two processing threads. I'll use 8. I've worked in groups that would be inclined to do 30, because they're on a 32 processor machine, and don't really need all of those during the off hours. Suddenly the fact that the file you needed from a directory that normally got processed 30 directories before yours is now getting processed at the same time as your file that needed it, because you didn't list the dependency and everything in those 30 directories that hasn't already been processed and isn't being processed has a dependency that's not yet finished its processing.

Remove package sources from cabal directory

Having installed a few packages with a large number of dependencies via cabal install, I now have several hundred megabytes of source files in my ~/.cabal/packages/hackage.haskell.org directory. As I'm trying to work on a small SSD, space is at a premium for me. Can I safely remove these, or will doing so cause failures later on?
Remove ~/.cabal/packages/hackage.haskell.org won't cause any failure, but cabal-install will redownload the huge 00-index.tar next time when you try compiling something and this single file is 80+% the size of the folder. It's the index of the whole haskell universe, now around 200MB and hopefully will grow without bound in future.
Compiled libraries and executables won't be affected, so if you are not going to build anything more it's fine to remove the whole folder.

Is there any JIT pre-caching support in NodeJS?

I am using a rather large and performance-intensive nodejs program to generate hinting data for CJK fonts (sfdhanautohint), and for some better dependency tracking I had to end up calling the nodejs program tens of thousands of times from a makefile like this.
This immediately brought me to the concern that doing such is actually putting a lot of overhead in starting and pre-heating the JIT engine, so I decided to find something like ngen.exe for nodejs. It appears that V8 already has some support for code caching, but is there anything I can do to use it in NodeJS?
Searching for kProduceCodeCache in NodeJS's GitHub repo doesn't return any non-bundled-v8 results. Perhaps it's time for a feature request…
Yes, this happens automatically. Node 5.7.0+ automatically pre-caches (pre-heats the JIT engine for your source) the first time you run your code (since PR #4845 / January 2016 here: https://github.com/nodejs/node/pull/4845).
It's important to note you can even pre-heat the pre-heat (before your code is ever even run on a machine, you can pre-cache your code and tell Node to load it).
Andres Suarez, a Facebook developer who works on Yarn, Atom and Babel created v8-compile-cache, which is a tiny little module that will JIT your code and require()s, and save your Node cache into your $TMP folder, and then use it if it's found. Check out the source for how it's done to suit other needs.
You can, if you'd like, have a little check that runs on start, and if the machine architecture is in your set of cache files, just load the cached files instead of letting Node JIT everything. This can cut your load time in half or more for a real-world large project with tons of requires, and it can do it on the very first run
Good for speeding up containers and getting them under that 500ms "microservice" boot time.
It's important to note:
Caches are binaries; they contain machine-executable code. They aren't your original JS code.
Node cache binaries are different for each target CPU you intend to run on (IA-32, IA-64, ARM etc). If you want to pre-cache pre-caches for your users, you must make cache targets for each target architecture you want to support.
Enjoy a ridiculous speed boost :)

Implementing an update/upgrade system for embedded Linux devices

I have an application that runs on an embedded Linux device and every now and then changes are made to the software and occasionally also to the root file system or even the installed kernel.
In the current update system the contents of the old application directory are simply deleted and the new files are copied over it. When changes to the root file system have been made the new files are delivered as part of the update and simply copied over the old ones.
Now, there are several problems with the current approach and I am looking for ways to improve the situation:
The root file system of the target that is used to create file system images is not versioned (I don't think we even have the original rootfs).
The rootfs files that go into the update are manually selected (instead of a diff)
The update continually grows and that becomes a pita. There is now a split between update/upgrade where the upgrade contains larger rootfs changes.
I have the impression that the consistency checks in an update are rather fragile if at all implemented.
Requirements are:
The application update package should not be too large and it must also be able to change the root file system in the case modifications have been made.
An upgrade can be much larger and only contains the stuff that goes into the root file system (like new libraries, kernel, etc.). An update can require an upgrade to have been installed.
Could the upgrade contain the whole root file system and simply do a dd on the flash drive of the target?
Creating the update/upgrade packages should be as automatic as possible.
I absolutely need some way to do versioning of the root file system. This has to be done in a way, that I can compute some sort of diff from it which can be used to update the rootfs of the target device.
I already looked into Subversion since we use that for our source code but that is inappropriate for the Linux root file system (file permissions, special files, etc.).
I've now created some shell scripts that can give me something similar to an svn diff but I would really like to know if there already exists a working and tested solution for this.
Using such diff's I guess an Upgrade would then simply become a package that contains incremental updates based on a known root file system state.
What are your thoughts and ideas on this? How would you implement such a system? I prefer a simple solution that can be implemented in not too much time.
I believe you are looking wrong at the problem - any update which is non atomic (e.g. dd a file system image, replace files in a directory) is broken by design - if the power goes off in the middle of an update the system is a brick and for embedded system, power can go off in the middle of an upgrade.
I have written a white paper on how to correctly do upgrade/update on embedded Linux systems [1]. It was presented at OLS. You can find the paper here: https://www.kernel.org/doc/ols/2005/ols2005v1-pages-21-36.pdf
[1] Ben-Yossef, Gilad. "Building Murphy-compatible embedded Linux systems." Linux Symposium. 2005.
I absolutely agree that an update must be atomic - I have started recently a Open Source project with the goal to provide a safe and flexible way for software management, with both local and remote update. I know my answer comes very late, but it could maybe help you on next projects.
You can find sources for "swupdate" (the name of the project) at github.com/sbabic/swupdate.
Stefano
Currently, there are quite a few Open Source embedded Linux update tools growing, with different focus each.
Another one that is worth being mentioned is RAUC, which focuses on handling safe and atomic installations of signed update bundles on your target while being really flexible in the way you adapt it to your application and environment. The sources are on GitHub: https://github.com/rauc/rauc
In general, a good overview and comparison of current update solutions you might find on the Yocto Project Wiki page about system updates:
https://wiki.yoctoproject.org/wiki/System_Update
Atomicity is critical for embedded devices, one of the reasons highlighted is power loss; but there could be others like hardware/network issues.
Atomicity is perhaps a bit misunderstood; this is a definition I use in the context of updaters:
An update is always either completed fully, or not at all
No software component besides the updater ever sees a half installed update
Full image update with a dual A/B partition layout is the simplest and most proven way to achieve this.
For Embedded Linux there are several software components that you might want to update and different designs to choose from; there is a newer paper on this available here: https://mender.io/resources/Software%20Updates.pdf
File moved to: https://mender.io/resources/guides-and-whitepapers/_resources/Software%2520Updates.pdf
If you are working with the Yocto Project you might be interested in Mender.io - the open source project I am working on. It consists of a client and server and the goal is to make it much faster and easier to integrate an updater into an existing environment; without needing to redesign too much or spend time on custom/homegrown coding. It also will allow you to manage updates centrally with the server.
You can journal an update and divide your update flash into two slots. Power failure always returns you to the currently executing slot. The last step is to modify the journal value. Non atomic and no way to make it brick. Even it if fails at the moment of writing the journal flags. There is no such thing as an atomic update. Ever. Never seen it in my life. Iphone, adroid, my network switch -- none of them are atomic. If you don't have enough room to do that kind of design, then fix the design.

Resources