Why is a fresh install of Haskell-Stack and GHC so large/big? - haskell

When doing a fresh install of Haskell Stack through the install script from here:
wget -qO- https://get.haskellstack.org/ | sh
Followed by:
stack setup
you will end up with a $HOME/.stack/ directory of 1.5 GB size (from just a 120+ MB download). Further if you run:
stack update
the size increases to 2.5 GB.
I am used to Java which is usually considered large/big (covers pretty much everything and has deprecated alternatives for backwards compatibility), but as a comparison: an IDE including a JDK, a stand alone JDK, and the JDK source is probably around 1.5 GB in size.
On the other hand, that Haskell which is a "small beautiful" language (from what I have heard and read, this is probably referring mostly to the syntax and semantics, but still), is that large/big, seems strange to me.
Why is it so big (is it related to this question?)?
Is this size normal or have I installed something extra?
If there are several (4?, 5?) flavors of everything, then can I remove all but one?
Are some of the data cache/temporary that can be removed?
The largest directories are: .stack/programs/x86_64-linux/ghc-tinfo6-nopie-8.2.2/lib/ghc-8.2.2 (1.3 GB) and .stack/indices/Hackage (980 MB). I assume the first one are installed packages (and related to stack setup) and the latter is some index over the Hackage package archive (and related to stack update)? Can these be reduced (as above in 3 or grabbing needed Hackage information online)?

As you can probably see by inspection, it is a combination of:
three flavors (static, dynamic, and profiled) of the GHC runtime (about 400 megs total) and the core GHC libraries (another 700 megs total) plus 100 megs of interface files and another 200 megs of documentation and 120 megs of compressed source (1.5 gigs total, all under programs/x86_64-linux/ghc-8.2.2* or similar)
two identical copies of the uncompressed Hackage index 00-index.tar and 01-index.tar, each containing the .cabal file for every version of every package ever published in the Hackage database, each about 457 megs, plus a few other files to bring the total up to 1.0 gigs
The first of these is installed when you run stack setup; the second when you run stack update.
To answer your questions:
It's so big because clearly no one has made any effort to make it smaller, as evidenced by the whole 00-index.tar, 00-index.tar.gz, and 01-index.tar situation.
That's a normal size for a minimum install.
You can remove the profile versions (the *_p.a files) if you never want to compile a program with profiling. I haven't tested this extensively, but it seems to work. I guess this'll save you around 800 megs. You can also remove the static versions (all *.a files) if you only want to dynamically link programs (i.e., using ghc -dynamic). Again, I haven't tested this extensively, but it seems to work. Removing the dynamic versions would be very difficult -- you'd have to find a way to remove only those *.so files that GHC itself doesn't need, and anything you did remove would no longer be loadable in the interpreter.
Several things are cached and you can remove them. For example, you can remove 00-index.tar and 00-index.tar.gz (saving about half a gigabyte), and Stack seems to run fine. It'll recreate them the next time you run stack update, though. I don't think this is documented anywhere, so it'll be a lot of trial and error determining what can be safely removed.
I think this question has already been covered above.
A propos of nothing, the other day, I saw a good deal on some 3-terabyte drives, and in my excitement I ordered two before realizing I didn't really have anything to put on them. It kind of puts a few gigabytes in perspective, doesn't it?
I guess I wouldn't expend a lot of effort trying to trim down your .stack directory, at least on a beefy desktop machine. If you're working on a laptop with a relatively small SSD, think about maybe putting your .stack directory on a filesystem that supports transparent compression (e.g., Btrfs), if you think it's likely to get out of hand.

Related

Should I use git-lfs for packages info files?

As a developer working with several languages, I notice that in most modern languages, dependencies metadata files can change a lot.
For instance, in NodeJS (which in my opinion is the worst when it comes to package management), a change of dependencies or in the version of NPM (respectively yarn) version can lead to huge changes in package-lock.json (respectively yarn.lock), sometimes with tens of thousands of modified lines.
In Golang for instance, this would be go.sum which can have important changes (in smaller magnitude when compared to Node of course) when modifying dependencies or running go mod tidy at times.
Would it be more efficient to track these dependencies files with git-lfs? Is there a reason not to do it?
Even if they are text files, I know that it is advised to push SVG files with git-lfs, because they are mostly generated files and their diff has no reason to be small when regenerating them after a change.
Are there studies about what language and what size/age of a project that makes git-lfs become profitable?
Would it be more efficient to track these dependencies files with git-lfs?
Git does a pretty good job at compressing text files, so initially you probably wouldn't see much gains. If the file gets heavily modified often, then over time the total cloneable repo size would increase by less if you use Git LFS, but it may not be worth it compared to the total repo size increases, which could make the size savings negligible as a percentage. The primary use case for LFS is largish binary files that change often.
Is there a reason not to do it?
If you aren't already using Git LFS, I wouldn't recommend starting for this reason. Also, AFAIK there isn't native built in support for diffing versions of files stored in LFS, though workarounds exist. If you often find yourself diffing the files you are considering moving into LFS, the nominal storage size gain may not be worth it.

Enormous appimage created by appimage-builder

I'm packaging an application I have written into an AppImage so that it can be delivered to Linux users.
One of the key features of the GUI toolkit I'm using is that it is small and lightweight, allowing me to compile a build which is statically linked to the GUI library of around 6Mb.
However, after building the AppImage - where I do what the instructions say - use all the functionality (which basically includes only using file browser dialogues to load files) - it generates an absolutely enormous AppImage of around 200Mb!
I know that AppImages are supposed to be a "little bit" big, but this is completely mad as a proposed solution for portability when the natively compiled binary including a statically linked GUI toolkit is only 6Mb.
However, I'm not convinced at all that I need all of that 200Mb. A very similar piece of software to mine, but that additionally uses Qt (which is pretty bloated in comparison) is only about 30Mb. I actually suspect that appimage-builder is doing something very wrong - I think it is listing the files in the directory I explore when using the file browser dialogue as dependencies (they are big files). I have no real other explanation. But if so how do I stop it doing that?
Why is mine so big? What can I do about it?
For the record I am using this method for building my AppImage
Building my binary separately
Running appimage-builder --generate and completing the form
Running appimage-builder --recipe AppImageBuilder.yml --skip-tests
Edit: Removing the obviously not needed files that were being packaged have reduced the size of the appimage to just 140Mb, but this is still almost 5 times bigger than equivalent appimages I've seen. Are there some tricks/options I'm not aware of?
In few recent days got started with AppImage and faced the same problem.
Shortly: check dependencies of your app by any possible ways and configure recipe to include only concrete dependencies and avoid includings of any theme/icon/etc packages which are not realy used :)
My case is a small app, written in Dart (with Flutter). The built project itself weights about 22MB (du -sh . in output directory). My host os is Linux Mint (Cinnamon).
First time I run appimage-builder --generate it generated me the "recipe" with 17 packages to be installed and bunch of libraries to be copied from /lib/x86_64-linux-gnu/. When I generated AppImage from this recipe, result was about 105MB, which are extremely large in my opinion for small app.
My first experiments was to cleanup included files section, as I guess "all necessary" libraries should be installed from apt. I referred to a few configs from network where were marked only few libraries for include and was exclude section, which contains some DE related files (themes, fonts, icons and etc.)
Using that I got result about 50MB (which are still large enough).
Next experiments were referred to from this issue - https://github.com/AppImageCrafters/appimage-builder/issues/130#issuecomment-843288012
Shortly - after generating an AppImage file, there appeared file .bundle.yml file inside AppDir folder, which contains deployed libraries. Advice is to try exclude something from that. May be it's a good enough advice, but it takes too long time to check for each package/library if it breaks resulted AppImage file at least with official tests of appimage-builder (docker containers). I faced more broken results than any sane size reduction.
My next experiment was to reduce dependencies which should be installed from package manager and use files from host system. I deleted AppDir and appimage-builder-cache folders and regenerated the recipe. At next step I commented all packages which should be installed from package manager and leaved only included files. Result was fail, because of needing one package, but after adding it I got AppImage result in 36MB. That sounds much better than starting 105MB or even previous result of 50MB.
Here I got small "boost" - Flutter project built into AOT binaries, without runtime. So I checked output of ldd for my app, and then mapped list of required libraries to list of library files which were detected by appimage-builder. Finally some of them was correct, some not found in ldd output and some was in ldd output, but were not detected by appimage-builder. I added all undetected, removed all unused. My final result is 26MB and it passed all appimage-builder tests (running in docker images of fedora, cent, debian, ubuntu and arch)
I understand that it's bad enough for continuous building, because it will require to always check for used libraries and adapt config if something changed, but for rare enough builds I guess it's has some kind of balance between good and bad.

Remove package sources from cabal directory

Having installed a few packages with a large number of dependencies via cabal install, I now have several hundred megabytes of source files in my ~/.cabal/packages/hackage.haskell.org directory. As I'm trying to work on a small SSD, space is at a premium for me. Can I safely remove these, or will doing so cause failures later on?
Remove ~/.cabal/packages/hackage.haskell.org won't cause any failure, but cabal-install will redownload the huge 00-index.tar next time when you try compiling something and this single file is 80+% the size of the folder. It's the index of the whole haskell universe, now around 200MB and hopefully will grow without bound in future.
Compiled libraries and executables won't be affected, so if you are not going to build anything more it's fine to remove the whole folder.

Orchard CMS Disk Space

Evening all,
I am playing with Orchard CMS and I have a quick question. I keep my source code on a 10GB partition on my PC; I downloaded the source for orchard (~40MB) and placed it on that drive.
Started visual studio, opened the solution and started a build, realised quite quickly that it was going to take some time so I went off and got a drink, came back to find it had errored out of the build and that the last 3GB of disk space on my dev drive had been filled. This can't be normal, can it?
Does anyone know how much free disk space I'll need to build orchard from the source? I am limited by the size of the SSD in my laptop and I'm not going to upgrade just so I can use orchard!
Problem is that vanilla source projects don't disable "copy local/private" for references. Therefore every project in the solution creates copies of all references in it's bin folders. This obviously isn't necessary here and increases size exponentially (since these references are shipped together anyway so better if they are included just once).
You have 2 options:
(Recommended) Don't compile the source, I've been writing modules on top of precompiled version and never needed to make changes to the core source, that may do more harm than good. But if you really need to compile >
Force references to not copy locally, either manually for every single reference in every single project or find macro or maybe some VS magic to enforce it globally.

What files can I safely delete from the Vim install directory?

I have an old 256mb usb drive that I'm using as a place to store text files. Scratch notes, links, todo lists, and a personal journal logfile.
I installed vim onto the drive in order to always have a decent text editor, even if I'm switching between computers, quickly editing something on someone else's machine, etc.
I'm not worried about hitting the drive's capacity any time soon: the vim install directory itself takes up more space than all the other text, and vim's pretty small.
But if I was to somehow fill the thing with 2 million or so characters of text, what could I safely delete from the vim install directory to clear up some space?
(Of course this isn't an urgent question, just a fun exercise in minimalism.)
If you want absolute minimalism, you can delete everything and just keep vim.exe (and any required DLLs). You can also reduce vim.exe's size from 1.6 MB to 0.8 MB by using UPX. If you want to reduce vim.exe's size further, compile it yourself removing any flags that you don't need.
You can additionally include your .vimrc, some syntax files, your favourite plugins, etc. The size of this depends on you.
So theoretically you could get your vim installation below 0.8 MB.
I'd start by removing syntax, filetype, and compiler plugins you never intend to use. All the colorschemes could also go, to be replaced with just the one you use (if any).
Assuming you are on Windows, you are going to notice that installation from vim.org uses around 30 mb, which leaves space for plenty text.
If you still believe it is necessary to have Vim use less space, you could try the portable version from PortableApps, which is very stable and uses ~15 mb.
Edit:
Actually the latest gVim Portable (7.4) doesn't save much space. The ~15 mb mentioned refers to version 7.3, which was compiled with a reduced set of features (e.g.: -perl/dyn).
Instead of deleting files as filetype plugins and colorschemes, which are usually very small, you should consider compile Vim enabling only the features you actually use. There are a large number of such features, as can be seem with :version command.

Resources