Do I ever want to install a package outside of a sandbox? - haskell

After 3 months of using Haskell, I just realized that I shouldn't use cabal as a package manager.
Now my question is do I ever want to install a package outside of a sandbox? If yes, why?

I often muck around with ideas that aren't full packages. They're often a single file, around 30-100 lines, don't have a main, aren't libraries, and are never intended to be used as anything other than toys to load in ghci. Occasionally, they depend on libraries from packages that don't come with ghc. I'll just install packages they depend on in my user db, because who cares? Worst case, I'll nuke ~/.ghc and reinstall what I need for whatever I work on next.

Related

What does cabal mean when it says "The following packages are likely to be broken by the reinstalls"

I've seen this message pop up a couple times when running cabal v1-install with a suggestion to use --force-reinstalls to install anyway. As I don't know that much about cabal, I'm not sure why a package would break due to a reinstall. Could someone please fill me in on the backstory behind this message?
Note for future readers: this discussion is about historical matters. For practical purposes, you can safely ignore all of that if you are using Cabal 3.
The problem had to do with transitive dependencies. For instance, suppose we had the following three packages installed at specific versions:
A-1.0;
B-1.0, which depends on A; and
C-1.0, which depends on B, but not explicitly on A.
Then, we would install A-1.1, which seemingly would work fine:
A-1.1 would be installed, but the older A-1.0 version would be kept around, solely for the sake of other packages built using it;
B-1.0 would keep using A-1.0; and
C-1.0 would keep using B-1.0.
However, there would be trouble if we, for whatever reason, attempted to reinstall B-1.0 (as opposed to, say, update to B-1.1):
A-1.1 and A-1.0 would still be available for other packages needing them;
B-1.0, however, would be rebuilt against A-1.1, there being no way of keeping around a second installation of the same version of B; and
C-1.0, which was built against the replaced B-1.0 (which depended on A-1.0), would now be broken.
v1-install provided a safeguard against this kind of dangerous reinstall. Using --force-reinstalls would disable that safeguard.
For a detailed explanation of the surrounding issues, see Albert Y. C. Lai's Storage and Identification of Cabalized Packages (in particular, the example I used here is essentially a summary of its Corollary: The Pigeon Drop Con section).
While Cabal 1, in its later versions, was able to, in the scenario above, detect that the reinstall changed B even though the version number remained the same (which is what made the safeguard possible), it couldn't keep around the two variants of B-1.0 simultaneously. Cabal 3, on the other hand, is able to do that, which eliminates the problem.

What is Cabal Hell?

I am a little bit confused while reading about Cabal Hell, as the term is overloaded. I guess originally Cabal Hell referred to the diamond dependency problem, which was solved by restricting the build plan to have only a single version of any package in each build plan (two different versions of a package can't exist in a single build plan) as explained in this answer.
However, the term is also used in various other contexts. Such as destructive re-installations, incorrect package dependency boundaries (lower/upper version bounds), inconsistent environments ... (or any other error reported by Cabal).
Particular among these, I am confused about 1) destructive re-installations and 2) inconsistent environments? What do they mean, and how cabal new-build solves these problems (is it just sandboxing like cabal sandbox)? And what role ghc-pkg has to play here?
Any references or a simple example where these problems could be reproduced would be very appreciated.
Regarding "destructive re-installations": If I am not wrong, GHC has a package manager of itself (ghc-pkg), and the packages are installed as dynamically linkable libraries i.e: base depends on ghc-prim, so if ghc-prim is removed it will break base, am I right? And since GHC only allows one instance of a package with the same version, cabal install might register a newer build of the same (package, version) such that it breaks the dependents of the unregistered package. If the above understanding regarding "destructive re-installations" are correct; how does cabal new-build help here?
The only meaningful use of the term is the one given in the linked answer. Related are the follow-on problems from having lots of different packages in the global database, which can make encountering diamond dependencies more common, requiring destructive reinstalls to resolve, etc.
The other usages of the term are not helpful and just mean "problems somehow involving cabal."
That said, let me answer your other questions.
1) ghc-pkg is not a package manager, but rather a tool for managing ghc package databases. It is used by cabal to register packages into databases, and can be used by end-users to inspect the contents of the databases. Think of it as part of the underlying substrate provided by ghc, not a competing tool.
2) new-build eliminates and replaces the standard notion of a packagedb entirely. Instead of saying that a db consists of packages and versions, with at most one of each pair, instead a db consists of potentially many copies of packages at any given version, each with potentially different versions of its dependencies, all of which are managed in part by hash-addressing, so marked by a unique "fingerprint". This is called the store. When you new-build, cabal calculates a build plan irrespective of any previously installed dependencies, from scratch. If a particular fingerprint (consisting of a package, version, and the versions of all its dependencies, certain flags, etc) already exists in the store, then it makes use of it. If it does not, it calculates it.
As such, the only "diamond dependencies" that can occur are the truly insoluble ones, and not the ones occasioned by having fixed too-early (due to already-installed deps) some portion of the dependency tree.
tldr; you write "since GHC only allows one instance of a package with the same version" but new-build partially lifts this restriction in the store which allows the solver to produce better, more reproducible plans more often.

Conflicting versions of Data.Map

I'm working with this module Algorithms.Geometry.LineSegmentIntersection.BentleyOttman using the function "intersections" that returns something of type Intersections which in turn is an alias for Map (Point 2 r) (Associated p r). So, I try to manipulate that result with the corresponding functions of the Data.Map.Lazy module, but I get the following error:
Any ideas on how to fix it? Thanks!
You have two versions of the containers package installed, and have ended up referencing both of them. A Map produced by containers 0.5.7.1 can't be passed to a Map-consuming function from containers 0.5.10.1 (or any mismatched versions), even if their definition of Map in source code is the same.
Without knowing more about your installation history, it's impossible to say exactly why that happened. I would guess you're just using cabal install to install packages as you need them, into the default user-wide package environment? That almost inevitably results in problems like this, eventually.
The easiest immediate solution is to delete your entire store of installed packages and then reinstall everything you need again (preferably all at the same time, not with multiple separate invocations of cabal install).
To prevent this from happening again, to could change your work practices to use tools like cabal sandbox or stack, which facilitate having separate package environments for each project.
Tough to know for sure without more details, but I will assume you are using stack and the latest LTS snapshot (8.6 as of the time I'm writing this).
This could be happening because LTS 8.6 has containers-0.5.7.1, and you are attempting to use a function that is in a newer version (containers-0.5.10.1) which hasn't made its way from Hackage to Stackage yet.
To resolve this, modify your stack.yaml file to include:
extra-deps:
- containers-0.5.10.1

cabal hell with dependencies of ghc-baked in packages

I have the following instance of cabal hell:
(with ghc-7.8.3 built from source on x86_64 GNU/Linux,
and user-install: True in .cabal/config)
1) at some time, transformers-0.4.0.0 was installed (in user space, shadowing (?) transformers-0.3 from the global space)
2) later, several libraries pick transformers-0.4
3) then, I install hint, which depends on ghc, which depends on transformers-0.3, and which cannot be changed, since ghc is hard-wired.
result: I cannot use libraries from 2) and hint in one project.
As a work-around, I am putting constraint: transformers installed in .cabal/config, and rebuild. Is there a better way to handle this situation - or to avoid it in the first place?
Is there a better way to handle this situation.
No, your approach is sensible.
or to avoid it in the first place?
Tricky. Most people do not build stuff depending on ghc, so for them it makes sense to upgrade transformers etc. Therefore, your constraint is not a suitable default.
As Zeta writes: Sandboxes can help. If you had used sandboxes for your installations in (2), and used another sandbox for whatever tries to use both hint and (2), then it would simply build these dependencies dedicated for whatever you are building.
This comes at the expense of not sharing any space or build-time between the various things you are doing.

How can one make a private copy of Hackage

I'd like to snapshot the global Hackage database into a frozen, smaller one for my company's deploys. How can one most easily copy out some segment of Hackage onto a private server?
Here's one script that does it in just about the simplest way possible: https://github.com/jamwt/mirror-hackage
You can also use the MirrorClient directly from the hackage2 repo: http://code.haskell.org/hackage-server/
This is not an answer two the question in the title but an answer to my interpretation of what the OP wish to achieve.
Depending of what you want for level of stability in your production circle you can approach the problem in several ways.
I have split the dependencies in two parts, things that I can use that are in the haskell platform (keep every platform used in production) and then only use a small number of packages outside that and don't let anyone (including yourself) add more packages into your dependency tree just because of laziness (as developer). These extra packages you use some kind of script for and collect from hackage (lock to version) by using cabal fetch. Keep them safe. Create a install script that uses your safe packages and if a new machine (developer) are added to your team, use that script.
yackage is great but it all comes down to how you ship your product. If you have older versions in production you need to have a yackage setup for every version and that could be quiet annoying after a couple of years.
You can download Hackage with Voker57's hackage-mirror.sh. You'll need 'curl' for it to run. If you're using a Debian based Linux distribution, you can install curl by typing apt-get install curl.
Though it's not a segment of Hackage, I've written a bash script, that downloads the whole Hackage, what can be further easily set up as a mirror using an HTTP server. Also, it downloads all required stuff like GHC compilers ready to be used with Stack.
Currently, a complete Hackage mirror occupies ~10GiB (~100000 packages of all versions) and Stack related stuff like GHC compilers ~21GiB (~200 files). Consequent runs of the script skip already downloaded stuff, so it downloads only new one. So it's a pretty convenient way to "live offline" and sync up to date when online.

Resources