What exactly is considered a breaking change to a PureScript library? - semantic-versioning

The Rust community has a fairly detailed description of their interpretation of Semantic Versioning.
The PureScript community has this, which includes:
We should write a semver tutorial for beginners, specifically its use in PureScript and the way we rely on ~-versions.
The odd thing is that looking at an assortment of 65 randomish purescript libraries, they all use ^-versions rather than ~-versions, but I have been unable to find any newer documentation and we recently had our build broken due to a mismatch in expectations.
Does the PureScript community have a reasonably consistent interpretation of semver, specifically regarding what is or isn't considered a breaking change? If so, what is it?

We don't have an exhaustive list anywhere, no. Now's as good a time as any to start one!
Taking advantage of features that require a newer compiler than when the current version was released.
Adding a dependency.
Removing a dependency.
Bumping a dependency's major version.
Deleting or renaming a module.
Removing a member (that means anything - type, value, class, kind, operator) from a module (either by hiding the export or deleting it).
Changing a type signature of an existing function or value in a way that means it won't unify with the previous version (so it is allowable to make types more general, but not less so).
Adding, removing, or altering the kind of type variables for a type.
Adding, removing, or altering data constructors for a type (unless the type does not export its constructors).
Adding or removing members of a type class declaration.
Changing the expected type parameters of a class.
Adding or altering functional dependencies of a class.
Changing the laws of a class.
Removing instances of a class.
Pretty much anything other than adding new members (or re-exports) to a module is considered a breaking change!
Occasionally we've made changes that are technically breaking (due to type signature changes), but done so to fix something that was completely unusable without the fix. In those cases they've gone out as patch bumps, but those cases are very rare. They tend only to occur when the FFI is involved.
Re: ~ vs ^... I think at the time the linked page was made there wasn't the option to use ^ in Bower (or it didn't default to that at least). ^ is the preferred/recommended range to use for libraries now.

Related

What exactly is considered a breaking change to a library crate?

Rust crates use Semantic Versioning. As a consequence, each release with a breaking change should result in a major version bump. A breaking change is commonly considered something that may break downstream crates (code the depends on the library in question).
However, in Rust a whole lot has the potential of breaking downstream crates. For example, changing (including merely adding to) the set of public symbols is possibly a breaking change, because downstream crates can use glob-imports (use foo::*;) to pull symbols of our library into their namespace. Thus, adding symbols can break dependent crates as well; see this example.
Similarly, changing (adding or changing the version) the set of our dependencies can break downstream builds. You can also imagine that the downstream crate relies on a specific size of one of our public types. This is rarely, if at all, useful; I just want to show: everything could be a breaking change, if only the downstream crate tries hard enough.
Is there any guideline about this? What exactly is considered a breaking change and what not (because it's considered "the user's fault")?
There is a Rust RFC on this subject: RFC 1105: API Evolution. It's applicable to any Rust library project, and it covers all kinds of changes (not just breaking changes) and how they impact semantic versioning. I'll try to summarize the key points from the RFC in order to not make this answer a link-only answer. :)
The RFC acknowledges that pretty much any change to a library can cause a client to suddenly stop compiling. As such, it defines a set of major changes, which require a bump of the major version number, and a set of minor changes, which require a bump of the minor version number; not all breaking changes are major changes.
The key attribute of a minor change is that there must be a way that clients can avoid the breakage in advance by altering slightly their source code (e.g. change a glob import to a non-glob import, disambiguate an ambiguous call with UFCS, etc.) in such a way that the code is compatible with the version prior to the change and with the version that includes the change (assuming that it's a minor release). A minor change must also not force downstream crates to make major breaking changes in order to resolve the breakage.
The major changes defined in the RFC (as of commit 721f2d74) are:
Switching your project from being compatible with the stable compiler to only being compatible with the nightly compiler.
Renaming, moving or removing any public item in a module.
Adding a private field to a struct when all current fields are public.
Adding a public field to a struct that has no private fields.
Adding new variants to an enum.
Adding new fields to an enum variant.
Adding a non-defaulted item to a trait.
Any non-trivial change to the signature of a trait item.
Implementing a fundamental trait on an existing type.
Tightening bounds on an existing type parameter.
Adding or removing arguments to a function.
Any other breaking change that is not listed as a minor change in the RFC.
The minor changes defined in the RFC (as of commit 721f2d74, breaking unless specified) are:
Altering the use of Cargo features on a crate.
Adding new public items in a module.
Adding or removing private fields in a struct when at least one already exists (before and after the change) [not breaking].
Turning a tuple struct with all private fields (with at least one field) into a normal struct, or vice versa.
Adding a defaulted item to a trait.
Adding a defaulted type parameter to a trait [not breaking].
Implementing any non-fundamental trait on an existing type.
Adding any item to an inherent impl.
Changing an undocumented behavior of a function.
Loosening bounds on an existing type parameter [not breaking].
Adding defaulted type parameters to a type or trait [not breaking].
Generalizing an existing struct or enum field by replacing its type by a new type parameter that defaults to the previous type [breaking until issue 27336 is fixed].
Introducing a new type parameter to an existing function.
Generalizing a parameter or the return type of an existing function by replacing the type by a new type parameter that can be instantiated to the previous type.
Introducing new lint warnings/errors.
See the RFC for explanations and examples.

Why can't I replace libraries distributed with GHC? What would happen if I did?

I see in this answer and this one that "everything will break horribly" and Stack won't let me replace base, but it will let me replace bytestring. What's the problem with this? Is there a way to do this safely without recompiling GHC? I'm debugging a problem with the base libraries and it'd be very convenient.
N.B. when I say I want to replace base I mean with a modified version of base from the same GHC version. I'm debugging the library, not testing a program against different GHC releases.
Most libraries are collections of Haskell modules containing Haskell code. The meaning of those libraries is determined by the code in the modules.
The base package, though, is a bit different. Many of the functions and data types it offers are not implemented in standard Haskell; their meaning is not given by the code contained in the package, but by the compiler itself. If you look at the source of the base package (and the other boot libraries), you will see many operations whose complete definition is simply undefined. Special code in the compiler's runtime system implements these operations and exposes them.
For example, if the compiler didn't offer seq as a primitive operation, there would be no way to implement seq after-the-fact: no Haskell term that you can write down will have the same type and semantics as seq unless it uses seq (or one of the Haskell extensions defined in terms of seq). Likewise many of the pointer operations, ST operations, concurrency primitives, and so forth are implemented in the compiler themselves.
Not only are these operations typically unimplementable, they also are typically very strongly tied to the compiler's internal data structures, which change from one release to the next. So even if you managed to convince GHC to use the base package from a different (version of the) compiler, the most likely outcome would simply be corrupted internal data structures with unpredictable (and potentially disastrous) results -- race conditions, trashing memory, space leaks, segfaults, that kind of thing.
If you need several versions of base, just install several versions of GHC. It's been carefully architected so that multiple versions can peacefully coexist on a single machine. (And in particular installing multiple versions definitely does not require recompiling GHC or even compiling GHC a first time, which seems to be your main concern.)

Dropping a typeclass instance and the package version policy

I've been asked to drop my dependency on system-filepath.
My package defines a typeclass Arguable, and defines an instance for Filesystem.Path's FilePath type. No system-filepath means no Filesystem.Path means no FilePath, so by dropping this dependency, I'd be changing my API to no longer provide the Arguable instance.
How does that line up with the PVP? Is this a major version change?
Yes, it's a major version change. The Haskell wiki page on the PVP states about A.B.C version numbers (relevant phrase bolded):
If any entity was removed, or the types of any entities or the definitions of datatypes or classes were changed, or orphan instances were added or any instances were removed, then the new A.B must be greater than the previous A.B. Note that modifying imports or depending on a newer version of another package may cause extra orphan instances to be exported and thus force a major version change.
Otherwise, if only new bindings, types, classes, non-orphan instances or modules (but see below) were added to the interface, then A.B may remain the same but the new C must be greater than the old C. Note that modifying imports or depending on a newer version of another package may cause extra non-orphan instances to be exported and thus force a minor version change.
Otherwise, A.B.C may remain the same (other version components may change).

Haskell without types

Is it possible to disable or work around the type system in Haskell? There are situations where it is convenient to have everything untyped as in Forth and BCPL or monotyped as in Mathematica. I'm thinking along the lines of declaring everything as the same type or of disabling type checking altogether.
Edit: In conformance with SO principles, this is a narrow technical question, not a request for discussion of the relative merits of different programming approaches. To rephrase the question, "Can Haskell be used in a way such that avoidance of type conflicts is entirely the responsibility of the programmer?"
Also look at Data.Dynamic which allows you to have dynamically typed values in parts of your code without disabling type-checking throughout.
GHC 7.6 (not released yet) has a similar feature, -fdefer-type-errors:
http://hackage.haskell.org/trac/ghc/wiki/DeferErrorsToRuntime
It will defer all type errors until runtime. It's not really untyped but it allows almost as much freedom.
Even with fdefer-type-errors one wouldn't be avoiding the type system. Nor does it really allow type independence. The point of the flag is to allow code with type errors to compile, so long as the errors are not called by the Main function. In particular, any code with a type error, when actually called by a Haskell interpreter, will still fail.
While the prospect of untyped functions in Haskell might be tempting, it's worth noting that the type system is really at the heart of the language. The code proves its own functionality in compilation, and the rigidity of the type system prevents a large number of errors.
Perhaps if you gave a specific example of the problem you're having, the community could address it. Interconverting between number types is something that I've asked about before, and there are a number of good tricks.
Perhaps fdefer-type-errors combined with https://hackage.haskell.org/package/base-4.14.1.0/docs/Unsafe-Coerce.html offers what you need.

Haskell FFI interlibrary dependencies

I maintain the augeas FFI library at http://hackage.haskell.org/package/augeas
Recently augeas added an aug_to_xml method that includes a parameter with type xmlNode from libmxl2. It looks like libxml is the FFI library for libxml2, but it hasn't been updated in a while, and it doesn't look to have Debian packaging, so I'm hesitant to add it as a dependency to the augeas FFI library.
So my question is when I add FFI support for this function, would it be better to add the dependency to libxml, which might lead to packaging problems later on, or would it be better to use something like a opaque type as per the FFI cookbook so there's no interlibrary dependency ?
If I go with the opaque type approach, and the users want like to use libxml on their own, could they go about casting my type as a Text.XML.LibXML.Node ?
An opaque type is probably the best route here if you want to include the function, but I'd be sceptical of including a function that is only usable with unsafe coercion to another library's type (which would be indeed possible, yes, but would rely on the internal representation of the libxml binding's Node to not change — risky).
I would suggest simply not adding the function; if someone wants to use it, they can easily import it themselves, and if your binding is appropriately direct, it'll probably be easy for them to use it with your binding's types. Of course, if it's likely to be commonly-used, you could easily bundle this up into a package by itself, though I highly doubt a package that was last updated in 2008 and doesn't even build on GHC 6.12 onwards is going to get much use.
So, I'd just omit the function from your binding, or use an opaque type if you really do want to include it.

Resources