Haskell FFI interlibrary dependencies - haskell

I maintain the augeas FFI library at http://hackage.haskell.org/package/augeas
Recently augeas added an aug_to_xml method that includes a parameter with type xmlNode from libmxl2. It looks like libxml is the FFI library for libxml2, but it hasn't been updated in a while, and it doesn't look to have Debian packaging, so I'm hesitant to add it as a dependency to the augeas FFI library.
So my question is when I add FFI support for this function, would it be better to add the dependency to libxml, which might lead to packaging problems later on, or would it be better to use something like a opaque type as per the FFI cookbook so there's no interlibrary dependency ?
If I go with the opaque type approach, and the users want like to use libxml on their own, could they go about casting my type as a Text.XML.LibXML.Node ?

An opaque type is probably the best route here if you want to include the function, but I'd be sceptical of including a function that is only usable with unsafe coercion to another library's type (which would be indeed possible, yes, but would rely on the internal representation of the libxml binding's Node to not change — risky).
I would suggest simply not adding the function; if someone wants to use it, they can easily import it themselves, and if your binding is appropriately direct, it'll probably be easy for them to use it with your binding's types. Of course, if it's likely to be commonly-used, you could easily bundle this up into a package by itself, though I highly doubt a package that was last updated in 2008 and doesn't even build on GHC 6.12 onwards is going to get much use.
So, I'd just omit the function from your binding, or use an opaque type if you really do want to include it.

Related

Severity of GHC multiple package version warning

GHC warns when a package depends on different instances of the same package via dependencies, e.g.:
Configuring tasty-hspec-1.1.5.1...
Warning:
This package indirectly depends on multiple versions of the same package.
This is very likely to cause a compile failure.
package hspec-core (hspec-core-2.5.5-H06vLnMfEeIEsZFdji6h0O) requires
clock-0.7.2-9qwmBbNbGzEOSffjlyarp
package tasty (tasty-1.1.0.3-I8Vu9v0lHj8Jlg3jpKXavp) requires
clock-0.7.2-Cf9UTsaN2AjEpBnoMpmgkU
Two things are unclear to me with respect to this warning:
If GHC warns, and the compile doesn't fail, is everything fine? That is, could subtly conflicting instances of the same package still cause bad behaviour? (I'm imagining something like a type (Int, Int) in the public interface, with both instances of the package switching the order of the fields.)
Is there a way to make GHC fail for this warning?
That's not GHC what's warning you about multiple package-versions. GHC just compiles the packages that were specified... which hardly anybody ever does by hand, but let Stack or Cabal do it for them. And in this case it's Cabal giving the warning message.
If different versions cause a problem, you will in practice almost always see it at compile-time. It's most often a missing-instance error, because you're e.g. trying to use the class Foo from pkg-1.0 with the type Bar from pkg-2.0. Direct version mismatch of a data type in public interfaces can also happen.
Theoretically, I think it would also be possible to have an error like (Int,Int) meaning two different things, which the compiler would not catch. However, this kind of change is asking for trouble anyways. Whenever the order of some data fields is not completely obvious and might change in the future, a data record should be used to make sure the compiler can catch it. (This is largely orthogonal to the different-versions-of-same-package issue.)
If you want to be safe from any version-mismatch issues, you can use Stack instead of Cabal. I think this is a good part of the reason why many Haskellers prefer Stack.

What are module signatures in Haskell?

I have recently found Haskell's feature called "module signatures". As I have discovered they are put in .hsig files and begin with signature keyword instead of module.
The example syntax of such a file may look like
signature Str where
data Str
empty :: Str
append :: Str -> Str -> Str
However, I cannot imagine how and why one would use them. Could you explain me which problems do they solve and how to properly make use of them?
They strongly remind me the module system that one can see in OCaml (link), which also has modules signatures and separate implementations, but I can't decide how close are these two concepts. Is it somehow related?
They are related to the OCaml module system, but with some important differences:
Signatures are defined within the language (in .hsig files) but unlike in OCaml they are not instantiated within the language. Instead, the package manager controls instantiation (currently, only Cabal provides that). Modules never know if they are importing an abstract signature or an actual module.
Implementation modules know nothing about signatures and do not not reference them directly. Any existing module can implement a signature if the definitions happen to be compatible.
Instantiation is triggered by a coincidence of module name and signature name in the dependencies of some compilation unit (executable, library, test suite...) When the names coincide, a process called "signature matching" takes place that verifies that the types and definitions are compatible.
The "happy path" is that in your program you depend on some library having a signature "hole", and also on another library that provides an implementation module with the same name. Then signature matching happens automatically. When the names don't match, or we need multiple instantiations of the signature-using library, we have to rename signatures and/or modules in the mixins section of the Cabal file.
As for why module signatures might be useful, consider bytestring, the most popular library by far for handling binary data in Haskell. But there are others, for example stdio with its Bytes type.
Suppose you are writing your own library that uses binary data, and you don't want to force your users into either stdio or bytestring. What are your choices?
One would be to create something like a Bytelike class and parameterize all your functions with it. You would also need to add a type parameter to every data type that contains bytes.
Another would be to create a signature that defines an abstract binary data type and all the operations that are required of it. Your library would make use of the signature, and remain "indefinite" until the user depends both on your library and a suitable implementation when creating his own libraries.
From the perspective of the user, the typeclass solution is unsatisfactory. The user knows that he wants to use either ByteString or Bytes, just one of them. The decision will not depend on some runtime flag and will remain constant across the extent of his program. And yet he has to deal with a more complex API that reminds him of that already decided issue at every turn.
It's better if he makes the decision once, writes it in his .cabal file, and deals with a simpler API from then onwards.
As described here, they're quite closely related to OCaml module signatures. They allow you to create a package missing some modules and say these modules, containing such and such types and values, should be delivered by package's user. I haven't tested it myself, but I imagine that such a package works very much like an OCaml functor.

What exactly is considered a breaking change to a PureScript library?

The Rust community has a fairly detailed description of their interpretation of Semantic Versioning.
The PureScript community has this, which includes:
We should write a semver tutorial for beginners, specifically its use in PureScript and the way we rely on ~-versions.
The odd thing is that looking at an assortment of 65 randomish purescript libraries, they all use ^-versions rather than ~-versions, but I have been unable to find any newer documentation and we recently had our build broken due to a mismatch in expectations.
Does the PureScript community have a reasonably consistent interpretation of semver, specifically regarding what is or isn't considered a breaking change? If so, what is it?
We don't have an exhaustive list anywhere, no. Now's as good a time as any to start one!
Taking advantage of features that require a newer compiler than when the current version was released.
Adding a dependency.
Removing a dependency.
Bumping a dependency's major version.
Deleting or renaming a module.
Removing a member (that means anything - type, value, class, kind, operator) from a module (either by hiding the export or deleting it).
Changing a type signature of an existing function or value in a way that means it won't unify with the previous version (so it is allowable to make types more general, but not less so).
Adding, removing, or altering the kind of type variables for a type.
Adding, removing, or altering data constructors for a type (unless the type does not export its constructors).
Adding or removing members of a type class declaration.
Changing the expected type parameters of a class.
Adding or altering functional dependencies of a class.
Changing the laws of a class.
Removing instances of a class.
Pretty much anything other than adding new members (or re-exports) to a module is considered a breaking change!
Occasionally we've made changes that are technically breaking (due to type signature changes), but done so to fix something that was completely unusable without the fix. In those cases they've gone out as patch bumps, but those cases are very rare. They tend only to occur when the FFI is involved.
Re: ~ vs ^... I think at the time the linked page was made there wasn't the option to use ^ in Bower (or it didn't default to that at least). ^ is the preferred/recommended range to use for libraries now.

Why can't I replace libraries distributed with GHC? What would happen if I did?

I see in this answer and this one that "everything will break horribly" and Stack won't let me replace base, but it will let me replace bytestring. What's the problem with this? Is there a way to do this safely without recompiling GHC? I'm debugging a problem with the base libraries and it'd be very convenient.
N.B. when I say I want to replace base I mean with a modified version of base from the same GHC version. I'm debugging the library, not testing a program against different GHC releases.
Most libraries are collections of Haskell modules containing Haskell code. The meaning of those libraries is determined by the code in the modules.
The base package, though, is a bit different. Many of the functions and data types it offers are not implemented in standard Haskell; their meaning is not given by the code contained in the package, but by the compiler itself. If you look at the source of the base package (and the other boot libraries), you will see many operations whose complete definition is simply undefined. Special code in the compiler's runtime system implements these operations and exposes them.
For example, if the compiler didn't offer seq as a primitive operation, there would be no way to implement seq after-the-fact: no Haskell term that you can write down will have the same type and semantics as seq unless it uses seq (or one of the Haskell extensions defined in terms of seq). Likewise many of the pointer operations, ST operations, concurrency primitives, and so forth are implemented in the compiler themselves.
Not only are these operations typically unimplementable, they also are typically very strongly tied to the compiler's internal data structures, which change from one release to the next. So even if you managed to convince GHC to use the base package from a different (version of the) compiler, the most likely outcome would simply be corrupted internal data structures with unpredictable (and potentially disastrous) results -- race conditions, trashing memory, space leaks, segfaults, that kind of thing.
If you need several versions of base, just install several versions of GHC. It's been carefully architected so that multiple versions can peacefully coexist on a single machine. (And in particular installing multiple versions definitely does not require recompiling GHC or even compiling GHC a first time, which seems to be your main concern.)

Importing modules as a function, with string as input

I want to make a function called 'load' which imports definitions of functions from another file. I know how to import modules, but in my program I want the definitions of the functions to change depending on which module is 'loaded' with this new function. Is there a way to do this? Is there a better way to write my program so that this is not necessary?
I think it's type signature would look something like:
load :: String -> IO ()
where the string is the name of the module to be loaded (and the module is in the same directory).
Edit: Thanks for all the replies. Most people agree that this is not the best way to do what I want. Instead, is there a way to declare a global variable from within an I/O program. That is, I want it so that if I type (function "thing") into a function of type String -> IO(), I can still type 'thing' into GHCi to get the value assigned to it... Any suggestions?
There is almost certainly a better way to write your program so that this is not necessary. It's hard to say what without knowing more details about your situation, though. You could, for instance, represent the generic interface each module implements as a data-type, and have each module export a value of that type with the implementation.
Basically, the set of loaded modules is a static, compile-time property, so it makes no sense to want your program's behaviour to change based on its contents. Are you trying to write a library? Your users probably won't appreciate it doing such evil magic to their import lists :) (And it probably isn't possible without Template Haskell in that case, anyway.)
The exception is if you're trying to implement a Haskell tool (e.g. REPL, IDE, etc.) or trying to do plugins; i.e. dynamically-loaded modules of Haskell source code to integrate into your Haskell program. The first thing to try for those should be hint, but you may find you need something more advanced; in that case, the GHC API is probably your best bet. plugins used to be the de-facto standard in this area, but it doesn't seem to compile with GHC 7; you might want to check out direct-plugins, a simplified implementation of a similar interface that does.
mueval might be relevant; it's designed for executing short (one-line) snippets of Haskell code in a safe sandbox, as used by lambdabot.
Unless you're building a Haskell IDE or something like that, you most likely don't need this (^1).
But, in the case you do, there is always the hint-package, which allows you to embed a haskell interpreter into your program. This allows you to both load haskell modules and to convert strings into haskell values at runtime. There is a nice example of how to use it here
^1: If you're looking for a way to make things polymorphic, i.e. changing some, but not all definitions of in your code, you're probably looking for typeclasses.
With regards to your edit, perhaps you might be interested in IORef.

Resources