Haskell: Coherence rules wrt. external code - haskell

In Rust, I am not allowed (for good reasons) to for example implement libA::SomeTrait for libB::SomeType.
Are there similar rules enforced in Haskell, or does Haskell just sort of unify the whole program and make sure it's coherent as a whole?
Edit: To phrase the question more clearly in Haskell terms, can you have an orphan instance where the class and the type are defined in modules from external packages (say, from Hackage)?

It looks like you can have an orphan instance wrt. modules. Can the same be done wrt. packages?
Yes. In Haskell, the compiler doesn't really concern itself with packages at all, except for looking up which modules are available. (Basically, Cabal or Stack tells GHC which ones are available, and GHC just uses them.) And an available module in another package is treated exactly the same as a module in your own package.
Most often, this is not a problem – people are expected to be sensible regarding what instances they define. If they do define a bad instance, it only matters for people who use both libA and libB, and they should know.
If you do have a particular reason why you want to prevent somebody from instantiating a class, you should just not export it. (You can still export functions with that class in their constraint.)

Related

Runtime "type terms" in LiquidHaskell vs. Idris

I have been playing around with LiquidHaskell and Idris lately and i got a pretty specific question, that i could not find a definitive answer anywhere.
Idris is a dependently typed language which is great for the most part. However i read that some type terms during type checking can "leak" from compile-time to run-time, even tough Idris does its best to eliminate those terms (this is a special feature even..). However, this elimination is not perfect and sometimes it does happen. If, why and when this happens is not immediately clear from the code and sometimes has an effect on run-time performance.
I have seen people prefering Haskells' type system, because it cannot happen there. When type checking is done, it is done. Types are "thrown away" and not used during run-time.
What is the story for LiquidHaskell? It enhances the type system capabilities quite a bit over traditional Haskell. Does LiquidHaskell also inject run-time bits for certain type "constellations" or (as i suspect) just adds another layer of "better" types over Haskell that do not affect run-time in any shape or form.
Meaning, if one removes the special LiquidHaskell type annotations and compiles it with standard GHC, is the code produced always the same? In other words: Is the LiquidHaskell extension compile-time only?
If yes, this seems like the best of both worlds, or is LiquidHaskell just not as expressive in the type system as Idris and therefore manages without run-time terms?
To answer your question as asked: Liquid Haskell allows you to provide annotations that a separate tool from the compiler verifies. Code is still compiled exactly the same way.
However, I have quibbles with your question as asked. It can be argued that there are cases in which some residue of the type must survive at run time in Haskell - especially when polymorphic recursion is involved. Consider this function:
lots :: Show a => Int -> a -> String
lots 0 x = show x
lots n x = lots (n-1) (x,x)
There is no way to statically determine the exact type involved in the use of show. Something derived from the types must survive to runtime. In practice, this is easy to do by using type class dictionaries. The theoretically important detail is that there's still type-directed behavior being chosen at runtime.

Should I make my Haskell modules Safe by default?

Is there any downside to marking the modules Safe? Should it be the assumed default?
As you can read in the GHC user manual, if you mark your modules Safe, you are restricted to a certain "safe" subset of the Haskell language.
You can only import modules which are also marked Safe (which rules out functions like unsafeCoerce or unsafePerformIO).
You can't use Template Haskell.
You can't use FFI outside of IO.
You can't use GeneralisedNewtypeDeriving.
You can't manually implement the Generic typeclass for the types you define in your module.
etc.
In exchange, the users of the module gain a bunch of guarantees (quoting from the manual):
Referential transparency — The types can be trusted. Any pure function, is guaranteed to be pure. Evaluating them is deterministic
and won’t cause any side effects. Functions in the IO monad are still
allowed and behave as usual. So, for example, the unsafePerformIO ::
IO a -> a function is disallowed in the safe language to enforce this
property.
Module boundary control — Only symbols that are publicly available through other module export lists can be accessed in the safe
language. Values using data constructors not exported by the defining
module, cannot be examined or created. As such, if a module M
establishes some invariants through careful use of its export list,
then code written in the safe language that imports M is guaranteed to
respect those invariants.
Semantic consistency — For any module that imports a module written in the safe language, expressions that compile both with and without
the safe import have the same meaning in both cases. That is,
importing a module written in the safe language cannot change the
meaning of existing code that isn’t dependent on that module. So, for
example, there are some restrictions placed on the use of
OverlappingInstances, as these can violate this property.
Strict subset — The safe language is strictly a subset of Haskell as implemented by GHC. Any expression that compiles in the safe language
has the same meaning as it does when compiled in normal Haskell.
Note that safety is inferred. If your module is not marked with any of Safe, Trustworthy, or Unsafe, GHC will infer the safety of the module to Safe or Unsafe. When you set the Safe flag, then GHC will issue an error if it decides the module is not actually safe. You can also set -Wunsafe, which emits a warning if a module is inferred to be unsafe. If you let it be inferred, your module will continue to compile even if your dependencies' safety statuses change. If you write it out, you promise to your users that the safety status is stable and dependable.
One use case described in the manual refers to running "untrusted" code. If you provide extension points of any kind in your product and you want to make sure that those capabilities aren't used to attack your product, you can require the code for the extension points to be marked Safe.
You can mark your module Trustworthy and this doesn't restrict you in any way while implementing your module. Your module might be used from Safe code then and it is your responsibility to not violate the guaranties, that should be given by Safe code. So this is a promise, you the author of said module, give. You can use the flag -fpackage-trust to enable extra checks while compiling modules marked as Trustworthy described here.
So, if you write normal libraries and don't have a good reason to care about Safe Haskell, then you probably shouldn't care. If your modules are safe, thats fine and can be inferred. If not, then this is probably for a reason, like because you used unsafePerformIO, which is also fine. If you know your module will be used in a way that requires it to compile under -XSafe (e.g. plugins, as above), you should do it. In all other cases, don't bother.

Haskell compiler magic: what requires a special treatment from the compiler?

When trying to learn Haskell, one of the difficulties that arise is the ability when something requires special magic from the compiler. One exemple that comes in mind is the seq function which can't be defined i.e. you can't make a seq2 function behaving exactly as the built-in seq. Consequently, when teaching someone about seq, you need to mention that seq is special because it's a special symbol for the compiler.
Another example would be the do-notation which only works with instances of the Monad class.
Sometimes, it's not always obvious. For instance, continuations. Does the compiler knows about Control.Monad.Cont or is it plain old Haskell that you could have invented yourself? In this case, I think nothing special is required from the compiler even if continuations are a very strange kind of beast.
Language extensions set aside, what other compiler magic Haskell learners should be aware of?
Nearly all the ghc primitives that cannot be implemented in userland are in the ghc-prim package. (it even has a module called GHC.Magic there!)
So browsing it will give a good sense.
Note that you should not use this module in userland code unless you know exactly what you are doing. Most of the usable stuff from it is exported in downstream modules in base, sometimes in modified form. Those downstream locations and APIs are considered more stable, while ghc-prim makes no guarantees as to how it will act from version to version.
The GHC-specific stuff is reexported in GHC.Exts, but plenty of other things go into the Prelude (such as basic data types, as well as seq) or the concurrency libraries, etc.
Polymorphic seq is definitely magic. You can implement seq for any specific type, but only the compiler can implement one function for all possible types [and avoid optimising it away even though it looks no-op].
Obviously the entire IO monad is deeply magic, as is everything to with concurrency and parallelism (par, forkIO, MVar), mutable storage, exception throwing and catching, querying the garbage collector and run-time stats, etc.
The IO monad can be considered a special case of the ST monad, which is also magic. (It allows truly mutable storage, which requires low-level stuff.)
The State monad, on the other hand, is completely ordinary user-level code that anybody can write. So is the Cont monad. So are the various exception / error monads.
Anything to do with syntax (do-blocks, list comprehensions) is hard-wired into the language definition. (Note, though, that some of these respond to LANGUAGE RebindableSyntax, which lets you change what functions it binds to.) Also the deriving stuff; the compiler "knows about" a handful of special classes and how to auto-generate instances for them. Deriving for newtype works for any class though. (It's just copying an instance from one type to another identical copy of that type.)
Arrays are hard-wired. Much like every other programming language.
All of the foreign function interface is clearly hard-wired.
STM can be implemented in user code (I've done it), but it's currently hard-wired. (I imagine this gives a significant performance benefit. I haven't tried actually measuring it.) But, conceptually, that's just an optimisation; you can implement it using the existing lower-level concurrency primitives.

Is there a Data.Binary instance for Data.Time.Calendar.Day?

Is there a Data.Binary instance for Data.Time.Calendar.Day?
More generally, what is one supposed to do if Data.Binary have not been provided for a particular datatype in a widely used library?
If you just create a Binary instance and place it in your module then you'll be creating an orphan instance which can cause a lot of confusion later when someone imports your module---it'll drag along that orphan instance and possibly conflict with their understanding of how dates should be made Binary.
If you have a very canonical instance, try pushing it to the library author. It's easy to add the instance if it's a good idea and it can benefit anyone who uses that library.
If that isn't an option (or if you have a non-canonical instance) then you probably want to create a newtype wrapper. They're "free" in that the compiler deletes them automatically, but they allow a type to take on an entirely new identity with new typeclass instances.
I've done this before to handle particular parses of, for instance, "dates in this format" compared to Date broadly.

How to know when an apparently pure Haskell interface hides unsafe operations?

I have been reading about unsafePerformIO lately, and I would like to ask you something. I'm OK with the fact that a real language should be able to interact with the external environment, so unsafePerformIO is somewhat justified.
However, at the best of my knowledge, I'm not aware of any quick way to know whether an apparently pure (judging from the types) interface/library is really pure without inspecting the code in search for calls to unsafePerformIO (documentation could omit to mention it).
I know it should be used only when you're sure that referential transparency is guaranteed, but I would like to know about it nevertheless.
There's no way without checking the source code. But that isn't too difficult, as Haddock gives a nice link directly to syntax-highlighted definitions right in the documentation. See the "Source" links to the right of the definitions on this page for an example.
Safe Haskell is relevant here; it's used to compile Haskell code in situations where you want to disallow usage of unsafe functionality. If a module uses an unsafe module (such as System.IO.Unsafe) and isn't specifically marked as Trustworthy, it'll inherit its unsafe status. But modules that use unsafePerformIO will generally be using it safely, and thus declare themselves Trustworthy.
In the case that you're thinking, the use of unsafePerformIO is unjustified. The documentation for unsafePerformIO explains this: it's only meant for cases where the implementer can prove that there is no way of breaking referential transparency, i.e., "purely functional" semantics. I.e., if anybody uses unsafePerformIO in a way that a purely functional program can detect it (e.g., write a function whose result depends on more than just its arguments), then that is a disallowed usage.
If you ran into a such a case, the most likely possibility is that you've found a bug.

Resources