Suppose I have the following class:
class P a where
nameOf :: a -> String
I would like to declare that all instances of this class are automatically instances of Show. My first attempt would be the following:
instance P a => Show a where
show = nameOf
My first attempt to go this way yesterday resulted in a rabbit warren of language extensions: I was first told to switch on flexible instances, then undecidable instances, then overlapping instances, and finally getting an error about overlapping instance declarations. I gave up and returned to repeating the code. However, this fundamentally seems like a very simple demand, and one that should be easily satisfied.
So, two questions:
Is there a trivially easy way to do this that I've just missed?
Why do I get an overlapping instances problem? I can see why I might need UndecidableInstances, since I seem to be violating the Paterson condition, but there are no overlapping instances around here: there are no instances of P, even. Why does the typechecker believe there are multiple instances for Show Double (as seems to be the case in this toy example)?
You get the overlapping instances error because some of your instances of P may have other instances of Show and then the compiler won't be able to decide which ones to use. If you have an instance of P for Double, then there you go, you get two instances of Show for Double: yours general one and the one already declared in Haskell's base library. How this error is triggered is correctly stated by #augustss in the comments to your question. For more info see the specs.
As you already know, there is no way to achieve what you're trying without the UndecidableInstances. When you enable that flag you must understand that you're taking over the compiler's responsibility to ensure that there won't arise any conflicting instances. This means that, of course, there mustn't be any other instances of Show produced in your library. This also means that your library won't export the P class, which will erase the possibility of users of the library declaring the conflicting instances.
If your case somehow conflicts with the said above, it's a reliable sign of that there must be something wrong with it. And in fact there is...
What you're trying to achieve is incorrect above all. You are missing several important points about the Show typeclass, distinguishing it from constructs like a toString method of popular OO languages:
From Show's haddock:
The result of show is a syntactically correct Haskell expression containing only constants, given the fixity declarations in force at the point where the type is declared. It contains only the constructor names defined in the data type, parentheses, and spaces. When labelled constructor fields are used, braces, commas, field names, and equal signs are also used.
In other words, declaring an instance of Show, which does not produce a valid Haskell expression, is incorrect per se.
Given the above it just doesn't make sense to declare a custom instance of Show when the type allows to simply derive it.
When a type does not allow to derive it (e.g., GADT), generally you'll still have to stick to type-specific instances to produce correct results.
So, if you need a custom representation function, you shouldn't use Show for that. Just declare a custom class, e.g.:
class Repr a where
repr :: a -> String
and approach the instances declaration responsibly.
Related
In Haskell, it's possible to add constraints to a type parameter.
For example:
foo :: Functor f => f a
The question: is it possible to negate a constraint?
I want to say that f can be anything except Functor for example.
UPD:
So it comes from the idea of how to map the bottom nested Functor.
Let's say I have Functor a where a can be a Functor b or not and the same rules works for b.
Reasons why this is not possible: (basically all the same reason, just different aspects of it)
There is an open-world assumption about type classes. It isn't possible to prove that a type is not an instance of a class because even if during compilation of a module, the instance isn't there, that doesn't mean somebody doesn't define it in a module “further down the road”. This could in principle be in a separate package, such that the compiler can't possibly know whether or not the instance exists.
(Such orphan instances are generally quite frowned upon, but there are use cases for them and the language doesn't attempt to prevent this.)
Membership in a class is an intuitionistic property, meaning that you shouldn't think of it as a classical boolean value “instance or not instance” but rather, if you can prove that a type is an instance then this gives you certain features for the type (specified by the class methods). If you can't prove that the type is an instance then this doesn't mean there is no instance, but perhaps just that you're not smart enough to prove it. (Read, “perhaps that nobody is smart enough”.)This ties back to the first point: the compiler not yet having the instance available is one case of “not being smart enough”.
A class isn't supposed to be used for making dispatch on whether or not a type is in it, but for enabling certain polymorphic functions even if they require ad-hoc conditions on the types. That's what class methods do, and they can come from a class instance, but how could they come from a “not-in-class instance”?
Now, all that said, there is a way you can kind of fake this: with an overlapping instance. Don't do it, it's a bad idea, but... that's the closest you can get.
The Haskell tutorial states that:
by looking at the type signature of read
read :: Read a => String -> a
it follows that GHCI has no way of knowing which type we want in return when running
ghci> read "4"
Why is it necessary to provide a second value from which GHCI can extract a type to compare with?
Wouldn't it be feasible to check a single value against all possible types of the Read typeclass?
Reference:
http://learnyouahaskell.com/types-and-typeclasses
I think you have a (rather common among beginners - I had it myself) misunderstanding of what type classes are. The way Haskell works is logically incompatible with "check[ing] a single value against all possible types of the Read typeclass". Instance selection is based on types. Only types.
You should not think of read as a magical function that can return many types. It's actually a huge family of functions, and the type is used to select which member of the family to use. It's that direction of dependence that matters. Classes create a case where values (usually functions, but not always) - the things that exist at run time - are chosen based on types - the things that exist at compile time.
You're asking "Why not the other direction? Why can't the type depend on the value?", and the answer to that is that Haskell just doesn't work that way. It wasn't designed to, and the theory it was based on doesn't allow it. There is a theory for that (dependent types), and there are extensions being added to GHC that support an increasing set of feature that do some aspect of dependent typing, but it's not there yet.
And even if it was, this example would still not work the way you want. Dependent types still need to know what type something is. You couldn't write a magical "returns anything" version of read. Instead, the type for read would have to involve some function that calculates the type from the value, and inherently only works for the closed set of types that function can return.
Those last two paragraphs are kind of an aside, though. The important part is that classes are ways to go from types to values, with handy compiler support to automatically figure it out for you most of the time. That's all they were designed to do, and it's all that they can do. There are advantages to this design, in terms of ease of compilation, predictability of behavior (open world assumption), and ability to optimize at compile time.
Wouldn't it be feasible to check a single value against all possible types of the Read typeclass?
Doing that would yield the same result; read "4" can potentially be anything that can be read from a String, and that's what ghci reports:
Prelude> :t read "4"
read "4" :: Read a => a
Until you actually do the parsing, the Read a => a represents a potential parsing result. Remember that typeclasses being open means that this could potentially be any type, depending on the presence of the instances.
It's also entirely possible that multiple types could share the same Show/Read textual representation, which brings me to my next point...
If you wanted to check what type the string can be parsed as, that would at the very least require resolving the ambiguity between multiple types that could accept the given input; which means you'd need to know those types beforehand, which Read can't do. And even if you did that, how do you propose such value be then used? You'd need to pack it into something, which implies that you need a closed set again.
All in all, read signature is as precise it can be, given the circumstances.
Not meant as an answer, but this wouldn't fit into a comment cleanly.
In ghci, if you simply do a read "5", then ghci is going to need some help figuring out what you want it to be. However, if that result is being used somewhere, ghci (and Haskell in general) can figure out the type. For (a silly) example:
add1 :: Int -> Int
add1 i = i + 1
five = read "5"
six = add1 five
In that case, there's no need to annotate the read with a type signature, because ghc can infer it from the fact that five is being used in a function that only takes an Int. If you added another function with a different signature that also tried to use five, you'd end up with a compile error:
-- Adding this to our code above
-- Fails to compile
add1Integer :: Integer -> Integer
add1Integer i = i + 1
sixAsInteger = add1Integer five
In object-oriented languages (e.g; Java and Python) we can make objects/instances from classes.
In Haskell we can make instances from type-classes, ex:
data ShirtSize = S | M | L -- Here ShirtSize is a enum data-type
class MyEq a where
(==) :: a -> a -> Bool
instance MyEq ShirtSize where -- Here ShirtSize is an instance of the MyEq type-class
S == S = True
M == M = True
L == L = True
_ == _ = False
My question is:
What does instance mean in haskell?
In java we can make instances from classes, but in haskell it seems like instances are types (like ShirtSize) which you can apply type-class functions on (e.g the (==) function from MyEq). Am I right? and also what is an instance in Haskell compared to an instance/object in Java?
In Java, the class system is a way to group similar objects. An instance of a class is an individual object which belongs to that class.
In Haskell, the class system is (roughly speaking) a way to group similar types. (This is the reason we call them "type classes"). An instance of a class is an individual type which belongs to that class. (That is, until you start considering multiparametric type classes).
Incidentally, a Haskell (monoparametric) class somewhat resembles a Java interface and, by extension, a Java class. Or perhaps a Haskell instance resembles Java class. It's better to view this as a coincidence. Approach the term keeping its mathematical origins in mind. A class is just a bunch of things that belong together, and an instance is one of these things.
If you're interested in explanation of type classes and difference from Java interfaces you should read this post by <❤>. It also explains instances as well.
As for me, I look at instance as connection between data type and interface. data contains some information, class contains methods. data is about data (sorry for tautology) and class is about behavior. When you look at data type you don't see what you can do with it, instead you see what it stores. When you look at class you see what type should be able to do, you don't care what it stores internally. In real programming you actually care about details of implementations and how methods implemented using specific data. So instance just shows you relation between some data and some behavior — how this behavior implemented using given data.
In case you're interested more in model of type classes then read this blog post: http://www.haskellforall.com/2012/05/scrap-your-type-classes.html
You can look at instances as a values! It might blow your mind if you face such definition first time.
In some dependently type languages instances are really values that you can pass to other functions. Take a look at this question:
In Idris, is "Eq a" a type, and can I supply a value for it?
it seems like instances are types (like ShirtSize) which you
can apply type-class functions on (e.g the (==) function from MyEq)
Absolutely correct.
In Haskell a type is a defined structure of data. Every value that exists in Haskell code has a defined type. And a type can be made an instance of a class, which means... actually, hold that thought. I want to talk about functions.
Functions have type signatures, defining which type(s) they can be used on. If a function is defined to work on a particular type, then the function can be used on any value that has that type. If a function is defined to work on a particular class, then it can be used on any value of any of the types that are instances of that class.
When you define a class you describe some minimal set of functions (eg == in your example) which have to be implemented for all the types that want to be instances of that class. The class defines names and signatures for those functions, and that definition means those names and signatures are fixed. They will be the same for every instance of the class.
But the implementations aren't fixed by the class. They can be different for different types. We make a type into an instance of a class by writing an instance statement, in which we can define how those functions will work. If the class provides a default implementation of a function, then the different instance types can override the default and have their own definitions. And if there is no default, then the instance types must have their own definitions.
Now you have a minimal set of functions which can be called with any value of any of the types. And you can write more functions that work by calling those functions, and build up from there.
The idea is really useful, but IMHO the terminology is awful. Saying that these types are instances of a class makes it sound as if they're subtypes or child types, inheriting properties from parent types. But it's not like that at all. Being an instance of a class is like being a member of a club. Lots of different, unrelated types can all be instances of the same class. And one type can be an instance of lots of different unrelated classes, all at the same time.
In Rust, they have the same idea, but with the word 'trait' instead of 'class'. Instead of saying "this type is an instance of that class", they would say "this type implements that trait". I think that gets the idea across much better.
Apparently it's a bad idea to put a typeclass constraint on a data declaration [src], [src].
I haven't personally come across a desire to constrain the types within data types I've created, but it's not obvious to me why the language designers "decided it was a bad idea to allow". Why is that?
I haven't personally come across a desire to constrain the types within data types I've created, but it's not obvious to me why the language designers "decided it was a bad idea to allow". Why is that?
Because it was misleading and worked completely backwards from what would actually be useful.
In particular, it didn't actually constrain the types within the data type in the way you're probably expecting. What it did do was put a class constraint on the data constructor itself, which meant that you needed to satisfy the instance when constructing a value... but that was all.
So, for instance, you couldn't simply define a binary search tree with an Ord constraint and then know that any tree has sortable elements; the lookup and insert functions would still need an Ord constraint themselves. All you'd prevent would be constructing an empty tree that "contains" values of some non-ordered type. As far as pattern matching was concerned, there was no constraint on the contained type at all.
On the other hand, the people working on Haskell didn't think that the sensible version (that people tended to assume data type contexts provided) was a bad idea at all! In fact, class constraints on a data type declared with GADT syntax (generalized algebraic data types, enabled in GHC with the GADTs language pragma) do work in the obvious way--you need a constraint to construct the value, and the instance in question also gets stored in the GADT, so that you don't need a constraint to work with values, and pattern matching on the GADT constructor lets you use the instance it captured.
It's not actually a bad idea to add a typeclass constraint on a
data type - it can be very useful, and doesn't break your other code.
The badness is all about the fact that often people expect that they can then
use the data type to excuse them from putting a constraint on functions
that use the data type, but that's not the case.
(You could argue that an implicit constraint can cause problems.)
Putting a constraint on a datatype actually puts it on the all the constructors
that mention the constrained type.
Just as with an ordinary function with a constraint, if you use the constructor,
you must add the constraint. I think that's healthy and above board.
It does ensure you can't put data in your data type unless you can do
certain things with it. Its useful. You won't be creating a programming
faux pas by using one, and it's not bad practice, it's just not as lovely as
they wanted.
The "bad idea to allow" is probably because GADTs is really what they would like.
If GADTs had been around first, they wouldn't have done this.
I don't think it's such a bad thing to have both. If you want a state
manipulating function, you can use a permanently explicit parameter you pass around,
or you can use a monad and make it implicit. If you want a constraint on
data you can use a permanently explicit one on a data declaration or an implicit one
with a GADT. GADTs and monads are more sophisticated, but it doesn't make
explicit parameters or data type constraints wrong.
I have a (fairly) legitimate case where there are two type instance implementations, and I want to specify a default one. After noting that doing modular arithmetic with Int types resulted in lots of hash collisions, I want to try GHC's Int64. I have the following code:
class Hashable64 a where
hash64 :: a -> Int64
instance Hashable64 a => Hashable a where
hash = fromInteger . toInteger . hash64
instance Hashable64 a => Hashable64 [a] where
hash64 = foldl1 (\x y -> x + 22636946317 * y) . map hash64
and an instance Hashable64 Char, which thus results in two implementations for Hashable String, namely:
The one defined in Data.Hashable.
Noting that it is an Hashable64 instance, then converting to a regular Int for an instance of Data.Hashable.
The second code path may be better because it performs hashing with Int64s. Can I specify to use this derivation of the instance Hashable String?
Edit 1
Sorry, I forgot to add I have already tried the overlapping instances thing; perhaps I'm just not implementing it correctly? The documentation for overlapping instances says it works when one instance is more specific. But when I try to add a specific instance for Hashable String, the situation doesn't improve. Full code at [ http://pastebin.com/9fP6LUX2 ] (sorry for the superfluous default header).
instance Hashable String where
hash x = hash (hash64 x)
I get
Matching instances:
instance (Hashable a) => Hashable [a] -- Defined in Data.Hashable
instance [overlap ok] Hashable String
-- Defined at Hashable64.hs:70:9-23
Edit 2
Any other solutions to this specific problem are welcome. A good solution might provide insight into this overlapping instances problem.
This sort of situation is handled by GHC's OverlappingInstances extension. Roughly speaking, this extension allows instances to coexist despite the existence of some type(s) to which both could apply. For such types, GHC will select the "most specific" instance, which is a little fuzzy in some cases but usually does what you'd want it to.
This sort of situation, where you have one or more specialized instances and a single catch-all instance Foo a as the "default", usually works pretty well.
The main stumbling blocks to be aware of with overlapping instances are:
If something forces GHC to select an instance on a polymorphic type that's ambiguous, it will refuse with potentially cryptic compiler errors
The context of an instance is ignored until after it's been selected, so don't try to distinguish between instances that way; there are workarounds but they're annoying.
The latter point would be relevant here if, for example, you have a list of some type that's not an instance of Hashable64; GHC will select the more specific second instance, then fail because of the context, even if the full type (the list, not the element type) is an instance of Hashable64 and thus would work with the first, generic instance.
Edit: Oh, I see, I misinterpreted the situation slightly, regarding where the instances are coming from. Quoth the GHC User's Guide:
The willingness to be overlapped or incoherent is a property of the instance declaration itself (...). Neither flag is required in a module that imports and uses the instance declaration.
(...)
These rules make it possible for a library author to design a library that relies on overlapping instances without the library client having to know.
If an instance declaration is compiled without -XOverlappingInstances, then that instance can never be overlapped. This could perhaps be inconvenient. (...) We are interested to receive feedback on these points.
...in other words, overlapping is only allowed if the less specific instance was built with OverlappingInstances enabled, which the instance for Hashable [a] was not. So your instance for Hashable a is allowed, but one for Hashable [Char] fails, as observed.
This is a tidy illustration of why the User's Guide finds the current rules unsatisfactory (other rules would have their own problems, so it's not clear what the best approach, if any, would be).
Back in the here-and-now, you have multiple options, which are slightly less convenient than what you'd hoped for. Off the top of my head:
Alternate class: Define your equivalent of the Hashable class, write the overlapped instances you want, and use generic instances with Hashable in the context to fall back to the original as needed. This is problematic if you're using another library that expects Hashable instances, rather than pre-hashed values or an explicit hash function.
Type wrapper: newtype wrappers are something of a "standard" way to disambiguate instances (c.f. Monoid). By using such a wrapper around your values, you'll be able to write whatever instances you please because none of the pre-defined instances will match. This becomes problematic if you have a lot of functions that would need to wrap/unwrap the newtype, though keep in mind that you can define other instances (e.g., Num, Show, etc.) for the wrapper easily and there's no overhead at run time.
There are other, more arcane, workarounds, but I can't offer too much explicit guidance because which is the least awkward tends to be very situation-dependent.
It's worth noting that you're definitely pushing the edges of what can be sensibly expressed with type classes, so it's not surprising that things are awkward. It's not a very satisfying situation, but there's little you can do when constrained to adding instances for a class defined elsewhere.