I have two types, assume they both have monoid implementations. Is there a way to have another type that will be specified to contain an X or a Y? Or is this not the right way to go about this?
data X = X [Int]
data Y = Y Double
The OP has clarified in the comments that they want 'instance behaviour' for some type Either X Y. Typically, you'd use a newtype in this situation:
newtype EitherXY = EitherXY (Either X Y)
In case you're not already aware, newtypes can have a record-like unwrapping function.
newtype EitherXY = EitherXY { unwrap :: Either X Y } deriving (...)
You may also auto-derive certain type classes (as with data declarations). The set of derivable classes depends on the compiler version and the set of enabled extensions. I won't elaborate on it here.
It's probably better to just do
data X_Or_Y = InX X | InY Y
This type is isomorphic to Either X Y, but it's easier to work with/pattern match on than a newtype, since it only has 2 layers of nested constructors
Related
Suppose I have defined some data type that derives Eq but want to insert my own definition of (==) for some pattern. Is there any way to do this or do I have to define (==) for every pattern?
e.g.
data Asdf = One Char | Two Char Char --(deriving Eq)
instance Eq Asdf where
(==) (One _) (One _) = True
--otherwise use what the derived definition would have done
--can I do this without defining these patterns myself?
To do what you're trying to do, you have to define it yourself, and that means you have to define it for every pattern.
Basically data MyType x = A x | B x x deriving (Eq) will add a default derivation equivalent to,
instance Eq x => Eq (MyType x) where
A x1 == A x2 = x1 == x2
B x1 x2 == B x3 x4 = x1 == x3 && x2 == x4
_ == _ = False
Note that it figures out the necessary dependencies (the Eq x => part above) as well as fills in the diagonal cases -- the special cases among the n2 possible matches where the same constructor was used.
As far as I know, it does this definition all-at-once and there is no way to dig into an existing instance declaration to mess with it -- and there is a good reason for this; if they let you do this then that would mean that as codebases grow, you could not look at an instance derivation or a deriving (Eq) clause and be confident that you know exactly what it means, since some other part of the code might monkey-patch that Eq instance to do something nefarious.
So one way is to redefine the diagonal yourself. But that's not the only way. There is at least one alternative which may work if it's easier to modify several usage sites than to shove all n constructors into a single thing:
newtype EverythingIsEqual x = E x deriving (Show)
instance Eq (EverythingIsEqual x) where
_ == _ = True
data MyType x = A (EverythingIsEqual x) | B x x deriving (Show, Eq, Ord)
This newtype allows you to strategically modify certain terms to have a different Eq relation at no runtime cost -- in fact this is pretty much one of the two central arguments for newtypes; aside from the lesser one where "I want to have a type-level difference between these two Strings but they ARE just strings and I don't want to pay any performance penalty," there is the greater argument of "sometimes we want to tell Haskell to use a different Ord dictionary without messing with any of the values that this dictionary acts upon, we just want to swap out the functions."
This question discusses how to do something very similar for the Show instance, using the https://hackage.haskell.org/package/generic-deriving package: Accessing the "default show" in Haskell?
See this answer in particular: https://stackoverflow.com/a/35385768/936310
I recently used it for the Show instance recently, and it worked wonderfully. You can similarly derive Eq as well for your type, assuming it's regular enough.
I understand that when having
instance (Foo a) => Bar a
instance (Xyy a) => Bar a
GHC doesn't consider the contexts, and the instances are reported as duplicate.
What is counterintuitive, that (I guess) after selecting an instance, it still needs to check if the context matches, and if not, discard the instance. So why not reverse the order, and discard instances with non-matching contexts, and proceed with the remaining set.
Would this be intractable in some way? I see how it could cause more constraint resolution work upfront, but just as there is UndecidableInstances / IncoherentInstances, couldn't there be a ConsiderInstanceContexts when "I know what I am doing"?
This breaks the open-world assumption. Assume:
class B1 a
class B2 a
class T a
If we allow constraints to disambiguate instances, we may write
instance B1 a => T a
instance B2 a => T a
And may write
instance B1 Int
Now, if I have
f :: T a => a
Then f :: Int works. But, the open world assumption says that, once something works, adding more instances cannot break it. Our new system doesn't obey:
instance B2 Int
will make f :: Int ambiguous. Which implementation of T should be used?
Another way to state this is that you've broken coherence. For typeclasses to be coherent means that there is only one way to satisfy a given constraint. In normal Haskell, a constraint c has only one implementation. Even with overlapping instances, coherence generally holds true. The idea is that instance T a and instance {-# OVERLAPPING #-} T Int do not break coherence, because GHC can't be tricked into using the former instance in a place where the latter would do. (You can trick it with orphans, but you shouldn't.) Coherence, at least to me, seems somewhat desirable. Typeclass usage is "hidden", in some sense, and it makes sense to enforce that it be unambiguous. You can also break coherence with IncoherentInstances and/or unsafeCoerce, but, y'know.
In a category theoretic way, the category Constraint is thin: there is at most one instance/arrow from one Constraint to another. We first construct two arrows a : () => B1 Int and b : () => B2 Int, and then we break thinness by adding new arrows x_Int : B1 Int => T Int, y_Int : B2 Int => T Int such that x_Int . a and y_Int . b are both arrows () => T Int that are not identical. Diamond problem, anyone?
This does not answer you question as to why this is the case. Note, however, that you can always define a newtype wrapper to disambiguate between the two instances:
newtype FooWrapper a = FooWrapper a
newtype XyyWrapper a = XyyWrapper a
instance (Foo a) => Bar (FooWrapper a)
instance (Xyy a) => Bar (XyyWrapper a)
This has the added advantage that by passing around either a FooWrapper or a XyyWrapper you explicitly control which of the two instances you'd like to use if your a happens to satisfy both.
Classes are a bit weird. The original idea (which still pretty much works) is a sort of syntactic sugar around what would otherwise be data statements. For example you can imagine:
data Num a = Num {plus :: a -> a -> a, ... , fromInt :: Integer -> a}
numInteger :: Num Integer
numInteger = Num (+) ... id
then you can write functions which have e.g. type:
test :: Num x -> x -> x -> x -> x
test lib a b c = a + b * (abs (c + b))
where (+) = plus lib
(*) = times lib
abs = absoluteValue lib
So the idea is "we're going to automatically derive all of this library code." The question is, how do we find the library that we want? It's easy if we have a library of type Num Int, but how do we extend it to "constrained instances" based on functions of type:
fooLib :: Foo x -> Bar x
xyyLib :: Xyy x -> Bar x
The present solution in Haskell is to do a type-pattern-match on the output-types of those functions and propagate the inputs to the resulting declaration. But when there's two outputs of the same type, we would need a combinator which merges these into:
eitherLib :: Either (Foo x) (Xyy x) -> Bar x
and basically the problem is that there is no good constraint-combinator of this kind right now. That's your objection.
Well, that's true, but there are ways to achieve something morally similar in practice. Suppose we define some functions with types:
data F
data X
foobar'lib :: Foo x -> Bar' x F
xyybar'lib :: Xyy x -> Bar' x X
bar'barlib :: Bar' x y -> Bar x
Clearly the y is a sort of "phantom type" threaded through all of this, but it remains powerful because given that we want a Bar x we will propagate the need for a Bar' x y and given the need for the Bar' x y we will generate either a Bar' x X or a Bar' x y. So with phantom types and multi-parameter type classes, we get the result we want.
More info: https://www.haskell.org/haskellwiki/GHC/AdvancedOverlap
Adding backtracking would make instance resolution require exponential time, in the worst case.
Essentially, instances become logical statements of the form
P(x) => R(f(x)) /\ Q(x) => R(f(x))
which is equivalent to
(P(x) \/ Q(x)) => R(f(x))
Computationally, the cost of this check is (in the worst case)
c_R(n) = c_P(n-1) + c_Q(n-1)
assuming P and Q have similar costs
c_R(n) = 2 * c_PQ(n-1)
which leads to exponential growth.
To avoid this issue, it is important to have fast ways to choose a branch, i.e. to have clauses of the form
((fastP(x) /\ P(x)) \/ (fastQ(x) /\ Q(x))) => R(f(x))
where fastP and fastQ are computable in constant time, and are incompatible so that at most one branch needs to be visited.
Haskell decided that this "fast check" is head compatibility (hence disregarding contexts). It could use other fast checks, of course -- it's a design decision.
AFAIK there is no way to do heterogeneous arrays in Haskell or extend data type.
However it seems it could be achieved easily by using nested pair (like CONS are).
For example
data Point2D a = Point2D a a
data Point3D a = Point3D a a a
Could be written using nested pair like this;
type Point2D = (a, (a, ())
type Point3D = (a, (a, (a, ()))
That way accessors can be common for Point2D and Point3D
x = fst
y = fst.snd
z = fst.snd.snd
This technique can also be used to extend record like this
type Person = (String, ())
name = fst
type User = (String, (String, ())
email = fst.snd
etc ...
Is it a good idea and if so , why is there no built-in support for such thing in Haskell ?
Is it what GADT is about ?
It can be a good idea, if that's what you need.
GADTs are not directly to do with this idea (if you want to know more about GADTs, probably better to google GADTs and read some of the links, and/or ask a separate Stackoverflow question).
There's no built in support for the same reason there's no built in support for graphics, matrix operations, or most other things; there's no need for support to be built in, because it can be added perfectly well by libraries, such as the HList package (indeed, even most of the functionality that is "built in" to Haskell is actually merely implemented in libraries; just libraries that happen to be always distributed with Haskell).
HList has fancy types for representing "heterogenous lists" and convenient functions for performing operations on them, and also uses these to develop "extensible records". HLists basically are equivalent to what you could do by developing your nested tuple idea further, so your basic idea is good enough that someone's already thought of it. :)
Here is a way to have arbitrarily nested pairs:
data Y f = In (f (Y f))
data P a = Pair Int a | STOP
type PY = Y P
a, b :: PY
a = In (Pair 23 (In (Pair 24 (In STOP))))
b = In (Pair 23 (In (Pair 24 (In (Pair 25 (In STOP))))))
c = [a,b] -- a and b have same type
In Haskell, you can use unsafeCoerce to override the type system. How to do the same in F#?
For example, to implement the Y-combinator.
I'd like to offer a different solution, based on embedding the untyped lambda calculus in a typed functional language. The idea is to create a data type that allows us to change between types α and α → α, which subsequently allows to escape the restrictions of a type system. I'm not very familiar with F# so I'll give my answer in Haskell, but I believe it could be adapted easily (perhaps the only complication could be F#'s strictness).
-- | Roughly represents morphism between #a# and #a -> a#.
-- Therefore we can embed a arbitrary closed λ-term into #Any a#. Any time we
-- need to create a λ-abstraction, we just nest into one #Any# constructor.
--
-- The type parameter allows us to embed ordinary values into the type and
-- retrieve results of computations.
data Any a = Any (Any a -> a)
Note that the type parameter isn't significant for combining terms. It just allows us to embed values into our representation and extract them later. All terms of a particular type Any a can be combined freely without restrictions.
-- | Embed a value into a λ-term. If viewed as a function, it ignores its
-- input and produces the value.
embed :: a -> Any a
embed = Any . const
-- | Extract a value from a λ-term, assuming it's a valid value (otherwise it'd
-- loop forever).
extract :: Any a -> a
extract x#(Any x') = x' x
With this data type we can use it to represent arbitrary untyped lambda terms. If we want to interpret a value of Any a as a function, we just unwrap its constructor.
First let's define function application:
-- | Applies a term to another term.
($$) :: Any a -> Any a -> Any a
(Any x) $$ y = embed $ x y
And λ abstraction:
-- | Represents a lambda abstraction
l :: (Any a -> Any a) -> Any a
l x = Any $ extract . x
Now we have everything we need for creating complex λ terms. Our definitions mimic the classical λ-term syntax, all we do is using l to construct λ abstractions.
Let's define the Y combinator:
-- λf.(λx.f(xx))(λx.f(xx))
y :: Any a
y = l (\f -> let t = l (\x -> f $$ (x $$ x))
in t $$ t)
And we can use it to implement Haskell's classical fix. First we'll need to be able to embed a function of a -> a into Any a:
embed2 :: (a -> a) -> Any a
embed2 f = Any (f . extract)
Now it's straightforward to define
fix :: (a -> a) -> a
fix f = extract (y $$ embed2 f)
and subsequently a recursively defined function:
fact :: Int -> Int
fact = fix f
where
f _ 0 = 1
f r n = n * r (n - 1)
Note that in the above text there is no recursive function. The only recursion is in the Any data type, which allows us to define y (which is also defined non-recursively).
In Haskell, unsafeCoerce has the type a -> b and is generally used to assert to the compiler that the thing being coerced actually has the destination type and it's just that the type-checker doesn't know it.
Another, less common use, is to reinterpret a pattern of bits as another type. For example an unboxed Double# could be reinterpreted as an unboxed Int64#. You have to be sure about the underlying representations for this to be safe.
In F#, the first application can be achieved with box |> unbox as John Palmer said in a comment on the question. If possible use explicit type arguments to make sure that you don't accidentally have the wrong coercion inferred, e.g. box<'a> |> unbox<'b> where 'a and 'b are type variables or concrete types that are already in scope in your code.
For the second application, look at the BitConverter class for specific conversions of bit-patterns. In theory you could also do something like interfacing with unmanaged code to achieve this, but that seems very heavyweight.
These techniques won't work for implementing the Y combinator because the cast is only valid if the runtime objects actually do have the target type, but with the Y combinator you actually need to call the same function again but with a different type. For this you need the kinds of encoding tricks mentioned in the question John Palmer linked to.
This is a followup question to Inconsistent Eq and Ord instances?.
The question there is essentially: when declaring Eq and Ord instances for a type, must one ensure that compare x y returns EQ if and only if x == y returns True? Is it dangerous to create instances that break this assumption? It seems like a natural law one might assume, but it doesn’t appear to be explicitly stated in the Prelude, unlike e.g. the monad or functor laws.
The basic response was: it is a bit dangerous to do this, since libraries may assume that this law holds.
My question, now, is: do any of the standard libraries (in particular, Set or Map) make this assumption? Is it dangerous to have a type with incompatible Eq and Ord, so long as I am only relying on the standard libraries supplied with GHC? (If big-list questions were still acceptable, I would be asking: which commonly used libraries assume this law?)
Edit. My use-case is similar to that of the original question. I have a type with a custom instance of Eq, that I use quite a bit. The only reason I want Ord is so that I can use it as the domain of a Map; I don’t care about the specific order, and will never use it explicitly in code. So if I can use the derived instance of Ord, then my life will be easier and my code clearer.
The definition of Ord itself in the standard prelude requires there already be an Eq instance:
class (Eq a) => Ord a where
...
So it would be just as wrong to violate
x == y = compare x y == EQ
x /= y = compare x y /= EQ
As it would be to violate (from the default definitions for these operators in Ord).
x <= y = compare x y /= GT
x < y = compare x y == LT
x >= y = compare x y /= LT
x > y = compare x y == GT
Edit: Use in libraries
I would be quite surprised if standard libraries didn't make use of Ord's == and /= operators. The specific purpose operators (==, /=, <=, <, >=, >) are frequently more convenient than compare, so I'd expect to see them used in code for maps or filters.
You can see == being used in guards on keys in Data.Map in fromAscListWithKey. This specific function only calls out for the Eq class, but if the key is also an Ord instance, Ord's compare will be used for other functions of the resulting Map, which is an assumption that Eq's == is the same as Ord's compare and testing for EQ.
As a library programmer, I wouldn't be surprised if any of the special purpose operators outperformed compare for the specific purpose. After all, that's why they are part of the Eq and Ord classes instead of being defined as polymorphic for all Eq or Ord instances. I might make a point of using them even when compare is more convenient. If I did, I'd probably define something like:
compareOp :: (Ord a) => Ordering -> Bool -> a -> a -> Bool
compareOp EQ True = (==)
compareOp EQ False = (/=)
compareOp LT True = (<)
compareOp LT False = (>=)
compareOp GT True = (>)
compareOp GT False = (<=)
To extend Cirdec's answer, typeclass instances should only be made if the operation being defined is somehow canonical. If there is a reasonable Eq which doesn't extend to a reasonable Ord, then it's best practice to pick either the other Eq or to not define an Ord. It's easy enough to create a non-polymorphic function for the "other" equality.
A great example of this tension is the potential Monoid instance
instance Monoid Int where
mzero = 0
mappend = (+)
which contests with the other "obvious" Monoid instance
instance Monoid Int where
mzero = 1
mappend = (*)
In this case the chosen path was to instantiate neither because it's not clear that one is "canonical" over the other. This typically conforms best to a user's expectation and which prevent bugs.
I've read through this and your original question, so I will address your general problem....
You want this-
Map BigThing OtherType
and this-
(==)::BigThing->BigThing->Bool
One of these cases has to be exact, the other case should ignore some of its data, for performance reasons. (it was (==) that needed to be exact in the first question, but it looks like you might be addressing the reverse in this question.... Same answer either way).
For instance, you want the map to only store the result based on some label, like a
`name::BigThing->String`
but (==) should do a deep compare. One way to do this would be to define incompatible compare and (==) functions. However....
in this case, this is unnecessary. Why not just instead use the map
Map String OtherThing
and do a lookup like this-
lookup (name obj) theMap
It is pretty rare to index directly on very large document data....