Design options for constructor constraints: GADT compare PatternSynonym Required

Design options for constructor constraints: GADT compare PatternSynonym Required - haskell

(This is a follow-up to this answer, trying to get the q more precise.)
Use Case Constructors to build/access a Set datatype. Being a set, the invariant is 'no duplicates'. To implement that I need an Eq constraint on the element type. (More realistically, the set might be implemented as a BST or hash-index, which'll need a more restrictive constraint; using Eq here to keep it simple.)
I want to disallow building even an empty set with an unacceptable type.
" it's now considered bad practice to require a constraint for an operation (data type construction or destruction) that does not need the constraint. Instead, the constraints should be moved closer to the "usage site".", to quote that answer.
OK so there's no need to get the constraint 'built into' the data structure. And operations that access/deconstruct (like showing or counting elements) won't necessarily need Eq.
Then consider two (or rather five) possible designs:
(I'm aware some of the constraints could be achieved via deriving, esp Foldable to get elem. But I'll hand-code here, so I can see what minimal constraints GHC wants.)
Option 1: No constraints on datatype
data NoCSet a where -- no constraints on the datatype
NilSet_ :: NoCSet a
ConsSet_ :: a -> NoCSet a -> NoCSet a
Option 1a. use PatternSynonym as 'smart constructor'
pattern NilSet :: (Eq a) => () => NoCSet a
pattern NilSet = NilSet_
pattern ConsSet x xs <- ConsSet_ x xs where
ConsSet x xs | not (elemS x xs) = ConsSet_ x xs
elemS x NilSet_ = False
elemS x (ConsSet_ y ys) | x == y = True -- infers (Eq a) for elemS
| otherwise = elemS x ys
GHC infers the constraint elemS :: Eq t => t -> NoCSet t -> Bool. But it doesn't infer a constraint for ConsSet that uses it. Rather, it rejects that definition:
* No instance for (Eq a) arising from a use of `elemS'
Possible fix:
add (Eq a) to the "required" context of
the signature for pattern synonym `ConsSet'
Ok I'll do that, with an explicitly empty 'Provided' constraint:
pattern ConsSet :: (Eq a) => () => ConsType a -- Req => Prov'd => type; Prov'd is empty, so omittable
Consequently inferred type (\(ConsSet x xs) -> x) :: Eq a => NoCSet a -> a, so the constraint 'escapes' from the destructor (also from elemS), whether or not I need it at the "usage site".
Option 1b. Pattern synonym wrapping a GADT constructor as 'smart constructor'
data CSet a where CSet :: Eq a => NoCSet a -> CSet a -- from comments to the earlier q
pattern NilSetC = CSet NilSet_ -- inferred Eq a provided
pattern ConsSetC x xs <- CSet (ConsSet_ x xs) where -- also inferred Eq a provided
ConsSetC x xs | not (elemS x xs) = CSet (ConsSet_ x xs)
GHC doesn't complain about the lack of signature, does infer pattern ConsSetC :: () => Eq a => a -> NoCSet a -> CSet a a Provided constraint, but empty Required.
Inferred (\(ConsSetC x xs) -> x) :: CSet p -> p, so the constraint doesn't escape from a "usage site".
But there's a bug: to Cons an element, I need to unwrap the NoCSet inside the CSet in the tail then re-wrap a CSet. And trying to do that with the ConsSetC pattern alone is ill typed. Instead:
insertCSet x (CSet xs) = ConsSetC x xs -- ConsSetC on rhs, to check for duplicates
As 'smart constructors' go, that's dumb. What am I doing wrong?
Inferred insertCSet :: a -> CSet a -> CSet a, so again the constraint doesn't escape.
Option 1c. Pattern synonym wrapping a GADT constructor as 'smarter constructor'
Same setup as option 1b, except this monster as ViewPattern for the Cons pattern
pattern ConsSetC2 x xs <- ((\(CSet (ConsSet_ x' xs')) -> (x', CSet xs')) -> (x, xs)) where
ConsSetC2 x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
GHC doesn't complain about the lack of signature, does infer pattern ConsSetC2 :: a -> CSet a -> CSet a with no constraint at all. I'm nervous. But it does correctly reject attempts to build a set with duplicates.
Inferred (\(ConsSetC2 x xs) -> x) :: CSet a -> a, so the constraint that isn't there doesn't escape from a "usage site".
Edit: ah, I can get a somewhat less monstrous ViewPattern expression to work
pattern ConsSetC3 x xs <- (CSet (ConsSet_ x (CSet -> xs))) where
ConsSetC3 x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
Curiously inferred pattern ConsSetC3 :: () => Eq a => a -> CSet a -> CSet a -- so the Provided constraint is visible, unlike with ConsSetC2, even though they're morally equivalent. It does reject attempts to build a set with duplicates.
Inferred (\(ConsSetC3 x xs) -> x) :: CSet p -> p, so that constraint that is there doesn't excape from "usage sites".
Option 2: GADT constraints on datatype
data GADTSet a where
GADTNilSet :: Eq a => GADTSet a
GADTConsSet :: Eq a => a -> GADTSet a -> GADTSet a
elemG x GADTNilSet = False
elemG x (GADTConsSet y ys) | x == y = True -- no (Eq a) 'escapes'
| otherwise = elemG x ys
GHC infers no visible constraint elemG :: a -> GADTSet a -> Bool; (\(GADTConsSet x xs) -> x) :: GADTSet p -> p.
Option 2a. use PatternSynonym as 'smart constructor' for the GADT
pattern ConsSetG x xs <- GADTConsSet x xs where
ConsSetG x xs | not (elemG x xs) = GADTConsSet x xs -- does infer Provided (Eq a) for ConsSetG
GHC doesn't complain about the lack of signature, does infer pattern ConsSetG :: () => Eq a => a -> GADTSet a -> GADTSet a a Provided constraint, but empty Required.
Inferred (\(ConsSetG x xs) -> x) :: GADTSet p -> p, so the constraint doesn't escape from a "usage site".
Option 2b. define an insert function
insertGADTSet x xs | not (elemG x xs) = GADTConsSet x xs -- (Eq a) inferred
GHC infers insertGADTSet :: Eq a => a -> GADTSet a -> GADTSet a; so the Eq has escaped, even though it doesn't escape from elemG.
Questions
With insertGADTSet, why does the constraint escape? It's only needed for the elemG check, but elemG's type doesn't expose the constraint.
With constructors GADTConsSet, GADTNilSet, there's a constraint wrapped 'all the way down' the data structure. Does that mean the data structure has a bigger memory footprint than with ConsSet_, NilSet_?
With constructors GADTConsSet, GADTNilSet, it's the same type a 'all the way down'. Is the same Eq a dictionary repeated at each node? Or shared?
By comparison, pattern ConsSetC/constructor CSet/Option 1b wraps only a single dictionary(?), so it'll have a smaller memory footprint than a GADTSet structure(?)
insertCSet has a performance hit of unwrapping and wrapping CSets?
ConsSetC2 in the build direction seems to work; there's a performance hit in unwrapping and wrapping CSets? But worse there's a unwrapping/wrapping performance hit in accessing/walking the nodes?
(I'm feeling there's no slam-dunk winner amongst these options for my use case.)

I don't think there is any realistic scenario where it is important that you disallow the creation of an empty set with an "unacceptable" type. It is at least partly because of lack of such realistic scenarios that DatatypeContexts is considered bad practice. Seriously, try to imagine how such a restriction could possibly help avoid real-world programming errors. That is, try to imagine how someone using your types and functions might (1) write an erroneous program that (2) "accidentally" uses a set of, say, functions and yet (3) somehow gets it to type check in a manner that (4) could have been caught if only there'd been an extra Eq constraint on NilSet. As soon as a programmer tries to do anything with that set that makes it non-empty (i.e., anything that needs Eq functionality), it won't type check, so what exactly are you trying to prevent? You want to stop someone who only needs empty sets of functions from using your types? Why? Is it spite? ... It is spite, isn't it?
Getting down to your various options, putting the constraint in a GADT is inappropriate and unnecessary to my mind. The point of constraints in GADTs is to allow destruction to dynamically and/or conditionally bring an instance dictionary into scope, based on a runtime case match. You do not need this functionality. In particular, you do not need the overhead of a dictionary in every one of your Cons nodes as per option 2. However, you also don't need a dictionary in the data type as per option 1(b). It's better to use the normal non-GADT mechanisms of passing dictionaries to functions instead of carrying them around in the data types. I expect you'll be missing many opportunities for specialization and optimization if you try option 1(b). Partly this may be because GADTs are intrinsically harder to optimize, but there's also much less work that's been put into optimizing code using GADTs than code using non-GADTs. Some of your questions suggest you're very concerned about small performance gains. If so, it's generally a good idea to stay well away from GADTs!
Option 1(a) is a reasonable solution. Without the unnecessary Eq constraint on Nil, and folding the insert function into the pattern definition, you get something like:
{-# LANGUAGE PatternSynonyms #-}
data Set a = Nil | Cons_ a (Set a) deriving (Show)
pattern Cons :: (Eq a) => a -> Set a -> Set a
pattern Cons x xs <- Cons_ x xs
where Cons x xs = go xs
where go Nil = Cons_ x xs
go (Cons_ y ys) | x == y = xs
| otherwise = go ys
which seems like a perfectly idiomatic, straightforward, smart constructor implementation using patterns, as designed.
Indeed, it's unfortunately that the constraint applies to destruction, when it isn't really needed. Ideally, GHC would allow constraints for use of Cons as a constructor to be specified separately, instead of assuming they're the combination of the required and provided constraints for destruction.
This would allow us to write something like:
pattern Cons :: a -> Set a -> Set a
pattern Cons x xs <- Cons_ x xs
where Cons :: Eq a => a -> Set a -> Set a -- <== only need Eq here
Cons x Nil = Cons_ x Nil
Cons x rest#(Cons_ y xs) | x == y = rest
| otherwise = Cons_ y (Cons x xs)
and then usages of Cons as a destructor could be constraint-free while uses as a constructor could take advantage of the Eq a constraint. I see this as a limitation of PatternSynonyms rather than an indication that adding unnecessary constraints a la DatatypeContexts is actually good programming practice. It looks like at least a few other people agree that this is a bug not a feature.
To your first four questions:
In option 2(b), insertGADTSet needs an Eq a dictionary to insert into the dictionary slot in the GADTConsSet constructor on the RHS. So, the Eq a constraint comes from the use of the GADTConsSet constructor.
Yes, GADT constraints become additional fields in the data type. In option 2, a pointer to the Eq a dictionary is added to every node in your set.
The dictionary itself is shared, but each node includes its own pointer to the dictionary.
Yes, for the CSet type, there's only one pointer to the dictionary per CSet value.

I believe that option 1b is your best bet. Of course, I may be biased, as I'm the one which suggested it on your other question.
To address the issue you've pointed out with your pattern synonym, let us imagine we don't have pattern synonyms. How might we deconstruct a Cons set?
One way is to write a method with this signature:
openSetC :: CSet a -> (Eq a => a -> CSet a -> r) -> (Eq a => r) -> r
Which says that given:
a CSet a,
a function which takes: a proof that Eq a, an a, and another CSet a, and returns some arbitrary type,
and a function which takes a proof that Eq a and returns the same arbitrary type,
we can produce a value of that arbitrary type. Since the type is arbitrary, we know that the values comes from calling the function, or from the given value of that type. The contract of this function is that it invokes the first function and return its result if and only if the set is ConsSet_, otherwise, if it is NilSet_, it invokes the second function.
If you squint a little, you can see this function is in a sense "equivalent" to pattern matching on CSet. You don't need pattern matching at all anymore; you can do everything with this function that you can do with pattern matching.
It's implementation is quite trivial:
openSetC (CSet (ConsSet_ x xs)) k _ = k x (CSet xs)
openSetC (CSet NilSet_) _ z = z
Consider now a different form of this function, which accomplishes all the same things, but is maybe a bit easier to use.
Note that the type forall r . (a -> r) -> (b -> r) -> r is isomorphic to Either a b. Also note that x0 -> y0 -> r is isomorphic (or close enough) to (x0, y0) -> r. And finally note that C a => r is isomorphic to Dict (C a) -> r where:
data Dict c where Dict :: c => Dict c
If we exploit these isomorphisms, we can write openSetC differently as:
openSetC' :: CSet a -> Either (a, CSet a, Dict (Eq a)) (Dict (Eq a))
openSetC' (CSet (ConsSet_ x xs)) = Left (x, CSet xs, Dict)
openSetC' (CSet NilSet_) = Right Dict
Now the fun part: using ViewPatterns, we can use this function directly to easily write the pattern with the signature you want. It's easy only because we've set up the type of openSetC' to match with the type of the pattern you want:
pattern ConsSetC :: () => Eq a => a -> CSet a -> CSet a
pattern ConsSetC x xs <- (openSetC' -> Left (x, xs, Dict))
-- included for completeness, but the expression form of the pattern synonym is not at issue here
where
ConsSetC x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
| otherwise = CSet xs
As for the rest of your questions, I would strongly suggest splitting them up into different posts so they could all have focused answers.

Related

Confused about GADTs and propagating constraints

There's plenty of Q&A about GADTs being better than DatatypeContexts, because GADTs automagically make constraints available in the right places. For example here, here, here. But sometimes it seems I still need an explicit constraint. What's going on? Example adapted from this answer:
{-# LANGUAGE GADTs #-}
import Data.Maybe -- fromJust
data GADTBag a where
MkGADTBag :: Eq a => { unGADTBag :: [a] } -> GADTBag a
baz (MkGADTBag x) (Just y) = x == y
baz2 x y = unGADTBag x == fromJust y
-- unGADTBag :: GADTBag a -> [a] -- inferred, no Eq a
-- baz :: GADTBag a -> Maybe [a] -> Bool -- inferred, no Eq a
-- baz2 :: Eq a => GADTBag a -> Maybe [a] -> Bool -- inferred, with Eq a
Why can't the type for unGADTBag tell us Eq a?
baz and baz2 are morally equivalent, yet have different types. Presumably because unGADTBag has no Eq a, then the constraint can't propagate into any code using unGADTBag.
But with baz2 there's an Eq a constraint hiding inside the GADTBag a. Presumably baz2's Eq a will want a duplicate of the dictionary already there(?)
Is it that potentially a GADT might have many data constructors, each with different (or no) constraints? That's not the case here, or with typical examples for constrained data structures like Bags, Sets, Ordered Lists.
The equivalent for a GADTBag datatype using DatatypeContexts infers baz's type same as baz2.
Bonus question: why can't I get an ordinary ... deriving (Eq) for GADTBag? I can get one with StandaloneDeriving, but it's blimmin obvious, why can't GHC just do it for me?
deriving instance (Eq a) => Eq (GADTBag a)
Is the problem again that there might be other data constructors?
(Code exercised at GHC 8.6.5, if that's relevant.)
Addit: in light of #chi's and #leftroundabout's answers -- neither of which I find convincing. All of these give *** Exception: Prelude.undefined:
*DTContexts> unGADTBag undefined
*DTContexts> unGADTBag $ MkGADTBag undefined
*DTContexts> unGADTBag $ MkGADTBag (undefined :: String)
*DTContexts> unGADTBag $ MkGADTBag (undefined :: [a])
*DTContexts> baz undefined (Just "hello")
*DTContexts> baz (MkGADTBag undefined) (Just "hello")
*DTContexts> baz (MkGADTBag (undefined :: String)) (Just "hello")
*DTContexts> baz2 undefined (Just "hello")
*DTContexts> baz2 (MkGADTBag undefined) (Just "hello")
*DTContexts> baz2 (MkGADTBag (undefined :: String)) (Just "hello")
Whereas these two give the same type error at compile time * Couldn't match expected type ``[Char]'* No instance for (Eq (Int -> Int)) arising from a use of ``MkGADTBag'/ ``baz2' respectively [Edit: my initial Addit gave the wrong expression and wrong error message]:
*DTContexts> baz (MkGADTBag (undefined :: [Int -> Int])) (Just [(+ 1)])
*DTContexts> baz2 (MkGADTBag (undefined :: [Int -> Int])) (Just [(+ 1)])
So baz, baz2 are morally equivalent not just in that they return the same result for the same well-defined arguments; but also in that they exhibit the same behaviour for the same ill-defined arguments. Or they differ only in where the absence of an Eq instance gets reported?
#leftroundabout Before you've actually deconstructed the x value, there's no way of knowing that the MkGADTBag constructor indeed applies.
Yes there is: field label unGADTBag is defined if and only if there's a pattern match on MkGADTBag. (It would maybe be different if there were other constructors for the type -- especially if those also had a label unGADTBag.) Again, being undefined/lazy evaluation doesn't postpone the type-inference.
To be clear, by "[not] convincing" I mean: I can see the behaviour and the inferred types I'm getting. I don't see that laziness or potential undefinedness gets in the way of type inference. How could I expose a difference between baz, baz2 that would explain why they have different types?

Function calls never bring type class constraints in scope, only (strict) pattern matching does.
The comparison
unGADTBag x == fromJust y
is essentially a function call of the form
foo (unGADTBag x) (fromJust y)
where foo requires Eq a. That would morally be provided by unGADTBag x, but that expression is not yet evaluated! Because of laziness, unGADTBag x will be evaluated only when (and if) foo demands its first argument.
So, in order to call foo in this example we need its argument to be evaluated in advance. While Haskell could work like this, it would be a rather surprising semantics, where arguments are evaluated or not depending on whether they provide a type class constraint which is needed. Imagine more general cases like
foo (if cond then unGADTBag x else unGADTBag z) (fromJust y)
What should be evaluated here? unGADTBag x? unGADTBag y? Both? cond as well? It's hard to tell.
Because of these issues, Haskell was designed so that we need to manually require the evaluation of a GADT value like x using pattern matching.

Why can't the type for unGADTBag tell us Eq a?
Before you've actually deconstructed the x value, there's no way of knowing that the MkGADTBag constructor indeed applies. Sure, if it doesn't then you have other problems (bottom), but those might conceivably not surface. Consider
ignore :: a -> b -> b
ignore _ = id
baz2' :: GADTBag a -> Maybe [a] -> Bool
baz2' x y = ignore (unGADTBag x) (y==y)
Note that I could now invoke the function with, say, undefined :: GADTBag (Int->Int). Shouldn't be a problem since the undefined is ignored, right★? Problem is, despite Int->Int not having an Eq instance, I was able to write y==y, which y :: Maybe [Int->Int] can't in fact support.
So, we can't have that only mentioning unGADTBag is enough to spew the Eq a constraint into its surrounding scope. Instead, we must clearly delimit the scope of that constraint to where we've confirmed that the MkGADTBag constructor does apply, and a pattern match accomplishes that.
★If you're annoyed that my argument relies on undefined, note that the same issue arises also when there are multiple constructors which would bring different constraints into scope.
An alternative to a pattern-match that does work is this:
{-# LANGUAGE RankNTypes #-}
withGADTBag :: GADTBag a -> (Eq a => [a] -> b) -> b
withGADTBag (MkGADTBag x) f = f x
baz3 :: GADTBag a -> Maybe [a] -> Bool
baz3 x y = withGADTBag x (== fromJust y)
Response to edits
All of these give *** Exception: Prelude.undefined:
Yes of course they do, because you actually evaluate x == y in your function. So the function can only possibly yield non-⟂ if the inputs have a NF. But that's by no means the case for all functions.
Whereas these two give the same type error at compile time
Of course they do, because you're trying to wrap a value of non-Eq type in the MkGADTBag constructor, which explicitly requires that constraint (and allows you to explicitly unwrap it again!), whereas the GADTBag type doesn't require that constraint. (Which is kind of the whole point about this sort of encapsulation!)
Before you've actually deconstructed the x value, there's no way of knowing that the `MkGADTBag` constructor indeed applies.Yes there is: field label `unGADTBag` is defined if and only if there's a pattern match on `MkGADTBag`.
Arguably, that's the way field labels should work, but they don't, in Haskell. A field label is nothing but a function from the data type to the field type, and a nontotal function at that if there are multiple constructors.Yeah, Haskell records are one of the worst-designed features of the language. I personally tend to use field labels only for big, single-constructor, plain-old-data types (and even then I prefer using not the field labels directly but lenses derived from them).
Anyway though, I don't see how “field label is defined iff there's a pattern match” could even be implemented in a way that would allow your code to work the way you think it should. The compiler would have to insert the step of confirming that the constructor applies (and extracting its GADT-encapsulated constraint) somewhere. But where? In your example it's reasonably obvious, but in general x could inhabit a vast scope with lots of decision branches and you really don't want it to get evaluated in a branch where the constraint isn't actually needed.
Also keep in mind that when we argue with undefined/⟂ it's not just about actually diverging computations, more typically you're worried about computations that would simply take a long time (just, Haskell doesn't actually have a notion of “taking a long time”).

The way to think about this is OutsideIn(X) ... with local assumptions. It's not about undefinedness or lazy evaluation. A pattern match on a GADT constructor is outside, the RHS of the equation is inside. Constraints from the constructor are made available only locally -- that is only inside.
baz (MkGADTBag x) (Just y) = x == y
Has an explicit data constructor MkGADTBag outside, supplying an Eq a. The x == y raises a wanted Eq a locally/inside, which gets discharged from the pattern match. OTOH
baz2 x y = unGADTBag x == fromJust y
Has no explicit data constructor outside, so no context is supplied. unGADTBag has a Eq a, but that is deeper inside the l.h. argument to ==; type inference doesn't go looking deeper inside. It just doesn't. Then in the effective definition for unGADTBag
unGADTBag (MkGADTBag x) = x
there is an Eq a made available from the outside, but it cannot escape from the RHS into the type environment at a usage site for unGADTBag. It just doesn't. Sad!
The best I can see for an explanation is towards the end of the OutsideIn paper, Section 9.7 Is the emphasis on principal types well-justified? (A rhetorical question but my answer would me: of course we must emphasise principal types; type inference could get better principaled under some circumstances.) That last section considers this example
data R a where
RInt :: Int -> R Int
RBool :: Bool -> R Bool
RChar :: Char -> R Char
flop1 (RInt x) = x
there is a third type that is arguably more desirable [for flop1], and that type is R Int -> Int.
flop1's definition is of the same form as unGADTBag, with a constrained to be Int.
flop2 (RInt x) = x
flop2 (RBool x) = x
Unfortunately, ordinary polymorphic types are too weak to express this restriction [that a must be only Int or Bool] and we can only get Ɐa.R a -> a for flop2, which does not rule the application of flop2 to values of type R Char.
So at that point the paper seems to give up trying to refine better principal types:
In conclusion, giving up on some natural principal types in favor of more specialized types that eliminate more pattern match errors at runtime is appealing but does not quite work unless we consider a more expressive syntax of types. Furthermore it is far from obvious how to specify these typings in a high-level declarative specification.
"is appealing". It just doesn't.
I can see a general solution is difficult/impossible. But for use-cases of constrained Bags/Lists/Sets, the specification is:
All data constructors have the same constraint(s) on the datatype's parameters.
All constructors yield the same type (... -> T a or ... -> T [a] or ... -> T Int, etc).
Datatypes with a single constructor satisfy that trivially.
To satisfy the first bullet, for a Set type using a binary balanced tree, there'd be a non-obvious definition for the Nil constructor:
data OrdSet a where
SNode :: Ord a => OrdSet a -> a -> OrdSet a -> OrdSet a
SNil :: Ord a => OrdSet a -- seemingly redundant Ord constraint
Even so, repeating the constraint on every node and every terminal seems wasteful: it's the same constraint all the way down (which is unlike GADTs for EDSL abstract syntax trees); presumably each node carries a copy of exactly the same dictionary.
The best way to ensure same constraint(s) on every constructor could just be prefixing the constraint to the datatype:
data Ord a => OrdSet a where ...
And perhaps the constraint could go 'OutsideOut' to the environment that's accessing the tree.

Another possible approach is to use a PatternSynonym with an explicit signature giving a Required constraint.
pattern EqGADTBag :: Eq a => [a] -> GADTBag a -- that Eq a is the *Required*
pattern EqGADTBag{ unEqGADTBag } = MkGADTBag unEqGADTBag -- without sig infers Eq a only as *Provided*
That is, without that explicit sig:
*> :i EqGADTBag
pattern EqGADTBag :: () => Eq a => [a] -> GADTBag a
The () => Eq a => ... shows Eq a is Provided, arising from the GADT constructor.
Now we get both inferred baz, baz2 :: Eq a => GADTBag a -> Maybe [a] -> Bool:
baz (EqGADTBag x) (Just y) = x == y
baz2 x y = unEqGADTBag x == fromJust y
As a curiosity: it's possible to give those equations for baz, baz2 as well as those in the O.P. using the names from the GADT decl. GHC warns of overlapping patterns [correctly]; and does infer the constrained sig for baz.
I wonder if there's a design pattern here? Don't put constraints on the data constructor -- that is, don't make it a GADT. Instead declare a 'shadow' PatternSynonym with the Required/Provided constraints.

You can capture the constraint in a fold function, (Eq a => ..) says you can assume Eq a but only within the function next (which is defined after a pattern match). If you instantiate next as = fromJust maybe == as it uses this constraint to witness equality
-- local constraint
-- |
-- vvvvvvvvvvvvvvvvvv
foldGADTBag :: (Eq a => [a] -> res) -> GADTBag a -> res
foldGADTBag next (MkGADTBag as) = next as
baz3 :: GADTBag a -> Maybe [a] -> Bool
baz3 gadtBag maybe = foldGADTBag (fromJust maybe ==) gadtBag
type Ty :: Type -> Type
data Ty a where
TyInt :: Int -> Ty Int
TyUnit :: Ty ()
-- locally assume Int locally assume unit
-- | |
-- vvvvvvvvvvvvvvvvvvv vvvvvvvvvvvvv
foldTy :: (a ~ Int => a -> res) -> (a ~ () => res) -> (Ty a -> res)
foldTy int unit (TyInt i) = int i
foldTy int unit TyUnit = unit
eval :: Ty a -> a
eval = foldTy id ()

Generics : run-time ADT for types with instances

Is it possible with Haskell / GHC, to extract an algebraic data type representing all types with Eq and Ord instances ? This would probably need Generics, Typeable, etc.
What I would like is something like :
data Data_Eq_Ord = Data_String String
| Data_Int Int
| Data_Bool Bool
| ...
deriving (Eq, Ord)
For all types known to have instances for Eq and Ord. If it makes the solution easier, we can limit our scope to Ord instances, since Eq is implied by Ord. But is would be interesting to know if constraints intersection is possible.
This data type would be useful because it gives the possibility to use it where Eq and Ord constraints are required, and pattern-match at runtime to refine on types.
I would need this to implement a generic Map Key Value, where Key would be this type, in a Document Indexing library, where the keys and their type is known at run-time. This library is here. For the moment I worked around the issue by defining a data DocIndexKey, and a FieldKey class, but this is not fully satisfactory since it requires boilerplate and can't cover all legit candidates.
Any good alternative approach to this situation is welcome. For some reasons, I prefer to avoid Template Haskell.

Well, it's not an ADT, but this definitely works:
data Satisfying c = forall a. c a => Satisfy a
class (l a, r a) => And l r a where
instance (l a, r a) => And l r a where
ex :: [Satisfying (Typeable `And` Show `And` Ord)]
ex = [ Satisfy (7 :: Int)
, Satisfy "Hello"
, Satisfy (5 :: Int)
, Satisfy [10..20 :: Int]
, Satisfy ['a'..'z']
, Satisfy ((), 'a')]
-- An example of use, with "complicated" logic
data With f c = forall a. c a => With (f a)
-- vvvvvvvvvvvvvvvvvvvvvvvvvv QuantifiedConstraints chokes on this, which is probably a bug...
partitionTypes :: (forall a. c a => TypeRep a) -> [Satisfying c] -> [[] `With` c]
partitionTypes rep = foldr go []
where go (Satisfy x) [] = [With [x]]
go x'#(Satisfy (x :: a)) (xs'#(With (xs :: [b])) : xss) =
case testEquality rep rep :: Maybe (a :~: b) of
Just Refl -> With (x : xs) : xss
Nothing -> xs' : go x' xss
main :: IO ()
main = traverse_ (\(With xs) -> print (sort xs)) $ partitionTypes typeRep ex
Exhaustivity is much harder. Perhaps with a plugin, you could get GHC to do it, but why bother? I don't believe GHC actually tries to keep track of what types it has seen. In particular, you'd have to scan all modules in the project and its dependencies, even those that haven't been loaded by the module containing the type definition. You'd have to implement it from the ground-up. And, as this answer shows, I very much doubt you would actually be able to use such exhaustivity for anything that you can't already do by just taking the open universe as it is.

Eq definition for alternative version numbering approach

I am trying to define Eq operator for alternative version numbering approach.
type VersionCompound = Maybe Int -- x, 0, 1, 2, ...
type VersionNumber = [VersionCompound] -- x.x, x.0, x.1, x.2, ... , 1.0, 1.1, 1.2, ... 1.x.x, 2.x.x, 3.x.x, ...
instance Eq VersionNumber where
[] == [] = True
(x:[]) == (y:[]) = x == y
(Nothing:xs) == ys = (xs == ys)
xs == (Nothing:ys) = (xs == ys)
It is expected that it returns True for following cases: x.x.x == x, 1.x.x == x.1.x.x, x.1 == 1, etc. But instead it returns an error:
VersionNumber.hs:58:34:
Overlapping instances for Eq [VersionCompound]
arising from a use of ‘==’
Matching instances:
instance Eq a => Eq [a] -- Defined in ‘GHC.Classes’
instance Eq VersionNumber -- Defined at VersionNumber.hs:55:10
In the expression: (xs == ys)
In an equation for ‘==’: (Nothing : xs) == ys = (xs == ys)
In the instance declaration for ‘Eq VersionNumber’
Any ideas how to fix it?
EDIT: My approach to this problem via pattern matching on lists turned out to be incomplete. I wanted to disregard any arbitrary list of x's (or Nothings) on the left hand side of the given version. So, for example, x.x.x.x.x would be equal to x.x.x and to x. Similarly, x.x.x.1 would be equal to x.x.1 and to 1. If there's an x in the middle, it won't be thrown away. So, for this case, x.x.1.x.0 would be equal to x.1.x.0 and 1.x.0. Yet another example: x.1.x.x.0.x is equal to 1.x.x.0.x and x.1.x.0.x is equal to 1.x.0.x (You just remove x's on the left side).
What I was struggling with after fixing an error Overlapping instances for Eq [VersionCompound] is how to get x.x.x == x -> True with pattern matching. But, as #WillemVanOnsem brilliantly noted, it should be achieved not via pattern matching, but with function composition.
PS. I personally encourage you to upvote the answer by #WillemVanOnsem because his solution is really elegant, required some effort to come up with and represents the essence of Haskell power.

You use type aliases. This means that you have not defined a separate type VersionNumber or VersionCompound; you simply have constructed an alias. Behind the curtains, Haskell sees VersionNumber as simply [Maybe Int].
Now if we look at Haskell's library, we see that:
instance Eq Int where
-- ...
instance Eq a => Eq (Maybe a) where
-- ...
instance Eq a => Eq [a] where
-- ...
So that means that Eq Int is defined, that Eq (Maybe Int) is defined as well, and thus that Eq [Maybe Int] is defined by the Haskell library as well. So you actually have already constructed an Eq VersionNumber without writing one. Now you try to write an additional one and, of course, the compiler gets confused with which one to pick.
There are ways to resolve overlapping instances, but this will probably only generate more trouble.
Thus, you better construct a data type with a single constructor. For example:
type VersionCompound = Maybe Int
data VersionNumber = VersionNumber [VersionCompound]
Now, since there is only one constructor, you better use a newtype:
type VersionCompound = Maybe Int
newtype VersionNumber = VersionNumber [VersionCompound]
and now we can define our special instance like:
instance Eq VersionNumber where
(VersionNumber a) == (VersionNumber b) = a =*= b
where [] =*= [] = True
(x:[]) =*= (y:[]) = x == y
(Nothing:xs) =*= ys = (xs =*= ys)
xs =*= (Nothing:ys) = (xs =*= ys)
So we thus unwrap the constructors and then use another locally defined function =*= that works like you defined it in your question.
Mind however that you forgot a few patterns in your program, like for instance Just x : xs on both the left and the right side. So, you better fix these first. If I run your code through a compiler, I get the following warnings:
Pattern match(es) are non-exhaustive
In an equation for ‘=*=’:
Patterns not matched:
[] (Just _:_)
[Just _] []
[Just _] (Just _:_:_)
(Just _:_:_) []
How you want to handle these cases is of course up to you. #DanielWagner suggests to use:
import Data.Function(on)
import Data.Maybe(catMaybes)
instance Eq VersionNumber where
(VersionNumber a) == (VersionNumber b) = on (==) catMaybes a b
This will filter out the Nothing values of both VersionNumbers and then check whether the values in the Just generate the same list. So 3.x.2.x.x.1 will be equal to 3.2.1 and x.x.x.x.x.3.2.1.
EDIT: based on your specification in comment, you are probably looking for:
import Data.Function(on)
instance Eq VersionNumber where
(VersionNumber a) == (VersionNumber b) = on (==) (dropWhile (Nothing ==)) a b

Infinite (finally-periodic) HList in Haskell

let's say I have an infinite sequence of actions, each of which returns the result of a certain type. Something like:
newtype Stream a = Stream (IO (a, Stream a))
But with a varying over time. I want to strongly type this sequence. It's obviously does not make sence for arbitrary infinite type sequence and naive approach such that:
data HStream :: [u] -> * where Cons :: Proxy x -> HStream xs -> HStream (x ': xs)
infiniteInt = Cons (Proxy :: Proxy Int) infiniteInt
will lead to an infinite type, which is not supported by Haskell's type system. But I don't see nothing wrong with a finally-periodic HLists (i.e. such what type sequence will repeat itself from some point: [Bool, Int, Int, Sting, Int, Sting, Int, Sting ... ]). And I also think that if we have some strongly normalizing way to describe infinite type or some way to provide an evidence of infinite type equality which can be checked in finite number of steps, it should be possible to typecheck program with such infinite types.
Does anyone have any idea how such types can be represented and used in Haskell? Let's start from infinite finally-periodic hlist for now, but I will also appreciate if someone has an idea how it can be generalized for wider class of infinite tupes and where generalization limits lays.

Make HLists infinite and periodic with this One Cool Trick!
When you add an element to your periodic heterogeneous stream, don't extend the list of types by which it's indexed. Rotate it.
type family Append x xs where
Append x '[] = '[x]
Append x (y ': xs) = y ': Append x xs
infixr 5 :::
data HStream as where
(:::) :: { headHS :: a, tailHS :: HStream (Append a as) } -> HStream (a ': as)
myHStream :: HStream '[Char, Bool, Int]
myHStream = 'c' ::: True ::: 3 ::: 'x' ::: False ::: -5 ::: myHStream

One general option is to switch from an HList, which encodes the types of all the elements, to a type-aligned list (or, more generally, a type-aligned sequence), which only ensures transitions along valid paths.
data TAList c x z where
Nil :: TAList c x x
Cons :: c x y -> TAList c y z -> TAList c x z
So you could encode your transitions, with some care, using a possibility-large GADT for c and an appropriate kind of your choice for x and z. Infinite type-aligned lists are no problem, because they're polymorphic in their final type argument.
You could probably use a McBride-style indexing scheme instead of an Atkey one to get more flexibility, at the cost of more complexity.

Choosing the non-empty Monoid

I need a function which will choose a non-empty monoid. For a list this will mean the following behaviour:
> [1] `mor` []
[1]
> [1] `mor` [2]
[1]
> [] `mor` [2]
[2]
Now, I've actually implemented it but am wondering wether there exists some standard alternative, because it seems to be a kind of a common case. Unfortunately Hoogle doesn't help.
Here's my implementation:
mor :: (Eq a, Monoid a) => a -> a -> a
mor a b = if a /= mempty then a else b

If your lists contain at most one element, they're isomorphic to Maybe, and for that there's the "first non empty" monoid: First from Data.Monoid. It's a wrapper around Maybe a values, and mappend returns the first Just value:
import Data.Monoid
main = do
print $ (First $ Just 'a') <> (First $ Just 'b')
print $ (First $ Just 'a') <> (First Nothing)
print $ (First Nothing) <> (First $ Just 'b')
print $ (First Nothing) <> (First Nothing :: First Char)
==> Output:
First {getFirst = Just 'a'}
First {getFirst = Just 'a'}
First {getFirst = Just 'b'}
First {getFirst = Nothing}
Conversion [a] -> Maybe a is achieved using Data.Maybe.listToMaybe.
On a side note: this one does not constrain the typeclass of the wrapped type; in your question, you need an Eq instance to compare for equality with mempty. This comes at the cost of having the Maybe type, of course.

[This is really a long comment rather than an answer]
In my comment, when I said "monoidal things have no notion of introspection" - I meant that you can't perform analysis (pattern matching, equality, <, >, etc.) on monoids. This is obvious of course - the API of Monoids is only unit (mempty) and an operation mappend (more abstractly <>) that takes two monodial things and returns one. The definition of mappend for a type is free to use case analysis, but afterwards all you can do with monoidal things is use the Monoid API.
It's something of a folklore in the Haskell community to avoid inventing things, prefering instead to use objects from mathematics and computer science (including functional programming history). Combining Eq (which needs analysis of is arguments) and Monoid introduces a new class of things - monoids that support enough introspection to allow equality; and at this point there is a reasonable argument that an Eq-Monoid thing goes against the spirit of its Monoid superclass (Monoids are opaque). As this is both a new class of objects and potentially contentious - a standard implementation won't exist.

First, your mor function looks rather suspicious because it requires a Monoid but never uses mappend, and so it is significantly more constrained than necessary.
mor :: (Eq a, Monoid a) => a -> a -> a
mor a b = if a /= mempty then a else b
You could accomplish the same thing with merely a Default constraint:
import Data.Default
mor :: (Eq a, Default a) => a -> a -> a
mor a b = if a /= def then a else b
and I think that any use of Default should also be viewed warily because, as I believe many Haskellers complain, it is a class without principles.
My second thought is that it seems that the data type you're really dealing with here is Maybe (NonEmpty a), not [a], and the Monoid you're actually talking about is First.
import Data.Monoid
morMaybe :: Maybe a -> Maybe a -> Maybe a
morMaybe x y = getFirst (First x <> First y)
And so then we could use that with lists, as in your example, under the (nonEmpty, maybe [] toList) isomorphism between [a] and Maybe (NonEmpty a):
import Data.List.NonEmpty
morList :: [t] -> [t] -> [t]
morList x y = maybe [] toList (nonEmpty x `mor` nonEmpty y)
λ> mor'list [1] []
[1]
λ> mor'list [] [2]
[2]
λ> mor'list [1] [2]
[1]
(I'm sure that somebody more familiar with the lens library could provide a more impressive concise demonstration here, but I don't immediately know how.)
You could extend Monoid with a predicate to test whether an element is an identity.
class Monoid a => TestableMonoid a
where
isMempty :: a -> Bool
morTestable :: a -> a -> a
morTestable x y = if isMempty x then y else x
Not every monoid can have an instance of TestableMonoid, but plenty (including list) can.
instance TestableMonoid [a]
where
isMempty = null
We could even then write a newtype wrapper with a Monoid:
newtype Mor a = Mor { unMor :: a }
instance TestableMonoid a => Monoid (Mor a)
where
mempty = Mor mempty
Mor x `mappend` Mor y = Mor (morTestable x y)
λ> unMor (Mor [1] <> Mor [])
[1]
λ> unMor (Mor [] <> Mor [2])
[2]
λ> unMor (Mor [1] <> Mor [2])
[1]
So that leaves open the question of whether the TestableMonoid class deserves to exist. It certainly seems like a more "algebraically legitimate" class than Default, at least, because we can give it a law that relates it to Monoid:
isEmpty x iff mappend x = id
But I do question whether this actually has any common use cases. As I said earlier, the Monoid constraint is superfluous for your use case because you never mappend. So we should ask, then, can we envision a situation in which one might need both mappend and isMempty, and thus have a legitimate need for a TestableMonoid constraint? It's possible I'm being shortsighted here, but I can't envision a case.
I think this is because of something Stephen Tetley touched on when he said that this "goes against the spirit of its Monoid." Tilt your head at the type signature of mappend with a slightly different parenthesization:
mappend :: a -> (a -> a)
mappend is a mapping from members of a set a to functions a -> a. A monoid is a way of viewing values as functions over those values. The monoid is a view of the world of a only through the window of what these functions let us see. And functions are very limited in what they let us see. The only thing we ever do with them is apply them. We never ask anything else of a function (as Stephen said, we have no introspection into them). So although, yes, you can bolt anything you want onto a subclass, in this case the thing we're bolting on feels very different in character from the base class we are extending, and it feels unlikely that there would be much intersection between the use cases of functions and the use cases of things that have general equality or a predicate like isMempty.
So finally I want to come back around to the simple and precise way to write this: Write code at the value level and stop worrying classes. You don't need Monoid and you don't need Eq, all you need is an additional argument:
morSimple :: (t -> Bool) -- ^ Determine whether a value should be discarded
-> t -> t -> t
morSimple f x y = if f x then y else x
λ> morSimple null [1] []
[1]
λ> morSimple null [1] [2]
[1]
λ> morSimple null [] [2]
[2]

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Design options for constructor constraints: GADT compare PatternSynonym Required - haskell

Related

Confused about GADTs and propagating constraints

Generics : run-time ADT for types with instances

Eq definition for alternative version numbering approach

Infinite (finally-periodic) HList in Haskell

Choosing the non-empty Monoid

Categories

Resources