Haskell Ord instance with a Set - haskell

I have some code that I would like to use to append an edge to a Node data structure:
import Data.Set (Set)
import qualified Data.Set as Set
data Node = Vertex String (Set Node)
deriving Show
addEdge :: Node -> Node -> Node
addEdge (Vertex name neighbors) destination
| Set.null neighbors = Vertex name (Set.singleton destination)
| otherwise = Vertex name (Set.insert destination neighbors)
However when I try to compile I get this error:
No instance for (Ord Node)
arising from a use of `Set.insert'
As far as I can tell, Set.insert expects nothing but a value and a set to insert it into. What is this Ord?

In GHCi:
> import Data.Set
> :t insert
insert :: (Ord a) => a -> Set a -> Set a
So yes, it does expect Ord. As for what Ord means, it's a type class for ordered values. It's required in this case because Data.Set uses a search tree, and so needs to be able to compare values to see which is larger or if they're equal.
Nearly all of the standard built-in data types are instances of Ord, as well as things like lists, tuples, Maybe, etc. being instances of Ord when their type parameter(s) are. The most notable exception, of course, are functions, where no sensible concept of ordering (or even equality) can be defined.
In many cases, you can automatically create instances of type classes for your own data types using a deriving clause after the declaration:
data Foo a = Foo a a Int deriving (Eq, Ord, Show, Read)
For parameterized types, the automatic derivation depends on the type parameter also being an instance, as is the case with lists, tuples, and such.
Besides Ord, some important type classes are Eq (equality comparisons, but not less/greater than), Enum (types you can enumerate values of, such as counting Integers), and Read/Show (simple serialization/deserialization with strings). To learn more about type classes, try this chapter in Real World Haskell or, for a more general overview, there's a Wikipedia article.

Haskell sets are based on a search tree. In order to put an element in a search tree an ordering over the elements must be given. You can derive Ord just like you are deriving Show by adding it to your data declaration, i.e.:
data Node = Vertex String (Set Node)
deriving (Show, Eq, Ord)
You can see the requirement of Ord by the signature of Data.Set.insert
(Ord a) => a -> Set a -> Set a
The part (Ord a) => establishes a constraint that there is an instance of the typeclass Ord for a. The section on type classes in the haskell tutorial gives a more thorough explanation.

Related

True isomorphisms in Haskell

Are the following assertions true:
The only real isomorphism, accessible programatically to the user, verified by Haskell type system, and that the Haskell compiler is/can be made aware of, is between:
the set of values of a Haskell datatype
the set of values of types those required by its constructors
Even generic programming can't produce "true" isomorphism, whose composition results at run time in an identity (thus staged-sop - and similarly in Ocaml)
Haskell itself is the only producing isomorphism, Coercible, but those isomorphism are restricted to the identity isomorphism
By "real isomorphism, accessible programatically to the user, verified by Haskell type system, and that the Haskell compiler is/can be made aware of" I mean a pair of function u : a -> b and v : b -> a such that Haskell knows (by being informed or otherwise) that u.v=id and v.u=id. Just like it knows (at compile time) how to rewrite some code to do "fold fusion", which is akin to, at once, recognize and apply it.
Look into Homotopy Type Theory/Cubical Agda where an "equality is isomorphism". I am not familiar enough with it to know what happens operationally, even if Agda knows isomorphic types are equal I still think your "true isomorphism" (i.e. with a proof and fusion) is too tall of an order.
In GHC it is possible to derive via "isomorphisms" but we need to wait for dependent types to properly verify isomorphisms in the type system. Even so they can be used to produce bone fide code even if you have to do some work operationally.
You already mentioned "representational equality" (Coercible) but it is worth discussing it. It underpins the two coerce-based deriving strategies: GeneralizedNewtypeDeriving and DerivingVia which generalizes GND.
GND is the simplest way to turn an isomorphism (Coercible USD Int) into code:
type USD :: Type
newtype USD = MkUSD Int
deriving
newtype (Eq, Ord, Show, Num)
Operationally coerce is zero-cost at so they incur no cost at run-time. This is the only way you will get what you want in Haskell.
Isomorphisms can also be done through user-defined type classes.
An instance of Representable f means f is (naturally) isomorphic to functions from its representing object (Rep f ->). The newtype Co uses this isomorphism to derive function instances for representable functor. A Pair a of two values is represented by Bool, and is thus isomorphic to Bool -> a.
This isomorphism lets Pair derive Functor, Applicative and Monad by roundtripping through (Bool ->):
type Pair :: Type -> Type
data Pair a = a :# a
deriving (Functor, Applicative, Monad)
via Co Pair
instance Distributive Pair where
distribute :: Functor f => f (Pair a) -> Pair (f a)
distribute = distributeRep
instance Representable Pair where
type Rep Pair = Bool
index :: Pair a -> (Bool -> a)
index (false :# true) = \case
False -> false
True -> true
tabulate :: (Bool -> a) -> Pair a
tabulate make = make False :# make True
When you derive Generic/Generic1 the compiler generates an isomorphism between a generic type and its generic representation Rep/Rep1 (not to be confused with the representing object Rep from the above example).
The class laws state that to/from and to1/from1 witness that isomorphism. The type system does not enforce these laws but if you derive them they should hold.
They are the main way to define generic implementations in Haskell. I recently introduced two newtypes Generically and Generically1 to base, as standard names for generic behaviour (use generic-data until the next GHC release). You can derive a generic isomorphism and programmatically use it in the next line without leaving the data declaration:
type Lists :: Type -> Type
data Lists a = Lists [a] [a] [a]
deriving
stock (Show, Generic, Generic1)
deriving (Semigroup, Monoid)
via Generically (Lists a)
deriving (Functor, Applicative, Alternative)
via Generically1 Lists
>> mempty #(Lists _)
Lists [] [] []
>> empty #Lists
Lists [] [] []
>> Lists "a" "b" "c" <> Lists "!" "." "?"
Lists "a!" "b." "c?"
>> pure #Lists 'a'
Lists "a" "a" "a"
You will however have to pay for the converstion cost, it's not as simple as adding {-# Rules "to/from" to . from = id #-} because the actual instances will appear with intermediate terms like to (from a <> from b). Even your "true isomorphisms" GHC could not fuse away the conversion since it's not of the form to . from.
There is also a library iso-deriving (blog) that allows deriving via arbitrary isomorphisms.

Can trees be generalized to allow any traversable sub-tree?

Data.Tree uses a list to represent the subtree rooted at a particular node. Is it possible to have two tree types, for example one which uses a list and another which uses a vector? I want to be able to write functions which don't care how the sub-tree is represented concretely, only that the subtree is traversable, as well as functions which take advantage of a particular subtree type, e.g. fast indexing into vectors.
It seems like type families would be the right tool for the job, though I've never used them before and I have no idea how to actually define the right family.
If it matters, I'm not using the containers library tree instance, but instead I have types
data Tree a b = Node a b [Tree a b] deriving (Show, Foldable, Generic)
and
data MassivTree a b = V a b (Array B Ix1 (MassivTree a b))
where the latter uses vectors from massiv.
You could use a typeclass - in fact the typeclass you need probably already exists.
Consider this:
data Tree t a = Tree a (t (Tree t a))
Argument t is a higher-kinded type which represents a container of as.
Now define a set of Tree operations, constrained on Traversable like so:
:: (Foldable t) => Tree t a -> b
And you can now create and manipulate trees that use any Foldable. You would need to choose the right typeclass for the set of operations you want - Functor may be enough, or you may want Traversable if you are doing anything with monadic actions. You can choose the typeclass on a per-function basis, depending on what it does.
You can now define Tree types like so:
type ListTree a = Tree [] a
type MassivTree r ix a = Tree (Array r ix) a
You can also define instance-specific functions, with access to a full range of functionality:
:: ListTree a -> b
-- or
:: Tree [] a -> b
Happy Haskelling!

Is it possible to establish Coercible instances between custom types and standard library ones?

For a simple example, say I want a type to represent tic-tac-toe marks:
data Mark = Nought | Cross
Which is the same as Bool
Prelude> :info Bool
data Bool = False | True -- Defined in ‘GHC.Types’
But there's no Coercible Bool Mark between them, not even if I import GHC.Types (I first thought maybe GHC needs Bool's defining place to be visible), the only way to have this instance seems to be through newtype.
Probably I could have defined newtype Mark = Mark Bool and define Nought and Cross with bidirectional patterns, I wish there's something simpler than that.
Unfortunately, you're out of luck. As the documentation for Data.Coerce explains, "one can pretend that the following three kinds of instances exist:"
Self-instances, as in instance Coercible a a,
Instances for coercing between two versions of a data type that differ by representational or phantom type parameters, as in instance Coercible a a' => Coercible (Maybe a) (Maybe a'), and
Instances between new types.
Furthermore, "Trying to manually declare an instance of Coercible is an error", so that's all you get. There are no instances between arbitrarily different data types, even if they look similar.
This may seem frustratingly limiting, but consider this: if there were a Coercible instance between Bool and Mark, what's stopping it from coercing Nought to True and Cross to False? It may be that Bool and Mark are represented in memory the same way, but there is no guarantee that they are semantically similar enough to warrant a Coercible instance.
Your solution of using a newtype and pattern synonyms is a great, safe way to get around the problem, even if it is a little annoying.
Another option is to consider using Generic. For instance, check out the idea of genericCoerce from this other question
This isn’t possible yet, and pattern synonyms are a good solution for now. I often use code like this to derive useful instances for a type that happens to be isomorphic to an existing primitive type.
module Mark
( Mark(Nought, Cross)
) where
newtype Mark = Mark Bool
deriving stock (…)
deriving newtype (…)
deriving (…) via Any
…
pattern Nought = Mark False
pattern Cross = Mark True
Coercion between unrelated ADTs is also not on the list of permitted unsafe coercions. Last I knew, in practice in GHC, coercions between Mark and Bool will work only if the values in question are fully evaluated, because they have a small number of constructors, so the constructor index is stored in the tag bits of the pointer at runtime. But an arbitrary thunk of type Mark or Bool can’t be coerced reliably, and the method doesn’t generalise to types with more than {4, 8} constructors (on resp. {32, 64}-bit systems).
Moreover, the code generator and runtime representation of objects both change periodically, so even if this works now (I don’t know), it will probably break in the future anyway.
My hope is that we get a generalised Coercible in the future that can accommodate more coercions than just newtype-of-T ↔ T, or even better, that allows us to specify a stable ABI for a data type. To my knowledge, no one is actively working on that in Haskell, although there is some similar work going on in Rust for safe transmute, so maybe someone will smuggle it back over to functional-land.
(Speaking of ABI, you could use the FFI for this, and I’ve done so in circumstances where I was already writing foreign code and knew the Storable instances matched. alloca a suitably sized buffer, poke a value of type Bool into it, castPtr the Ptr Bool into a Ptr Mark, peek the Mark out of it, and unsafePerformIO the whole shebang.)
Coercible Bool Mark is not required. Mark-instances can be derived via Bool without it.
Generic types whose generic representations (Rep) are Coercible can be converted to each other:
from coerce to
A -----> Rep A () -----> Rep Via () -----> Via
For the datatype Mark this means instances (Eq, ..) can be derived via instances of Bool.
type Mark :: Type
data Mark = Nought | Cross
deriving
stock Generic
deriving Eq
via Bool <-> Mark
How does Bool <-> Mark work?
type (<->) :: Type -> Type -> Type
newtype via <-> a = Via a
First we capture the constraint that we can coerce between the generic representation of two types:
type CoercibleRep :: Type -> Type -> Constraint
type CoercibleRep via a = (Generic via, Generic a, Rep a () `Coercible` Rep via ())
Given this constraint we can move from a to it via type, creating intermediate Reps:
translateTo :: forall b a. CoercibleRep a b => a -> b
translateTo = from #a #() >>> coerce >>> to #b #()
Now we can easily write an Eq instance for this type, we assume an Eq via instance for the via type (Bool in our case)
instance (CoercibleRep via a, Eq via) => Eq (via <-> a) where
(==) :: (via <-> a) -> (via <-> a) -> Bool
Via a1 == Via a2 = translateTo #via a1 == translateTo #via a2
The instance for Semigroup requires translating via back to a
instance (CoercibleRep via a, Semigroup via) => Semigroup (via <-> a) where
(<>) :: (via <-> a) -> (via <-> a) -> (via <-> a)
Via a1 <> Via a2 = Via do
translateTo #a do
translateTo #via a1 <> translateTo #via a2
Now we can derive Eq and Semigroup!
-- >> V3 "a" "b" "c" <> V3 "!" "!" "!"
-- V3 "a!" "b!" "c!"
type V4 :: Type -> Type
data V4 a = V4 a a a a
deriving
stock Generic
deriving (Eq, Semigroup)
via (a, a, a, a) <-> V4 a
Using a newtype from the beginning avoids this boilerplate but once it's up it can be reused. It is simple to write a newtype and use pattern synonyms to cover it up.

Retaining list-ness operations for a data type

I want to be able to define a custom data type as opposed to using a type alias to ensure that proper values are being passed around, below is a sketch of how that might look,
module Example (fromList) where
import Data.Ord (comparing, Down(..))
import Data.List (sort)
data DictEntry = DictEntry (String, Integer) deriving (Show, Eq)
instance Ord DictEntry where
(DictEntry (word1, freq1)) `compare` (DictEntry (word2, freq2))
| freq1 == freq2 = word1 `compare` word2
| otherwise = comparing Down freq1 freq2
data Dictionary = Dictionary [DictEntry] deriving (Show)
fromList :: [(String, Integer)] -> Dictionary
fromList l = Dictionary $ sort $ map DictEntry l
However, I'd also like to retain the "list-ness" of the underlying type without having to unwrap and re-wrap [DictEntry], and without having to define utility functions such as head :: Dictionary -> DictEntry and tail :: Dictionary -> Dictionary. Is that possible? Is there some type class that I could define an instance of or a language extension that enables this?
Never use head and avoid using tail, for lists or else. These are unsafe and can always easily be replaced with pattern matching.
But yes, there is a typeclass that supports list-like operations, or rather multiple classes. The simplest of these is Monoid, which just implements concatenation and empty-initialisation. Foldable, allows you to deconstruct containers as if they were lists. Traversable additionally allows you to assemble them again as you go over the data.
The latter two won't quite work with Dictionary because it's not parametric on the contained type. You can circumvent that by switching to the “monomorphic version”.
However, I frankly don't think you should do any of this – just use the standard Map type to store key-value associative data, instead of rolling your own dictionary type.

Equal class misunderstanding

I have my own data type to represent nodes and edges of a graph as follows:
data Node a = Node a deriving (Show, Eq)
data Label a = Label a deriving (Show)
data Cost = CostI Int | CostF Float deriving (Show)
data Edge label node = Edge (Label label, (Node node,Node node), Cost) deriving (Show)
Now, I create a function to check whether an edge contains 2 nodes or not as follows:
isEdge:: (Eq n) => (Edge l n) -> (Node n, Node n) -> Bool
isEdge (Edge (_, (n1,n2), _)) (n3, n4) = result
where result = (n1 == n3) && (n2 == n4)
The function works well, the problem here is if I remove (Eq n) from the function, it fails. So, why is that, even though in the declaration above I declared Node as deriving from Eq class?
data Node a = Node a deriving (Show, Eq)
The Eq instance GHC derives for Node a is something like this:
instance Eq a => Eq (Node a) where
(Node x) == (Node y) = x == y
(Node x) /= (Node y) = x /= y
You can view the generated code by compiling with -ddump-deriv. The Eq a constraint is needed for obvious reasons. So, GHC couldn't infer an instance of Eq for, say, Node (a -> b) since functions can't be compared.
However, the fact that GHC can't infer an instance of Eq for Node a for some a doesn't mean it will stop you from constructing a values of type Node a where a isn't an equality type.
If you wanted to stop people from constructing non-comparable Nodes, you could try putting a constraint like this:
data Eq a => Node a = Node a deriving (Eq, Show)
But now GHC tells us we need a compiler pragma:
Illegal datatype context (use -XDatatypeContexts): Eq a =>
OK, let's add it to the top of our file:
{-# LANGUAGE DatatypeContexts #-}
Now compile:
/tmp/foo.hs:1:41: Warning: -XDatatypeContexts is deprecated: It was widely
considered a misfeature, and has been removed from the Haskell language.
The problem is that now every function using Nodes will need an Eq class constraint, which is annoying (your functions still need the constraint!). (Also, if your user wants to create Nodes using a non-equality type but never tests them for equality, what's the problem?)
There's actually a way to get GHC to do what you want, however: Generalized Algebraic Data Types (GADTs):
{-# LANGUAGE GADTs, StandaloneDeriving #-}
data Node a where
Node :: Eq a => a -> Node a
This looks just like your original definition, except that it emphasizes the Node value constructor (the one formerly on the right hand side of the data declaration) is just a function, which you can add constraints to. Now GHC knows that only equality types can be put into Nodes, and unlike our earlier attempted solution, we can make new functions that don't need a constraint:
fromNode :: Node a -> a
fromNode (Node x) = x
We can still derive Eq and Show instances, but with a slightly different syntax:
deriving instance Eq (Node a)
deriving instance Show (Node a)
(Hence the StandaloneDeriving pragma above.)
For this to work, GHC also requires us to add a Show constraint to our GADT (if you look at the generated code again, you'll see the constraints are now gone):
data Node a where
Node :: (Eq a, Show a) => a -> Node a
And now we can take the Eq constraint off isEdge, since GHC can infer it!
(This is definitely overkill for such a simple situation -- again, if people want to construct nodes with functions inside them, why shouldn't they? However, GADTs are extremely useful in pretty similar situations when you want to enforce certain properties of your data types. See a cool example).
EDIT (from the future): you can also write
data Node a = (Eq a, Show a) => Node a
but you still need to enable GADT extensions and derive instances separately. See this thread.
When you add a deriving clause to a data declaration, the derived clause will include any necessary constraints for the type variable in scope at the declaration. In this case, deriving Eq will create essentially the following instance:
instance Eq a => Eq (Node a) where
(Node a) == (Node b) = a == b
(Node a) /= (Node b) = a /= b
Any derived Eq instance will depend upon the Eq instance of types that appear to the right of the data constructor.
This is because there's really no other way to derive an Eq instance automatically. Two values are equal if they have the same type and all their components are equal. So you need to be able to test the components for equality. In order to generically test a polymorphic component for equality, you need an Eq instance.
This is true not just for Eq, but for all the derived classes. For example this code
toStr :: Edge l n -> String
toStr = show
won't work without adding the constraint (Show l, Show n). Without that constraint, the function to show an Edge doesn't know what to call to show its internal Labels and Nodes.

Resources