Confused about custom data types in Haskell

Confused about custom data types in Haskell - haskell

The task: I am trying to create a custom data type and have it able to print to the console. I also want to be able to sort it using Haskell's natural ordering.
The issue: Write now, I can't get this code to compile. It throws the following error: No instance for (Show Person) arising from a use of 'print'.
What I have so far:
-- Omitted working selection-sort function
selection_sort_ord :: (Ord a) => [a] -> [a]
selection_sort_ord xs = selection_sort (<) xs
data Person = Person {
first_name :: String,
last_name :: String,
age :: Int }
main :: IO ()
main = print $ print_person (Person "Paul" "Bouchon" 21)

You need a Show instance to convert the type to a printable representation (a String). The easiest way to obtain one is to add
deriving Show
to the type definition.
data Person = Person {
first_name :: String,
last_name :: String,
age :: Int }
deriving (Eq, Ord, Show)
to get the most often needed instances.
If you want a different Ord instance, as suggested in the comments, instead of deriving that (keep deriving Eq and Show unless you want different behaviour for those), provide an instance like
instance Ord Person where
compare p1 p2 = case compare (age p1) (age p2) of
EQ -> case compare (last_name p1) (last_name p2) of
EQ -> compare (first_name p1) (first_name p2)
other -> other
unequal -> unequal
or use pattern matching in the definition of compare if you prefer,
compare (Person first1 last1 age1) (Person first2 last2 age2) =
case compare age1 age2 of
EQ -> case compare last1 last2 of
EQ -> compare first1 first2
other -> other
unequal -> unequal
That compares according to age first, then last name, and finally, if needed, first name.

Related

Deriving Eq and Show for an ADT that contains fields that can't have Eq or Show

I'd like to be able to derive Eq and Show for an ADT that contains multiple fields. One of them is a function field. When doing Show, I'd like it to display something bogus, like e.g. "<function>"; when doing Eq, I'd like it to ignore that field. How can I best do this without hand-writing a full instance for Show and Eq?
I don't want to wrap the function field inside a newtype and write my own Eq and Show for that - it would be too bothersome to use like that.

One way you can get proper Eq and Show instances is to, instead of hard-coding that function field, make it a type parameter and provide a function that just “erases” that field. I.e., if you have
data Foo = Foo
{ fooI :: Int
, fooF :: Int -> Int }
you change it to
data Foo' f = Foo
{ _fooI :: Int
, _fooF :: f }
deriving (Eq, Show)
type Foo = Foo' (Int -> Int)
eraseFn :: Foo -> Foo' ()
eraseFn foo = foo{ fooF = () }
Then, Foo will still not be Eq- or Showable (which after all it shouldn't be), but to make a Foo value showable you can just wrap it in eraseFn.

Typically what I do in this circumstance is exactly what you say you don’t want to do, namely, wrap the function in a newtype and provide a Show for that:
data T1
{ f :: X -> Y
, xs :: [String]
, ys :: [Bool]
}
data T2
{ f :: OpaqueFunction X Y
, xs :: [String]
, ys :: [Bool]
}
deriving (Show)
newtype OpaqueFunction a b = OpaqueFunction (a -> b)
instance Show (OpaqueFunction a b) where
show = const "<function>"
If you don’t want to do that, you can instead make the function a type parameter, and substitute it out when Showing the type:
data T3' a
{ f :: a
, xs :: [String]
, ys :: [Bool]
}
deriving (Functor, Show)
newtype T3 = T3 (T3' (X -> Y))
data Opaque = Opaque
instance Show Opaque where
show = const "..."
instance Show T3 where
show (T3 t) = show (Opaque <$ t)
Or I’ll refactor my data type to derive Show only for the parts I want to be Showable by default, and override the other parts:
data T4 = T4
{ f :: X -> Y
, xys :: T4' -- Move the other fields into another type.
}
instance Show T4 where
show (T4 f xys) = "T4 <function> " <> show xys
data T4' = T4'
{ xs :: [String]
, ys :: [Bool]
}
deriving (Show) -- Derive ‘Show’ for the showable fields.
Or if my type is small, I’ll use a newtype instead of data, and derive Show via something like OpaqueFunction:
{-# LANGUAGE DerivingVia #-}
newtype T5 = T5 (X -> Y, [String], [Bool])
deriving (Show) via (OpaqueFunction X Y, [String], [Bool])
You can use the iso-deriving package to do this for data types using lenses if you care about keeping the field names / record accessors.
As for Eq (or Ord), it’s not a good idea to have an instance that equates values that can be observably distinguished in some way, since some code will treat them as identical and other code will not, and now you’re forced to care about stability: in some circumstance where I have a == b, should I pick a or b? This is why substitutability is a law for Eq: forall x y f. (x == y) ==> (f x == f y) if f is a “public” function that upholds the invariants of the type of x and y (although floating-point also violates this). A better choice is something like T4 above, having equality only for the parts of a type that can satisfy the laws, or explicitly using comparison modulo some function at use sites, e.g., comparing someField.

The module Text.Show.Functions in base provides a show instance for functions that displays <function>. To use it, just:
import Text.Show.Functions
It just defines an instance something like:
instance Show (a -> b) where
show _ = "<function>"
Similarly, you can define your own Eq instance:
import Text.Show.Functions
instance Eq (a -> b) where
-- all functions are equal...
-- ...though some are more equal than others
_ == _ = True
data Foo = Foo Int Double (Int -> Int) deriving (Show, Eq)
main = do
print $ Foo 1 2.0 (+1)
print $ Foo 1 2.0 (+1) == Foo 1 2.0 (+2) -- is True
This will be an orphan instance, so you'll get a warning with -Wall.
Obviously, these instances will apply to all functions. You can write instances for a more specialized function type (e.g., only for Int -> String, if that's the type of the function field in your data type), but there is no way to simultaneously (1) use the built-in Eq and Show deriving mechanisms to derive instances for your datatype, (2) not introduce a newtype wrapper for the function field (or some other type polymorphism as mentioned in the other answers), and (3) only have the function instances apply to the function field of your data type and not other function values of the same type.
If you really want to limit applicability of the custom function instances without a newtype wrapper, you'd probably need to build your own generics-based solution, which wouldn't make much sense unless you wanted to do this for a lot of data types. If you go this route, then the Generics.Deriving.Show and Generics.Deriving.Eq modules in generic-deriving provide templates for these instances which could be modified to treat functions specially, allowing you to derive per-datatype instances using some stub instances something like:
instance Show Foo where showsPrec = myGenericShowsPrec
instance Eq Foo where (==) = myGenericEquality

I proposed an idea for adding annotations to fields via fields, that allows operating on behaviour of individual fields.
data A = A
{ a :: Int
, b :: Int
, c :: Int -> Int via Ignore (Int->Int)
}
deriving
stock GHC.Generic
deriving (Eq, Show)
via Generically A -- assuming Eq (Generically A)
-- Show (Generically A)
But this is already possible with the "microsurgery" library, but you might have to write some boilerplate to get it going. Another solution is to write separate behaviour in "sums-of-products style"
data A = A Int Int (Int->Int)
deriving
stock GHC.Generic
deriving
anyclass SOP.Generic
deriving (Eq, Show)
via A <-𝈖-> '[ '[ Int, Int, Ignore (Int->Int) ] ]

Deserializing many network messages without using an ad-hoc parser implementation

I have a question pertaining to deserialization. I can envision a solution using Data.Data, Data.Typeable, or with GHC.Generics, but I'm curious if it can be accomplished without generics, SYB, or meta-programming.
Problem Description:
Given a list of [String] that is known to contain the fields of a locally defined algebraic data type, I would like to deserialize the [String] to construct the target data type. I could write a parser to do this, but I'm looking for a generalized solution that will deserialize to an arbitrary number of data types defined within the program without writing a parser for each type. With knowledge of the number and type of value constructors an algebraic type has, it's as simple as performing a read on each string to yield the appropriate values necessary to build up the type. However, I don't want to use generics, reflection, SYB, or meta-programming (unless it's otherwise impossible).
Say I have around 50 types defined similar to this (all simple algebraic types composed of basic primitives (no nested or recursive types, just different combinations and orderings of primitives) :
data NetworkMsg = NetworkMsg { field1 :: Int, field2 :: Int, field3 :: Double}
data NetworkMsg2 = NetworkMsg2 { field1 :: Double, field2 :: Int, field3 :: Double }
I can determine the data-type to be associated with a [String] I've received over the network using a tag id that I parse before each [String].
Possible conjectured solution path:
Since data constructors are first-class values in Haskell, and actually have a type-- Can NetworkMsg constructor be thought of as a function, such as:
NetworkMsg :: Int -> Int -> Double -> NetworkMsg
Could I transform this function into a function on tuples using uncurryN then copy the [String] into a tuple of the same shape the function now takes?
NetworkMsg' :: (Int, Int, Double) -> NetworkMsg
I don't think this would work because I'd need knowledge of the value constructors and type information, which would require Data.Typeable, reflection, or some other metaprogramming technique.
Basically, I'm looking for automatic deserialization of many types without writing type instance declarations or analyzing the type's shape at run-time. If it's not feasible, I'll do it an alternative way.

You are correct in that the constructors are essentially just functions so you can write generic instances for any number of types by just writing instances for the functions. You'll still need to write a separate instance
for all the different numbers of arguments, though.
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
import Text.Read
import Control.Applicative
class FieldParser p r where
parseFields :: p -> [String] -> Maybe r
instance Read a => FieldParser (a -> r) r where
parseFields con [a] = con <$> readMaybe a
parseFields _ _ = Nothing
instance (Read a, Read b) => FieldParser (a -> b -> r) r where
parseFields con [a, b] = con <$> readMaybe a <*> readMaybe b
parseFields _ _ = Nothing
instance (Read a, Read b, Read c) => FieldParser (a -> b -> c -> r) r where
parseFields con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseFields _ _ = Nothing
{- etc. for as many arguments as you need -}
Now you can use this type class to parse any message based on the constructor as long as the type-checker is able to figure out the resulting message type from context (i.e. it is not able to deduce it simply from the given constructor for these sort of multi-param type class instances).
data Test1 = Test1 {fieldA :: Int} deriving Show
data Test2 = Test2 {fieldB ::Int, fieldC :: Float} deriving Show
test :: String -> [String] -> IO ()
test tag fields = case tag of
"Test1" -> case parseFields Test1 fields of
Just (a :: Test1) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
"Test2" -> case parseFields Test2 fields of
Just (a :: Test2) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
I'd like to know how exactly you use the message types in the application, though, because having each message as its separate type makes it very difficult to have any sort of generic message handler.
Is there some reason why you don't simply have a single message data type? Such as
data NetworkMsg
= NetworkMsg1 {fieldA :: Int}
| NetworkMsg2 {fieldB :: Int, fieldC :: Float}
Now, while the instances are built in pretty much the same way, you get much better type inference since the result type is always known.
instance Read a => MessageParser (a -> NetworkMsg) where
parseMsg con [a] = con <$> readMaybe a
instance (Read a, Read b) => MessageParser (a -> b -> NetworkMsg) where
parseMsg con [a, b] = con <$> readMaybe a <*> readMaybe b
instance (Read a, Read b, Read c) => MessageParser (a -> b -> c -> NetworkMsg) where
parseMsg con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseMessage :: String -> [String] -> Maybe NetworkMsg
parseMessage tag fields = case tag of
"NetworkMsg1" -> parseMsg NetworkMsg1 fields
"NetworkMsg2" -> parseMsg NetworkMsg2 fields
_ -> Nothing
I'm also not sure why you want to do type-generic programming specifically without actually using any of the tools meant for generics. GHC.Generics, SYB or Template Haskell is usually the best solution for this kind of problem.

Basic Haskell: problems with a function

me again with another basic problem I have. I'm using ghci.
I (with help) created this working code:
newtype Name = Name String deriving (Show)
newtype Age = Age Int deriving (Show)
newtype Weight = Weight Int deriving (Show)
newtype Person = Person (Name, Age, Weight) deriving (Show)
isAdult :: Person -> Bool
isAdult (Person(_, Age a, _)) = a > 18
However problems occur when I tried making a more complex function updateWeight that allows the user to change a Person's weight from it's previous value. Can you point out where I have gone wrong?
updateWeight :: Person -> Int -> Person
updateWeight (Person(_,_,Weight w) b = (Person(_,_,w+b))

The problem is that you can't use the _ placeholder on the right hand side of an expression. You'll have to pass through the unchanged values. Also, you must wrap the result of w + b with a Weight again. This should work:
updateWeight :: Person -> Int -> Person
updateWeight (Person(n, a, Weight w) b = (Person(n, a, Weight (w + b)))
You can get rid of the boilerplate of passing through the unchanged values by using record syntax for the Person type.

Haskell: Typeclass implies other typeclass

Is it possible to have a typeclass imply another typeclass in Haskell? For example let's say there is a bunch of "things" that can be ordered by an "attribute":
data Person = Person { name :: String, age :: Int }
Person p1 <= Person p1 = (age p1) <= (age p2)
To avoid repetition one could define a "orderable by key" type class
class OrdByKey o where
orderKey :: (Ord r) => o -> r
x <= y = (orderKey x) <= (orderKey y)
Then the instance declaration for Person could look like this
instance OrdByKey Person where
orderKey Person p = age p
Now this does obviously not work for multiple reasons. I wonder if it's possible at all?

As you have specified it, the OrdByKey class can only have one instance
per type, when it sounds like you would like to be able to declare an instance
for each field in your record type.
To accomplish that, you will have to put the field type into the class
definition as well. This lets you do something like the following:
{-# LANGUAGE MultiParamTypeClasses #-}
data Person = Person { name :: String, age :: Int }
class (Ord r) => OrdByKey o r where
orderKey :: o -> r
instance OrdByKey Person Int where
orderKey p = age p
x <=? y = (orderKey x :: Int) <= (orderKey y :: Int)
However, you can only have one instance per field type, so if your
Person type looks like
data Person = Person { name :: String, age :: Int, ssn :: String}
you will not be able to have a version to compare on both the name and
the ssn fields. You could get around this by wrapping each field in a
newtype so each field has a unique type. So your Person type would look
like
data Person = Person { name :: Name, age :: Age, ssn :: SSN}
That would lead to a lot of newtypes floating around though.
The real downside of this is the need to specify the return type for the
orderKey function. I would recommend using the on function from
Data.Function to write the appropriate comparison functions. I think a
function like
compareByKey :: (Ord b) => (a -> b) -> a -> a -> Bool
compareByKey = on (<=)
generalizes your idea of "can be compared by some key". You just have to give
it the function that extracts that key, which would be exactly the accessor
functions for your Person type, in this case.
I can't think of an instance where the OrdByKey class would be useful and trying to overload the <= with multiple versions for the same type seems like it would be down right
confusing in practice.

You could do this:
instance Ord Person where
compare p1 p2 = compare (age p1) (age p2)
Now the standard <= operator will work on Persons and compare their ages.

Sort by constructor ignoring (part of) value

Suppose I have
data Foo = A String Int | B Int
I want to take an xs :: [Foo] and sort it such that all the As are at the beginning, sorted by their strings, but with the ints in the order they appeared in the list, and then have all the Bs at the end, in the same order they appeared.
In particular, I want to create a new list containg the first A of each string and the first B.
I did this by defining a function taking Foos to (Int, String)s and using sortBy and groupBy.
Is there a cleaner way to do this? Preferably one that generalizes to at least 10 constructors.
Typeable, maybe? Something else that's nicer?
EDIT: This is used for processing a list of Foos that is used elsewhere. There is already an Ord instance which is the normal ordering.

You can use
sortBy (comparing foo)
where foo is a function that extracts the interesting parts into something comparable (e.g. Ints).
In the example, since you want the As sorted by their Strings, a mapping to Int with the desired properties would be too complicated, so we use a compound target type.
foo (A s _) = (0,s)
foo (B _) = (1,"")
would be a possible helper. This is more or less equivalent to Tikhon Jelvis' suggestion, but it leaves space for the natural Ord instance.

To make it easier to build comparison function for ADTs with large number of constructors, you can map values to their constructor index with SYB:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Generics
data Foo = A String Int | B Int deriving (Show, Eq, Typeable, Data)
cIndex :: Data a => a -> Int
cIndex = constrIndex . toConstr
Example:
*Main Data.Generics> cIndex $ A "foo" 42
1
*Main Data.Generics> cIndex $ B 0
2

Edit:After re-reading your question, I think the best option is to make Foo an instance of Ord. I do not think there is any way to do this automatically that will act the way you want (just using deriving will create different behavior).
Once Foo is an instance of Ord, you can just use sort from Data.List.
In your exact example, you can do something like this:
data Foo = A String Int | B Int deriving (Eq)
instance Ord Foo where
(A _ _) <= (B _) = True
(A s _) <= (A s' _) = s <= s'
(B _) <= (B _) = True
When something is an instance of Ord, it means the data type has some ordering. Once we know how to order something, we can use a bunch of existing functions (like sort) on it and it will behave how you want. Anything in Ord has to be part of Eq, which is what the deriving (Eq) bit does automatically.
You can also derive Ord. However, the behavior will not be exactly what you want--it will order by all of the fields if it has to (e.g. it will put As with the same string in order by their integers).
Further edit: I was thinking about it some more and realized my solution is probably semantically wrong.
An Ord instance is a statement about your whole data type. For example, I'm saying that Bs are always equal with each other when the derived Eq instance says otherwise.
If the data your representing always behaves like this (that is, Bs are all equal and As with the same string are all equal) then an Ord instance makes sense. Otherwise, you should not actually do this.
However, you can do something almost exactly like this: write your own special compare function (Foo -> Foo -> Ordering) that encapsulates exactly what you want to do then use sortBy. This properly codifies that your particular sorting is special rather than the natural ordering of the data type.

You could use some template haskell to fill in the missing transitive cases. The mkTransitiveLt creates the transitive closure of the given cases (if you order them least to greatest). This gives you a working less-than, which can be turned into a function that returns an Ordering.
{-# LANGUAGE TemplateHaskell #-}
import MkTransitiveLt
import Data.List (sortBy)
data Foo = A String Int | B Int | C | D | E deriving(Show)
cmp a b = $(mkTransitiveLt [|
case (a, b) of
(A _ _, B _) -> True
(B _, C) -> True
(C, D) -> True
(D, E) -> True
(A s _, A s' _) -> s < s'
otherwise -> False|])
lt2Ord f a b =
case (f a b, f b a) of
(True, _) -> LT
(_, True) -> GT
otherwise -> EQ
main = print $ sortBy (lt2Ord cmp) [A "Z" 1, A "A" 1, B 1, A "A" 0, C]
Generates:
[A "A" 1,A "A" 0,A "Z" 1,B 1,C]
mkTransitiveLt must be defined in a separate module:
module MkTransitiveLt (mkTransitiveLt)
where
import Language.Haskell.TH
mkTransitiveLt :: ExpQ -> ExpQ
mkTransitiveLt eq = do
CaseE e ms <- eq
return . CaseE e . reverse . foldl go [] $ ms
where
go ms m#(Match (TupP [a, b]) body decls) = (m:ms) ++
[Match (TupP [x, b]) body decls | Match (TupP [x, y]) _ _ <- ms, y == a]
go ms m = m:ms

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Confused about custom data types in Haskell - haskell

Related

Deriving Eq and Show for an ADT that contains fields that can't have Eq or Show

Deserializing many network messages without using an ad-hoc parser implementation

Basic Haskell: problems with a function

Haskell: Typeclass implies other typeclass

Sort by constructor ignoring (part of) value

Categories

Resources