Haskell: How to put multiple instances in the same module? - haskell

Let's say that I have the following code:
import Data.List.Ordered
data Person = Person String String
deriving (Show, Eq)
main :: IO ()
main = print . show . sort $ [(Person "Isaac" "Newton"), (Person "Johannes" "Kepler")]
And in the same module, I want to be able to sort the list by both first name and by last name. Obviously I can't do this:
instance Ord Person where
compare (Person _ xLast) (Person _ yLast) = compare xLast yLast
instance Ord Person where
compare (Person xFirst _) (Person yFirst _) = compare xFirst yFirst
So, what are my options?
This page mentions "You can achieve this by wrapping the type in a newtype and lift all required instances to that new type." Can someone give an example of that?
Is there a better way?

The newtype method would be:
newtype ByFirstname = ByFirstname { unByFirstname :: Person }
instance Ord ByFirstname where
-- pattern matching on (ByFirstname (Person xFirst _)) etc
compare [...] = [...]
newtype ByLastname = ByLastname { unByLastname :: Person }
instance Ord ByLastname where
-- as above
Then the sorting function would be something like:
sortFirstname = map unByFirstname . sort . map ByFirstname
and similarly for ByLastname.
A better way is to use sortBy, compare and on, with functions for retrieving the first and last names. i.e.
sortFirstname = sortBy (compare `on` firstName)
(On that note, it might be worth using a record type for Person, i.e. data Person = Person { firstName :: String, lastName :: String }, and one even gets the accessor functions for free.)

You don't want to define multiple Ord instances just to sort by different orders. You just need to use the sortBy function, which takes an explicit comparison function as its argument.
Neat trick: if you use record types to define your Person type, import Data.Function (which gives you the on function) and Data.Monoid, you can use some neat tricks to make this much briefer and easy:
import Data.Function (on)
import Data.Monoid (mappend)
import Data.List (sortBy)
data Person = Person { firstName :: String, lastName :: String }
instance Show Person where
show p = firstName p ++ " " ++ lastName p
exampleData = [ Person "Mary" "Smith"
, Person "Joe" "Smith"
, Person "Anuq" "Katig"
, Person "Estanislao" "Martínez"
, Person "Barack" "Obama" ]
--
-- The "on" function allows you to construct comparison functions out of field
-- accessors:
--
-- compare `on` firstName :: Person -> Person -> Ordering
--
-- That expression evaluates to something equivalent to this:
--
-- (\x y -> compare (firstName x) (firstName y))
--
sortedByFirstName = sortBy (compare `on` firstName) exampleData
sortedByLastName = sortBy (compare `on` lastName) exampleData
--
-- The "mappend" function allows you to combine comparison functions into a
-- composite one. In this one, the comparison function sorts first by lastName,
-- then by firstName:
--
sortedByLastNameThenFirstName = sortBy lastNameFirstName exampleData
where lastNameFirstName =
(compare `on` lastName) `mappend` (compare `on` firstName)

The reason that page mentions newtype is that you cannot have two instances of the same class for a given type. Well, you can but not in the same scope and it's generally a really bad idea. See Orphan Instances for more information (as C. A. McCann pointed out this isn't about orphan instances. I included the link because it has a good description of why duplicate instances, even in cases that superficially work, is a bad idea.).
An example of newtype would be:
newtype SortFirst = SF Person
instance Ord Person where
compare (Person _ xLast) (Person _ yLast) = compare xLast yLast
instance Ord SortFirst where
compare (SF (Person xFirst _)) (SF (Person yFirst _)) = compare xFirst yFirst
Then when you want to sort by first name you would have to do:
sort (map SF persons)
It's not very convenient so it may be better to use sortBy which takes an comparison function as an argument.

It seems that all the functions in Data.List.Ordered have "by" variants that let you provide a comparison function rather than using an Ord instance. If there's no single obvious "standard" ordering, that's not a bad option.
It then becomes your responsibility to ensure any given list is always used with a consistent comparison function, however, which is a potential source of all kinds of fun bugs.
If I needed to work with many ordered lists using a variety of orderings, rather than fussing with newtype wrapping and unwrapping--which is dreadfully tedious--I'd consider a data type to bundle a list with its comparison function, then define proxy functions that use the "by" functions with the associated comparison.

Related

What's a better way of managing large Haskell records?

Replacing fields names with letters, I have cases like this:
data Foo = Foo { a :: Maybe ...
, b :: [...]
, c :: Maybe ...
, ... for a lot more fields ...
} deriving (Show, Eq, Ord)
instance Writer Foo where
write x = maybeWrite a ++
listWrite b ++
maybeWrite c ++
... for a lot more fields ...
parser = permute (Foo
<$?> (Nothing, Just `liftM` aParser)
<|?> ([], bParser)
<|?> (Nothing, Just `liftM` cParser)
... for a lot more fields ...
-- this is particularly hideous
foldl1 merge [foo1, foo2, ...]
merge (Foo a b c ...seriously a lot more...)
(Foo a' b' c' ...) =
Foo (max a a') (b ++ b') (max c c') ...
What techniques would allow me to better manage this growth?
In a perfect world a, b, and c would all be the same type so I could keep them in a list, but they can be many different types. I'm particularly interested in any way to fold the records without needing the massive patterns.
I'm using this large record to hold the different types resulting from permutation parsing the vCard format.
Update
I've implemented both the generics and the foldl approaches suggested below. They both work, and they both reduce three large field lists to one.
Datatype-generic programming techniques can be used to transform all the fields of a record in some "uniform" sort of way.
Perhaps all the fields in the record implement some typeclass that we want to use (the typical example is Show). Or perhaps we have another record of "similar" shape that contains functions, and we want to apply each function to the corresponding field of the original record.
For these kinds of uses, the generics-sop library is a good option. It expands the default Generics functionality of GHC with extra type-level machinery that provides analogues of functions like sequence or ap, but which work over all the fields of a record.
Using generics-sop, I tried to create a slightly less verbose version of your merge funtion. Some preliminary imports:
{-# language TypeOperators #-}
{-# language DeriveGeneric #-}
{-# language TypeFamilies #-}
{-# language DataKinds #-}
import Control.Applicative (liftA2)
import qualified GHC.Generics as GHC
import Generics.SOP
A helper function that lifts a binary operation to a form useable by the functions of generics-sop:
fn_2' :: (a -> a -> a) -> (I -.-> (I -.-> I)) a -- I is simply an Identity functor
fn_2' = fn_2 . liftA2
A general merge function that takes a vector of operators and works on any single-constructor record that derives Generic:
merge :: (Generic a, Code a ~ '[ xs ]) => NP (I -.-> (I -.-> I)) xs -> a -> a -> a
merge funcs reg1 reg2 =
case (from reg1, from reg2) of
(SOP (Z np1), SOP (Z np2)) ->
let npResult = funcs `hap` np1 `hap` np2
in to (SOP (Z npResult))
Code is a type family that returns a type-level list of lists describing the structure of a datatype. The outer list is for constructors, the inner lists contain the types of the fields for each constructor.
The Code a ~ '[ xs ] part of the constraint says "the datatype can only have one constructor" by requiring the outer list to have exactly one element.
The (SOP (Z _) pattern matches extract the (heterogeneus) vector of field values from the record's generic representation. SOP stands for "sum-of-products".
A concrete example:
data Person = Person
{
name :: String
, age :: Int
} deriving (Show,GHC.Generic)
instance Generic Person -- this Generic is from generics-sop
mergePerson :: Person -> Person -> Person
mergePerson = merge (fn_2' (++) :* fn_2' (+) :* Nil)
The Nil and :* constructors are used to build the vector of operators (the type is called NP, from n-ary product). If the vector doesn't match the number of fields in the record, the program won't compile.
Update. Given that the types in your record are highly uniform, an alternative way of creating the vector of operations is to define instances of an auxiliary typeclass for each field type, and then use the hcpure function:
class Mergeable a where
mergeFunc :: a -> a -> a
instance Mergeable String where
mergeFunc = (++)
instance Mergeable Int where
mergeFunc = (+)
mergePerson :: Person -> Person -> Person
mergePerson = merge (hcpure (Proxy :: Proxy Mergeable) (fn_2' mergeFunc))
The hcliftA2 function (that combines hcpure, fn_2 and hap) could be used to simplify things further.
Some suggestions:
(1) You can use the RecordWildCards extension to automatically
unpack a record into variables. Doesn't help if you need to unpack
two records of the same type, but it's a useful to keep in mind.
Oliver Charles has a nice blog post on it: (link)
(2) It appears your example application is performing a fold over the records.
Have a look at Gabriel Gonzalez's foldl package. There is also a blog post: (link)
Here is a example of how you might use it with a record like:
data Foo = Foo { _a :: Int, _b :: String }
The following code computes the maximum of the _a fields and the
concatenation of the _b_ fields.
import qualified Control.Foldl as L
import Data.Profunctor
data Foo = Foo { _a :: Int, _b :: String }
deriving (Show)
fold_a :: L.Fold Foo Int
fold_a = lmap _a (L.Fold max 0 id)
fold_b :: L.Fold Foo String
fold_b = lmap _b (L.Fold (++) "" id)
fold_foos :: L.Fold Foo Foo
fold_foos = Foo <$> fold_a <*> fold_b
theFoos = [ Foo 1 "a", Foo 3 "b", Foo 2 "c" ]
test = L.fold fold_foos theFoos
Note the use of the Profunctor function lmap to extract out
the fields we want to fold over. The expression:
L.Fold max 0 id
is a fold over a list of Ints (or any Num instance), and therefore:
lmap _a (L.Fold max 0 id)
is the same fold but over a list of Foo records where we use _a
to produce the Ints.

Haskell get types of Data Constructor

I was wondering if given a constructor, such as:
data UserType = User
{ username :: String
, password :: String
} -- deriving whatever necessary
What the easiest way is for me to get something on the lines of [("username", String), ("password", String)], short of just manually writing it. Now for this specific example it is fine to just write it but for a complex database model with lots of different fields it would be pretty annoying.
So far I have looked through Typeable and Data but so far the closest thing I have found is:
user = User "admin" "pass"
constrFields (toConstr user)
But that doesn't tell me the types, it just returns ["username", "password"] and it also requires that I create an instance of User.
I just knocked out a function using Data.Typeable that lets you turn a constructor into a list of the TypeReps of its arguments. In conjunction with the constrFields you found you can zip them together to get your desired result:
{-# LANGUAGE DeriveDataTypeable #-}
module Foo where
import Data.Typeable
import Data.Typeable.Internal(funTc)
getConsArguments c = go (typeOf c)
where go x = let (con, rest) = splitTyConApp x
in if con == funTc
then case rest of (c:cs:[]) -> c : go cs
_ -> error "arrows always take two arguments"
else []
given data Foo = Foo {a :: String, b :: Int} deriving Typeable, we get
*> getConsArguments Foo
[[Char],Int]
As one would hope.
On how to get the field names without using a populated data type value itself, here is a solution:
constrFields . head . dataTypeConstrs $ dataTypeOf (undefined :: Foo)

How do I let a function in Haskell depend on the type of its argument?

I tried to write a variation on show that treats strings differently from other instances of Show, by not including the " and returning the string directly. But I don't know how to do it. Pattern matching? Guards? I couldn't find anything about that in any documentation.
Here is what I tried, which doesn't compile:
show_ :: Show a => a -> String
show_ (x :: String) = x
show_ x = show x
If possible, you should wrap your values of type String up in a newtype as #wowofbob suggests.
However, sometimes this isn't feasible, in which case there are two general approaches to making something recognise String specifically.
The first way, which is the natural Haskell approach, is to use a type class just like Show to get different behaviour for different types. So you might write
class Show_ a where
show_ :: a -> String
and then
instance Show_ String where
show_ x = x
instance Show_ Int where
show_ x = show x
and so on for any other type you want to use. This has the disadvantage that you need to explicitly write out Show_ instances for all the types you want.
#AndrewC shows how you can cut each instance down to a single line, but you'll still have to list them all explicitly. You can in theory work around this, as detailed in this question, but it's not pleasant.
The second option is to get true runtime type information with the Typeable class, which is quite short and simple in this particular situation:
import Data.Typeable
[...]
show_ :: (Typeable a, Show a) => a -> String
show_ x =
case cast x :: Maybe String of
Just s -> s
Nothing -> show x
This is not a natural Haskell-ish approach because it means callers can't tell much about what the function will do from the type.
Type classes in general give constrained polymorphism in the sense that the only variations in behaviour of a particular function must come from the variations in the relevant type class instances. The Show_ class gives some indication what it's about from its name, and it might be documented.
However Typeable is a very general class. You are delegating everything to the specific function you are calling; a function with a Typeable constraint might have completely different implementations for lots of different concrete types.
Finally, a further elaboration on the Typeable solution which gets closer to your original code is to use a couple of extensions:
{-# LANGUAGE ViewPatterns, ScopedTypeVariables #-}
import Data.Typeable
[...]
show_ :: (Typeable a, Show a) => a -> String
show_ (cast -> Just (s :: String)) = s
show_ x = show x
The use of ViewPatterns allows us to write the cast inside a pattern, which may fit in more nicely with more complicated examples. In fact we can omit the :: String type constraint because the body of this cases forces s to be the result type of show_, i.e. String, anyway. But that's a little obscure so I think it's better to be explicit.
You can wrap it into newtype and make custom Show instance for it:
newtype PrettyString = PrettyString { toString :: String }
instance Show PrettyString where
show (PrettyString s) = "$$" ++ s ++ "$$" -- for example
And then use it like below:
main = getLine >>= print . PrettyString
TL;DR:
Copy the prelude's way and use showList_ as a class function to generate instances for lists so you can override the definition for String.
Caveat
For the record, wowofbob's answer using a newtype wrapper is the simple, clean solution I would use in real life, but I felt it was instructive to also look at some of how the Prelude does this.
intercalate commas by default
The way this is done in the prelude is to make the Show class have a function for showing lists, with a default definition that you can override.
import Data.List (intercalate)
I'll use intercalate :: [a] -> [[a]] -> [a] to put commas in between stuff:
ghci> intercalate "_._" ["intercalate","works","like","this"]
"intercalate_._works_._like_._this"
Make a showList_ class function, and default to show and comma-separated lists.
So now the class with the default implementation of the showList function and, importantly, a default show_ implementation that just uses the ordinary show function. To be able to use that, we have to insist that the type is already in the Show typeclass, but that's OK as far as I understand you.
class Show a => Show_ a where
show_ :: a -> String
showList_ :: [a] -> String
show_ = show
showList_ xs = '[' : intercalate ", " (map show_ xs) ++ "]"
The real Show class uses functions of type String -> String instead of String directly for efficiency reasons, and a precedence argument to control the use of brackets, but I'll skip all that for simplicity.
Automatically make instances for lists
Now we can use the showList function to provide an instance for lists:
instance Show_ a => Show_ [a] where
show_ xs = showList_ xs
The Show a => superclass makes instances super-easy
Now we come to some instances. Because of our default show_ implementation, we don't need to do any actual programming unless we want to override the default, which we'll do for Char, because String ~ [Char].
instance Show_ Int
instance Show_ Integer
instance Show_ Double
instance Show_ Char where
show_ c = [c] -- so show_ 'd' = "d". You can put show_ = show if you want "'d'"
showList_ = id -- just return the string
In practice:
Now that's not much use to hide " from your output in ghci, because the default show function is used for that, but if we use putStrLn, the quotes disappear:
put :: Show_ a => a -> IO ()
put = putStrLn . show_
ghci> show "hello"
"\"hello\""
ghci> show_ "hello"
"hello"
ghci> put "hello"
hello
ghci> put [2,3,4]
[2, 3, 4]
ghci>

What's the difference between makeLenses and makeFields?

Pretty self-explanatory. I know that makeClassy should create typeclasses, but I see no difference between the two.
PS. Bonus points for explaining the default behaviour of both.
Note: This answer is based on lens 4.4 or newer. There were some changes to the TH in that version, so I don't know how much of it applies to older versions of lens.
Organization of the lens TH functions
The lens TH functions are all based on one function, makeLensesWith (also named makeFieldOptics inside lens). This function takes a LensRules argument, which describes exactly what is generated and how.
So to compare makeLenses and makeFields, we only need to compare the LensRules that they use. You can find them by looking at the source:
makeLenses
lensRules :: LensRules
lensRules = LensRules
{ _simpleLenses = False
, _generateSigs = True
, _generateClasses = False
, _allowIsos = True
, _classyLenses = const Nothing
, _fieldToDef = \_ n ->
case nameBase n of
'_':x:xs -> [TopName (mkName (toLower x:xs))]
_ -> []
}
makeFields
defaultFieldRules :: LensRules
defaultFieldRules = LensRules
{ _simpleLenses = True
, _generateSigs = True
, _generateClasses = True -- classes will still be skipped if they already exist
, _allowIsos = False -- generating Isos would hinder field class reuse
, _classyLenses = const Nothing
, _fieldToDef = camelCaseNamer
}
What do these mean?
Now we know that the differences are in the simpleLenses, generateClasses, allowIsos and fieldToDef options. But what do those options actually mean?
makeFields will never generate type-changing optics. This is controlled by the simpleLenses = True option. That option doesn't have haddocks in the current version of lens. However, lens HEAD added documentation for it:
-- | Generate "simple" optics even when type-changing optics are possible.
-- (e.g. 'Lens'' instead of 'Lens')
So makeFields will never generate type-changing optics, while makeLenses will if possible.
makeFields will generate classes for the fields. So for each field foo, we have a class:
class HasFoo t where
foo :: Lens' t <Type of foo field>
This is controlled by the generateClasses option.
makeFields will never generate Iso's, even if that would be possible (controlled by the allowIsos option, which doesn't seem to be exported from Control.Lens.TH)
While makeLenses simply generates a top-level lens for each field that starts with an underscore (lowercasing the first letter after the underscore), makeFields will instead generate instances for the HasFoo classes. It also uses a different naming scheme, explained in a comment in the source code:
-- | Field rules for fields in the form # prefixFieldname or _prefixFieldname #
-- If you want all fields to be lensed, then there is no reason to use an #_# before the prefix.
-- If any of the record fields leads with an #_# then it is assume a field without an #_# should not have a lens created.
camelCaseFields :: LensRules
camelCaseFields = defaultFieldRules
So makeFields also expect that all fields are not just prefixed with an underscore, but also include the data type name as a prefix (as in data Foo = { _fooBar :: Int, _fooBaz :: Bool }). If you want to generate lenses for all fields, you can leave out the underscore.
This is all controlled by the _fieldToDef (exported as lensField by Control.Lens.TH).
As you can see, the Control.Lens.TH module is very flexible. Using makeLensesWith, you can create your very own LensRules if you need a pattern not covered by the standard functions.
Disclaimer: this is based on experimenting with the working code; it gave me enough information to proceed with my project, but I'd still prefer a better-documented answer.
data Stuff = Stuff {
_foo
_FooBar
_stuffBaz
}
makeLenses
Will create foo as a lens accessor to Stuff
Will create fooBar (changing the capitalized name to lowercase);
makeFields
Will create baz and a class HasBaz; it will make Stuff an instance of that class.
Normal
makeLenses creates a single top-level optic for each field in the type. It looks for fields that start with an underscore (_) and it creates an optic that is as general as possible for that field.
If your type has one constructor and one field you'll get an Iso.
If your type has one constructor and multiple fields you'll get many Lens.
If your type has multiple constructors you'll get many Traversal.
Classy
makeClassy creates a single class containing all the optics for your type. This version is used to make it easy to embed your type in another larger type achieving a kind of subtyping. Lens and Traversal optics will be created according to the rules above (Iso is excluded because it hinders the subtyping behavior.)
In addition to one method in the class per field you'll get an extra method that makes it easy to derive instances of this class for other types. All of the other methods have default instances in terms of the top-level method.
data T = MkT { _field1 :: Int, _field2 :: Char }
class HasT a where
t :: Lens' a T
field1 :: Lens' a Int
field2 :: Lens' a Char
field1 = t . field1
field2 = t . field2
instance HasT T where
t = id
field1 f (MkT x y) = fmap (\x' -> MkT x' y) (f x)
field2 f (MkT x y) = fmap (\y' -> MkT x y') (f y)
data U = MkU { _subt :: T, _field3 :: Bool }
instance HasT U where
t f (MkU x y) = fmap (\x' -> MkU x' y) (f x)
-- field1 and field2 automatically defined
This has the additional benefit that it is easy to export/import all the lenses for a given type. import Module (HasT(..))
Fields
makeFields creates a single class per field which is intended to be reused between all types that have a field with the given name. This is more of a solution to record field names not being able to be shared between types.

Haskell introspecting a record's field names and types

Based on a recent exchange, I've been convinced to use Template Haskell to generate some code to ensure compile-time type safety.
I need to introspect record field names and types. I understand I can get field names by using constrFields . toConstr :: Data a => a -> [String]. But I need more than the field names, I need to know their type. For example, I need to know the names of fields that are of type Bool.
How do I construct a function f :: a -> [(String, xx)] where a is the record, String is the field name and xx is the field type?
The type should be available, along with everything else, in the Info value provided by reify. Specifically, you should get a TyConI, which contains a Dec value, from which you can get the list of Con values specifying the constructors. A record type should then use RecC, which will give you a list of fields described by a tuple containing the field name, whether the field is strict, and the type.
Where you go from there depends on what you want to do with all this.
Edit: For the sake of actually demonstrating the above, here's a really terrible quick and dirty function that finds record fields:
import Language.Haskell.TH
test :: Name -> Q Exp
test n = do rfs <- fmap getRecordFields $ reify n
litE . stringL $ show rfs
getRecordFields :: Info -> [(String, [(String, String)])]
getRecordFields (TyConI (DataD _ _ _ cons _)) = concatMap getRF' cons
getRecordFields _ = []
getRF' :: Con -> [(String, [(String, String)])]
getRF' (RecC name fields) = [(nameBase name, map getFieldInfo fields)]
getRF' _ = []
getFieldInfo :: (Name, Strict, Type) -> (String, String)
getFieldInfo (name, _, ty) = (nameBase name, show ty)
Importing that in another module, we can use it like so:
data Foo = Foo { foo1 :: Int, foo2 :: Bool }
foo = $(test ''Foo)
Loading that in GHCi, the value in foo is [("Foo",[("foo1","ConT GHC.Types.Int"),("foo2","ConT GHC.Types.Bool")])].
Does that give you the rough idea?

Resources