Haskell record syntax and type classes - haskell

Suppose that I have two data types Foo and Bar. Foo has fields x and y. Bar has fields x and z. I want to be able to write a function that takes either a Foo or a Bar as a parameter, extracts the x value, performs some calculation on it, and then returns a new Foo or Bar with the x value set accordingly.
Here is one approach:
class HasX a where
getX :: a -> Int
setX :: a -> Int -> a
data Foo = Foo Int Int deriving Show
instance HasX Foo where
getX (Foo x _) = x
setX (Foo _ y) val = Foo val y
getY (Foo _ z) = z
setY (Foo x _) val = Foo x val
data Bar = Bar Int Int deriving Show
instance HasX Bar where
getX (Bar x _) = x
setX (Bar _ z) val = Bar val z
getZ (Bar _ z) = z
setZ (Bar x _) val = Bar x val
modifyX :: (HasX a) => a -> a
modifyX hasX = setX hasX $ getX hasX + 5
The problem is that all those getters and setters are painful to write, especially if I replace Foo and Bar with real-world data types that have lots of fields.
Haskell's record syntax gives a much nicer way of defining these records. But, if I try to define the records like this
data Foo = Foo {x :: Int, y :: Int} deriving Show
data Bar = Foo {x :: Int, z :: Int} deriving Show
I'll get an error saying that x is defined multiple times. And, I'm not seeing any way to make these part of a type class so that I can pass them to modifyX.
Is there a nice clean way of solving this problem, or am I stuck with defining my own getters and setters? Put another way, is there a way of connecting the functions created by record syntax up with type classes (both the getters and setters)?
EDIT
Here's the real problem I'm trying to solve. I'm writing a series of related programs that all use System.Console.GetOpt to parse their command-line options. There will be a lot of command-line options that are common across these programs, but some of the programs may have extra options. I'd like each program to be able to define a record containing all of its option values. I then start with a default record value that is then transformed through a StateT monad and GetOpt to get a final record reflecting the command-line arguments. For a single program, this approach works really well, but I'm trying to find a way to re-use code across all of the programs.

You want extensible records which, I gather, is one of the most talked about topics in Haskell. It appears that there is not currently much consensus on how to implement it.
In your case it seems like maybe instead of an ordinary record you could use a heterogeneous list like those implemented in HList.
Then again, it seems you only have two levels here: common and program. So maybe you should just define a common record type for the common options and a program-specific record type for each program, and use StateT on a tuple of those types. For the common stuff you can add aliases that compose fst with the common accessors so it's invisible to callers.

You could use code such as
data Foo = Foo { fooX :: Int, fooY :: Int } deriving (Show)
data Bar = Bar { barX :: Int, barZ :: Int } deriving (Show)
instance HasX Foo where
getX = fooX
setX r x' = r { fooX = x' }
instance HasX Bar where
getX = barX
setX r x' = r { barX = x' }
What are you modeling in your code? If we knew more about the problem, we could suggest something less awkward than this object-oriented design shoehorned into a functional language.

Seems to me like a job for generics. If you could tag your Int with different newtypes, then you would be able to write (with uniplate, module PlateData):
data Foo = Foo Something Another deriving (Data,Typeable)
data Bar = Bar Another Thing deriving (Data, Typerable)
data Opts = F Foo | B Bar
newtype Something = S Int
newtype Another = A Int
newtype Thing = T Int
getAnothers opts = [ x | A x <- universeBi opts ]
This would extract all Another's from anywhere inside the Opts.
Modification is possible as well.

If you make the types instances of Foldable you get a toList function that you can use as the basis of your accessor.
If Foldable doesn't by you anything, then maybe the right approach is to define the interface you want as a type class and figure out a good way to autogenerate the derived values.
Perhaps by deriving from doing
deriving(Data)
you could use gmap combinators to base your access off.

Related

Deriving Eq and Show for an ADT that contains fields that can't have Eq or Show

I'd like to be able to derive Eq and Show for an ADT that contains multiple fields. One of them is a function field. When doing Show, I'd like it to display something bogus, like e.g. "<function>"; when doing Eq, I'd like it to ignore that field. How can I best do this without hand-writing a full instance for Show and Eq?
I don't want to wrap the function field inside a newtype and write my own Eq and Show for that - it would be too bothersome to use like that.
One way you can get proper Eq and Show instances is to, instead of hard-coding that function field, make it a type parameter and provide a function that just “erases” that field. I.e., if you have
data Foo = Foo
{ fooI :: Int
, fooF :: Int -> Int }
you change it to
data Foo' f = Foo
{ _fooI :: Int
, _fooF :: f }
deriving (Eq, Show)
type Foo = Foo' (Int -> Int)
eraseFn :: Foo -> Foo' ()
eraseFn foo = foo{ fooF = () }
Then, Foo will still not be Eq- or Showable (which after all it shouldn't be), but to make a Foo value showable you can just wrap it in eraseFn.
Typically what I do in this circumstance is exactly what you say you don’t want to do, namely, wrap the function in a newtype and provide a Show for that:
data T1
{ f :: X -> Y
, xs :: [String]
, ys :: [Bool]
}
data T2
{ f :: OpaqueFunction X Y
, xs :: [String]
, ys :: [Bool]
}
deriving (Show)
newtype OpaqueFunction a b = OpaqueFunction (a -> b)
instance Show (OpaqueFunction a b) where
show = const "<function>"
If you don’t want to do that, you can instead make the function a type parameter, and substitute it out when Showing the type:
data T3' a
{ f :: a
, xs :: [String]
, ys :: [Bool]
}
deriving (Functor, Show)
newtype T3 = T3 (T3' (X -> Y))
data Opaque = Opaque
instance Show Opaque where
show = const "..."
instance Show T3 where
show (T3 t) = show (Opaque <$ t)
Or I’ll refactor my data type to derive Show only for the parts I want to be Showable by default, and override the other parts:
data T4 = T4
{ f :: X -> Y
, xys :: T4' -- Move the other fields into another type.
}
instance Show T4 where
show (T4 f xys) = "T4 <function> " <> show xys
data T4' = T4'
{ xs :: [String]
, ys :: [Bool]
}
deriving (Show) -- Derive ‘Show’ for the showable fields.
Or if my type is small, I’ll use a newtype instead of data, and derive Show via something like OpaqueFunction:
{-# LANGUAGE DerivingVia #-}
newtype T5 = T5 (X -> Y, [String], [Bool])
deriving (Show) via (OpaqueFunction X Y, [String], [Bool])
You can use the iso-deriving package to do this for data types using lenses if you care about keeping the field names / record accessors.
As for Eq (or Ord), it’s not a good idea to have an instance that equates values that can be observably distinguished in some way, since some code will treat them as identical and other code will not, and now you’re forced to care about stability: in some circumstance where I have a == b, should I pick a or b? This is why substitutability is a law for Eq: forall x y f. (x == y) ==> (f x == f y) if f is a “public” function that upholds the invariants of the type of x and y (although floating-point also violates this). A better choice is something like T4 above, having equality only for the parts of a type that can satisfy the laws, or explicitly using comparison modulo some function at use sites, e.g., comparing someField.
The module Text.Show.Functions in base provides a show instance for functions that displays <function>. To use it, just:
import Text.Show.Functions
It just defines an instance something like:
instance Show (a -> b) where
show _ = "<function>"
Similarly, you can define your own Eq instance:
import Text.Show.Functions
instance Eq (a -> b) where
-- all functions are equal...
-- ...though some are more equal than others
_ == _ = True
data Foo = Foo Int Double (Int -> Int) deriving (Show, Eq)
main = do
print $ Foo 1 2.0 (+1)
print $ Foo 1 2.0 (+1) == Foo 1 2.0 (+2) -- is True
This will be an orphan instance, so you'll get a warning with -Wall.
Obviously, these instances will apply to all functions. You can write instances for a more specialized function type (e.g., only for Int -> String, if that's the type of the function field in your data type), but there is no way to simultaneously (1) use the built-in Eq and Show deriving mechanisms to derive instances for your datatype, (2) not introduce a newtype wrapper for the function field (or some other type polymorphism as mentioned in the other answers), and (3) only have the function instances apply to the function field of your data type and not other function values of the same type.
If you really want to limit applicability of the custom function instances without a newtype wrapper, you'd probably need to build your own generics-based solution, which wouldn't make much sense unless you wanted to do this for a lot of data types. If you go this route, then the Generics.Deriving.Show and Generics.Deriving.Eq modules in generic-deriving provide templates for these instances which could be modified to treat functions specially, allowing you to derive per-datatype instances using some stub instances something like:
instance Show Foo where showsPrec = myGenericShowsPrec
instance Eq Foo where (==) = myGenericEquality
I proposed an idea for adding annotations to fields via fields, that allows operating on behaviour of individual fields.
data A = A
{ a :: Int
, b :: Int
, c :: Int -> Int via Ignore (Int->Int)
}
deriving
stock GHC.Generic
deriving (Eq, Show)
via Generically A -- assuming Eq (Generically A)
-- Show (Generically A)
But this is already possible with the "microsurgery" library, but you might have to write some boilerplate to get it going. Another solution is to write separate behaviour in "sums-of-products style"
data A = A Int Int (Int->Int)
deriving
stock GHC.Generic
deriving
anyclass SOP.Generic
deriving (Eq, Show)
via A <-𝈖-> '[ '[ Int, Int, Ignore (Int->Int) ] ]

What is the idiomatic way to access part of Algebraic data type in Haskell?

There is an easier way to call function test with value x than using case expression?
data FooBar = Foo Int | Bar String
test :: Maybe Int -> Bool -- Int from Foo constructor
x :: FooBar
One easier way is to define a helper to get you part of the way there:
data FooBar = Foo Int | Bar String
foo :: FooBar -> Maybe Int
foo (Foo x) = Just x
foo _ = Nothing
test :: Maybe Int -> Bool
x :: FooBar
result :: Bool
result = test . foo $ x
If you're the one defining test, you could also just define it differently to make things easier on yourself:
test' :: FooBar -> Bool
test' (Foo x) = (some logic)
test' _ = (the default value)
There is a neat concept called a "prism" that models this general concept -- extracting pieces of data from sum types -- elegantly. But they're... kind of hard to understand, and whether or not they can be considered "idiomatic" is pretty controversial.
You could use guards or pattern matching in the functions you are handing a FooBar as an argument.

data type with a default field and that needs a function that works with it

Say, I have a data type
data FooBar a = Foo String Char [a]
| Bar String Int [a]
I need to create values of this type and give empty list as the second field:
Foo "hello" 'a' []
or
Bar "world" 1 []
1) I do this everywhere in my code and I think it would be nice if I could omit the empty list part somehow and have the empty list assigned implicitly. Is this possible? Something similar to default function arguments in other languages.
2) Because of this [] "default" value, I often need to have a partial constructor application that results in a function that takes the first two values:
mkFoo x y = Foo x y []
mkBar x y = Bar x y []
Is there a "better" (more idiomatic, etc) way to do it? to avoid defining new functions?
3) I need a way to add things to the list:
add (Foo u v xs) x = Foo u v (x:xs)
add (Bar u v xs) x = Bar u v (x:xs)
Is this how it is done idiomatically? Just a general purpose function?
As you see I am a beginner, so maybe these questions make little sense. Hope not.
I'll address your questions one by one.
Default arguments do not exist in Haskell. They are simply not worth the added complexity and loss of compositionally. Being a functional language, you do a lot more function manipulation in Haskell, so funkiness like default arguments would be tough to handle.
One thing I didn't realize when I started Haskell is that data constructors are functions just like everything else. In your example,
Foo :: String -> Char -> [a] -> FooBar a
Thus you can write functions for filling in various arguments of other functions, and then those functions will work with Foo or Bar or whatever.
fill1 :: a -> (a -> b) -> b
fill1 a f = f a
--Note that fill1 = flip ($)
fill2 :: b -> (a -> b -> c) -> (a -> c)
--Equivalently, fill2 :: b -> (a -> b -> c) -> a -> c
fill2 b f = \a -> f a b
fill3 :: c -> (a -> b -> c -> d) -> (a -> b -> d)
fill3 c f = \a b -> f a b c
fill3Empty :: (a -> b -> [c] -> d) -> (a -> b -> d)
fill3Empty f = fill3 [] f
--Now, we can write
> fill3Empty Foo x y
Foo x y []
The lens package provides elegant solutions to questions like this. However, you can tell at a glance that this package is enormously complicated. Here is the net result of how you would call the lens package:
_list :: Lens (FooBar a) (FooBar b) [a] [b]
_list = lens getter setter
where getter (Foo _ _ as) = as
getter (Bar _ _ as) = as
setter (Foo s c _) bs = Foo s c bs
setter (Bar s i _) bs = Bar s i bs
Now we can do
> over _list (3:) (Foo "ab" 'c' [2,1])
Foo "ab" 'c' [3,2,1]
Some explanation: the lens function produces a Lens type when given a getter and a setter for some type. Lens s t a b is a type that says "s holds an a and t holds a b. Thus, if you give me a function a -> b, I can give you a function s -> t". That is exactly what over does: you provide it a lens and a function (in our case, (3:) was a function that adds 3 to the front of a List) and it applies the function "where the lens indicates". This is very similar to a functor, however, we have significantly more freedom (in this example, the functor instance would be obligated to change every element of the lists, not operate on the lists themselves).
Note that our new _list lens is very generic: it works equally well over Foo and Bar and the lens package provides many functions other than over for doing magical things.
The idiomatic thing is to take those parameters of a function or constructor that you commonly want to partially apply, and move them toward the beginning:
data FooBar a = Foo [a] String Char
| Bar [a] String Int
foo :: String -> Char -> FooBar a
foo = Foo []
bar :: String -> Int -> FooBar a
bar = Bar []
Similarly, reordering the parameters to add lets you partially apply add to get functions of type FooBar a -> FooBar a, which can be easily composed:
add :: a -> FooBar a -> FooBar a
add x (Foo xs u v) = Foo (x:xs) u v
add123 :: FooBar Int -> FooBar Int
add123 = add 1 . add 2 . add 3
add123 (foo "bar" 42) == Foo [1, 2, 3] "bar" 42
(2) and (3) are perfectly normal and idiomatic ways of doing such things. About (2) in particular, one expression you will occasionally hear is "smart constructor". That just means a function like your mkFoo/mkBar that produces a FooBar a (or a Maybe (FooBar a) etc.) with some extra logic to ensure only reasonable values can be constructed.
Here are some additional tricks that might (or might not!) make sense, depending on what you are trying to do with FooBar.
If you use Foo values and Barvalues in similar ways most of the time (i.e. the difference between having the Char field and the Int one is a minor detail), it makes sense to factor out the similarities and use a single constructor:
data FooBar a = FooBar String FooBarTag [a]
data FooBarTag = Foo Char | Bar Int
Beyond avoiding case analysis when you don't care about the FooBarTag, that allows you to safely use record syntax (records and types with multiple constructors do not mix well).
data FooBar a = FooBar
{ fooBarName :: String
, fooBarTag :: FooBarTag
, fooBarList :: [a]
}
Records allow you to use the fields without having to pattern match the whole thing.
If there are sensible defaults for all fields in a FooBar, you can go one step beyond mkFoo-like constructors and define a default value.
defaultFooBar :: FooBar a
defaultFooBar = FooBar
{ fooBarName = ""
, fooBarTag = Bar 0
, fooBarList = []
}
You don't need records to use a default, but they allow overriding default fields conveniently.
myFooBar = defaultFooBar
{ fooBarTag = Foo 'x'
}
If you ever get tired of typing long names for the defaults over and over, consider the data-default package:
instance Default (FooBar a) where
def = defaultFooBar
myFooBar = def { fooBarTag = Foo 'x' }
Do note that a significant number of people do not like the Default class, and not without reason. Still, for types which are very specific to your application (e.g. configuration settings) Default is perfectly fine IMO.
Finally, updating record fields can be messy. If you end up annoyed by that, you will find lens very useful. Note that it is a big library, and it might be a little overwhelming to a beginner, so take a deep breath beforehand. Here is a small sample:
{-# LANGUAGE TemplateHaskell #-} -- At the top of the file. Needed for makeLenses.
import Control.Lens
-- Note the underscores.
-- If you are going to use lenses, it is sensible not to export the field names.
data FooBar a = FooBar
{ _fooBarName :: String
, _fooBarTag :: FooBarTag
, _fooBarList :: [a]
}
makeLenses ''FooBar -- Defines lenses for the fields automatically.
defaultFooBar :: FooBar a
defaultFooBar = FooBar
{ _fooBarName = ""
, _fooBarTag = Bar 0
, _fooBarList = []
}
-- Using a lens (fooBarTag) to set a field without record syntax.
-- Note the lack of underscores in the name of the lens.
myFooBar = set fooBarTag (Foo 'x') defaultFooBar
-- Using a lens to access a field.
myTag = view fooBarTag myFooBar -- Results in Foo 'x'
-- Using a lens (fooBarList) to modify a field.
add :: a -> FooBar a -> FooBar a
add x fb = over fooBarList (x :) fb
-- set, view and over have operator equivalents, (.~). (^.) and (%~) respectively.
-- Note that (^.) is flipped with respect to view.
Here is a gentle introduction to lens which focuses on aspects I have not demonstrated here, specially in how nicely lenses can be composed.

Is it possible to define a function in Haskell that has an input argument of two possible types?

For my own understanding, I want to define a function in Haskell that takes two arguments- either both Integers, or both Chars. It does some trivial examination of the arguments, like so:
foo 1 2 = 1
foo 2 1 = 0
foo 'a' 'b' = -1
foo _ _ = -10
This I know won't compile, because it doesn't know whether its args are of type Num or Char. But I can't make its arguments polymorphic, like:
foo :: a -> a -> Int
Because then we are saying it must be a Char (or Int) in the body.
Is it possible to do this in Haskell? I thought of maybe creating a custom type? Something like:
data Bar = Int | Char
foo :: Bar -> Bar -> Int
But I don't think this is valid either. In general, I'm confused about if there's a middle ground between a function in Haskell being either explicitly of ONE type, or polymorphic to a typeclass, prohibiting any usage of a specific type in the function body.
You can use the Either data type to store two different types. Something like this should work:
foo :: Either (Int, Int) (Char, Char) -> Int
foo (Right x) = 3
foo (Left y) = fst y
So, for it's Left data constructor you pass two Int to it and for it's Right constructor you pass two Char to it. Another way would be to define your own algebric data type like this:
data MyIntChar = MyInt (Int, Int) | MyChar (Char, Char) deriving (Show)
If you observe, then you can see that the above type is isomorphic to Either data type.
I'm not sure I would necessarily recommend using typeclasses for this, but they do make something like this possible at least.
class Foo a where
foo :: a -> a -> Int
instance Foo Int where
foo 1 2 = 1
foo 2 1 = 0
foo _ _ = -10
instance Foo Char where
foo 'a' 'b' = -1
foo _ _ = -10
You can do
type Bar = Either Int Char
foo :: Bar -> Bar -> Int
foo (Left 1) (Left 2) = 1
foo (Right 'a') (Right 'b') = -1
foo (Left 3) (Right 'q') = 42
foo _ _ = 10
and things like that - the Either data type is precisely for mixing two types together. You can roll your own similar type like
data Quux = AnInt Int | AChar Char | ThreeBools Bool Bool Bool
It's called an Algebraic Data Type.
(I struggle to think of circumstances when it's useful to mix specifically characters and integers together - mainly it's very helpful to know where your data is and what type it is.)
That said, I write algebraic data types a lot, but I give them meaningful names that represent actual things rather than just putting random stuff together because I don't like to be specific. Being very specific or completely general is useful. In between there are typeclasses like Eq. You can have a function with type Eq a => a -> [a] -> Bool which means it has type a -> [a] -> Bool for any type that has == defined, and I leave it open for people to use it for data types I never thought of as long as they define an equality function.

Pattern matching against record syntax

Consider the following data definition:
data Foo = A{field::Int}
| B{field::Int}
| C
| D
Now let's say we want to write a function that takes a Foo and increases field if it exists, and leave it unchanged otherwise:
incFoo :: Foo -> Foo
incFoo A{field=n} = A{field=n+1}
incFoo B{field=n} = B{field=n+1}
incFoo x = x
This naive approach leads to some code duplication. But the fact that both A and B shares field allows us to rewrite it:
incFoo :: Foo -> Foo
incFoo x | hasField x, n <- field x = x{field=n+1}
incFoo x = x
hasField A{} = True
hasField B{} = True
hasField _ = False
Less elegant, but that's defiantly easier to maintain when the actual manipulation is complex. The key feature here is x{field=n+1} - record syntax allows us to "update" field without specifying x's type. Considering this, I'd expect something similar to the following syntax (which is not supported):
incFoo :: Foo -> Foo
incFoo x{field=n} = x{field=n+1}
incFoo x = x
I've considered using View Patterns, but since field is a partial function (field C raises an error) it'll require wrapping it in more boilerplate code.
So my question is: why there's no support for the above syntax, and is there any elegant way of implementing a similar behavior?
Thanks!
The reason why you can't do this is because in Haskell, records are inherently second class. They always must be wrapped in a constructor. So in order to have this work as intended you either must use an individual case for each constructor, or use a record substitute.
One possible solution is to use lenses. I'll use the implementation of lenses lens since lens-family-th doesn't seem to handle duplicate field names.
import Control.Lens
data Foo = A {_f :: Int}
| B {_f :: Int}
deriving Show
makeLenses ''Foo
foo :: Foo -> Foo
foo = f %~ (+1)
And then we can use it like this
> foo (A 1)
A{_f = 1}

Resources