I can create and reference relative pointers to struct members in C++ using the ::*, .*, and ->* syntax like :
char* fstab_t::*field = &fstab_t::fs_vfstype;
my_fstab.*field = ...
In Haskell, I can easily create temporary labels for record getters like :
(idxF_s,idxL_s) = swap_by_sign sgn (idxF,idxL) ;
Afaik, I cannot however then update records using these getters as labels like :
a { idxF_s = idxL_s b }
Is there an easy way to do this without coding for each record setter?
A getter and setter bundled together in a first-class value is referred to as a lens. There are quite a few packages for doing this; the most popular are data-lens and fclabels. This previous SO question is a good introduction.
Both of those libraries support deriving lenses from record definitions using Template Haskell (with data-lens, it's provided as an additional package for portability). Your example would be expressed as (using data-lens syntax):
setL idxF_s (b ^. idL_s) a
(or equivalently: idxF_s ^= (b ^. idL_s) $ a)
You can, of course, transform lenses in a generic way by transforming their getter and setter together:
-- I don't know what swap_by_sign is supposed to do.
negateLens :: (Num b) => Lens a b -> Lens a b
negateLens l = lens get set
where
get = negate . getL l
set = setL l . negate
(or equivalently: negateLens l = iso negate negate . l1)
In general, I would recommend using lenses whenever you have to deal with any kind of non-trivial record handling; not only do they vastly simplify pure transformation of records, but both packages contain convenience functions for accessing and modifying a state monad's state using lenses, which is incredibly useful. (For data-lens, you'll want to use the data-lens-fd package to use these convenience functions in any MonadState; again, they're in a separate package for portability.)
1 When using either package, you should start your modules with:
import Prelude hiding (id, (.))
import Control.Category
This is because they use generalised forms of the Prelude's id and (.) functions — id can be used as the lens from any value to itself (not all that useful, admittedly), and (.) is used to compose lenses (e.g. getL (fieldA . fieldB) a is the same as getL fieldA . getL fieldB $ a). The shorter negateLens definition uses this.
What you want here is first-class record labels, and while this does not exist in the language, there are several packages on Hackage which implement this pattern. One of these is fclabels, which can use Template Haskell to generate the required boilerplate for you. Here's an example:
{-# LANGUAGE TemplateHaskell #-}
import Control.Category
import Data.Label
import Prelude hiding ((.))
data Foo = Foo { _fieldA :: Int, _fieldB :: Int }
deriving (Show)
$(mkLabels [''Foo])
main = do
let foo = Foo 2 3
putStrLn "Pick a field, A or B"
line <- getLine
let field = (if line == "A" then fieldA else fieldB)
print $ modify field (*10) foo
Related
I'm fairly new to Haskell, and one thing I've been struggling with is writing readable code using records.
My specific problems are:
I haven't found an effective strategy for dealing with name conflicts between fields in different record types. I'm finding I want the same field in multiple different record types, and the name conflict issue is really annoying. I end up choosing some prefix to put on all of my fields, which adds to verbosity and hinders readability.
Using nested records results in really verbose code. I find
someFunction(foo.bar, 2 * foo.bar.baz)
in a language like Java or C++ to be pretty readable. In Haskell I find myself writing this to accomplish the same thing
someFunction (fooBar foo) (2 * barBaz (fooBar foo))
which is a lot harder to visually parse, and calls to functions with multiple arguments quickly become unreadable. In order to make this more readable, I find myself defining intermediate values which are to extract fields from records, which is more readable, but adds more lines of code, so it hurts readability in a different way.
Is there a better way to use records that is more readable, or is there something I should be doing instead? Just using tuples? Writing functions with tons of parameters instead of grouping related values into records? Something else?
One solution (as suggested in the comments) to the problem is to use lenses. Using the microlens and microlens-th packages (these might be simpler when you're getting started):
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE FlexibleInstances #-}
import Data.List (nub)
import Lens.Micro ((^.), (^..))
import Lens.Micro.TH (makeFields)
newtype Name = Name String
deriving Eq
data Person = Person { _personName :: Name }
makeFields ''Person
data Species = Dog | Cat
deriving Eq
data Pet = Pet { _petName :: Name, _petSpecies :: Species }
makeFields ''Pet
-- ^. is an infix operator for view
uniquePersonNames :: [Person] -> [Name]
uniquePersonNames ps = nub (map (\p -> p ^. name) ps)
dogs :: [Pet] -> [Pet]
dogs ps = filter (\p -> p ^. species == Dog) ps
data Concert = Concert
{ _concertPerformers :: [Person]
, _concertAttendees :: [Person]
}
makeFields ''Concert
-- ^.. is an infix operator for toListOf
performerNames :: Concert -> [Name]
performerNames c = c ^.. performers . traverse . name
data House = House { _housePeople :: [Person], _housePet :: Pet}
makeFields ''House
houseSound :: House -> String
houseSound h = case h ^. pet . species of
Dog -> "Woof!"
Cat -> "Meow!"
There are several resources out there to learn more about lenses and other kinds of optics. One particularly beginner friendly resource Control.Lens.Tutorial.
Be warned that this approach can lead to type errors that are hard to understand (I believe the generic-lens library has better error messages, but I have not used it), especially if you start using things blindly. I suggest sticking to the basics (as presented in the linked tutorial) -- this will cover a large portion of your use cases.
Replacing fields names with letters, I have cases like this:
data Foo = Foo { a :: Maybe ...
, b :: [...]
, c :: Maybe ...
, ... for a lot more fields ...
} deriving (Show, Eq, Ord)
instance Writer Foo where
write x = maybeWrite a ++
listWrite b ++
maybeWrite c ++
... for a lot more fields ...
parser = permute (Foo
<$?> (Nothing, Just `liftM` aParser)
<|?> ([], bParser)
<|?> (Nothing, Just `liftM` cParser)
... for a lot more fields ...
-- this is particularly hideous
foldl1 merge [foo1, foo2, ...]
merge (Foo a b c ...seriously a lot more...)
(Foo a' b' c' ...) =
Foo (max a a') (b ++ b') (max c c') ...
What techniques would allow me to better manage this growth?
In a perfect world a, b, and c would all be the same type so I could keep them in a list, but they can be many different types. I'm particularly interested in any way to fold the records without needing the massive patterns.
I'm using this large record to hold the different types resulting from permutation parsing the vCard format.
Update
I've implemented both the generics and the foldl approaches suggested below. They both work, and they both reduce three large field lists to one.
Datatype-generic programming techniques can be used to transform all the fields of a record in some "uniform" sort of way.
Perhaps all the fields in the record implement some typeclass that we want to use (the typical example is Show). Or perhaps we have another record of "similar" shape that contains functions, and we want to apply each function to the corresponding field of the original record.
For these kinds of uses, the generics-sop library is a good option. It expands the default Generics functionality of GHC with extra type-level machinery that provides analogues of functions like sequence or ap, but which work over all the fields of a record.
Using generics-sop, I tried to create a slightly less verbose version of your merge funtion. Some preliminary imports:
{-# language TypeOperators #-}
{-# language DeriveGeneric #-}
{-# language TypeFamilies #-}
{-# language DataKinds #-}
import Control.Applicative (liftA2)
import qualified GHC.Generics as GHC
import Generics.SOP
A helper function that lifts a binary operation to a form useable by the functions of generics-sop:
fn_2' :: (a -> a -> a) -> (I -.-> (I -.-> I)) a -- I is simply an Identity functor
fn_2' = fn_2 . liftA2
A general merge function that takes a vector of operators and works on any single-constructor record that derives Generic:
merge :: (Generic a, Code a ~ '[ xs ]) => NP (I -.-> (I -.-> I)) xs -> a -> a -> a
merge funcs reg1 reg2 =
case (from reg1, from reg2) of
(SOP (Z np1), SOP (Z np2)) ->
let npResult = funcs `hap` np1 `hap` np2
in to (SOP (Z npResult))
Code is a type family that returns a type-level list of lists describing the structure of a datatype. The outer list is for constructors, the inner lists contain the types of the fields for each constructor.
The Code a ~ '[ xs ] part of the constraint says "the datatype can only have one constructor" by requiring the outer list to have exactly one element.
The (SOP (Z _) pattern matches extract the (heterogeneus) vector of field values from the record's generic representation. SOP stands for "sum-of-products".
A concrete example:
data Person = Person
{
name :: String
, age :: Int
} deriving (Show,GHC.Generic)
instance Generic Person -- this Generic is from generics-sop
mergePerson :: Person -> Person -> Person
mergePerson = merge (fn_2' (++) :* fn_2' (+) :* Nil)
The Nil and :* constructors are used to build the vector of operators (the type is called NP, from n-ary product). If the vector doesn't match the number of fields in the record, the program won't compile.
Update. Given that the types in your record are highly uniform, an alternative way of creating the vector of operations is to define instances of an auxiliary typeclass for each field type, and then use the hcpure function:
class Mergeable a where
mergeFunc :: a -> a -> a
instance Mergeable String where
mergeFunc = (++)
instance Mergeable Int where
mergeFunc = (+)
mergePerson :: Person -> Person -> Person
mergePerson = merge (hcpure (Proxy :: Proxy Mergeable) (fn_2' mergeFunc))
The hcliftA2 function (that combines hcpure, fn_2 and hap) could be used to simplify things further.
Some suggestions:
(1) You can use the RecordWildCards extension to automatically
unpack a record into variables. Doesn't help if you need to unpack
two records of the same type, but it's a useful to keep in mind.
Oliver Charles has a nice blog post on it: (link)
(2) It appears your example application is performing a fold over the records.
Have a look at Gabriel Gonzalez's foldl package. There is also a blog post: (link)
Here is a example of how you might use it with a record like:
data Foo = Foo { _a :: Int, _b :: String }
The following code computes the maximum of the _a fields and the
concatenation of the _b_ fields.
import qualified Control.Foldl as L
import Data.Profunctor
data Foo = Foo { _a :: Int, _b :: String }
deriving (Show)
fold_a :: L.Fold Foo Int
fold_a = lmap _a (L.Fold max 0 id)
fold_b :: L.Fold Foo String
fold_b = lmap _b (L.Fold (++) "" id)
fold_foos :: L.Fold Foo Foo
fold_foos = Foo <$> fold_a <*> fold_b
theFoos = [ Foo 1 "a", Foo 3 "b", Foo 2 "c" ]
test = L.fold fold_foos theFoos
Note the use of the Profunctor function lmap to extract out
the fields we want to fold over. The expression:
L.Fold max 0 id
is a fold over a list of Ints (or any Num instance), and therefore:
lmap _a (L.Fold max 0 id)
is the same fold but over a list of Foo records where we use _a
to produce the Ints.
I love Lens library and I love how it works, but sometimes it introduces so many problems, that I regret I ever started using it. Lets look at this simple example:
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
data Data = A { _x :: String, _y :: String }
| B { _x :: String }
makeLenses ''Data
main = do
let b = B "x"
print $ view y b
it outputs:
""
And now imagine - we've got a datatype and we refactor it - by changing some names. Instead of getting error (in runtime, like with normal accessors) that this name does not longer apply to particular data constructor, lenses use mempty from Monoid to create default object, so we get strange results instead of error. Debugging something like this is almost impossible.
Is there any way to fix this behaviour? I know there are some special operators to get the behaviour I want, but all "normal" looking functions from lenses are just horrible. Should I just override them with my custom module or is there any nicer method?
As a sidenote: I want to be able to read and set the arguments using lens syntax, but just remove the behaviour of automatic result creating when field is missing.
It sounds like you just want to recover the exception behavior. I vaguely recall that this is how view once worked. If so, I expect a reasonable choice was made with the change.
Normally I end up working with (^?) in the cases you are talking about:
> b ^? y
Nothing
If you want the exception behavior you can use ^?!
> b ^?! y
"*** Exception: (^?!): empty Fold
I prefer to use ^? to avoid partial functions and exceptions, similar to how it is commonly advised to stay away from head, last, !! and other partial functions.
Yes, I too have found it a bit odd that view works for Traversals by concatenating the targets. I think this is because of the instance Monoid m => Applicative (Const m). You can write your own view equivalent that doesn't have this behaviour by writing your own Const equivalent that doesn't have this instance.
Perhaps one workaround would be to provide a type signature for y, so know know exactly what it is. If you had this then your "pathological" use of view wouldn't compile.
data Data = A { _x :: String, _y' :: String }
| B { _x :: String }
makeLenses ''Data
y :: Lens' Data String
y = y'
You can do this by defining your own view1 operator. It doesn't exist in the lens package, but it's easy to define locally.
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
data Data = A { _x :: String, _y :: String }
| B { _x :: String }
makeLenses ''Data
newtype Get a b = Get { unGet :: a }
instance Functor (Get a) where
fmap _ (Get x) = Get x
view1 :: LensLike' (Get a) s a -> s -> a
view1 l = unGet . l Get
works :: Data -> String
works = view1 x
-- fails :: Data -> String
-- fails = view1 y
-- Bug.hs:23:15:
-- No instance for (Control.Applicative.Applicative (Get String))
-- arising from a use of ‘y’
Pretty self-explanatory. I know that makeClassy should create typeclasses, but I see no difference between the two.
PS. Bonus points for explaining the default behaviour of both.
Note: This answer is based on lens 4.4 or newer. There were some changes to the TH in that version, so I don't know how much of it applies to older versions of lens.
Organization of the lens TH functions
The lens TH functions are all based on one function, makeLensesWith (also named makeFieldOptics inside lens). This function takes a LensRules argument, which describes exactly what is generated and how.
So to compare makeLenses and makeFields, we only need to compare the LensRules that they use. You can find them by looking at the source:
makeLenses
lensRules :: LensRules
lensRules = LensRules
{ _simpleLenses = False
, _generateSigs = True
, _generateClasses = False
, _allowIsos = True
, _classyLenses = const Nothing
, _fieldToDef = \_ n ->
case nameBase n of
'_':x:xs -> [TopName (mkName (toLower x:xs))]
_ -> []
}
makeFields
defaultFieldRules :: LensRules
defaultFieldRules = LensRules
{ _simpleLenses = True
, _generateSigs = True
, _generateClasses = True -- classes will still be skipped if they already exist
, _allowIsos = False -- generating Isos would hinder field class reuse
, _classyLenses = const Nothing
, _fieldToDef = camelCaseNamer
}
What do these mean?
Now we know that the differences are in the simpleLenses, generateClasses, allowIsos and fieldToDef options. But what do those options actually mean?
makeFields will never generate type-changing optics. This is controlled by the simpleLenses = True option. That option doesn't have haddocks in the current version of lens. However, lens HEAD added documentation for it:
-- | Generate "simple" optics even when type-changing optics are possible.
-- (e.g. 'Lens'' instead of 'Lens')
So makeFields will never generate type-changing optics, while makeLenses will if possible.
makeFields will generate classes for the fields. So for each field foo, we have a class:
class HasFoo t where
foo :: Lens' t <Type of foo field>
This is controlled by the generateClasses option.
makeFields will never generate Iso's, even if that would be possible (controlled by the allowIsos option, which doesn't seem to be exported from Control.Lens.TH)
While makeLenses simply generates a top-level lens for each field that starts with an underscore (lowercasing the first letter after the underscore), makeFields will instead generate instances for the HasFoo classes. It also uses a different naming scheme, explained in a comment in the source code:
-- | Field rules for fields in the form # prefixFieldname or _prefixFieldname #
-- If you want all fields to be lensed, then there is no reason to use an #_# before the prefix.
-- If any of the record fields leads with an #_# then it is assume a field without an #_# should not have a lens created.
camelCaseFields :: LensRules
camelCaseFields = defaultFieldRules
So makeFields also expect that all fields are not just prefixed with an underscore, but also include the data type name as a prefix (as in data Foo = { _fooBar :: Int, _fooBaz :: Bool }). If you want to generate lenses for all fields, you can leave out the underscore.
This is all controlled by the _fieldToDef (exported as lensField by Control.Lens.TH).
As you can see, the Control.Lens.TH module is very flexible. Using makeLensesWith, you can create your very own LensRules if you need a pattern not covered by the standard functions.
Disclaimer: this is based on experimenting with the working code; it gave me enough information to proceed with my project, but I'd still prefer a better-documented answer.
data Stuff = Stuff {
_foo
_FooBar
_stuffBaz
}
makeLenses
Will create foo as a lens accessor to Stuff
Will create fooBar (changing the capitalized name to lowercase);
makeFields
Will create baz and a class HasBaz; it will make Stuff an instance of that class.
Normal
makeLenses creates a single top-level optic for each field in the type. It looks for fields that start with an underscore (_) and it creates an optic that is as general as possible for that field.
If your type has one constructor and one field you'll get an Iso.
If your type has one constructor and multiple fields you'll get many Lens.
If your type has multiple constructors you'll get many Traversal.
Classy
makeClassy creates a single class containing all the optics for your type. This version is used to make it easy to embed your type in another larger type achieving a kind of subtyping. Lens and Traversal optics will be created according to the rules above (Iso is excluded because it hinders the subtyping behavior.)
In addition to one method in the class per field you'll get an extra method that makes it easy to derive instances of this class for other types. All of the other methods have default instances in terms of the top-level method.
data T = MkT { _field1 :: Int, _field2 :: Char }
class HasT a where
t :: Lens' a T
field1 :: Lens' a Int
field2 :: Lens' a Char
field1 = t . field1
field2 = t . field2
instance HasT T where
t = id
field1 f (MkT x y) = fmap (\x' -> MkT x' y) (f x)
field2 f (MkT x y) = fmap (\y' -> MkT x y') (f y)
data U = MkU { _subt :: T, _field3 :: Bool }
instance HasT U where
t f (MkU x y) = fmap (\x' -> MkU x' y) (f x)
-- field1 and field2 automatically defined
This has the additional benefit that it is easy to export/import all the lenses for a given type. import Module (HasT(..))
Fields
makeFields creates a single class per field which is intended to be reused between all types that have a field with the given name. This is more of a solution to record field names not being able to be shared between types.
Is there a short circuit built in to GHC's (and Haskell's in general) derived Eq instance that will fire when I compare the same instance of a data type?
-- will this fire?
let same = complex == complex
My plan is to read in a lazy datastructure (let's say a tree), change some values and then compare the old and the new version to create a diff that will then be written back to the file.
If there would be a short circuit built in then the compare step would break as soon as it finds that the new structure is referencing old values. At the same time this wouldn't read in more than necessary from the file in the first place.
I know I'm not supposed to worry about references in Haskell but this seems to be a nice way to handle lazy file changes. If there is no shortcircuit builtin, would there be a way to implement this? Suggestions on different schemes welcome.
StableNames are specifically designed to solve problems like yours.
Note that StableNames can only be created in the IO monad. So you have two choices: either create your objects in the IO monad, or use unsafePerformIO in your (==) implementation (which is more or less fine in this situation).
But I should stress that it is possible to do this in a totally safe way (without unsafe* functions): only creation of stable names should happen in IO; after that, you may compare them in a totally pure way.
E.g.
data SNWrapper a = SNW !a !(StableName a)
snwrap :: a -> IO (SNWrapper a)
snwrap a = SNW a <$> makeStableName a
instance Eq a => Eq (SNWrapper a) where
(SNW a sna) (SNW b snb) = sna == snb || a == b
Notice that if stable name comparison says "no", you still need to perform full value comparison to get a definitive answer.
In my experience that worked pretty well when you have lots of sharing and for some reason are not willing to use other methods to indicate sharing.
(Speaking of other methods, you could, for example, replace the IO monad with a State Integer monad and generate unique integers in that monad as an equivalent of "stable names".)
Another trick is, if you have a recursive data structure, make the recursion go through SNWrapper. E.g. instead of
data Tree a = Bin (Tree a) (Tree a) | Leaf a
type WrappedTree a = SNWrapper (Tree a)
use
data Tree a = Bin (WrappedTree a) (WrappedTree a) | Leaf a
type WrappedTree a = SNWrapper (Tree a)
This way, even if short-circuiting doesn't fire at the topmost layer, it might fire somewhere in the middle and still save you some work.
There's no short-circuiting when both arguments of (==) are the same object. The derived Eq instance will do a structural comparison, and in the case of equality, of course needs to traverse the entire structure. You can build in a possible shortcut yourself using
GHC.Prim.reallyUnsafePtrEquality# :: a -> a -> GHC.Prim.Int#
but that will in fact fire only rarely:
Prelude GHC.Base> let x = "foo"
Prelude GHC.Base> I# (reallyUnsafePtrEquality# x x)
1
Prelude GHC.Base> I# (reallyUnsafePtrEquality# True True)
1
Prelude GHC.Base> I# (reallyUnsafePtrEquality# 3 3)
0
Prelude GHC.Base> I# (reallyUnsafePtrEquality# (3 :: Int) 3)
0
And if you read a structure from file, it will certainly not find it the same object as one that was already in memory.
You can use rewrite rules to avoid the comparison of lexically identical objects
module Equal where
{-# RULES
"==/same" forall x. x == x = True
#-}
main :: IO ()
main = let x = [1 :: Int .. 10] in print (x == x)
which leads to
$ ghc -O -ddump-rule-firings Equal.hs
[1 of 1] Compiling Equal ( Equal.hs, Equal.o )
Rule fired: Class op enumFromTo
Rule fired: ==/same
Rule fired: Class op show
the rule firing (note: it didn't fire with let x = "foo", but with user-defined types, it should).