How do I model a record's fields as data? - haskell

Let's say I have a Person record with some fields:
data Person = Person
{ name :: String
, age :: Int
, id :: Int
}
and I want to be able to search a list of Persons by a given field:
findByName :: String -> [Person] -> Maybe Person
findByName s = find (\p -> name p == s)
Now let's say I want to be able to model and store these searches/queries as data, for instance for logging purposes, or to batch execute them, or whatever.
How would I go about representing a search over a given field (or set of fields) as data?
My intuition says to model it as a map of fields to string values (Map (RecordField) (Maybe String)), but I can't do that, because record fields are functions.
Is there a better way to do this than, say, the following?
data PersonField = Name | Age | Int
type Search = Map PersonField (Maybe String)
This could technically work but it decouples PersonField from Person in an ugly way.

I want to be able to model and store these searches/queries as data
Let's assume we want to store them as JSON. We could define a type like
data Predicate record = Predicate {
runPredicate :: record -> Bool ,
storePredicate :: Value
}
Where storePredicate would return a JSON representation of the "reference value" inside the predicate. For example, the value 77 for "age equals 77".
For each record, we would like to have a collection like this:
type FieldName = String
type FieldPredicates record = [(FieldName, Value -> Maybe (Predicate record))]
That is: for each field, we can supply a JSON value encoding the "reference value" of the predicate and, if it parses successfully, we get a Predicate. Otherwise we get Nothing. This would allows us to serialize and deserialize predicates.
We could define FieldPredicates manually for each record, but is there a more automated way? We could try generating field equality predicates using a typeclass. But first, the extensions and imports dance:
{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE StandaloneKindSignatures #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE BlockArguments #-}
import Data.Functor ( (<&>) )
import Data.Kind ( Type, Constraint )
import Data.Proxy
import GHC.Records ( HasField(..) )
import GHC.TypeLits ( KnownSymbol, Symbol, symbolVal )
import Data.Aeson ( FromJSON(parseJSON), Value, ToJSON(toJSON) )
import Data.Aeson.Types (parseMaybe)
import Data.List ( lookup )
Now we define the helper typeclass:
type HasEqFieldPredicates :: [Symbol] -> Type -> Constraint
class HasEqFieldPredicates fieldNames record where
eqFieldPredicates :: FieldPredicates record
instance HasEqFieldPredicates '[] record where
eqFieldPredicates = []
instance
( KnownSymbol fieldName,
HasField fieldName record v,
Eq v,
FromJSON v,
ToJSON v,
HasEqFieldPredicates fieldNames record
) =>
HasEqFieldPredicates (fieldName ': fieldNames) record
where
eqFieldPredicates =
let current =
( symbolVal (Proxy #fieldName),
\j ->
parseMaybe (parseJSON #v) j <&> \v ->
Predicate (\record -> getField #fieldName record == v) (toJSON v))
in current : eqFieldPredicates #fieldNames #record
An example with Person:
personEqPredicates :: [(FieldName, Value -> Maybe (Predicate Person))]
personEqPredicates = eqFieldPredicates #["name", "age", "id"] #Person
personAgeEquals :: Value -> Maybe (Predicate Person)
personAgeEquals = let Just x = Data.List.lookup "age" personEqPredicates in x
Putting it to work:
ghci> let Just p = personAgeEquals (toJSON (77::Int)) in runPredicate p Person { name = "John", age = 78, id = 3 }
False
ghci> let Just p = personAgeEquals (toJSON (78::Int)) in runPredicate p Person { name = "John", age = 78, id = 3 }
True

If you don't need to serialize these query objects to disk, then your "field" type is Person -> a. A record accessor is just a function from Person to some type a. Or if you end up outgrowing basic accessors and need to work with a lot of nested data, you can look into lenses.
However, it sounds like you want to be able to write these queries to disk. In that case, you can't easily serialize functions (or lenses, for that matter). I don't know of a way built-in to Haskell to do all of that automatically and still have it be serializable. So my recommendation would be to roll your own datatypes.
data PersonField = Name | Age | Id
or, even better, you can use GADTs to keep type safety.
data PersonField a where
Name :: PersonField String
Age :: PersonField Int
Id :: PersonField Int
getField :: PersonField a -> Person -> a
getField Name = name
getField Age = age
getField Id = id
Then you have total control over this concrete type and can write your own serialization logic for it. I think Map PersonField (Maybe String) is a good start, and you can refine the Maybe String part if you end up doing more complex queries (like "contains" or "case insensitive comparison", for instance).

Related

Filter a list by returning only one kind of data constructor instead of the type of the type constructor

So let's say I have the following data type :
data CommandRequest = CreateWorkspace {commandId :: UUID , workspaceId ::UUID }
| IntroduceIdea {commandId :: UUID , workspaceId ::UUID , ideaContent :: String} deriving (Show,Eq)
with the {-# LANGUAGE DataKinds #-}
I want to implement the following function (in pseudocode) :
filter :: [CommandRequest] -> [CreateWorkspace] (promoting the data constructor to a type level)
can you help me with the implementation of that function ?... Thank you !
Given a Haskell type like:
data Foo = Bar Int | Baz String
there is no direct way of writing down a new type that represents the subset of values of type Foo that are constructed with Bar, even using the DataKinds extension.
In particular, when you turn on DataKinds, the Bar type that you get is not the type of the values Bar 1 and Bar 2. In fact, the new lifted Bar type doesn't really have anything to do with the values Bar 1 and Bar 2, except for the fact that they share the name Bar. It's not that different than explicitly defining:
data True = TrueThing
This new type True has nothing to do with the value True of type Bool, except they happen to have the same name.
Presuming that what you are trying to do is find a type-safe way of representing the result of filtering CommandRequest for just those values that were constructed with the CreateWorkspace constructor so that you can't "accidentally" let an IntroduceIdea sneak in to your list, you'll have to take another approach. There are several possibilities.
The most straightforward way, which doesn't require any special type-level programming at all, is to represent CreateWorkspace and IntroduceIdea as separate types:
{-# LANGUAGE DuplicateRecordFields #-}
data CreateWorkspace = CreateWorkspace
{ commandId :: UUID
, workspaceId ::UUID
} deriving (Show)
data IntroduceIdea = IntroduceIdea
{ commandId :: UUID
, workspaceId ::UUID
, ideaContent :: String
} deriving (Show)
and then create a new algebraic sum type to represent the disjoint union of those separate types:
data CommandRequest
= CreateWorkspace' CreateWorkspace
| IntroduceIdea' IntroduceIdea
deriving (Show)
Note we've used the ticks to differentiate these constructors from those used in the underlying component types. A simple variant of this would be to move common fields (like commandId, and perhaps workSpaceId) into the CommandRequest type. This might or might not make sense, depending on what you're trying to accomplish.
Anyway, this adds a little syntactic fluff, but it's straightforward to define:
filterCreateWorkspace :: [CommandRequest] -> [CreateWorkspace]
filterCreateWorkspace crs = [ cw | CreateWorkspace' cw <- crs ]
and with some additional "constructors":
createWorkspace :: UUID -> UUID -> CommandRequest
createWorkspace u1 u2 = CreateWorkspace' (CreateWorkspace u1 u2)
introduceIdea :: UUID -> UUID -> String -> CommandRequest
introduceIdea u1 u2 s = IntroduceIdea' (IntroduceIdea u1 u2 s)
it's not too hard to create and filter [CommandRequest] lists:
type UUID = Int
testdata1 :: [CommandRequest]
testdata1
= [ createWorkspace 1 2
, createWorkspace 3 4
, introduceIdea 5 6 "seven"
]
test1 = filterCreateWorkspace testdata1
giving:
> test1
[CreateWorkspace {commandId = 1, workspaceId = 2}
,CreateWorkspace {commandId = 3, workspaceId = 4}]
This is almost certainly the correct approach for doing what you want to do. I mean, this is exactly what algebraic data types are for. This is what a Haskell program is supposed to look like.
"But no," I hear you say! "I want to spend endless hours fighting confusing type errors! I want to crawl down the dependent type rabbit hole. You know, for 'reasons'." Should I stand in your way? Can one man stand against the ocean?
If you really want to do this at the type level, you still want to define separate types for your two constructors:
data CreateWorkspace = CreateWorkspace
{ commandId :: UUID
, workspaceId ::UUID
} deriving (Show)
data IntroduceIdea = IntroduceIdea
{ commandId :: UUID
, workspaceId ::UUID
, ideaContent :: String
} deriving (Show)
As before, this makes it easy to represent a list of type [CreateWorkspace]. Now, the key to working at the type level will be finding a way to make it as difficult as possible to represent a list of type [CommandRequest]. A standard method would be to introduce a CommandRequest type class with instances for our two types, together with an existential type to represent an arbitrary type belonging to that class:
{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE ExistentialQuantification #-}
type UUID = Int -- for the sake of examples
data CreateWorkspace = CreateWorkspace
{ commandId :: UUID
, workspaceId ::UUID
} deriving (Show)
data IntroduceIdea = IntroduceIdea
{ commandId :: UUID
, workspaceId ::UUID
, ideaContent :: String
} deriving (Show)
class CommandRequest a where
maybeCreateWorkspace :: a -> Maybe CreateWorkspace
instance CommandRequest CreateWorkspace where
maybeCreateWorkspace c = Just c
instance CommandRequest IntroduceIdea where
maybeCreateWorkspace _ = Nothing
data SomeCommandRequest = forall t . CommandRequest t => SomeCommandRequest t
Now we can define:
import Data.Maybe
filterCreateWorkspace :: [SomeCommandRequest] -> [CreateWorkspace]
filterCreateWorkspace = catMaybes . map getCW
where getCW (SomeCommandRequest cr) = maybeCreateWorkspace cr
which works fine, though the syntax is still a bit cumbersome:
testdata2 :: [SomeCommandRequest]
testdata2 = [ SomeCommandRequest (CreateWorkspace 1 2)
, SomeCommandRequest (CreateWorkspace 3 4)
, SomeCommandRequest (IntroduceIdea 5 6 "seven")
]
test2 = print $ filterCreateWorkspace testdata2
The test gives:
> test2
[CreateWorkspace {commandId = 1, workspaceId = 2}
,CreateWorkspace {commandId = 3, workspaceId = 4}]
The awkward thing about this solution is that we need a type class method for identifying the CreateWorkspace type. If we wanted to construct lists of each possible constructor, we'd need to add a new type class method for every single one, and we need to give a definition for the method for every instance (though we can get away with a default definition that returns Nothing for all but one instance, I guess). Anyway, that's nuts!
The mistake we made was making it difficult to represent a list of type [CreateWorkspace] instead of absurdly difficult. To make it absurdly difficult, we'll still want to represent our two constructors as separate types, but we'll make them instances of a data family keyed by constructor names lifted to the type level by the DataKinds extension. Now this is starting to look like a Haskell program!
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE TypeFamilies #-}
data CommandRequestC = CreateWorkspace | IntroduceIdea
data family CommandRequest (c :: CommandRequestC)
type UUID = Int -- for the sake of examples
data instance CommandRequest CreateWorkspace
= CreateWorkspaceCR
{ commandId :: UUID
, workspaceId ::UUID
} deriving (Show)
data instance CommandRequest IntroduceIdea
= IntroduceIdeaCR
{ commandId :: UUID
, workspaceId ::UUID
, ideaContent :: String
} deriving (Show)
What's going on here? Well, we introduced a new type CommandRequestC (the trailing C stands for "constructor") with two constructors CreateWorkspace and IntroduceIdea. The only purpose of these constructors was to lift them to the type level using DataKinds in order to use them as type-level tags for the CommandRequest data family. This is a very common way of using DataKinds, maybe the most common. In fact, the example you gave of the type ReadResult 'RegularStream StreamSlice was exactly this kind of usage. The type:
data StreamType = All | RegularStream
carries no useful data. The whole point of its existence is to lift the constructors All and RegularStream to type-level tags, so that ReadResult 'All StreamSlice and ReadResult 'RegularStream StreamSlice can be used to name two different related types, just like CommandRequest 'CreateWorkspace and CommandRequest 'IntroduceIdea name two different related types.
At this point, we have two separate types for our two constructors that happen to be related via a tagged data family, rather than via a type class.
testdata3 :: [CommandRequest 'CreateWorkspace]
testdata3 = [CreateWorkspaceCR 1 2, CreateWorkspaceCR 3 4]
testdata4 :: [CommandRequest 'IntroduceIdea]
testdata4 = [IntroduceIdeaCR 5 6 "seven"]
Note that even though we can write the type [CommandRequest c], leaving the constructor tag as an unspecified type variable c, we still can't write a list that mixes these constructors:
testdata5bad :: [CommandRequest c]
testdata5bad = [CreateWorkspaceCR 1 2, CreateWorkspaceCR 3 4,
IntroduceIdeaCR 5 6 "seven"] -- **ERROR**
We still need our existential type:
{-# LANGUAGE ExistentialQuantification #-}
data SomeCommandRequest = forall c . SomeCommandRequest (CommandRequest c)
and the extra existential syntax:
testdata6 :: [SomeCommandRequest]
testdata6 = [ SomeCommandRequest (CreateWorkspaceCR 1 2)
, SomeCommandRequest (CreateWorkspaceCR 3 4)
, SomeCommandRequest (IntroduceIdeaCR 5 6 "seven")]
Worse yet, if we try to write a filter function, it's not clear how to implement it. One reasonable first attempt is:
filterCreateWorkspace :: [SomeCommandRequest] -> [CommandRequest 'CreateWorkspace]
filterCreateWorkspace (SomeCommandRequest cr : rest)
= case cr of cw#(CreateWorkspaceCR _ _) -> cw : filterCreateWorkspace rest
_ -> filterCreateWorkspace rest
but this fails with an error about failing to match to the CreateWorkspace tag.
The problem is that data families aren't powerful enough to allow you to infer which member of a family you actually have (i.e., whether cr is a CreateWorkspaceCR or IntroduceIdeaCR). At this point, we could go back to working with a type class or maybe introduce proxies or singletons to maintain a value-level representation of the constructors in the existential type, but there's a more straightforward solution.
GADTs are powerful enough to infer the type of cr, and we can rewrite our data family as a GADT. Not only is the syntax simpler:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE StandaloneDeriving #-}
data CommandRequestC = CreateWorkspace | IntroduceIdea
type UUID = Int
data CommandRequest c where
CreateWorkspaceCR :: UUID -> UUID -> CommandRequest 'CreateWorkspace
IntroduceIdeaCR :: UUID -> UUID -> String -> CommandRequest 'IntroduceIdea
deriving instance Show (CommandRequest c)
data SomeCommandRequest = forall c . SomeCommandRequest (CommandRequest c)
but we can implement our filtering function without fuss:
filterCreateWorkspace :: [SomeCommandRequest] -> [CommandRequest 'CreateWorkspace]
filterCreateWorkspace crs
= [ cw | SomeCommandRequest cw#(CreateWorkspaceCR _ _) <- crs ]
define some helpful "constructors":
createWorkspace :: UUID -> UUID -> SomeCommandRequest
createWorkspace u1 u2 = SomeCommandRequest (CreateWorkspaceCR u1 u2)
introduceIdea :: UUID -> UUID -> String -> SomeCommandRequest
introduceIdea u1 u2 s = SomeCommandRequest (IntroduceIdeaCR u1 u2 s)
and test it:
testdata7 :: [SomeCommandRequest]
testdata7 = [ createWorkspace 1 2
, createWorkspace 3 4
, introduceIdea 5 6 "seven"]
test7 = filterCreateWorkspace testdata7
like so:
> test4
[CreateWorkspaceCR 1 2,CreateWorkspaceCR 3 4]
>
Does any of this look familiar? It should, because it's #chi's solution. And it's the only type-level solution that really makes sense, giving what you're trying to do.
Now, with a couple of type aliases and some clever renaming, you can technically get the type signature you want, like so:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE StandaloneDeriving #-}
data CommandRequestC = CreateWorkspaceC | IntroduceIdeaC
type CreateWorkspace = ACommandRequest 'CreateWorkspaceC
type IntroduceIdea = ACommandRequest 'IntroduceIdeaC
type UUID = Int
data ACommandRequest c where
CreateWorkspaceCR :: UUID -> UUID -> CreateWorkspace
IntroduceIdeaCR :: UUID -> UUID -> String -> IntroduceIdea
deriving instance Show (ACommandRequest c)
data CommandRequest = forall c . CommandRequest (ACommandRequest c)
filterCreateWorkspace :: [CommandRequest] -> [CreateWorkspace]
filterCreateWorkspace crs
= [ cw | CommandRequest cw#(CreateWorkspaceCR _ _) <- crs ]
createWorkspace :: UUID -> UUID -> CommandRequest
createWorkspace u1 u2 = CommandRequest (CreateWorkspaceCR u1 u2)
introduceIdea :: UUID -> UUID -> String -> CommandRequest
introduceIdea u1 u2 s = CommandRequest (IntroduceIdeaCR u1 u2 s)
testdata8 :: [CommandRequest]
testdata8 = [ createWorkspace 1 2
, createWorkspace 3 4
, introduceIdea 5 6 "seven"]
test8 = filterCreateWorkspace testdata8
but this is just a trick, so I wouldn't take it too seriously.
And, if all this seems like a lot of work and leaves you feeling dissatisfied with the resulting solution, then welcome to the world of type-level programming! (Actually, it is all kind of fun, but try not to expect too much.)
You can use a list comprehension to filter only those values obtained through a specific constructor. Note that the type of the list does not change.
filter :: [CommandRequest] -> [CommandRequest]
filter xs = [ x | x#(CreateWorkspace{}) <- xs ]
If you want a more precise type, you need more complex type-level machinery, like GADTs.
Here's an untested approach. You'll need a few extensions to be turned on.
data CR = CW | II -- to be promoted to "kinds"
-- A more precise, indexed type
data CommandRequestP (k :: CR) where
CreateWorkspace :: {commandId :: UUID, workspaceId ::UUID }
-> CommandRequestP 'CW
IntroduceIdea :: {commandId :: UUID, workspaceId ::UUID, ideaContent :: String}
-> CommandRequestP 'II
-- Existential wrapper, so that we can build lists
data CommandRequest where
C :: CommandRequestP k -> CommandRequest
filter :: [CommandRequest] -> [CommandRequestP 'CW]
filter xs = [ x | C (x#(CreateWorkspace{})) <- xs ]

Disambiguate record update with DuplicateRecordFields

I'm using the DuplicateRecordFields (+OverloadedLabels) extension, and I've run into a situation where I can't figure out how to disambiguate in a record update.
Here is a simplified example:
data A = A { name :: String }
data B = B { name :: String }
combine :: A -> B -> A
combine a b = a { name = name b }
Is there any way to make this work?
I answered in one of the previous questions about -XDuplicateRecordFields that currently GHC doesn't infer type of record field from its argument:
Silly duplicated record fields error
What you can do now is to specify type of name extractor explicitly, like this:
{-# LANGUAGE DuplicateRecordFields #-}
data A = A { name :: String }
data B = B { name :: String }
combine :: A -> B -> A
combine a b = a { name = (name :: B -> String) b }
Alternatively, you can mechanically use getField from GHC.Records to disambiguate, like this:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE TypeApplications #-}
module DRF where
import GHC.Records (getField)
data A = A { name :: String } deriving Show
data B = B { name :: String }
combine :: A -> B -> A
combine a b = a { name = getField #"name" b }
{- in ghci
Prelude DRF> a = A "Alice"
Prelude DRF> b = B "Bob"
Prelude DRF> DRF.combine a b
A {name = "Bob"}
-}
References:
https://www.reddit.com/r/haskell/comments/8693a3/usefulness_of_duplicaterecordfields/dw3fe8u/
https://hackage.haskell.org/package/base-4.12.0.0/docs/GHC-Records.html
You could match the name from a pattern:
data A = A { name :: String }
data B = B { name :: String }
combine :: A -> B -> A
combine a B{name = nb} = a { name = nb }
I'm not a fan of DuplicateRecordFields though. Why not instead go the lens route?
{-# LANGUAGE TemplateHaskell, FlexibleInstances, FunctionalDependencies #-}
import Control.Lens
import Control.Lens.TH
data A = A { _aName :: String }
makeFields ''A
data B = B { _bName :: String }
makeFields ''B
combine :: A -> B -> A
combine a b = a & name .~ b^.name

Haskell: refer to type of value at compile time

I'm wondering if there's a nice way to refer to types of values without explicitly aliasing them using type in code (not at runtime - there is no reification going on here).
Take the following code (using Data.Vinyl):
{-# LANGUAGE DataKinds, TypeOperators #-}
import Data.Vinyl
name = Field :: "name" ::: String
age = Field :: "age" ::: Int
type Person = ["name" ::: String, "age" ::: Int]
Here we have the types "name" ::: String and "age" ::: Int repeated in two places. If we reuse fields in multiple records, this can become multiple places. Despite the fact that the Person type is really referring to the constituent fields, the type declarations are independent. So changing age to be represented by Float, say, requires changes in various places.
Obviously it's not necessary to explicitly type things, since they will be inferred. However, in my case the record types are being returned from an options parser, and thus exported. Likewise, one could write the following:
type Name = "name" ::: String
name = Field :: Name
type Age = "age" ::: Int
age = Field :: Age
type Person = [Name, Age]
However, this then involves another load of type aliases and double the number of lines. What I would like to be able to write is the following:
name = Field :: "name" ::: String
age = Field :: "age" ::: Int
type Person = [typeof name, typeof age]
This explicitly links the type of a Person to the types of its fields.
Is there a way (preferably sans-TH, but I'd be interested even involving TH) to do this?
It should be easy enough to make a String -> [Name] -> DecsQ function out of
the following. Too bad with ghc7.6 (at least), the check for cycles in type
synonyms seems to stop the prettier type Person = $(listOfT ['name, 'age]) from
working out.
{-# LANGUAGE DataKinds, TemplateHaskell, TypeOperators #-}
import Language.Haskell.TH
import Control.Applicative
import Data.Vinyl
name = Field :: "name" ::: String
age = Field :: "age" ::: Int
let listOfT (n:ns) = do
VarI _ ty _ _ <- reify n
(appT promotedConsT) (return ty) `appT` listOfT ns
listOfT [] = promotedNilT
in return <$> tySynD (mkName "Person") [] (listOfT ['name, 'age])

Name conflicts in Haskell records

Haskell doesn't have dot notation for record members. For each record member a compiler creates a function with the same name with a type RecType -> FieldType. This leads to name conflicts. Are there any ways to work around this, i.e. how can I have several records with the same field names?
For large projects, I prefer to keep each type in its own module and use Haskell's module system to namespace accessors for each type.
For example, I might have some type A in module A:
-- A.hs
data A = A
{ field1 :: String
, field2 :: Double
}
... and another type B with similarly-named fields in module B:
-- B.hs
data B = B
{ field1 :: Char
, field2 :: Int
}
Then if I want to use both types in some other module C I can import them qualified to distinguish which accessor I mean:
-- C.hs
import A as A
import B as B
f :: A -> B -> (Double, Int)
f a b = (A.field2 a, B.field2 b)
Unfortunately, Haskell does not have a way to define multiple name-spaces within the same module, otherwise there would be no need to split each type in a separate module to do this.
Another way to avoid this problem is to use the lens package. It provides a makeFields template haskell function, which you can use like this:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeSynonymInstances #-}
import Control.Lens
data A = A
{ _aText :: String
}
makeFields ''A -- Creates a lens x for each record accessor with the name _aX
data B = B
{ _bText :: Int
, _bValue :: Int
}
-- Creates a lens x for each record accessor with the name _bX
makeFields ''B
main = do
let a = A "hello"
let b = B 42 1
-- (^.) is a function of lens which accesses a field (text) of some value (a)
putStrLn $ "Text of a: " ++ a ^. text
putStrLn $ "Text of b: " ++ show (b ^. text)
If you don't want to use TemplateHaskell and lens, you can also do manually what lens automates using TemplateHaskell:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeSynonymInstances #-}
data A = A
{ aText :: String
}
data B = B
{ bText :: Int
, bValue :: Int
}
-- A class for types a that have a "text" field of type t
class HasText a t | a -> t where
-- An accessor for the text value
text :: a -> t
-- Make our two types instances of those
instance HasText A String where text = aText
instance HasText B Int where text = bText
main = do
let a = A "hello"
let b = B 42 1
putStrLn $ "Text of a: " ++ text a
putStrLn $ "Text of b: " ++ show (text b)
But I can really recommend learning lens, as it also provides lots of other utilities, like modifying or setting a field.
The GHC developers developed a couple of extensions to help with this issue . Check out this ghc wiki page. Initially a single OverloadedRecordFields extension was planned, but instead two extensions were developed. The extensions are OverloadedLabels and DuplicateRecordFields. Also see that reddit discussion.
The DuplicateRecordFields extensions makes this code legal in a single module:
data Person = MkPerson { personId :: Int, name :: String }
data Address = MkAddress { personId :: Int, address :: String }
As of 2019, I'd say these two extensions didn't get the adoption I thought they would have (although they did get some adoption) and the status quo is probably still ongoing.

How to "newtype" IntSet?

Thanks to newtype and the GeneralizedNewtypeDeriving extension, one can define distinct lightweight types with little effort:
newtype PersonId = PersonId Int deriving (Eq, Ord, Show, NFData, ...)
newtype GroupId = GroupId Int deriving (Eq, Ord, Show, NFData, ...)
which allows the type-system to make sure a PersonId is not used by accident where a GroupId was expected, but still inherit selected typeclass instances from Int.
Now one could simply define PersonIdSet and GroupIdSet as
import Data.Set (Set)
import qualified Data.Set as Set
type PersonIdSet = Set PersonId
type GroupIdSet = Set GroupId
noGroups :: GroupIdSet
noGroups = Set.empty
-- should not type-check
foo = PersonId 123 `Set.member` noGroups
-- should type-check
bar = GroupId 123 `Set.member` noGroups
which is type safe, since map is parametrized by the key-type, and also, the Set.member operation is polymorphic so I don't need to define per-id-type variants such as personIdSetMember and groupIdSetMember (and all other set-operations I might want to use)
...but how can I use the more efficient IntSets instead for PersonIdSet and GroupIdSet respectively in a similiar way to the example above? Is there a simple way w/o having to wrap/replicate the whole Data.IntSet API as a typeclass?
I think you have to wrap IntSet as you said. However, rather than defining each ID type separately, you can introduce a phantom type to create a family of IDs and IDSets that are compatible with one another:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import qualified Data.IntSet as IntSet
import Data.IntSet (IntSet)
newtype ID a = ID { unID :: Int }
deriving ( Eq, Ord, Show, Num )
newtype IDSet a = IDSet { unIDSet :: IntSet }
deriving ( Eq, Ord, Show )
null :: IDSet a -> Bool
null = IntSet.null . unIDSet
member :: ID a -> IDSet a -> Bool
member i = IntSet.member (unID i) . unIDSet
empty :: IDSet a
empty = IDSet $ IntSet.empty
singleton :: ID a -> IDSet a
singleton = IDSet . IntSet.singleton . unID
insert :: ID a -> IDSet a -> IDSet a
insert i = IDSet . IntSet.insert (unID i) . unIDSet
delete :: ID a -> IDSet a -> IDSet a
delete i = IDSet . IntSet.delete (unID i) . unIDSet
So, assuming you have a Person type, and a Group type, you can do:
type PersonID = ID Person
type PersonIDSet = IDSet Person
type GroupID = ID Group
type GroupIDSet = IDSet Group
The enummapset package implements one approach to newtype-safe IntMap/IntSets.
An example for its usage based on the types from the original question:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import Data.EnumSet (EnumSet)
import qualified Data.EnumSet as ES
newtype PersonId = PersonId Int deriving Enum
newtype GroupId = GroupId Int deriving Enum
type PersonIdSet = EnumSet PersonId
type GroupIdSet = EnumSet GroupId
noGroups :: GroupIdSet
noGroups = ES.empty
-- fails type-check: Couldn't match expected type `PersonId' with actual type `GroupId'
foo = PersonId 123 `ES.member` noGroups
-- passes type-check
bar = GroupId 123 `ES.member` noGroups
The usage of Data.EnumMap is similar.
I am under the impression you assume it is less efficient to use a type instead of a newtype. That is not true, newtypes are usually more efficiently implemented than datas.
So, your definition of PersonIdSet is perfectly safe and as efficient as you might want.

Resources