Filter a list by returning only one kind of data constructor instead of the type of the type constructor - haskell

So let's say I have the following data type :
data CommandRequest = CreateWorkspace {commandId :: UUID , workspaceId ::UUID }
| IntroduceIdea {commandId :: UUID , workspaceId ::UUID , ideaContent :: String} deriving (Show,Eq)
with the {-# LANGUAGE DataKinds #-}
I want to implement the following function (in pseudocode) :
filter :: [CommandRequest] -> [CreateWorkspace] (promoting the data constructor to a type level)
can you help me with the implementation of that function ?... Thank you !

Given a Haskell type like:
data Foo = Bar Int | Baz String
there is no direct way of writing down a new type that represents the subset of values of type Foo that are constructed with Bar, even using the DataKinds extension.
In particular, when you turn on DataKinds, the Bar type that you get is not the type of the values Bar 1 and Bar 2. In fact, the new lifted Bar type doesn't really have anything to do with the values Bar 1 and Bar 2, except for the fact that they share the name Bar. It's not that different than explicitly defining:
data True = TrueThing
This new type True has nothing to do with the value True of type Bool, except they happen to have the same name.
Presuming that what you are trying to do is find a type-safe way of representing the result of filtering CommandRequest for just those values that were constructed with the CreateWorkspace constructor so that you can't "accidentally" let an IntroduceIdea sneak in to your list, you'll have to take another approach. There are several possibilities.
The most straightforward way, which doesn't require any special type-level programming at all, is to represent CreateWorkspace and IntroduceIdea as separate types:
{-# LANGUAGE DuplicateRecordFields #-}
data CreateWorkspace = CreateWorkspace
{ commandId :: UUID
, workspaceId ::UUID
} deriving (Show)
data IntroduceIdea = IntroduceIdea
{ commandId :: UUID
, workspaceId ::UUID
, ideaContent :: String
} deriving (Show)
and then create a new algebraic sum type to represent the disjoint union of those separate types:
data CommandRequest
= CreateWorkspace' CreateWorkspace
| IntroduceIdea' IntroduceIdea
deriving (Show)
Note we've used the ticks to differentiate these constructors from those used in the underlying component types. A simple variant of this would be to move common fields (like commandId, and perhaps workSpaceId) into the CommandRequest type. This might or might not make sense, depending on what you're trying to accomplish.
Anyway, this adds a little syntactic fluff, but it's straightforward to define:
filterCreateWorkspace :: [CommandRequest] -> [CreateWorkspace]
filterCreateWorkspace crs = [ cw | CreateWorkspace' cw <- crs ]
and with some additional "constructors":
createWorkspace :: UUID -> UUID -> CommandRequest
createWorkspace u1 u2 = CreateWorkspace' (CreateWorkspace u1 u2)
introduceIdea :: UUID -> UUID -> String -> CommandRequest
introduceIdea u1 u2 s = IntroduceIdea' (IntroduceIdea u1 u2 s)
it's not too hard to create and filter [CommandRequest] lists:
type UUID = Int
testdata1 :: [CommandRequest]
testdata1
= [ createWorkspace 1 2
, createWorkspace 3 4
, introduceIdea 5 6 "seven"
]
test1 = filterCreateWorkspace testdata1
giving:
> test1
[CreateWorkspace {commandId = 1, workspaceId = 2}
,CreateWorkspace {commandId = 3, workspaceId = 4}]
This is almost certainly the correct approach for doing what you want to do. I mean, this is exactly what algebraic data types are for. This is what a Haskell program is supposed to look like.
"But no," I hear you say! "I want to spend endless hours fighting confusing type errors! I want to crawl down the dependent type rabbit hole. You know, for 'reasons'." Should I stand in your way? Can one man stand against the ocean?
If you really want to do this at the type level, you still want to define separate types for your two constructors:
data CreateWorkspace = CreateWorkspace
{ commandId :: UUID
, workspaceId ::UUID
} deriving (Show)
data IntroduceIdea = IntroduceIdea
{ commandId :: UUID
, workspaceId ::UUID
, ideaContent :: String
} deriving (Show)
As before, this makes it easy to represent a list of type [CreateWorkspace]. Now, the key to working at the type level will be finding a way to make it as difficult as possible to represent a list of type [CommandRequest]. A standard method would be to introduce a CommandRequest type class with instances for our two types, together with an existential type to represent an arbitrary type belonging to that class:
{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE ExistentialQuantification #-}
type UUID = Int -- for the sake of examples
data CreateWorkspace = CreateWorkspace
{ commandId :: UUID
, workspaceId ::UUID
} deriving (Show)
data IntroduceIdea = IntroduceIdea
{ commandId :: UUID
, workspaceId ::UUID
, ideaContent :: String
} deriving (Show)
class CommandRequest a where
maybeCreateWorkspace :: a -> Maybe CreateWorkspace
instance CommandRequest CreateWorkspace where
maybeCreateWorkspace c = Just c
instance CommandRequest IntroduceIdea where
maybeCreateWorkspace _ = Nothing
data SomeCommandRequest = forall t . CommandRequest t => SomeCommandRequest t
Now we can define:
import Data.Maybe
filterCreateWorkspace :: [SomeCommandRequest] -> [CreateWorkspace]
filterCreateWorkspace = catMaybes . map getCW
where getCW (SomeCommandRequest cr) = maybeCreateWorkspace cr
which works fine, though the syntax is still a bit cumbersome:
testdata2 :: [SomeCommandRequest]
testdata2 = [ SomeCommandRequest (CreateWorkspace 1 2)
, SomeCommandRequest (CreateWorkspace 3 4)
, SomeCommandRequest (IntroduceIdea 5 6 "seven")
]
test2 = print $ filterCreateWorkspace testdata2
The test gives:
> test2
[CreateWorkspace {commandId = 1, workspaceId = 2}
,CreateWorkspace {commandId = 3, workspaceId = 4}]
The awkward thing about this solution is that we need a type class method for identifying the CreateWorkspace type. If we wanted to construct lists of each possible constructor, we'd need to add a new type class method for every single one, and we need to give a definition for the method for every instance (though we can get away with a default definition that returns Nothing for all but one instance, I guess). Anyway, that's nuts!
The mistake we made was making it difficult to represent a list of type [CreateWorkspace] instead of absurdly difficult. To make it absurdly difficult, we'll still want to represent our two constructors as separate types, but we'll make them instances of a data family keyed by constructor names lifted to the type level by the DataKinds extension. Now this is starting to look like a Haskell program!
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE TypeFamilies #-}
data CommandRequestC = CreateWorkspace | IntroduceIdea
data family CommandRequest (c :: CommandRequestC)
type UUID = Int -- for the sake of examples
data instance CommandRequest CreateWorkspace
= CreateWorkspaceCR
{ commandId :: UUID
, workspaceId ::UUID
} deriving (Show)
data instance CommandRequest IntroduceIdea
= IntroduceIdeaCR
{ commandId :: UUID
, workspaceId ::UUID
, ideaContent :: String
} deriving (Show)
What's going on here? Well, we introduced a new type CommandRequestC (the trailing C stands for "constructor") with two constructors CreateWorkspace and IntroduceIdea. The only purpose of these constructors was to lift them to the type level using DataKinds in order to use them as type-level tags for the CommandRequest data family. This is a very common way of using DataKinds, maybe the most common. In fact, the example you gave of the type ReadResult 'RegularStream StreamSlice was exactly this kind of usage. The type:
data StreamType = All | RegularStream
carries no useful data. The whole point of its existence is to lift the constructors All and RegularStream to type-level tags, so that ReadResult 'All StreamSlice and ReadResult 'RegularStream StreamSlice can be used to name two different related types, just like CommandRequest 'CreateWorkspace and CommandRequest 'IntroduceIdea name two different related types.
At this point, we have two separate types for our two constructors that happen to be related via a tagged data family, rather than via a type class.
testdata3 :: [CommandRequest 'CreateWorkspace]
testdata3 = [CreateWorkspaceCR 1 2, CreateWorkspaceCR 3 4]
testdata4 :: [CommandRequest 'IntroduceIdea]
testdata4 = [IntroduceIdeaCR 5 6 "seven"]
Note that even though we can write the type [CommandRequest c], leaving the constructor tag as an unspecified type variable c, we still can't write a list that mixes these constructors:
testdata5bad :: [CommandRequest c]
testdata5bad = [CreateWorkspaceCR 1 2, CreateWorkspaceCR 3 4,
IntroduceIdeaCR 5 6 "seven"] -- **ERROR**
We still need our existential type:
{-# LANGUAGE ExistentialQuantification #-}
data SomeCommandRequest = forall c . SomeCommandRequest (CommandRequest c)
and the extra existential syntax:
testdata6 :: [SomeCommandRequest]
testdata6 = [ SomeCommandRequest (CreateWorkspaceCR 1 2)
, SomeCommandRequest (CreateWorkspaceCR 3 4)
, SomeCommandRequest (IntroduceIdeaCR 5 6 "seven")]
Worse yet, if we try to write a filter function, it's not clear how to implement it. One reasonable first attempt is:
filterCreateWorkspace :: [SomeCommandRequest] -> [CommandRequest 'CreateWorkspace]
filterCreateWorkspace (SomeCommandRequest cr : rest)
= case cr of cw#(CreateWorkspaceCR _ _) -> cw : filterCreateWorkspace rest
_ -> filterCreateWorkspace rest
but this fails with an error about failing to match to the CreateWorkspace tag.
The problem is that data families aren't powerful enough to allow you to infer which member of a family you actually have (i.e., whether cr is a CreateWorkspaceCR or IntroduceIdeaCR). At this point, we could go back to working with a type class or maybe introduce proxies or singletons to maintain a value-level representation of the constructors in the existential type, but there's a more straightforward solution.
GADTs are powerful enough to infer the type of cr, and we can rewrite our data family as a GADT. Not only is the syntax simpler:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE StandaloneDeriving #-}
data CommandRequestC = CreateWorkspace | IntroduceIdea
type UUID = Int
data CommandRequest c where
CreateWorkspaceCR :: UUID -> UUID -> CommandRequest 'CreateWorkspace
IntroduceIdeaCR :: UUID -> UUID -> String -> CommandRequest 'IntroduceIdea
deriving instance Show (CommandRequest c)
data SomeCommandRequest = forall c . SomeCommandRequest (CommandRequest c)
but we can implement our filtering function without fuss:
filterCreateWorkspace :: [SomeCommandRequest] -> [CommandRequest 'CreateWorkspace]
filterCreateWorkspace crs
= [ cw | SomeCommandRequest cw#(CreateWorkspaceCR _ _) <- crs ]
define some helpful "constructors":
createWorkspace :: UUID -> UUID -> SomeCommandRequest
createWorkspace u1 u2 = SomeCommandRequest (CreateWorkspaceCR u1 u2)
introduceIdea :: UUID -> UUID -> String -> SomeCommandRequest
introduceIdea u1 u2 s = SomeCommandRequest (IntroduceIdeaCR u1 u2 s)
and test it:
testdata7 :: [SomeCommandRequest]
testdata7 = [ createWorkspace 1 2
, createWorkspace 3 4
, introduceIdea 5 6 "seven"]
test7 = filterCreateWorkspace testdata7
like so:
> test4
[CreateWorkspaceCR 1 2,CreateWorkspaceCR 3 4]
>
Does any of this look familiar? It should, because it's #chi's solution. And it's the only type-level solution that really makes sense, giving what you're trying to do.
Now, with a couple of type aliases and some clever renaming, you can technically get the type signature you want, like so:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE StandaloneDeriving #-}
data CommandRequestC = CreateWorkspaceC | IntroduceIdeaC
type CreateWorkspace = ACommandRequest 'CreateWorkspaceC
type IntroduceIdea = ACommandRequest 'IntroduceIdeaC
type UUID = Int
data ACommandRequest c where
CreateWorkspaceCR :: UUID -> UUID -> CreateWorkspace
IntroduceIdeaCR :: UUID -> UUID -> String -> IntroduceIdea
deriving instance Show (ACommandRequest c)
data CommandRequest = forall c . CommandRequest (ACommandRequest c)
filterCreateWorkspace :: [CommandRequest] -> [CreateWorkspace]
filterCreateWorkspace crs
= [ cw | CommandRequest cw#(CreateWorkspaceCR _ _) <- crs ]
createWorkspace :: UUID -> UUID -> CommandRequest
createWorkspace u1 u2 = CommandRequest (CreateWorkspaceCR u1 u2)
introduceIdea :: UUID -> UUID -> String -> CommandRequest
introduceIdea u1 u2 s = CommandRequest (IntroduceIdeaCR u1 u2 s)
testdata8 :: [CommandRequest]
testdata8 = [ createWorkspace 1 2
, createWorkspace 3 4
, introduceIdea 5 6 "seven"]
test8 = filterCreateWorkspace testdata8
but this is just a trick, so I wouldn't take it too seriously.
And, if all this seems like a lot of work and leaves you feeling dissatisfied with the resulting solution, then welcome to the world of type-level programming! (Actually, it is all kind of fun, but try not to expect too much.)

You can use a list comprehension to filter only those values obtained through a specific constructor. Note that the type of the list does not change.
filter :: [CommandRequest] -> [CommandRequest]
filter xs = [ x | x#(CreateWorkspace{}) <- xs ]
If you want a more precise type, you need more complex type-level machinery, like GADTs.
Here's an untested approach. You'll need a few extensions to be turned on.
data CR = CW | II -- to be promoted to "kinds"
-- A more precise, indexed type
data CommandRequestP (k :: CR) where
CreateWorkspace :: {commandId :: UUID, workspaceId ::UUID }
-> CommandRequestP 'CW
IntroduceIdea :: {commandId :: UUID, workspaceId ::UUID, ideaContent :: String}
-> CommandRequestP 'II
-- Existential wrapper, so that we can build lists
data CommandRequest where
C :: CommandRequestP k -> CommandRequest
filter :: [CommandRequest] -> [CommandRequestP 'CW]
filter xs = [ x | C (x#(CreateWorkspace{})) <- xs ]

Related

How do I model a record's fields as data?

Let's say I have a Person record with some fields:
data Person = Person
{ name :: String
, age :: Int
, id :: Int
}
and I want to be able to search a list of Persons by a given field:
findByName :: String -> [Person] -> Maybe Person
findByName s = find (\p -> name p == s)
Now let's say I want to be able to model and store these searches/queries as data, for instance for logging purposes, or to batch execute them, or whatever.
How would I go about representing a search over a given field (or set of fields) as data?
My intuition says to model it as a map of fields to string values (Map (RecordField) (Maybe String)), but I can't do that, because record fields are functions.
Is there a better way to do this than, say, the following?
data PersonField = Name | Age | Int
type Search = Map PersonField (Maybe String)
This could technically work but it decouples PersonField from Person in an ugly way.
I want to be able to model and store these searches/queries as data
Let's assume we want to store them as JSON. We could define a type like
data Predicate record = Predicate {
runPredicate :: record -> Bool ,
storePredicate :: Value
}
Where storePredicate would return a JSON representation of the "reference value" inside the predicate. For example, the value 77 for "age equals 77".
For each record, we would like to have a collection like this:
type FieldName = String
type FieldPredicates record = [(FieldName, Value -> Maybe (Predicate record))]
That is: for each field, we can supply a JSON value encoding the "reference value" of the predicate and, if it parses successfully, we get a Predicate. Otherwise we get Nothing. This would allows us to serialize and deserialize predicates.
We could define FieldPredicates manually for each record, but is there a more automated way? We could try generating field equality predicates using a typeclass. But first, the extensions and imports dance:
{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE StandaloneKindSignatures #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE BlockArguments #-}
import Data.Functor ( (<&>) )
import Data.Kind ( Type, Constraint )
import Data.Proxy
import GHC.Records ( HasField(..) )
import GHC.TypeLits ( KnownSymbol, Symbol, symbolVal )
import Data.Aeson ( FromJSON(parseJSON), Value, ToJSON(toJSON) )
import Data.Aeson.Types (parseMaybe)
import Data.List ( lookup )
Now we define the helper typeclass:
type HasEqFieldPredicates :: [Symbol] -> Type -> Constraint
class HasEqFieldPredicates fieldNames record where
eqFieldPredicates :: FieldPredicates record
instance HasEqFieldPredicates '[] record where
eqFieldPredicates = []
instance
( KnownSymbol fieldName,
HasField fieldName record v,
Eq v,
FromJSON v,
ToJSON v,
HasEqFieldPredicates fieldNames record
) =>
HasEqFieldPredicates (fieldName ': fieldNames) record
where
eqFieldPredicates =
let current =
( symbolVal (Proxy #fieldName),
\j ->
parseMaybe (parseJSON #v) j <&> \v ->
Predicate (\record -> getField #fieldName record == v) (toJSON v))
in current : eqFieldPredicates #fieldNames #record
An example with Person:
personEqPredicates :: [(FieldName, Value -> Maybe (Predicate Person))]
personEqPredicates = eqFieldPredicates #["name", "age", "id"] #Person
personAgeEquals :: Value -> Maybe (Predicate Person)
personAgeEquals = let Just x = Data.List.lookup "age" personEqPredicates in x
Putting it to work:
ghci> let Just p = personAgeEquals (toJSON (77::Int)) in runPredicate p Person { name = "John", age = 78, id = 3 }
False
ghci> let Just p = personAgeEquals (toJSON (78::Int)) in runPredicate p Person { name = "John", age = 78, id = 3 }
True
If you don't need to serialize these query objects to disk, then your "field" type is Person -> a. A record accessor is just a function from Person to some type a. Or if you end up outgrowing basic accessors and need to work with a lot of nested data, you can look into lenses.
However, it sounds like you want to be able to write these queries to disk. In that case, you can't easily serialize functions (or lenses, for that matter). I don't know of a way built-in to Haskell to do all of that automatically and still have it be serializable. So my recommendation would be to roll your own datatypes.
data PersonField = Name | Age | Id
or, even better, you can use GADTs to keep type safety.
data PersonField a where
Name :: PersonField String
Age :: PersonField Int
Id :: PersonField Int
getField :: PersonField a -> Person -> a
getField Name = name
getField Age = age
getField Id = id
Then you have total control over this concrete type and can write your own serialization logic for it. I think Map PersonField (Maybe String) is a good start, and you can refine the Maybe String part if you end up doing more complex queries (like "contains" or "case insensitive comparison", for instance).

Polymorphic lens without template haskell

I am trying to create a polymorphic lens decleration (without template haskell) for multiple types.
module Sample where
import Control.Lens
data A = A {_value:: Int}
data B = B {_value:: Int}
data C = C {_value:: String}
value = lens _value (\i x -> i{_value=x}) -- <<< ERROR
But I get following error:
Ambiguous occurrence ‘_value’
It could refer to either the field ‘_value’,
defined at library/Sample.hs:5:13
or the field ‘_value’, defined at
library/Sample.hs:4:13
or the field ‘_value’, defined at
library/Sample.hs:3:13
|
6 | value = lens _value (\i x -> i{_value=x}) -- <<< ERROR
| ^^^^^^
So, goal is to have value lens to work on all three types A, B and C. Is there a way to achieve that?
Thanks.
Lenses can be derived without TH by generic-lens. You can specialize the generic field lens to a specific field and give it a name as follows.
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE TypeApplications #-}
import GHC.Generics (Generic)
import Control.Lens (Lens, (^.))
import Data.Generics.Product (HasField(field))
data A = A { _value :: Int } deriving Generic
data B = B { _value :: Int } deriving Generic
data C = C { _value :: String } deriving Generic
value :: HasField "_value" s t a b => Lens s t a b
value = field #"_value"
main :: IO ()
main = print (A 0 ^. value, B 0 ^. value, C "0" ^. value)
Haskell does not support overloading functions the way a language like Java or C++ does. To do what you want you need to use a typeclass like so.
class HasValue a b where
value :: Lens' a b
data A = A {_valueA:: Int}
data B = B {_valueB:: Int}
data C = C {_valueC:: String}
instance HasValue A Int where
value = lens _valueA (\i x -> i{_valueA=x})
instance HasValue B Int where
value = lens _valueB (\i x -> i{_valueB=x})
instance HasValue C String where
value = lens _valueC (\i x -> i{_valueC=x}
You'll need to enable multiparameter typeclasses to do this.
If you want to avoid DuplicateRecordFields, another option is to define everything in its own module, though this will require qualified imports for them to be imported into the same module.
module Sample.A where
import Control.Lens
data A = A {_value:: Int}
value = lens _value (\i x -> i{_value=x})
module Sample.B where
import Control.Lens
data B = B {_value:: Int}
value = lens _value (\i x -> i{_value=x})
module Sample.C where
import Control.Lens
data C = C {_value:: String}
value = lens _value (\i x -> i{_value=x})
module Main where
import qualified Sample.A as A
import qualified Sample.B as B
import qualified Sample.C as C
main :: IO ()
main = print (A.A 0 ^. A.value, B.B 0 ^. B.value, C.C "0" ^. C.value)

Lenses and prisms with sum types

This code won't compile.
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
data A = A { _a1 :: B, _a2 :: Int }
makeLenses ''A
data B = B1 { _b1 :: Int } | B2
makeLenses ''B
The error is amy.hs:5:21: Not in scope: type constructor or class ‘B’. I have two questions.
Is there a way to do something like this, or do I need to write my own lenses for B?
Given an A, I would like to apply a function to the b1 field, if that field exists. I think this is a job for prisms, but I haven't figured out how to do it.
Rearrange your program as follows
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
data A = A { _a1 :: B, _a2 :: Int }
data B = B1 { _b1 :: Int } | B2
makeLenses ''A
makeLenses ''B
The issue has to do with staging order of Template Haskell (and in this case it's possibly a bug).
Observe that makeLenses ''B creates a Traversal for the _b1 field because it only appears in one of the two constructors.
b1 :: Traversal' B Int
If you were to use Prisms, as well, you'd add
makePrisms ''B
which would produce
_B1 :: Prism' B Int
_B2 :: Prism' B ()

Name conflicts in Haskell records

Haskell doesn't have dot notation for record members. For each record member a compiler creates a function with the same name with a type RecType -> FieldType. This leads to name conflicts. Are there any ways to work around this, i.e. how can I have several records with the same field names?
For large projects, I prefer to keep each type in its own module and use Haskell's module system to namespace accessors for each type.
For example, I might have some type A in module A:
-- A.hs
data A = A
{ field1 :: String
, field2 :: Double
}
... and another type B with similarly-named fields in module B:
-- B.hs
data B = B
{ field1 :: Char
, field2 :: Int
}
Then if I want to use both types in some other module C I can import them qualified to distinguish which accessor I mean:
-- C.hs
import A as A
import B as B
f :: A -> B -> (Double, Int)
f a b = (A.field2 a, B.field2 b)
Unfortunately, Haskell does not have a way to define multiple name-spaces within the same module, otherwise there would be no need to split each type in a separate module to do this.
Another way to avoid this problem is to use the lens package. It provides a makeFields template haskell function, which you can use like this:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeSynonymInstances #-}
import Control.Lens
data A = A
{ _aText :: String
}
makeFields ''A -- Creates a lens x for each record accessor with the name _aX
data B = B
{ _bText :: Int
, _bValue :: Int
}
-- Creates a lens x for each record accessor with the name _bX
makeFields ''B
main = do
let a = A "hello"
let b = B 42 1
-- (^.) is a function of lens which accesses a field (text) of some value (a)
putStrLn $ "Text of a: " ++ a ^. text
putStrLn $ "Text of b: " ++ show (b ^. text)
If you don't want to use TemplateHaskell and lens, you can also do manually what lens automates using TemplateHaskell:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeSynonymInstances #-}
data A = A
{ aText :: String
}
data B = B
{ bText :: Int
, bValue :: Int
}
-- A class for types a that have a "text" field of type t
class HasText a t | a -> t where
-- An accessor for the text value
text :: a -> t
-- Make our two types instances of those
instance HasText A String where text = aText
instance HasText B Int where text = bText
main = do
let a = A "hello"
let b = B 42 1
putStrLn $ "Text of a: " ++ text a
putStrLn $ "Text of b: " ++ show (text b)
But I can really recommend learning lens, as it also provides lots of other utilities, like modifying or setting a field.
The GHC developers developed a couple of extensions to help with this issue . Check out this ghc wiki page. Initially a single OverloadedRecordFields extension was planned, but instead two extensions were developed. The extensions are OverloadedLabels and DuplicateRecordFields. Also see that reddit discussion.
The DuplicateRecordFields extensions makes this code legal in a single module:
data Person = MkPerson { personId :: Int, name :: String }
data Address = MkAddress { personId :: Int, address :: String }
As of 2019, I'd say these two extensions didn't get the adoption I thought they would have (although they did get some adoption) and the status quo is probably still ongoing.

How to "newtype" IntSet?

Thanks to newtype and the GeneralizedNewtypeDeriving extension, one can define distinct lightweight types with little effort:
newtype PersonId = PersonId Int deriving (Eq, Ord, Show, NFData, ...)
newtype GroupId = GroupId Int deriving (Eq, Ord, Show, NFData, ...)
which allows the type-system to make sure a PersonId is not used by accident where a GroupId was expected, but still inherit selected typeclass instances from Int.
Now one could simply define PersonIdSet and GroupIdSet as
import Data.Set (Set)
import qualified Data.Set as Set
type PersonIdSet = Set PersonId
type GroupIdSet = Set GroupId
noGroups :: GroupIdSet
noGroups = Set.empty
-- should not type-check
foo = PersonId 123 `Set.member` noGroups
-- should type-check
bar = GroupId 123 `Set.member` noGroups
which is type safe, since map is parametrized by the key-type, and also, the Set.member operation is polymorphic so I don't need to define per-id-type variants such as personIdSetMember and groupIdSetMember (and all other set-operations I might want to use)
...but how can I use the more efficient IntSets instead for PersonIdSet and GroupIdSet respectively in a similiar way to the example above? Is there a simple way w/o having to wrap/replicate the whole Data.IntSet API as a typeclass?
I think you have to wrap IntSet as you said. However, rather than defining each ID type separately, you can introduce a phantom type to create a family of IDs and IDSets that are compatible with one another:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import qualified Data.IntSet as IntSet
import Data.IntSet (IntSet)
newtype ID a = ID { unID :: Int }
deriving ( Eq, Ord, Show, Num )
newtype IDSet a = IDSet { unIDSet :: IntSet }
deriving ( Eq, Ord, Show )
null :: IDSet a -> Bool
null = IntSet.null . unIDSet
member :: ID a -> IDSet a -> Bool
member i = IntSet.member (unID i) . unIDSet
empty :: IDSet a
empty = IDSet $ IntSet.empty
singleton :: ID a -> IDSet a
singleton = IDSet . IntSet.singleton . unID
insert :: ID a -> IDSet a -> IDSet a
insert i = IDSet . IntSet.insert (unID i) . unIDSet
delete :: ID a -> IDSet a -> IDSet a
delete i = IDSet . IntSet.delete (unID i) . unIDSet
So, assuming you have a Person type, and a Group type, you can do:
type PersonID = ID Person
type PersonIDSet = IDSet Person
type GroupID = ID Group
type GroupIDSet = IDSet Group
The enummapset package implements one approach to newtype-safe IntMap/IntSets.
An example for its usage based on the types from the original question:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import Data.EnumSet (EnumSet)
import qualified Data.EnumSet as ES
newtype PersonId = PersonId Int deriving Enum
newtype GroupId = GroupId Int deriving Enum
type PersonIdSet = EnumSet PersonId
type GroupIdSet = EnumSet GroupId
noGroups :: GroupIdSet
noGroups = ES.empty
-- fails type-check: Couldn't match expected type `PersonId' with actual type `GroupId'
foo = PersonId 123 `ES.member` noGroups
-- passes type-check
bar = GroupId 123 `ES.member` noGroups
The usage of Data.EnumMap is similar.
I am under the impression you assume it is less efficient to use a type instead of a newtype. That is not true, newtypes are usually more efficiently implemented than datas.
So, your definition of PersonIdSet is perfectly safe and as efficient as you might want.

Resources