Name conflicts in Haskell records - haskell

Haskell doesn't have dot notation for record members. For each record member a compiler creates a function with the same name with a type RecType -> FieldType. This leads to name conflicts. Are there any ways to work around this, i.e. how can I have several records with the same field names?

For large projects, I prefer to keep each type in its own module and use Haskell's module system to namespace accessors for each type.
For example, I might have some type A in module A:
-- A.hs
data A = A
{ field1 :: String
, field2 :: Double
}
... and another type B with similarly-named fields in module B:
-- B.hs
data B = B
{ field1 :: Char
, field2 :: Int
}
Then if I want to use both types in some other module C I can import them qualified to distinguish which accessor I mean:
-- C.hs
import A as A
import B as B
f :: A -> B -> (Double, Int)
f a b = (A.field2 a, B.field2 b)
Unfortunately, Haskell does not have a way to define multiple name-spaces within the same module, otherwise there would be no need to split each type in a separate module to do this.

Another way to avoid this problem is to use the lens package. It provides a makeFields template haskell function, which you can use like this:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeSynonymInstances #-}
import Control.Lens
data A = A
{ _aText :: String
}
makeFields ''A -- Creates a lens x for each record accessor with the name _aX
data B = B
{ _bText :: Int
, _bValue :: Int
}
-- Creates a lens x for each record accessor with the name _bX
makeFields ''B
main = do
let a = A "hello"
let b = B 42 1
-- (^.) is a function of lens which accesses a field (text) of some value (a)
putStrLn $ "Text of a: " ++ a ^. text
putStrLn $ "Text of b: " ++ show (b ^. text)
If you don't want to use TemplateHaskell and lens, you can also do manually what lens automates using TemplateHaskell:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeSynonymInstances #-}
data A = A
{ aText :: String
}
data B = B
{ bText :: Int
, bValue :: Int
}
-- A class for types a that have a "text" field of type t
class HasText a t | a -> t where
-- An accessor for the text value
text :: a -> t
-- Make our two types instances of those
instance HasText A String where text = aText
instance HasText B Int where text = bText
main = do
let a = A "hello"
let b = B 42 1
putStrLn $ "Text of a: " ++ text a
putStrLn $ "Text of b: " ++ show (text b)
But I can really recommend learning lens, as it also provides lots of other utilities, like modifying or setting a field.

The GHC developers developed a couple of extensions to help with this issue . Check out this ghc wiki page. Initially a single OverloadedRecordFields extension was planned, but instead two extensions were developed. The extensions are OverloadedLabels and DuplicateRecordFields. Also see that reddit discussion.
The DuplicateRecordFields extensions makes this code legal in a single module:
data Person = MkPerson { personId :: Int, name :: String }
data Address = MkAddress { personId :: Int, address :: String }
As of 2019, I'd say these two extensions didn't get the adoption I thought they would have (although they did get some adoption) and the status quo is probably still ongoing.

Related

How do I model a record's fields as data?

Let's say I have a Person record with some fields:
data Person = Person
{ name :: String
, age :: Int
, id :: Int
}
and I want to be able to search a list of Persons by a given field:
findByName :: String -> [Person] -> Maybe Person
findByName s = find (\p -> name p == s)
Now let's say I want to be able to model and store these searches/queries as data, for instance for logging purposes, or to batch execute them, or whatever.
How would I go about representing a search over a given field (or set of fields) as data?
My intuition says to model it as a map of fields to string values (Map (RecordField) (Maybe String)), but I can't do that, because record fields are functions.
Is there a better way to do this than, say, the following?
data PersonField = Name | Age | Int
type Search = Map PersonField (Maybe String)
This could technically work but it decouples PersonField from Person in an ugly way.
I want to be able to model and store these searches/queries as data
Let's assume we want to store them as JSON. We could define a type like
data Predicate record = Predicate {
runPredicate :: record -> Bool ,
storePredicate :: Value
}
Where storePredicate would return a JSON representation of the "reference value" inside the predicate. For example, the value 77 for "age equals 77".
For each record, we would like to have a collection like this:
type FieldName = String
type FieldPredicates record = [(FieldName, Value -> Maybe (Predicate record))]
That is: for each field, we can supply a JSON value encoding the "reference value" of the predicate and, if it parses successfully, we get a Predicate. Otherwise we get Nothing. This would allows us to serialize and deserialize predicates.
We could define FieldPredicates manually for each record, but is there a more automated way? We could try generating field equality predicates using a typeclass. But first, the extensions and imports dance:
{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE StandaloneKindSignatures #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE BlockArguments #-}
import Data.Functor ( (<&>) )
import Data.Kind ( Type, Constraint )
import Data.Proxy
import GHC.Records ( HasField(..) )
import GHC.TypeLits ( KnownSymbol, Symbol, symbolVal )
import Data.Aeson ( FromJSON(parseJSON), Value, ToJSON(toJSON) )
import Data.Aeson.Types (parseMaybe)
import Data.List ( lookup )
Now we define the helper typeclass:
type HasEqFieldPredicates :: [Symbol] -> Type -> Constraint
class HasEqFieldPredicates fieldNames record where
eqFieldPredicates :: FieldPredicates record
instance HasEqFieldPredicates '[] record where
eqFieldPredicates = []
instance
( KnownSymbol fieldName,
HasField fieldName record v,
Eq v,
FromJSON v,
ToJSON v,
HasEqFieldPredicates fieldNames record
) =>
HasEqFieldPredicates (fieldName ': fieldNames) record
where
eqFieldPredicates =
let current =
( symbolVal (Proxy #fieldName),
\j ->
parseMaybe (parseJSON #v) j <&> \v ->
Predicate (\record -> getField #fieldName record == v) (toJSON v))
in current : eqFieldPredicates #fieldNames #record
An example with Person:
personEqPredicates :: [(FieldName, Value -> Maybe (Predicate Person))]
personEqPredicates = eqFieldPredicates #["name", "age", "id"] #Person
personAgeEquals :: Value -> Maybe (Predicate Person)
personAgeEquals = let Just x = Data.List.lookup "age" personEqPredicates in x
Putting it to work:
ghci> let Just p = personAgeEquals (toJSON (77::Int)) in runPredicate p Person { name = "John", age = 78, id = 3 }
False
ghci> let Just p = personAgeEquals (toJSON (78::Int)) in runPredicate p Person { name = "John", age = 78, id = 3 }
True
If you don't need to serialize these query objects to disk, then your "field" type is Person -> a. A record accessor is just a function from Person to some type a. Or if you end up outgrowing basic accessors and need to work with a lot of nested data, you can look into lenses.
However, it sounds like you want to be able to write these queries to disk. In that case, you can't easily serialize functions (or lenses, for that matter). I don't know of a way built-in to Haskell to do all of that automatically and still have it be serializable. So my recommendation would be to roll your own datatypes.
data PersonField = Name | Age | Id
or, even better, you can use GADTs to keep type safety.
data PersonField a where
Name :: PersonField String
Age :: PersonField Int
Id :: PersonField Int
getField :: PersonField a -> Person -> a
getField Name = name
getField Age = age
getField Id = id
Then you have total control over this concrete type and can write your own serialization logic for it. I think Map PersonField (Maybe String) is a good start, and you can refine the Maybe String part if you end up doing more complex queries (like "contains" or "case insensitive comparison", for instance).

Haskell not using the more specific instance of a typeclass

I've been having trouble the past few days figuring out whether something I'm trying to do is actually feasible in Haskell.
Here is some context:
I am trying to code a little markup language (akin to ReST) where the syntax already enables custom extensions through directives.
For users to implement new directives, they should be able to add new semantic constructs inside the document datatype. For exemple if one wants to add a directive for displaying math, they might want to have a MathBlock String constructor inside the ast.
Obviously data types are not extensible, and a solution where there is a generic constructor DirectiveBlock String containing the name of the directive (here, "math") is undesirable as we would like to have in our ast only well-formed constructs (so only directives with well-formed arguments).
Using type families, I prototyped something like:
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE FlexibleInstances #-}
-- Arguments for custom directives.
data family Args :: * -> *
data DocumentBlock
= Paragraph String
| forall a. Block (Args a)
Sure enough, if someone wishes to define a new directive for math display, they can do it as such:
data Math
-- The expected arguments for the math directive.
data instance Args Math = MathArgs String
doc :: [DocumentBlock]
doc =
[ Paragraph "some text"
, Block (MathArgs "x_{n+1} = x_{n} + 3")
]
So far so good, we can only construct documents where directive blocks receive the correct arguments.
The problem arises when one user wants to convert the internal representation of a document to some custom output, say, String.
The user needs to provide a default output for all directives, since there will be many and some of them cannot be converted to the target.
Furthermore, the user may wish to provide a more specific output for some directives:
class StringWriter a where
write :: Args a -> String
-- User defined generic conversion for all directives.
instance StringWriter a where
write _ = "Directive"
-- Custom way of showing the math directive.
instance StringWriter Math where
write (MathArgs raw) = "Math(" ++ raw ++ ")"
-- Then to display a DocumentBlock
writeBlock :: DocumentBlock -> String
writeBlock (Paragraph t) = "Paragraph(" ++ t ++ ")"
writeBlock (Block args) = write args
main :: IO ()
main = putStrLn $ writeBlock (Block (MathArgs "a + b"))
With this example, the output is Block and not Math(a+b), so the generic instance for StringWriter is always chosen. Even when playing with {-# OVERLAPPABLE #-}, nothing succeeds.
Is the kind of behavior I'm describing possible at all in Haskell?
When trying to include a generic Writer inside the Block definition, it also fails to compile.
-- ...
class Writer a o where
write :: Args a -> o
data DocumentBlock
= Paragraph String
| forall a o. Writer a o => Block (Args a)
instance {-# OVERLAPPABLE #-} Writer a String where
write _ = "Directive"
instance {-# OVERLAPS #-} Writer Math String where
write (MathArgs raw) = "Math(" ++ raw ++ ")"
-- ...
Your code does not compile, since Block something has type DocumentBlock, while write expects an Args a argument, and the two types are different.
Did you mean writeBlock instead? I'll assume so.
What you might want to try is to add a constraint in your existential type, e.g.:
data DocumentBlock
= Paragraph String
| forall a. StringWriter a => Block (Args a)
-- ^^^^^^^^^^^^^^ --
This has the following effect. Operationally, every time Block something is used, the instance is remembered (a pointer is implicitly stored along the Args a value). That will be a pointer to the catch-all instance, or to the specific one, whichever is the best fit.
When the constructor is then pattern-matched later on, the instance can then be used. Full working code:
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE FlexibleInstances #-}
-- Arguments for custom directives.
data family Args :: * -> *
data DocumentBlock
= Paragraph String
| forall a. StringWriter a => Block (Args a)
data Math
-- The expected arguments for the math directive.
data instance Args Math = MathArgs String
doc :: [DocumentBlock]
doc =
[ Paragraph "some text"
, Block (MathArgs "x_{n+1} = x_{n} + 3")
]
class StringWriter a where
write :: Args a -> String
-- User defined generic conversion for all directives.
instance {-# OVERLAPPABLE #-} StringWriter a where
write _ = "Directive"
-- Custom way of showing the math directive.
instance StringWriter Math where
write (MathArgs raw) = "Math(" ++ raw ++ ")"
-- Then to display a DocumentBlock
writeBlock :: DocumentBlock -> String
writeBlock (Paragraph t) = "Paragraph(" ++ t ++ ")"
writeBlock (Block args) = write args
main :: IO ()
main = putStrLn $ writeBlock (Block (MathArgs "a + b"))
This prints Math(a + b).
A final note: for this to work it is crucial that all the relevant instances are in scope when Block is used. Otherwise, GHC might choose the wrong instance, causing some unintended output. This is the main limitation, making overlapping instances a bit fragile in general.
As long as there are no orphan instances, this should work.
Also note that, if using other existential types, a user can (intentionally or accidentally) cause GHC to pick the wrong instance anyway. For instance, if we use
data SomeArgs = forall a. SomeArgs (Args a)
toGenericInstance :: DocumentBlock -> DocumentBlock
toGenericInstance (Block a) = case SomeArgs a of
SomeArgs a' -> Block a' -- this will always pick the generic instance
toGenericInstance db = db
then, writeBlock (toGenericInstance (Block (MathArgs "a + b")))
will produce Directive instead.

Haskell Export Record for Read Access Only

I have a Haskell type that uses record syntax.
data Foo a = Foo { getDims :: (Int, Int), getData :: [a] }
I don't want to export the Foo value constructor, so that the user can't construct invalid objects. However, I would like to export getDims, so that the user can get the dimensions of the data structure. If I do this
module Data.ModuleName(Foo(getDims)) where
then the user can use getDims to get the dimensions, but the problem is that they can also use record update syntax to update the field.
getDims foo -- This is allowed (as intended)
foo { getDims = (999, 999) } -- But this is also allowed (not intended)
I would like to prevent the latter, as it would put the data in an invalid state. I realize that I could simply not use records.
data Foo a = Foo { getDims_ :: (Int, Int), getData :: [a] }
getDims :: Foo a -> (Int, Int)
getDims = getDims_
But this seems like a rather roundabout way to work around the problem. Is there a way to continue using record syntax while only exporting the record name for read access, not for write access?
Hiding the constructor and then defining new accessor functions for each field is a solution, but it can get tedious for records with a large number of fields.
Here's a solution with the new HasField typeclass in GHC 8.2.1 that avoids having to define functions for each field.
The idea is to define an auxiliary newtype like this:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE PolyKinds #-} -- Important, obscure errors happen without this.
import GHC.Records (HasField(..))
-- Do NOT export the actual constructor!
newtype Moat r = Moat r
-- Export this instead.
moat :: r -> Moat r
moat = Moat
-- If r has a field, Moat r also has that field
instance HasField s r v => HasField s (Moat r) v where
getField (Moat r) = getField #s r
Every field in a record r will be accesible from Moat r, with the following syntax:
λ :set -XDataKinds
λ :set -XTypeApplications
λ getField #"getDims" $ moat (Foo (5,5) ['s'])
(5,5)
The Foo constructor should be hidden from clients. However, the field accessors for Foo should not be hidden; they must be in scope for the HasField instances of Moat to kick in.
Every function in your public-facing api should return and receive Moat Foos instead of Foos.
To make the accessor syntax slightly less verbose, we can turn to OverloadedLabels:
import GHC.OverloadedLabels
newtype Label r v = Label { field :: r -> v }
instance HasField l r v => IsLabel l (Label r v) where
fromLabel = Label (getField #l)
In ghci:
λ :set -XOverloadedLabels
λ field #getDims $ moat (Foo (5,5) ['s'])
(5,5)
Instead of hiding the Foo constructor, another option would be to make Foo completely public and define Moat inside your library, hiding any Moat constructors from clients.

(Generically) Build Parsers from custom data types?

I'm working on a network streaming client that needs to talk to the server. The server encodes the responses in bytestrings, for example, "1\NULJohn\NULTeddy\NUL501\NUL", where '\NUL' is the separator. The above response translates to "This is a message of type 1(hard coded by the server), which tells the client what the ID of a user is(here, the user id of "John Teddy" is "501").
So naively I define a custom data type
data User
{ firstName :: String
, lastName :: String
, id :: Int
}
and a parser for this data type
parseID :: Parser User
parseID = ...
Then one just writes a handler to do some job(e.g., write to a database) after the parser succesfully mathes a response like this. This is very straightforward.
However, the server has almost 100 types of different responses like this that the client needs to parse. I suspect that there must be a much more elegant way to do the job rather than writing 100 almost identical parsers like this, because, after all, all haksell coders are lazy. I am a total newbie to generic programming so can some one tell me if there is a package that can do this job?
For these kinds of problems I turn to generics-sop instead of using generics directly. generics-sop is built on top of Generics and provides functions for manipulating all the fields in a record in a uniform way.
In this answer I use the ReadP parser which comes with base, but any other Applicative parser would do. Some preliminary imports:
{-# language DeriveGeneric #-}
{-# language FlexibleContexts #-}
{-# language FlexibleInstances #-}
{-# language TypeFamilies #-}
{-# language DataKinds #-}
{-# language TypeApplications #-} -- for the Proxy
import Text.ParserCombinators.ReadP (ReadP,readP_to_S)
import Text.ParserCombinators.ReadPrec (readPrec_to_P)
import Text.Read (readPrec)
import Data.Proxy
import qualified GHC.Generics as GHC
import Generics.SOP
We define a typeclass that can produce an Applicative parser for each of its instances. Here we define only the instances for Int and Bool:
class HasSimpleParser c where
getSimpleParser :: ReadP c
instance HasSimpleParser Int where
getSimpleParser = readPrec_to_P readPrec 0
instance HasSimpleParser Bool where
getSimpleParser = readPrec_to_P readPrec 0
Now we define a generic parser for records in which every field has a HasSimpleParser instance:
recParser :: (Generic r, Code r ~ '[xs], All HasSimpleParser xs) => ReadP r
recParser = to . SOP . Z <$> hsequence (hcpure (Proxy #HasSimpleParser) getSimpleParser)
The Code r ~ '[xs], All HasSimpleParser xs constraint means "this type has only one constructor, the list of field types is xs, and all the field types have HasSimpleParser instances".
hcpure constructs an n-ary product (NP) where each component is a parser for the corresponding field of r. (NP products wrap each component in a type constructor, which in our case is the parser type ReadP).
Then we use hsequence to turn a n-ary product of parsers into the parser of an n-ary product.
Finally, we fmap into the resulting parser and turn the n-ary product back into the original r record using to. The Z and SOP constructors are required for turning the n-ary product into the sum-of-products the to function expects.
Ok, let's define an example record and make it an instance of Generics.SOP.Generic:
data Foo = Foo { x :: Int, y :: Bool } deriving (Show, GHC.Generic)
instance Generic Foo -- Generic from generics-sop
Let's check if we can parse Foo with recParser:
main :: IO ()
main = do
print $ readP_to_S (recParser #Foo) "55False"
The result is
[(Foo {x = 55, y = False},"")]
You can write your own parser - but there is already a package that can do the parsing for you: cassava and while SO is usually not a place to search for library recommendations, I want to include this answer for people looking for a solution, but not having the time to implement this themselves and looking for a solution that works out of the box.
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE OverloadedStrings #-}
import Data.Csv
import Data.Vector
import Data.ByteString.Lazy as B
import GHC.Generics
data Person = P { personId :: Int
, firstName :: String
, lastName :: String
} deriving (Eq, Generic, Show)
-- the following are provided by friendly neighborhood Generic
instance FromRecord Person
instance ToRecord Person
main :: IO ()
main = do B.writeFile "test" "1\NULThomas\NULof Aquin"
Right thomas <- decodeWith (DecodeOptions 0) NoHeader <$>
B.readFile "test"
print (thomas :: Vector Person)
Basically cassava allows you to parse all X-separated structures into a Vector, provided you can write down a FromRecord instance (which needs a parseRecord :: Parser … function to work.
Side note on Generic until recently I thought - EVERYTHING - in haskell has a Generic instance, or can derive one. Well this is not the case I wanted to serialize some ThreadId to CSV/JSON and happened to find out unboxed types are not so easily "genericked"!
And before I forget it - when you speak of streaming and server and so on there is cassava-conduit that might be of help.

Haskell instance signatures

I'm a complete newbie in Haskell so please be patient.
Let's say I've got this class
class Indexable i where
at :: i a p -> p -> a
Now let's say I want to implement that typeclass for this data type:
data Test a p = Test [a]
What I tried is:
instance Indexable Test where
at (Test l) p = l `genericIndex` p
However it didn't compile, because p needs to be an Integral, however as far as I understand, it's impossibile to add the type signature to instances. I tried to use InstanceSigs, but failed.
Any ideas?
here is a version where you add the index-type to the class using MultiParamTypeClasses
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE RankNTypes #-}
module Index where
import Data.List (genericIndex)
class Indexable i f where
at :: forall a . f a -> i -> a
data Test a = Test [a]
instance Integral i => Indexable i Test where
at (Test as) i = as `genericIndex` i
here I need the FlexibleInstances because of the way the instance is declared and RankNTypes for the forall a . ;)
assuming this is your expected behavior:
λ> let test = Test [1..5]
λ> test `at` 3
4
λ> test `at` 0
1
λ> test `at` (0 :: Int)
1
λ> test `at` (1 :: Integer)
2
Just for fun, here's a very different solution which doesn't require any changes to your class declaration. (N.B. This answer is for fun only! I do not advocate keeping your class as-is; it seems a strange class definition to me.) The idea here is to push the burden of proof off from the class instance to the person constructing a value of type Test p a; we will demand that constructing such a value will require an Integral p instance in scope.
All this code stays exactly the same (but with a new extension turned on):
{-# LANGUAGE GADTs #-}
import Data.List
class Indexable i where
at :: i a p -> p -> a
instance Indexable Test where
at (Test l) p = l `genericIndex` p
But the declaration of your data type changes just slightly to demand an Integral p instance:
data Test a p where
Test :: Integral p => [a] -> Test a p
You are actually trying to do something fairly advanced. If I understand what you want, you actually need a multiparameter typeclass here, because your type parameter "p" depends on "i": for a list indexed by integer you need "p" to be integral, but for a table indexed by strings you need it to be "String", or at least an instance of "Ord".
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE MultiParamTypeClasses #-} -- Enable the language extensions.
class Indexable i p | i -> p where
at :: i a -> p -> a
This says that the class is for two types, "i" and "p", and if you know "i" then "p" follows automatically. So if "i" is a list the "p" has to be Int, and if "i" is a "Map String a" then "p" has to be "String".
instance Indexable [a] Int where
at = (!!)
This declares the combination of [a] and Int as being an instance of Indexable.
user2407038 has provided an alternative approach using "type families", which is a more recent and sophisticated version of multiparameter type classes.
You can use associated type families and constraint kinds:
import GHC.Exts(Constraint)
class Indexable i where
type IndexableCtr i :: * -> Constraint
at :: IndexableCtr i p => i a p -> p -> a
instance Indexable Test where
type IndexableCtr Test = Integral
at (Test l) p = l `genericIndex` p
This defines the class Indexable with an associated type IndexableCtr which
is used to constraint the type of at.

Resources