Haskell -- any way to qualify or disambiguate record names? - haskell

I have two data types, which are used for hastache templates. It makes sense in my code to have two different types, both with a field named "name". This, of course, causes a conflict. It seems that there's a mechanism to disambiguate any calls to "name", but the actual definition causes problems. Is there any workaround, say letting the record field name be qualified?
data DeviceArray = DeviceArray
{ name :: String,
bytes :: Int }
deriving (Eq, Show, Data, Typeable)
data TemplateParams = TemplateParams
{ arrays :: [DeviceArray],
input :: DeviceArray }
deriving (Eq, Show, Data, Typeable)
data MakefileParams = MakefileParams
{ name :: String }
deriving (Eq, Show, Data, Typeable)
i.e. if the fields are now used in code, they will be "DeviceArray.name" and "MakefileParams.name"?

As already noted, this isn't directly possible, but I'd like to say a couple things about proposed solutions:
If the two fields are clearly distinct, you'll want to always know which you're using anyway. By "clearly distinct" here I mean that there would never be a circumstance where it would make sense to do the same thing with either field. Given this, excess disambiguity isn't really unwelcome, so you'd want either qualified imports as the standard approach, or the field disambiguation extension if that's more to your taste. Or, as a very simplistic (and slightly ugly) option, just manually prefix the fields, e.g. deviceArrayName instead of just name.
If the two fields are in some sense the same thing, it makes sense to be able to treat them in a homogeneous way; ideally you could write a function polymorphic in choice of name field. In this case, one option is using a type class for "named things", with functions that let you access the name field on any appropriate type. A major downside here, besides a proliferation of trivial type constraints and possible headaches from the Dreaded Monomorphism Restriction, is that you also lose the ability to use the record syntax, which begins to defeat the whole point.
The other major option for similar fields, which I didn't see suggested yet, is to extract the name field out into a single parameterized type, e.g. data Named a = Named { name :: String, item :: a }. GHC itself uses this approach for source locations in syntax trees, and while it doesn't use record syntax the idea is the same. The downside here is that if you have a Named DeviceArray, accessing the bytes field now requires going through two layers of records. If you want to update the bytes field with a function, you're stuck with something like this:
addBytes b na = na { item = (item na) { bytes = b + bytes (item na) } }
Ugh. There are ways to mitigate the issue a bit, but they're still not idea, to my mind. Cases like this are why I don't like record syntax in general. So, as a final option, some Template Haskell magic and the fclabels package:
{-# LANGUAGE TemplateHaskell #-}
import Control.Category
import Data.Record.Label
data Named a = Named
{ _name :: String,
_namedItem :: a }
deriving (Eq, Show, Data, Typeable)
data DeviceArray = DeviceArray { _bytes :: Int }
deriving (Eq, Show, Data, Typeable)
data MakefileParams = MakefileParams { _makefileParams :: [MakeParam] }
deriving (Eq, Show, Data, Typeable)
data MakeParam = MakeParam { paramText :: String }
deriving (Eq, Show, Data, Typeable)
$(mkLabels [''Named, ''DeviceArray, ''MakefileParams, ''MakeParam])
Don't mind the MakeParam business, I just needed a field on there to do something with. Anyway, now you can modify fields like this:
addBytes b = modL (namedItem >>> bytes) (b +)
nubParams = modL (namedItem >>> makefileParams) nub
You could also name bytes something like bytesInternal and then export an accessor bytes = namedItem >>> bytesInternal if you like.

Record field names are in the same scope as the data type, so you cannot do this directly.
The common ways to work around this is to either add prefixes to the field names, e.g. daName, mpName, or put them in separate modules which you then import qualified.

What you can do is to put each data type in its own module, then you can used qualified imports to disambiguate. It's a little clunky, but it works.

There are several GHC extensions which may help. The linked one is applicable in your case.
Or, you could refactor your code and use typeclasses for the common fields in records. Or, you should manually prefix each record selector with a prefix.

If you want to use the name in both, you can use a Class that define the name funcion. E.g:
Class Named a where
name :: a -> String
data DeviceArray = DeviceArray
{ deviceArrayName :: String,
bytes :: Int }
deriving (Eq, Show, Data, Typeable)
instance Named DeviceArray where
name = deviceArrayName
data MakefileParams = MakefileParams
{ makefileParamsName :: String }
deriving (Eq, Show, Data, Typeable)
instance Named MakefileParams where
name = makefileParamsName
And then you can use name on both classes.

Related

How do I create several related data types in haskell?

I have a User type that represents a user saved in the database. However, when displaying users, I only want to return a subset of these fields so I made a different type without the hash. When creating a user, a password will be provided instead of a hash, so I made another type for that.
This is clearly the worst, because there is tons of duplication between my types. Is there a better way to create several related types that all share some fields, but add some fields and remove others?
{-# LANGUAGE DeriveGeneric #}
data User = User {
id :: String,
email :: String,
hash :: String,
institutionId :: String
} deriving (Show, Generic)
data UserPrintable = UserPrintable {
email :: String,
id :: String,
institutionId :: String
} deriving (Generic)
data UserCreatable = UserCreatable {
email :: String,
hash :: String,
institutionId :: String
} deriving (Generic)
data UserFromRequest = UserFromRequest {
email :: String,
institutionId :: String,
password :: String
} deriving (Generic)
-- UGHHHHHHHHHHH
In this case, I think you can replace your various User types with functions. So instead of UserFromRequest, have:
userFromRequest :: Email -> InstitutionId -> String -> User
Note how you can also make separate types for Email and InstitutionId, which will help you avoid a bunch of annoying mistakes. This serves the same purpose as taking a record with labelled fields as an argument, while also adding a bit of extra static safety. You can just implement these as newtypes:
newtype Email = Email String deriving (Show, Eq)
Similarly, we can replace UserPrintable with showUser.
UserCreatable might be a bit awkard however, depending on how you need to use it. If all you ever do with it is take it as an argument and create a database row, then you can refactor it into a function the same way. But if you actually need the type for a bunch of things, this isn't a good solution.
In this second case, you have a couple of decent options. One would be to just make id a Maybe and check it each time. A better one would be to create a generic type WithId a which just adds an id field to anything:
data WithId a = { id :: DatabaseId, content :: a }
Then have a User type with no id and have your database functions work with a WithId User.

Hide fields from deriving (Show)

Imagine I have a data record with many fields:
data DataRecord = DataRecord {
field1 :: String,
field2 :: String,
...
} deriving (Show)
Is it possible to hide some fields from the deriving (Show) or do have to implement my own show function for DataRecord?
Reason for my question: When I have cyclic dependencies between two data records both using deriving (Show) the show function would generate an infinite string.
The Haskell 2010 report mentions your cyclic dependencies as unsuitable case:
The derived Read and Show instances may be unsuitable for some uses. Some problems include:
Circular structures cannot be printed or read by these instances.
So you need to specify the instance by hand.

Why doesn't GHC Haskell support overloaded record parameter names?

What I am talking about is that it is not possible to define:
data A = A {name :: String}
data B = B {name :: String}
I know that the GHC just desugars this to plain functions and the idiomatic way to solve this would be:
data A = A {aName :: String}
data B = B {bName :: String}
class Name a where
name :: a -> String
instance Name A where
name = aName
instance Name B where
name = bName
After having written this out I don't like it that much ... couldn't this typeclassing be part of the desugaring process?
The thought came to me when I was writing some Aeson JSON parsing. Where it would have been too easy to just derive the FromJSON instances for every data type I had to write everything out by hand (currently >1k lines and counting).
Having names like name or simply value in a data record is not that uncommon.
http://www.haskell.org/haskellwiki/Performance/Overloading mentions that function overloading introduces some runtime overhead. But I actually don't see why the compiler wouldn't be able to resolve this at compile time and give them different names internally.
This SO question from 2012 more or less states historical reasons and points to a mail thread from 2006. Has anything changed recently?
Even if there would be some runtime overhead most people wouldn't mind cause most code hardly is performance critical.
Is there some hidden language extension that actually allows this? Again I am not sure ... but I think Idris actually does this?
Many, mostly minor reasons. One is the problem raised by a better answer, overloading just on the first argument is insufficient to handle all the useful cases.
You could "desugar"
data A { name :: String }
data B { name :: Text }
into
class Has'name a b | a -> b where
name :: a -> b
data A { aName :: String }
instance Has'name A String where
name :: aName
data B { bName :: Text }
instance Has'name B Text where
name :: bName
but that would require GHC extensions (Functional Dependencies) that haven't made it into the standard, yet. It would preclude using just 'name' for record creation, updates, and pattern matching (view patterns might help there), since 'name' isn't "just" a function in those cases. You can probably pull off something very similar with template Haskell.
Using the record syntax
data A { name :: String }
implicitly defines a function
name :: A -> String
If define both A and B with a { name :: String }, we have conflicting type definitions for name:
name :: A -> String
name :: B -> String
It's not clear how your proposed implicit type classes would work because if we define two types
data A { name :: String }
data B { name :: Text }
then we have just shifted the problem to conflicting type class definitions:
class Has'name a where
name :: a -> String
class Has'name a where
name :: a -> Text
In principle this could be resolved one way or another, but this is just one of several tricky conflicting desirable properties for records. When Haskell was defined, it was decided that it was better to have simple if limited support rather than to try to design something more ambitious and complicated. Several improvements to records have been discussed at various times and there are perennial discussions, e.g. this Haskell Cafe thread. Perhaps something will be worked out for Haskell Prime.
The best way I found, is to use a preprocessor to solve this definitely rather stupid problem.
Haskell and GHC make this easy, because the whole Haskell parser is available as a normal library. You could just parse all the files, do that renaming scheme (e.g. « data A { name :: String } » and « let a = A "Betty" in name a » into « data A { a_Name :: String } » and « let a = A "Betty" in aName a ») depending on the type of data the name function is applied to, using the type resolver, and write them out for compilation.
But honestly, that should be integrated into GHC. You’re right: It’s silly that this isn’t included.

How to define a class that allows uniform access to different records in Haskell?

I have two records that both have a field I want to extract for display. How do I arrange things so they can be manipulated with the same functions? Since they have different fields (in this case firstName and buildingName) that are their name fields, they each need some "adapter" code to map firstName to name. Here is what I have so far:
class Nameable a where
name :: a -> String
data Human = Human {
firstName :: String
}
data Building = Building {
buildingName :: String
}
instance Nameable Human where
name x = firstName x
instance Nameable Building where
-- I think the x is redundant here, i.e the following should work:
-- name = buildingName
name x = buildingName x
main :: IO ()
main = do
putStr $ show (map name items)
where
items :: (Nameable a) => [a]
items = [ Human{firstName = "Don"}
-- Ideally I want the next line in the array too, but that gives an
-- obvious type error at the moment.
--, Building{buildingName = "Empire State"}
]
This does not compile:
TypeTest.hs:23:14:
Couldn't match expected type `a' against inferred type `Human'
`a' is a rigid type variable bound by
the type signature for `items' at TypeTest.hs:22:23
In the expression: Human {firstName = "Don"}
In the expression: [Human {firstName = "Don"}]
In the definition of `items': items = [Human {firstName = "Don"}]
I would have expected the instance Nameable Human section would make this work. Can someone explain what I am doing wrong, and for bonus points what "concept" I am trying to get working, since I'm having trouble knowing what to search for.
This question feels similar, but I couldn't figure out the connection with my problem.
Consider the type of items:
items :: (Nameable a) => [a]
It's saying that for any Nameable type, items will give me a list of that type. It does not say that items is a list that may contain different Nameable types, as you might think. You want something like items :: [exists a. Nameable a => a], except that you'll need to introduce a wrapper type and use forall instead. (See: Existential type)
{-# LANGUAGE ExistentialQuantification #-}
data SomeNameable = forall a. Nameable a => SomeNameable a
[...]
items :: [SomeNameable]
items = [ SomeNameable $ Human {firstName = "Don"},
SomeNameable $ Building {buildingName = "Empire State"} ]
The quantifier in the data constructor of SomeNameable basically allows it to forget everything about exactly which a is used, except that it is Nameable. Therefore, you will only be allowed to use functions from the Nameable class on the elements.
To make this nicer to use, you can make an instance for the wrapper:
instance Nameable (SomeNameable a) where
name (SomeNameable x) = name x
Now you can use it like this:
Main> map name items
["Don", "Empire State"]
Everybody is reaching for either existential quantification or algebraic data types. But these are both overkill (well depending on your needs, ADTs might not be).
The first thing to note is that Haskell has no downcasting. That is, if you use the following existential:
data SomeNameable = forall a. Nameable a => SomeNameable a
then when you create an object
foo :: SomeNameable
foo = SomeNameable $ Human { firstName = "John" }
the information about which concrete type the object was made with (here Human) is forever lost. The only things we know are: it is some type a, and there is a Nameable a instance.
What is it possible to do with such a pair? Well, you can get the name of the a you have, and... that's it. That's all there is to it. In fact, there is an isomorphism. I will make a new data type so you can see how this isomorphism arises in cases when all your concrete objects have more structure than the class.
data ProtoNameable = ProtoNameable {
-- one field for each typeclass method
protoName :: String
}
instance Nameable ProtoNameable where
name = protoName
toProto :: SomeNameable -> ProtoNameable
toProto (SomeNameable x) = ProtoNameable { protoName = name x }
fromProto :: ProtoNameable -> SomeNameable
fromProto = SomeNameable
As we can see, this fancy existential type SomeNameable has the same structure and information as ProtoNameable, which is isomorphic to String, so when you are using this lofty concept SomeNameable, you're really just saying String in a convoluted way. So why not just say String?
Your items definition has exactly the same information as this definition:
items = [ "Don", "Empire State" ]
I should add a few notes about this "protoization": it is only as straightforward as this when the typeclass you are existentially quantifying over has a certain structure: namely when it looks like an OO class.
class Foo a where
method1 :: ... -> a -> ...
method2 :: ... -> a -> ...
...
That is, each method only uses a once as an argument. If you have something like Num
class Num a where
(+) :: a -> a -> a
...
which uses a in multiple argument positions, or as a result, then eliminating the existential is not as easy, but still possible. However my recommendation to do this changes from a frustration to a subtle context-dependent choice, because of the complexity and distant relationship of the two representations. However, every time I have seen existentials used in practice it is with the Foo kind of tyepclass, where it only adds needless complexity, so I quite emphatically consider it an antipattern. In most of these cases I recommend eliminating the entire class from your codebase and exclusively using the protoized type (after you give it a good name).
Also, if you do need to downcast, then existentials aren't your man. You can either use an algebraic data type, as others people have answered, or you can use Data.Dynamic (which is basically an existential over Typeable. But don't do that; a Haskell programmer resorting to Dynamic is ungentlemanlike. An ADT is the way to go, where you characterize all the possible types it could be in one place (which is necessary so that the functions that do the "downcasting" know that they handle all possible cases).
I like #hammar's answer, and you should also check out this article which provides another example.
But, you might want to think differently about your types. The boxing of Nameable into the SomeNameable data type usually makes me start thinking about whether a union type for the specific case is meaningful.
data Entity = H Human | B Building
instance Nameable Entity where ...
items = [H (Human "Don"), B (Building "Town Hall")]
I'm not sure why you want to use the same function for
getting the name of a Human and the name of a Building.
If their names are used in fundamentally different ways,
except maybe for simple things like printing them,
then you probably want two
different functions for that. The type system
will automatically guide you to choose the right function
to use in each situation.
But if having a name is something significant about the
whole purpose of your program, and a Human and a Building
are really pretty much the same thing in that respect as far as your program
is concerned, then you would define their type together:
data NameableThing =
Human { name :: String } |
Building { name :: String }
That gives you a polymorphic function name that works for
whatever particular flavor of NameableThing you happen to have,
without needing to get into type classes.
Usually you would use a type class for a different kind of situation:
if you have some kind of non-trivial operation that has the same purpose
but a different implementation for several different types.
Even then, it's often better to use some other approach instead, like
passing a function as a parameter (a "higher order function", or "HOF").
Haskell type classes are a beautiful and powerful tool, but they are totally
different than what is called a "class" in object-oriented languages,
and they are used far less often.
And I certainly don't recommend complicating your program by using an advanced
extension to Haskell like Existential Qualification just to fit into
an object-oriented design pattern.
You can try to use Existentially Quanitified types and do it like this:
data T = forall a. Nameable a => MkT a
items = [MkT (Human "bla"), MkT (Building "bla")]
I've just had a look at the code that this question is abstracting from. For this, I would recommend merging the Task and RecurringTaskDefinition types:
data Task
= Once
{ name :: String
, scheduled :: Maybe Day
, category :: TaskCategory
}
| Recurring
{ name :: String
, nextOccurrence :: Day
, frequency :: RecurFrequency
}
type ProgramData = [Task] -- don't even need a new data type for this any more
Then, the name function works just fine on either type, and the functions you were complaining about like deleteTask and deleteRecurring don't even need to exist -- you can just use the standard delete function as usual.

When should I use record syntax for data declarations in Haskell?

Record syntax seems extremely convenient compared to having to write your own accessor functions. I've never seen anyone give any guidelines as to when it's best to use record syntax over normal data declaration syntax, so I'll just ask here.
You should use record syntax in two situations:
The type has many fields
The type declaration gives no clue about its intended layout
For instance a Point type can be simply declared as:
data Point = Point Int Int deriving (Show)
It is obvious that the first Int denotes the x coordinate and the second stands for y. But the case with the following type declaration is different (taken from Learn You a Haskell for Great Good):
data Person = Person String String Int Float String String deriving (Show)
The intended type layout is: first name, last name, age, height, phone number, and favorite ice-cream flavor. But this is not evident in the above declaration. Record syntax comes handy here:
data Person = Person { firstName :: String
, lastName :: String
, age :: Int
, height :: Float
, phoneNumber :: String
, flavor :: String
} deriving (Show)
The record syntax made the code more readable, and saved a great deal of typing by automatically defining all the accessor functions for us!
In addition to complex multi-fielded data, newtypes are often defined with record syntax. In either of these cases, there aren't really any downsides to using record syntax, but in the case of sum types, record accessors usually don't make sense. For example:
data Either a b = Left { getLeft :: a } | Right { getRight :: b }
is valid, but the accessor functions are partial – it is an error to write getLeft (Right "banana"). For that reason, such accessors are generally speaking discouraged; something like getLeft :: Either a b -> Maybe a would be more common, and that would have to be defined manually. However, note that accessors can share names:
data Item = Food { description :: String, tastiness :: Integer }
| Wand { description :: String, magic :: Integer }
Now description is total, although tastiness and magic both still aren't.

Resources