How do I create several related data types in haskell? - haskell

I have a User type that represents a user saved in the database. However, when displaying users, I only want to return a subset of these fields so I made a different type without the hash. When creating a user, a password will be provided instead of a hash, so I made another type for that.
This is clearly the worst, because there is tons of duplication between my types. Is there a better way to create several related types that all share some fields, but add some fields and remove others?
{-# LANGUAGE DeriveGeneric #}
data User = User {
id :: String,
email :: String,
hash :: String,
institutionId :: String
} deriving (Show, Generic)
data UserPrintable = UserPrintable {
email :: String,
id :: String,
institutionId :: String
} deriving (Generic)
data UserCreatable = UserCreatable {
email :: String,
hash :: String,
institutionId :: String
} deriving (Generic)
data UserFromRequest = UserFromRequest {
email :: String,
institutionId :: String,
password :: String
} deriving (Generic)
-- UGHHHHHHHHHHH

In this case, I think you can replace your various User types with functions. So instead of UserFromRequest, have:
userFromRequest :: Email -> InstitutionId -> String -> User
Note how you can also make separate types for Email and InstitutionId, which will help you avoid a bunch of annoying mistakes. This serves the same purpose as taking a record with labelled fields as an argument, while also adding a bit of extra static safety. You can just implement these as newtypes:
newtype Email = Email String deriving (Show, Eq)
Similarly, we can replace UserPrintable with showUser.
UserCreatable might be a bit awkard however, depending on how you need to use it. If all you ever do with it is take it as an argument and create a database row, then you can refactor it into a function the same way. But if you actually need the type for a bunch of things, this isn't a good solution.
In this second case, you have a couple of decent options. One would be to just make id a Maybe and check it each time. A better one would be to create a generic type WithId a which just adds an id field to anything:
data WithId a = { id :: DatabaseId, content :: a }
Then have a User type with no id and have your database functions work with a WithId User.

Related

Haskell: Create a list of only certain "kind" of type?

I've been working through both Learn You a Haskell and Beginning Haskell and have come on an interesting problem. To preface, I'm normally a C++ programmer, so forgive me if I have no idea what I'm talking about.
One of the exercises in Beginning Haskell has me create a type Client, which can be a Government organization, Company, or Individual. I decided to try out record syntax for this.
data Client = GovOrg { name :: String }
| Company { name :: String,
id :: Integer,
contact :: String,
position :: String
}
| Individual { fullName :: Person,
offers :: Bool
}
deriving Show
data Person = Person { firstName :: String,
lastName :: String,
gender :: Gender
}
deriving Show
data Gender = Male | Female | Unknown
deriving Show
This is used for an exercise where given a list of Clients, I have to find how many of each gender are in the list. I started by filtering to get a list of just Individuals since only they have the Gender type, but my method seems to be completely wrong:
listIndividuals :: [Client] -> [Client]
listIndividuals xs = filter (\x -> x == Individual) xs
How would I get this functionality where I can check what "kind" of Client something is. Also for the record syntax, how is my coding style? Too inconsistent?
First of all, I would recommend not using record types with algebraic types, because you end up with partial accessor functions. For example, it is perfectly legal to have the code position (Individual (Person "John" "Doe" Male) True), but it will throw a runtime error. Instead, consider something more like
data GovClient = GovClient {
govName :: String
} deriving Show
data CompanyClient = CompanyClient {
companyName :: String,
companyID :: Integer, -- Also, don't overwrite existing names, `id` is built-in function
companyContact :: String,
companyPosition :: String
} deriving Show
data IndividualClient = IndividualClient {
indvFullName :: Person,
indvOffers :: Bool
} deriving Show
Then you can have
data Client
= GovOrg GovClient
| Company CompanyClient
| Individual IndividualClient
deriving (Show)
Now you can also define your function as
isIndividualClient :: Client -> Bool
isIndividualClient (Individual _) = True
isIndividualClient _ = False
listIndividuals :: [Client] -> [IndividualClient]
listIndividuals clients = filter isIndividualClient clients
Or the more point-free form of
listIndividuals = filter isIndividualClient
Here, in order to make the decision I've simply used pattern matching in a separate function to determine which of Client's constructors was used. Now you get the full power of record and algebraic types, with just a hair more code to worry about, but a lot more safety. You'll never accidentally call a function expecting a government client on an individual client, for example, because it wouldn't type check, whereas with your current implementation it would be more than possible.
If you're concerned with the longer names, I would recommend eventually looking into the lens library that is designed to help you manipulate complex trees of record types with relative ease.
With your current implementation, you could also do something pretty similar to the final solution:
isIndividualClient :: Client -> Bool
isIndividualClient (Individual _ _) = True
isIndividualClient _ = False
listIndividuals :: [Client] -> [Client]
listIndividuals clients = filter isIndividualClient clients
The main difference here is that Individual takes two fields, so I have two _ wildcard matches in the pattern, and the type of listIndividuals is now [Client] -> [Client].

Why doesn't Haskell/GHC support record name overloading

I am a Haskell newbie. I have noticed that Haskell does not support record name overloading:
-- Records.hs
data Employee = Employee
{ firstName :: String
, lastName :: String
, ssn :: String
} deriving (Show, Eq)
data Manager = Manager
{ firstName :: String
, lastName :: String
, ssn :: String
, subordinates :: [Employee]
} deriving (Show, Eq)
When I compile this I get:
[1 of 1] Compiling Main ( Records.hs, Records.o )
Records.hs:10:5:
Multiple declarations of `firstName'
Declared at: Records.hs:4:5
Records.hs:10:5
Records.hs:11:5:
Multiple declarations of `lastName'
Declared at: Records.hs:5:5
Records.hs:11:5
Records.hs:12:5:
Multiple declarations of `ssn'
Declared at: Records.hs:6:5
Records.hs:12:5
Given the "strength" of the Haskell type system, it seems like it should be easy for the compiler to determine which field to access in
emp = Employee "Joe" "Smith" "111-22-3333"
man = Manager "Mary" "Jones" "333-22-1111" [emp]
firstName man
firstName emp
Is there some issue that I am not seeing. I know that the Haskell Report does not allow this, but why not?
Historical reasons. There have been many competing designs for better record systems for Haskell -- so many in fact, that no consensus could be reached. Yet.
The current record system is not very sophisticated. It's mostly some syntactic sugar for things you could do with boilerplate if there was no record syntax.
In particular, this:
data Employee = Employee
{ firstName :: String
, lastName :: String
, ssn :: String
} deriving (Show, Eq)
generates (among other things) a function firstName :: Employee -> String.
If you also allow in the same module this type:
data Manager = Manager
{ firstName :: String
, lastName :: String
, ssn :: String
, subordinates :: [Employee]
} deriving (Show, Eq)
then what would be the type of the firstName function?
It would have to be two separate functions overloading the same name, which Haskell does not allow. Unless you imagine that this would implicitly generate a typeclass and make instances of it for everything with a field named firstName (gets messy in the general case, when the fields could have different types), then Haskell's current record system isn't going to be able to support multiple fields with the same name in the same module. Haskell doesn't even attempt to do any such thing at present.
It could, of course, be done better. But there are some tricky problems to solve, and essentially no one's come up with solutions to them that have convinced everyone that there is a most promising direction to move in yet.
One option to avoid this is to put your data types in different modules and use qualified imports. In that way you can use the same field accessors on different data records and keep you code clean and more readable.
You can create one module for the employee, for example
module Model.Employee where
data Employee = Employee
{ firstName :: String
, lastName :: String
, ssn :: String
} deriving (Show, Eq)
And one module for the Manager, for example:
module Model.Manager where
import Model.Employee (Employee)
data Manager = Manager
{ firstName :: String
, lastName :: String
, ssn :: String
, subordinates :: [Employee]
} deriving (Show, Eq)
And then wherever you want to use these two data types you can import them qualified and access them as follows:
import Model.Employee (Employee)
import qualified Model.Employee as Employee
import Model.Manager (Manager)
import qualified Model.Manager as Manager
emp = Employee "Joe" "Smith" "111-22-3333"
man = Manager "Mary" "Jones" "333-22-1111" [emp]
name1 = Manager.firstName man
name2 = Employee.firstName emp
Keep in mind that after all you are using two different data types and thus Manger.firstName is another function than Employee.firstName, even when you know that both data types represent a person and each person has a first name. But it is up to you how far you go to abstract data types, for example to create a Person data type from those "attribute collections" as well.

Haskell -- any way to qualify or disambiguate record names?

I have two data types, which are used for hastache templates. It makes sense in my code to have two different types, both with a field named "name". This, of course, causes a conflict. It seems that there's a mechanism to disambiguate any calls to "name", but the actual definition causes problems. Is there any workaround, say letting the record field name be qualified?
data DeviceArray = DeviceArray
{ name :: String,
bytes :: Int }
deriving (Eq, Show, Data, Typeable)
data TemplateParams = TemplateParams
{ arrays :: [DeviceArray],
input :: DeviceArray }
deriving (Eq, Show, Data, Typeable)
data MakefileParams = MakefileParams
{ name :: String }
deriving (Eq, Show, Data, Typeable)
i.e. if the fields are now used in code, they will be "DeviceArray.name" and "MakefileParams.name"?
As already noted, this isn't directly possible, but I'd like to say a couple things about proposed solutions:
If the two fields are clearly distinct, you'll want to always know which you're using anyway. By "clearly distinct" here I mean that there would never be a circumstance where it would make sense to do the same thing with either field. Given this, excess disambiguity isn't really unwelcome, so you'd want either qualified imports as the standard approach, or the field disambiguation extension if that's more to your taste. Or, as a very simplistic (and slightly ugly) option, just manually prefix the fields, e.g. deviceArrayName instead of just name.
If the two fields are in some sense the same thing, it makes sense to be able to treat them in a homogeneous way; ideally you could write a function polymorphic in choice of name field. In this case, one option is using a type class for "named things", with functions that let you access the name field on any appropriate type. A major downside here, besides a proliferation of trivial type constraints and possible headaches from the Dreaded Monomorphism Restriction, is that you also lose the ability to use the record syntax, which begins to defeat the whole point.
The other major option for similar fields, which I didn't see suggested yet, is to extract the name field out into a single parameterized type, e.g. data Named a = Named { name :: String, item :: a }. GHC itself uses this approach for source locations in syntax trees, and while it doesn't use record syntax the idea is the same. The downside here is that if you have a Named DeviceArray, accessing the bytes field now requires going through two layers of records. If you want to update the bytes field with a function, you're stuck with something like this:
addBytes b na = na { item = (item na) { bytes = b + bytes (item na) } }
Ugh. There are ways to mitigate the issue a bit, but they're still not idea, to my mind. Cases like this are why I don't like record syntax in general. So, as a final option, some Template Haskell magic and the fclabels package:
{-# LANGUAGE TemplateHaskell #-}
import Control.Category
import Data.Record.Label
data Named a = Named
{ _name :: String,
_namedItem :: a }
deriving (Eq, Show, Data, Typeable)
data DeviceArray = DeviceArray { _bytes :: Int }
deriving (Eq, Show, Data, Typeable)
data MakefileParams = MakefileParams { _makefileParams :: [MakeParam] }
deriving (Eq, Show, Data, Typeable)
data MakeParam = MakeParam { paramText :: String }
deriving (Eq, Show, Data, Typeable)
$(mkLabels [''Named, ''DeviceArray, ''MakefileParams, ''MakeParam])
Don't mind the MakeParam business, I just needed a field on there to do something with. Anyway, now you can modify fields like this:
addBytes b = modL (namedItem >>> bytes) (b +)
nubParams = modL (namedItem >>> makefileParams) nub
You could also name bytes something like bytesInternal and then export an accessor bytes = namedItem >>> bytesInternal if you like.
Record field names are in the same scope as the data type, so you cannot do this directly.
The common ways to work around this is to either add prefixes to the field names, e.g. daName, mpName, or put them in separate modules which you then import qualified.
What you can do is to put each data type in its own module, then you can used qualified imports to disambiguate. It's a little clunky, but it works.
There are several GHC extensions which may help. The linked one is applicable in your case.
Or, you could refactor your code and use typeclasses for the common fields in records. Or, you should manually prefix each record selector with a prefix.
If you want to use the name in both, you can use a Class that define the name funcion. E.g:
Class Named a where
name :: a -> String
data DeviceArray = DeviceArray
{ deviceArrayName :: String,
bytes :: Int }
deriving (Eq, Show, Data, Typeable)
instance Named DeviceArray where
name = deviceArrayName
data MakefileParams = MakefileParams
{ makefileParamsName :: String }
deriving (Eq, Show, Data, Typeable)
instance Named MakefileParams where
name = makefileParamsName
And then you can use name on both classes.

Modeling domain data in Haskell [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm working on designing a larger-ish web application using Haskell. This is purely for my education and interest.
I'm starting by writing out my domain/value objects. One example is a User. Here's what I've come up with so far
module Model (User) where
class Audited a where
creationDate :: a -> Integer
lastUpdatedDate :: a -> Integer
creationUser :: a -> User
lastUpdatedUser :: a -> User
class Identified a where
id :: a -> Integer
data User = User { userId :: Integer
, userEmail :: String
, userCreationDate :: Integer
, userLastUpdatedDate :: Integer
, userCreationUser :: User
, userLastUpdatedUser :: User
}
instance Identified User where
id u = userId u
instance Audited User where
creationDate u = userCreationDate
lastUpdatedDate u = userLastUpdatedDate
creationUser u = userCreationUser
lastUpdatedUser u = userLastUpdatedUser
My application will have roughly 20 types like the above type. When I say "like the above type", I mean they will have an id, audit information, and some type-specific information (like email in the case of User).
The thing I can't wrap my mind around is the fact that each of my fields (e.g. User.userEmail) creates a new function fieldName :: Type -> FieldType. With 20 different types, the namespace seems like it'll get pretty full pretty fast. Also, I don't like having to name my User ID field userId. I'd rather name it id. Is there any way around this?
Maybe I should mention that I'm coming from the imperative world, so this FP stuff is pretty new (yet pretty exciting) for me.
Yeah, namespacing can be kind of a pain in Haskell. I usually end up tightening up my abstractions until there are not so many names. It also allows for more reuse. For yours, I would make a data type rather than a class for the audit information:
data Audit = Audit {
creationDate :: Integer,
lastUpdatedDate :: Integer,
creationUser :: User,
lastUpdatedUser :: User
}
And then pair that up with the type-specific data:
data User = User {
userAudit :: Audit,
userId :: Integer,
userEmail :: String
}
You can still use those typeclasses if you want:
class Audited a where
audit :: a -> Audit
class Identified a where
ident :: a -> Integer
However as your design develops, be open to the possibility of those typeclasses dissolving into thin air. Object-like typeclasses -- ones where every method takes a single parameter of type a -- have a way of simplifying themselves away.
Another way to approach this might be to classify your objects with a parametric type:
data Object a = Object {
objId :: Integer,
objAudit :: Audit,
objData :: a
}
Check it out, Object is a Functor!
instance Functor Object where
fmap f (Object id audit dta) = Object id audit (f dta)
I would be more inclined to do it this way, based on my design hunch. It is hard to say which way is better without knowing more about your plans. And look, the need for those typeclasses dissolved away. :-)
This is a known problem with Haskell's records. There have been some suggestions (notably TDNR) to mitigate the effects, but no solutions have emerged yet.
If you don't mind putting each of your data objects in a separate module, then you can use namespaces to differentiate between the functions:
import qualified Model.User as U
import qualified Model.Privileges as P
someUserId user = U.id user
somePrivId priv = P.id priv
As to using id instead of userId; it's possible if you hide the id which is imported from the Prelude by default. Use the following as your first import statement:
import Prelude hiding (id)
and now the usual id function won't be in scope. If you need it for some reason, you can access it with a fully-qualified name, i.e. Prelude.id.
Think carefully before creating a name that clashes with the Prelude. It can often be confusing for the programmer, and it's slightly awkward to work with. You may be better off using short, generic name, such as oId.
One simple option is, rather than going all out on type classes, make all your types varieties of a single algebraic data type:
data DomainObject = User {
objectID :: Int,
objectCreationDate :: Date
...
}
| SomethingElse {
objectID :: Int,
objectCreationDate :: Date,
somethingProperty :: Foo
...
}
| AnotherThing {
objectID :: Int,
objectCreationDate :: Date,
anotherThingProperty :: Bar
...
}
It's clunky because it requires having all your data structures in a single file, but it does at least allow you to use the same function (objectID) to get the ID of an object.

When should I use record syntax for data declarations in Haskell?

Record syntax seems extremely convenient compared to having to write your own accessor functions. I've never seen anyone give any guidelines as to when it's best to use record syntax over normal data declaration syntax, so I'll just ask here.
You should use record syntax in two situations:
The type has many fields
The type declaration gives no clue about its intended layout
For instance a Point type can be simply declared as:
data Point = Point Int Int deriving (Show)
It is obvious that the first Int denotes the x coordinate and the second stands for y. But the case with the following type declaration is different (taken from Learn You a Haskell for Great Good):
data Person = Person String String Int Float String String deriving (Show)
The intended type layout is: first name, last name, age, height, phone number, and favorite ice-cream flavor. But this is not evident in the above declaration. Record syntax comes handy here:
data Person = Person { firstName :: String
, lastName :: String
, age :: Int
, height :: Float
, phoneNumber :: String
, flavor :: String
} deriving (Show)
The record syntax made the code more readable, and saved a great deal of typing by automatically defining all the accessor functions for us!
In addition to complex multi-fielded data, newtypes are often defined with record syntax. In either of these cases, there aren't really any downsides to using record syntax, but in the case of sum types, record accessors usually don't make sense. For example:
data Either a b = Left { getLeft :: a } | Right { getRight :: b }
is valid, but the accessor functions are partial – it is an error to write getLeft (Right "banana"). For that reason, such accessors are generally speaking discouraged; something like getLeft :: Either a b -> Maybe a would be more common, and that would have to be defined manually. However, note that accessors can share names:
data Item = Food { description :: String, tastiness :: Integer }
| Wand { description :: String, magic :: Integer }
Now description is total, although tastiness and magic both still aren't.

Resources