Is there an idiomatic way to do deal with this situation when two structures share some content? - haskell

I'm making a toy forum to gain familiarity with Haskell and Servant.
My API looks something like this:
type UserAPI = "messages" :> ReqBody '[JSON] Msg :> Header "X-Real-IP" String :> Post '[JSON] APIMessage
:<|> "messages" :> ReqBody '[JSON] Int :> Get '[JSON] [Msg']
My types look something like this:
data Msg = Msg
{ thread :: Int
, dname :: String
, contents :: String
} deriving (Eq, Show, Generic)
data Msg' = Msg'
{ thread' :: Int
, stamp' :: UTCTime
, dname' :: String
, contents' :: String
, ip' :: String
} deriving (Eq, Show, Generic)
and they derive ToJSON / FromJSON / FromRow instances, which is very convenient.
Msg represents the data the API expects when receiving messages and Msg' the data it sends when queried for messages, which has two additional fields that are added by the server, but this doesn't feel right and there has to be a cleaner way to achieve this.
Any insight on an idiomatic way to do deal with this sort of problem appreciated.

I will consider here that you question is more a conceptual one ("What can I do when I have two data types that share some structure ?") than a simple "How do I model inheritance in Haskell ?" that is already replied here.
To answer your question, you will need to consider more than just the structure of your data. For example, if I provide you A and B and if I state that
data A = A Int String
data B = B Int
I doubt that you will automatically make the assumption that a A is a B with an extra String. You will probably try to figure the exact relation between these two data structure. And this is the good thing to do.
If each instance of A can actually be seen as an instance of B then it can be relevant to provide way to represent it in your code. Then you could use a plain Haskell way with a
data A = A { super :: B, theString :: String }
data B = B { id :: Int }
Obviously, this will not be easy to work with these datatype without creating some other functions. For example a fromB function could be relevant
fromB :: B -> String -> A
toB :: A -> B
And you can also use typeclass to access id
class HasId a where
getId :: a -> Int
instance HasId A where
getId = id . super
This is where some help form Lens can be useful. And the answer to this question How do I model inheritance in Haskell? is a good start. Lens package provides Object Oriented syntactic sugar to handle inheritance relationship.
However you can also find that a A is not exactly a B but they both share the same ancestor. And you could prefer to create something like
data A = A { share :: C, theString :: String }
data B = B { share :: C }
data C = C Int
This is the case when you do not want to use a A as a B, but it exists some function that can be used by both. The implementation will be near the previous cases, so I do not explain it.
Finally you could find that there does not really exists relation that can be useful (and, therefore, no function that will really exists that is shared between A and B). Then you would prefer to keep your code.
In your specific case, I think that there is not a direct "is a" relation between Msg and Msg' since one is for the receiving and the other is for the sending. But they could share a common ancestor since both are messages. So they will probably have some constructors in common and accessors (in term of OO programming).
Try to never forget that structure is always bind to some functions. And what category theory teaches us is that you cannot only look at the structures only but you have to consider their functions also to see the relation between each other.

Related

Writing an OOP-style "setter" function in Haskell using record-syntax

I'm reading a tutorial on lenses and, in the introduction, the author motivates the lens concept by showing a few examples of how we might implement OOP-style "setter"/"getter" using standard Haskell. I'm confused by the following example.
Let's say we define a User algebraic data types as per Figure 1 (below). The tutorial states (correctly) that we can implement "setter" functionality via the NaiveLens data type and the nameLens function (also in Figure 1). An example usage is given in Figure 2.
I'm perplexed as to why we need such an elaborate construct (i.e., a NaiveLens datatype and a nameLens function) in order to implement "setter" functionality, when the following (somewhat obvious) function seems to do the job equally well: set' a s = s {name = a}.
HOWEVER, given that my "obvious" function is none other than the lambda function that's part of nameLens, I suspect there is indeed an advantage to using the construct below but that I'm too dense to see what that advantage is. Am hoping one of the Haskell wizards can help me understand.
Figure 1 (definitions):
data User = User { name :: String
, age :: Int
} deriving Show
data NaiveLens s a = NaiveLens { view :: s -> a
, set :: a -> s -> s
}
nameLens :: NaiveLens User String
nameLens = NaiveLens name (\a s -> s {name = a})
Figure 2 (example usage):
λ: let john = User {name="John",age=30}
john :: User
λ: set nameLens "Bob" john
User {name = "Bob", age = 30}
it :: User
The main advantage of lenses is that they compose, so they can be used for accessing and updating fields in nested records. Writing this sort of nested update manually using record update syntax gets tedious quite quickly.
Say you added an Email data type:
data Email = Email
{ _handle :: String
, _domain :: String
} deriving (Eq, Show)
handle :: NaiveLens Email String
handle = NaiveLens _handle (\h e -> e { _handle = h })
And added this as a field to your User type:
data User = User
{ _name :: String
, _age :: Int
, _userEmail :: Email
} deriving (Eq, Show)
email :: NaiveLens User Email
email = NaiveLens _userEmail (\e u -> u { _userEmail = e })
The real power of lenses comes from being able to compose them, but this is a bit of a tricky step. We would like some function that looks like
(...) :: NaiveLens s b -> NaiveLens b a -> NaiveLens s a
NaiveLens viewA setA ... NaiveLens viewB setB
= NaiveLens (viewB . viewA) (\c a -> setA (setB c (viewA a)) a)
For an explanation of how this was written, I'll defer to this post, where I shamelessly lifted it from. The resulting set field of this new lens can be thought of as taking a new value and a top-level record, looking up the lower record and setting its value to c, then setting that new record for the top-level record.
Now we have a convenient function for composing our lenses:
> let bob = User "Bob" 30 (Email "bob" "gmail")
> view (email...handle) bob
"bob"
> set (email...handle) "NOTBOB" bob
User {_name = "Bob", _age = 30, _userEmail = Email {_handle = "NOTBOB", _domain = "gmail"}}
I've used ... as the composition operator here because I think it's rather easy to type and still is similar to the . operator. This now gives us a way to drill down into a structure, getting and setting values fairly arbitrarily. If we had a domain lens written similarly, we could get and set that value in much the same way. This is what makes it look like it's OOP member access, even when it's simply fancy function composition.
If you look at the lens library (my choice for lenses), you get some nice tools to automatically build the lenses for you using template haskell, and there's some extra stuff going on behind the scenes that lets you use the normal function composition operator . instead of a custom one.

Why doesn't GHC Haskell support overloaded record parameter names?

What I am talking about is that it is not possible to define:
data A = A {name :: String}
data B = B {name :: String}
I know that the GHC just desugars this to plain functions and the idiomatic way to solve this would be:
data A = A {aName :: String}
data B = B {bName :: String}
class Name a where
name :: a -> String
instance Name A where
name = aName
instance Name B where
name = bName
After having written this out I don't like it that much ... couldn't this typeclassing be part of the desugaring process?
The thought came to me when I was writing some Aeson JSON parsing. Where it would have been too easy to just derive the FromJSON instances for every data type I had to write everything out by hand (currently >1k lines and counting).
Having names like name or simply value in a data record is not that uncommon.
http://www.haskell.org/haskellwiki/Performance/Overloading mentions that function overloading introduces some runtime overhead. But I actually don't see why the compiler wouldn't be able to resolve this at compile time and give them different names internally.
This SO question from 2012 more or less states historical reasons and points to a mail thread from 2006. Has anything changed recently?
Even if there would be some runtime overhead most people wouldn't mind cause most code hardly is performance critical.
Is there some hidden language extension that actually allows this? Again I am not sure ... but I think Idris actually does this?
Many, mostly minor reasons. One is the problem raised by a better answer, overloading just on the first argument is insufficient to handle all the useful cases.
You could "desugar"
data A { name :: String }
data B { name :: Text }
into
class Has'name a b | a -> b where
name :: a -> b
data A { aName :: String }
instance Has'name A String where
name :: aName
data B { bName :: Text }
instance Has'name B Text where
name :: bName
but that would require GHC extensions (Functional Dependencies) that haven't made it into the standard, yet. It would preclude using just 'name' for record creation, updates, and pattern matching (view patterns might help there), since 'name' isn't "just" a function in those cases. You can probably pull off something very similar with template Haskell.
Using the record syntax
data A { name :: String }
implicitly defines a function
name :: A -> String
If define both A and B with a { name :: String }, we have conflicting type definitions for name:
name :: A -> String
name :: B -> String
It's not clear how your proposed implicit type classes would work because if we define two types
data A { name :: String }
data B { name :: Text }
then we have just shifted the problem to conflicting type class definitions:
class Has'name a where
name :: a -> String
class Has'name a where
name :: a -> Text
In principle this could be resolved one way or another, but this is just one of several tricky conflicting desirable properties for records. When Haskell was defined, it was decided that it was better to have simple if limited support rather than to try to design something more ambitious and complicated. Several improvements to records have been discussed at various times and there are perennial discussions, e.g. this Haskell Cafe thread. Perhaps something will be worked out for Haskell Prime.
The best way I found, is to use a preprocessor to solve this definitely rather stupid problem.
Haskell and GHC make this easy, because the whole Haskell parser is available as a normal library. You could just parse all the files, do that renaming scheme (e.g. « data A { name :: String } » and « let a = A "Betty" in name a » into « data A { a_Name :: String } » and « let a = A "Betty" in aName a ») depending on the type of data the name function is applied to, using the type resolver, and write them out for compilation.
But honestly, that should be integrated into GHC. You’re right: It’s silly that this isn’t included.

How to define a class that allows uniform access to different records in Haskell?

I have two records that both have a field I want to extract for display. How do I arrange things so they can be manipulated with the same functions? Since they have different fields (in this case firstName and buildingName) that are their name fields, they each need some "adapter" code to map firstName to name. Here is what I have so far:
class Nameable a where
name :: a -> String
data Human = Human {
firstName :: String
}
data Building = Building {
buildingName :: String
}
instance Nameable Human where
name x = firstName x
instance Nameable Building where
-- I think the x is redundant here, i.e the following should work:
-- name = buildingName
name x = buildingName x
main :: IO ()
main = do
putStr $ show (map name items)
where
items :: (Nameable a) => [a]
items = [ Human{firstName = "Don"}
-- Ideally I want the next line in the array too, but that gives an
-- obvious type error at the moment.
--, Building{buildingName = "Empire State"}
]
This does not compile:
TypeTest.hs:23:14:
Couldn't match expected type `a' against inferred type `Human'
`a' is a rigid type variable bound by
the type signature for `items' at TypeTest.hs:22:23
In the expression: Human {firstName = "Don"}
In the expression: [Human {firstName = "Don"}]
In the definition of `items': items = [Human {firstName = "Don"}]
I would have expected the instance Nameable Human section would make this work. Can someone explain what I am doing wrong, and for bonus points what "concept" I am trying to get working, since I'm having trouble knowing what to search for.
This question feels similar, but I couldn't figure out the connection with my problem.
Consider the type of items:
items :: (Nameable a) => [a]
It's saying that for any Nameable type, items will give me a list of that type. It does not say that items is a list that may contain different Nameable types, as you might think. You want something like items :: [exists a. Nameable a => a], except that you'll need to introduce a wrapper type and use forall instead. (See: Existential type)
{-# LANGUAGE ExistentialQuantification #-}
data SomeNameable = forall a. Nameable a => SomeNameable a
[...]
items :: [SomeNameable]
items = [ SomeNameable $ Human {firstName = "Don"},
SomeNameable $ Building {buildingName = "Empire State"} ]
The quantifier in the data constructor of SomeNameable basically allows it to forget everything about exactly which a is used, except that it is Nameable. Therefore, you will only be allowed to use functions from the Nameable class on the elements.
To make this nicer to use, you can make an instance for the wrapper:
instance Nameable (SomeNameable a) where
name (SomeNameable x) = name x
Now you can use it like this:
Main> map name items
["Don", "Empire State"]
Everybody is reaching for either existential quantification or algebraic data types. But these are both overkill (well depending on your needs, ADTs might not be).
The first thing to note is that Haskell has no downcasting. That is, if you use the following existential:
data SomeNameable = forall a. Nameable a => SomeNameable a
then when you create an object
foo :: SomeNameable
foo = SomeNameable $ Human { firstName = "John" }
the information about which concrete type the object was made with (here Human) is forever lost. The only things we know are: it is some type a, and there is a Nameable a instance.
What is it possible to do with such a pair? Well, you can get the name of the a you have, and... that's it. That's all there is to it. In fact, there is an isomorphism. I will make a new data type so you can see how this isomorphism arises in cases when all your concrete objects have more structure than the class.
data ProtoNameable = ProtoNameable {
-- one field for each typeclass method
protoName :: String
}
instance Nameable ProtoNameable where
name = protoName
toProto :: SomeNameable -> ProtoNameable
toProto (SomeNameable x) = ProtoNameable { protoName = name x }
fromProto :: ProtoNameable -> SomeNameable
fromProto = SomeNameable
As we can see, this fancy existential type SomeNameable has the same structure and information as ProtoNameable, which is isomorphic to String, so when you are using this lofty concept SomeNameable, you're really just saying String in a convoluted way. So why not just say String?
Your items definition has exactly the same information as this definition:
items = [ "Don", "Empire State" ]
I should add a few notes about this "protoization": it is only as straightforward as this when the typeclass you are existentially quantifying over has a certain structure: namely when it looks like an OO class.
class Foo a where
method1 :: ... -> a -> ...
method2 :: ... -> a -> ...
...
That is, each method only uses a once as an argument. If you have something like Num
class Num a where
(+) :: a -> a -> a
...
which uses a in multiple argument positions, or as a result, then eliminating the existential is not as easy, but still possible. However my recommendation to do this changes from a frustration to a subtle context-dependent choice, because of the complexity and distant relationship of the two representations. However, every time I have seen existentials used in practice it is with the Foo kind of tyepclass, where it only adds needless complexity, so I quite emphatically consider it an antipattern. In most of these cases I recommend eliminating the entire class from your codebase and exclusively using the protoized type (after you give it a good name).
Also, if you do need to downcast, then existentials aren't your man. You can either use an algebraic data type, as others people have answered, or you can use Data.Dynamic (which is basically an existential over Typeable. But don't do that; a Haskell programmer resorting to Dynamic is ungentlemanlike. An ADT is the way to go, where you characterize all the possible types it could be in one place (which is necessary so that the functions that do the "downcasting" know that they handle all possible cases).
I like #hammar's answer, and you should also check out this article which provides another example.
But, you might want to think differently about your types. The boxing of Nameable into the SomeNameable data type usually makes me start thinking about whether a union type for the specific case is meaningful.
data Entity = H Human | B Building
instance Nameable Entity where ...
items = [H (Human "Don"), B (Building "Town Hall")]
I'm not sure why you want to use the same function for
getting the name of a Human and the name of a Building.
If their names are used in fundamentally different ways,
except maybe for simple things like printing them,
then you probably want two
different functions for that. The type system
will automatically guide you to choose the right function
to use in each situation.
But if having a name is something significant about the
whole purpose of your program, and a Human and a Building
are really pretty much the same thing in that respect as far as your program
is concerned, then you would define their type together:
data NameableThing =
Human { name :: String } |
Building { name :: String }
That gives you a polymorphic function name that works for
whatever particular flavor of NameableThing you happen to have,
without needing to get into type classes.
Usually you would use a type class for a different kind of situation:
if you have some kind of non-trivial operation that has the same purpose
but a different implementation for several different types.
Even then, it's often better to use some other approach instead, like
passing a function as a parameter (a "higher order function", or "HOF").
Haskell type classes are a beautiful and powerful tool, but they are totally
different than what is called a "class" in object-oriented languages,
and they are used far less often.
And I certainly don't recommend complicating your program by using an advanced
extension to Haskell like Existential Qualification just to fit into
an object-oriented design pattern.
You can try to use Existentially Quanitified types and do it like this:
data T = forall a. Nameable a => MkT a
items = [MkT (Human "bla"), MkT (Building "bla")]
I've just had a look at the code that this question is abstracting from. For this, I would recommend merging the Task and RecurringTaskDefinition types:
data Task
= Once
{ name :: String
, scheduled :: Maybe Day
, category :: TaskCategory
}
| Recurring
{ name :: String
, nextOccurrence :: Day
, frequency :: RecurFrequency
}
type ProgramData = [Task] -- don't even need a new data type for this any more
Then, the name function works just fine on either type, and the functions you were complaining about like deleteTask and deleteRecurring don't even need to exist -- you can just use the standard delete function as usual.

Modeling domain data in Haskell [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm working on designing a larger-ish web application using Haskell. This is purely for my education and interest.
I'm starting by writing out my domain/value objects. One example is a User. Here's what I've come up with so far
module Model (User) where
class Audited a where
creationDate :: a -> Integer
lastUpdatedDate :: a -> Integer
creationUser :: a -> User
lastUpdatedUser :: a -> User
class Identified a where
id :: a -> Integer
data User = User { userId :: Integer
, userEmail :: String
, userCreationDate :: Integer
, userLastUpdatedDate :: Integer
, userCreationUser :: User
, userLastUpdatedUser :: User
}
instance Identified User where
id u = userId u
instance Audited User where
creationDate u = userCreationDate
lastUpdatedDate u = userLastUpdatedDate
creationUser u = userCreationUser
lastUpdatedUser u = userLastUpdatedUser
My application will have roughly 20 types like the above type. When I say "like the above type", I mean they will have an id, audit information, and some type-specific information (like email in the case of User).
The thing I can't wrap my mind around is the fact that each of my fields (e.g. User.userEmail) creates a new function fieldName :: Type -> FieldType. With 20 different types, the namespace seems like it'll get pretty full pretty fast. Also, I don't like having to name my User ID field userId. I'd rather name it id. Is there any way around this?
Maybe I should mention that I'm coming from the imperative world, so this FP stuff is pretty new (yet pretty exciting) for me.
Yeah, namespacing can be kind of a pain in Haskell. I usually end up tightening up my abstractions until there are not so many names. It also allows for more reuse. For yours, I would make a data type rather than a class for the audit information:
data Audit = Audit {
creationDate :: Integer,
lastUpdatedDate :: Integer,
creationUser :: User,
lastUpdatedUser :: User
}
And then pair that up with the type-specific data:
data User = User {
userAudit :: Audit,
userId :: Integer,
userEmail :: String
}
You can still use those typeclasses if you want:
class Audited a where
audit :: a -> Audit
class Identified a where
ident :: a -> Integer
However as your design develops, be open to the possibility of those typeclasses dissolving into thin air. Object-like typeclasses -- ones where every method takes a single parameter of type a -- have a way of simplifying themselves away.
Another way to approach this might be to classify your objects with a parametric type:
data Object a = Object {
objId :: Integer,
objAudit :: Audit,
objData :: a
}
Check it out, Object is a Functor!
instance Functor Object where
fmap f (Object id audit dta) = Object id audit (f dta)
I would be more inclined to do it this way, based on my design hunch. It is hard to say which way is better without knowing more about your plans. And look, the need for those typeclasses dissolved away. :-)
This is a known problem with Haskell's records. There have been some suggestions (notably TDNR) to mitigate the effects, but no solutions have emerged yet.
If you don't mind putting each of your data objects in a separate module, then you can use namespaces to differentiate between the functions:
import qualified Model.User as U
import qualified Model.Privileges as P
someUserId user = U.id user
somePrivId priv = P.id priv
As to using id instead of userId; it's possible if you hide the id which is imported from the Prelude by default. Use the following as your first import statement:
import Prelude hiding (id)
and now the usual id function won't be in scope. If you need it for some reason, you can access it with a fully-qualified name, i.e. Prelude.id.
Think carefully before creating a name that clashes with the Prelude. It can often be confusing for the programmer, and it's slightly awkward to work with. You may be better off using short, generic name, such as oId.
One simple option is, rather than going all out on type classes, make all your types varieties of a single algebraic data type:
data DomainObject = User {
objectID :: Int,
objectCreationDate :: Date
...
}
| SomethingElse {
objectID :: Int,
objectCreationDate :: Date,
somethingProperty :: Foo
...
}
| AnotherThing {
objectID :: Int,
objectCreationDate :: Date,
anotherThingProperty :: Bar
...
}
It's clunky because it requires having all your data structures in a single file, but it does at least allow you to use the same function (objectID) to get the ID of an object.

When should I use record syntax for data declarations in Haskell?

Record syntax seems extremely convenient compared to having to write your own accessor functions. I've never seen anyone give any guidelines as to when it's best to use record syntax over normal data declaration syntax, so I'll just ask here.
You should use record syntax in two situations:
The type has many fields
The type declaration gives no clue about its intended layout
For instance a Point type can be simply declared as:
data Point = Point Int Int deriving (Show)
It is obvious that the first Int denotes the x coordinate and the second stands for y. But the case with the following type declaration is different (taken from Learn You a Haskell for Great Good):
data Person = Person String String Int Float String String deriving (Show)
The intended type layout is: first name, last name, age, height, phone number, and favorite ice-cream flavor. But this is not evident in the above declaration. Record syntax comes handy here:
data Person = Person { firstName :: String
, lastName :: String
, age :: Int
, height :: Float
, phoneNumber :: String
, flavor :: String
} deriving (Show)
The record syntax made the code more readable, and saved a great deal of typing by automatically defining all the accessor functions for us!
In addition to complex multi-fielded data, newtypes are often defined with record syntax. In either of these cases, there aren't really any downsides to using record syntax, but in the case of sum types, record accessors usually don't make sense. For example:
data Either a b = Left { getLeft :: a } | Right { getRight :: b }
is valid, but the accessor functions are partial – it is an error to write getLeft (Right "banana"). For that reason, such accessors are generally speaking discouraged; something like getLeft :: Either a b -> Maybe a would be more common, and that would have to be defined manually. However, note that accessors can share names:
data Item = Food { description :: String, tastiness :: Integer }
| Wand { description :: String, magic :: Integer }
Now description is total, although tastiness and magic both still aren't.

Resources