Why doesn't GHC Haskell support overloaded record parameter names? - haskell

What I am talking about is that it is not possible to define:
data A = A {name :: String}
data B = B {name :: String}
I know that the GHC just desugars this to plain functions and the idiomatic way to solve this would be:
data A = A {aName :: String}
data B = B {bName :: String}
class Name a where
name :: a -> String
instance Name A where
name = aName
instance Name B where
name = bName
After having written this out I don't like it that much ... couldn't this typeclassing be part of the desugaring process?
The thought came to me when I was writing some Aeson JSON parsing. Where it would have been too easy to just derive the FromJSON instances for every data type I had to write everything out by hand (currently >1k lines and counting).
Having names like name or simply value in a data record is not that uncommon.
http://www.haskell.org/haskellwiki/Performance/Overloading mentions that function overloading introduces some runtime overhead. But I actually don't see why the compiler wouldn't be able to resolve this at compile time and give them different names internally.
This SO question from 2012 more or less states historical reasons and points to a mail thread from 2006. Has anything changed recently?
Even if there would be some runtime overhead most people wouldn't mind cause most code hardly is performance critical.
Is there some hidden language extension that actually allows this? Again I am not sure ... but I think Idris actually does this?

Many, mostly minor reasons. One is the problem raised by a better answer, overloading just on the first argument is insufficient to handle all the useful cases.
You could "desugar"
data A { name :: String }
data B { name :: Text }
into
class Has'name a b | a -> b where
name :: a -> b
data A { aName :: String }
instance Has'name A String where
name :: aName
data B { bName :: Text }
instance Has'name B Text where
name :: bName
but that would require GHC extensions (Functional Dependencies) that haven't made it into the standard, yet. It would preclude using just 'name' for record creation, updates, and pattern matching (view patterns might help there), since 'name' isn't "just" a function in those cases. You can probably pull off something very similar with template Haskell.

Using the record syntax
data A { name :: String }
implicitly defines a function
name :: A -> String
If define both A and B with a { name :: String }, we have conflicting type definitions for name:
name :: A -> String
name :: B -> String
It's not clear how your proposed implicit type classes would work because if we define two types
data A { name :: String }
data B { name :: Text }
then we have just shifted the problem to conflicting type class definitions:
class Has'name a where
name :: a -> String
class Has'name a where
name :: a -> Text
In principle this could be resolved one way or another, but this is just one of several tricky conflicting desirable properties for records. When Haskell was defined, it was decided that it was better to have simple if limited support rather than to try to design something more ambitious and complicated. Several improvements to records have been discussed at various times and there are perennial discussions, e.g. this Haskell Cafe thread. Perhaps something will be worked out for Haskell Prime.

The best way I found, is to use a preprocessor to solve this definitely rather stupid problem.
Haskell and GHC make this easy, because the whole Haskell parser is available as a normal library. You could just parse all the files, do that renaming scheme (e.g. « data A { name :: String } » and « let a = A "Betty" in name a » into « data A { a_Name :: String } » and « let a = A "Betty" in aName a ») depending on the type of data the name function is applied to, using the type resolver, and write them out for compilation.
But honestly, that should be integrated into GHC. You’re right: It’s silly that this isn’t included.

Related

Simple Refinement Types in Haskell

From Scott Wlaschin's blog post and Book "Domain Modeling made Functional", and from Alexis King's post, I take that a domain model should encode as much information about the domain as is practical in the types, so as to "make illegal states unrepresentable" and to obtain strong guarantees that allows me to write domain logic functions that are total.
In basic enterprise applications, we have many basic domain types like street names, company names, cities and the like. To represent them in a form that prevents most errors later on, I would like to use a type that lets me
restrict the maximum and minimum number of characters.
specify the subset of characters that may be used,
add additional constraints, like no leading or trailing whitespace.
I can think of two ways to implement such types: As custom abstract data types with smart constructors and hidden data constructors, or via some type-level machinery (I vaguely read about refinement types? Can such types be represented via some of the newer language extensions? via LiquidHaskell?). Which is the sensible way to go? Which approach most easily works with all the functions that operate on regular Text, and how can I most easily combine two or more values of the same refined type, map over them, etc.?
Ideally, there would be a library to help me create such custom types. Is there?
following Alexis King's blog, I'd say that a suitable solution would be something like below. Of course, other solutions are posible.
import Control.Monad (>=>)
newtype StreetName = StreetName {getStreetName :: String}
-- compose all validations and wrap them in new constructor.
asStreetName :: String -> Maybe StreetName
asStreetName = StreetName <$> rightNumberOfChars >=> rightCharSubset >=> noTrailingWhiteSpace
-- This funcs will process the string, and produce another validated string or nothing.
rightNumberOfChars :: String -> Maybe String
rightNumberOfChars s = {- ... -}
rightCharSubset :: String -> Maybe String
rightCharSubset s = {- ... -}
noTrailingWhiteSpace :: String -> Maybe String
noTrailingWhiteSpaces = {- ... -}
main = do
street <- readStreetAsString
case asStreetName street of
Just s -> {- s is now validated -}
Nothing -> {- handle the error -}
make StreetName a hiden constructor, as use asStreetName as a smart constructor. Remember that other functions should use StreetName instead of String in the type, to make sure that data is validated.

understanding data structure in Haskell

I have a problem with a homework (the topic is : "functional data structures").
Please understand that I don't want anyone to solve my homework.
I just have a problem with understanding the structure of this :
data Heap e t = Heap {
empty :: t e,
insert :: e -> t e -> t e,
findMin :: t e -> Maybe e,
deleteMin :: t e -> Maybe (t e),
merge :: t e -> t e -> t e,
contains :: e -> t e -> Maybe Int
}
In my understanding "empty" "insert" and so on are functions which can applied to "Heap"-type data.
Now I just want to understand how that "Heap"thing looks like.
So I was typing things like :
a = Heap 42 42
But I get errors I can't really work with.
Maybe it is a dumb question and I'm just stuck at this point for no reason, but it is killing me at the moment.
Thankful to any help
If you truly wish to understand that type, you need to understand a few requisites first.
types and values (and functions)
Firstly, you need to understand what types and values are. I'm going to assume you understand this. You understand, for example, the separation between "hello" as a value and its type, String and you understand clearly what it means when I say a = "hello" :: String and:
a :: String
a = "hello"
If you don't understand that, then you need to research values and types in Haskell. There are a myriad of books that can help here, such as this one, which I helped to author: http://happylearnhaskelltutorial.com
I'm also going to assume you understand what functions and currying are, and how to use both of them.
polymorphic types
Secondly, as your example contains type variables, you'll need to understand what they are. That is, you need to understand what polymoprhic types are. So, for example, Maybe a, or Either a b, and you'll need to understand how Maybe String is different to Maybe Int and what Num a => [a] and even things like what Num a => [Maybe a] is.
Again, there are many free or paid books that can help, the example above covers this, too.
algebraic data types
Next up is algebraic data types. This is a pretty amazingly cool feature that Haskell has. Haskell-like languages such as Elm and Idris have it as well as others like Rust, too. It lets you define your own data types. These aren't just things like Structs in other languages, and yeah, they can even contain functions.
Maybe is actually an example of an algebraic data types. If you understand these, you'll know that:
data Direction = North | South | East | West
defines a data type called Direction whose values can only be one of North, South, East or West, and you'll know that you can also use the polymorhpic type variables above to parameterise your types like so:
data Tree a = EmptyNode | Node (Tree a) (Tree a)
which uses both optionality (as in the sum type of Direction above) as well as parameterization.
In addition to this, you can also have multiple types in each value. These are called product types, and Haskell's algebraic datatypes can be expressed as a combination of Sum types that can contain Product types. For example:
type Location = (Float, Float)
data ShapeNode = StringNode Location String | CircleNode Location Float | SquareNode Location Float Float
That is, each value can be one of StringNode, CircleNode or SquareNode, and in each case there are a different set of fields given to each value. To create a StringNode, for example, you'd need to pass the values of it constructor like this: StringNode (10.0, 5.3) "A String".
Again, the freely available books will go into much more detail about these things, but we're moving in the direction of getting more than a basic understanding of Haskell now.
Finally, in order to fully understand your example, you'll need to know about...
record types
Record types are the same as product types above, except that the fields are labelled rather than being anonymous. So, you could define the shape node data type like this, instead:
type Location = (Float, Float)
data ShapeNode
= StringNode { stringLocation :: Location, stringData :: String }
| CircleNode { circleLocation :: Location, radius :: Float }
| SquareNode { squareLocation :: Location, length :: Float, height :: Float }
Each field is named, and you can't repeat the same name inside data values.
All that you need in addition to this to understand the above example is to realise your example contains all of these things together, along with the fact that you have functions as your record field values in the data type you have.
It's a good idea to thoroughly flesh out your understanding and not skip any steps, then you'll be able to follow these kinds of things much more easily in the future. :) I wish you luck!
Heap is a record with six elements. In order to create a value of that type, you must supply all six elements. Assuming that you have appropriate values and functions, you can create a value like this:
myHeap = Heap myEmpty myInsert myFindMin myDeleteMin myMerge myContains
The doesn't seem like idiomatic Haskell design, however. Why not define generic functions independent of the data, or, if they must be bundled together, a typeclass?

StateT with Q monad from template haskell

I would like to create a function that takes some declarations of type Dec (which I get from [d| ... |]) and modify them. Modifications will depend on previous declarations so I would like to be able to store them in map hidden in State monad - mainly I create records and class instances and add to them fields from previous records. (This way I want to mimic OOP but this is probably not relevant for my question). I would like to splice results of my computations to code after I processed (and changed) all declarations.
I tried composing StatT with Q every possible way but I can't get it right.
Edit
My idea was to create functions collecting classes declarations (I am aware that Haskell is not object oriented language, I read some papers about how you can encode OOP in Haskell and I try to implement it with Template Haskell as small assignment).
So I would like to be able to write something like this:
declare [d| class A a where
attr1 :: a -> Int
attr2 :: a -> Int |]
declareInherit [d| class B b where
attr3 :: b -> Int |] ''A
This is encoding of (for example c++ code)
struct A{
int attr1;
int attr2;
}
struct B : A {
int attr3;
}
And I would like to generate two records with template haskell:
data A_data = A_data { attr1' :: Int, attr2' :: Int}
data B_data = B_data { attr1'' :: Int, attr2'' :: Int, attr3'' :: Int}
and instances of classes:
instance A A_data where
attr1 = attr1'
attr2 = attr2'
instance A B_data where
attr1 = attr1''
attr2 = attr2''
instance B B_data where
attr3 = attr3''
This is how my encoding of OO works, but I would like to be able to generate it automatically, not write it by hand.
I had a problem with interacting with DecsQ in declare function, probably I would like it to exist in something like this:
data Env = Env {classes :: Map.Map Name Dec }
type QS a = (StateT Env Q) a
I also has problem how to run computation in QS.
A problem with Template Haskell is that its API is not as polished as in most of other Haskell libraries. The Q monad is overloaded with problems: it reifies, it renders, it manages the state of local names. But it is never a good idea to interleave problem domains, since it's been proven that human beings can only think about one thing at a time (in other words, we're "single core"). This means that mixing problems together is not scalable. That is why it is hard to reason about Q already, and you want to add yet another problem on top of it: your state. Bad idea.
As well as any other problem you should approach this with separation of concerns. I.e., you should extract smaller problem domains from your main one and work with each of them in isolation. Concerning Template Haskell there are several evident domains: reification of existing ASTs, their analysis and rendering of new ASTs. For communication between those domains you may also need some "lingua franca" data model. Kinda reminds of something, doesn't it? Yep, it's MVC.
There are certain properties of the extracted domains, which you can then exploit to your benefits: you only need to remain in the Q monad for reification, the rendering and analysis can be done in the pure environment, granting you with all of its benefits. You can safely purify quasi-quotes using unsafePerformIO . runQ.
For real-life examples I can refer you to some of my projects, in which I apply this approach:
https://github.com/nikita-volkov/type-structure/blob/404017df89d3432481ed118776713cbd660ba220/src/TypeStructure/TH.hs
https://github.com/nikita-volkov/graph-db/blob/ba6ca0343ce73011b1c801d74aa3e17e0250d8a4/library/GraphDB/Macros.hs
At least within one file, you can pass state from one DecsQ splice to the one lower down in the file by storing it in:
{-# NOINLINE env #-}
env :: IORef Env
env = unsafePerformIO (newIORef (Env mempty))
and then do things like runIO (readIORef env) :: Q Env.

How to define a class that allows uniform access to different records in Haskell?

I have two records that both have a field I want to extract for display. How do I arrange things so they can be manipulated with the same functions? Since they have different fields (in this case firstName and buildingName) that are their name fields, they each need some "adapter" code to map firstName to name. Here is what I have so far:
class Nameable a where
name :: a -> String
data Human = Human {
firstName :: String
}
data Building = Building {
buildingName :: String
}
instance Nameable Human where
name x = firstName x
instance Nameable Building where
-- I think the x is redundant here, i.e the following should work:
-- name = buildingName
name x = buildingName x
main :: IO ()
main = do
putStr $ show (map name items)
where
items :: (Nameable a) => [a]
items = [ Human{firstName = "Don"}
-- Ideally I want the next line in the array too, but that gives an
-- obvious type error at the moment.
--, Building{buildingName = "Empire State"}
]
This does not compile:
TypeTest.hs:23:14:
Couldn't match expected type `a' against inferred type `Human'
`a' is a rigid type variable bound by
the type signature for `items' at TypeTest.hs:22:23
In the expression: Human {firstName = "Don"}
In the expression: [Human {firstName = "Don"}]
In the definition of `items': items = [Human {firstName = "Don"}]
I would have expected the instance Nameable Human section would make this work. Can someone explain what I am doing wrong, and for bonus points what "concept" I am trying to get working, since I'm having trouble knowing what to search for.
This question feels similar, but I couldn't figure out the connection with my problem.
Consider the type of items:
items :: (Nameable a) => [a]
It's saying that for any Nameable type, items will give me a list of that type. It does not say that items is a list that may contain different Nameable types, as you might think. You want something like items :: [exists a. Nameable a => a], except that you'll need to introduce a wrapper type and use forall instead. (See: Existential type)
{-# LANGUAGE ExistentialQuantification #-}
data SomeNameable = forall a. Nameable a => SomeNameable a
[...]
items :: [SomeNameable]
items = [ SomeNameable $ Human {firstName = "Don"},
SomeNameable $ Building {buildingName = "Empire State"} ]
The quantifier in the data constructor of SomeNameable basically allows it to forget everything about exactly which a is used, except that it is Nameable. Therefore, you will only be allowed to use functions from the Nameable class on the elements.
To make this nicer to use, you can make an instance for the wrapper:
instance Nameable (SomeNameable a) where
name (SomeNameable x) = name x
Now you can use it like this:
Main> map name items
["Don", "Empire State"]
Everybody is reaching for either existential quantification or algebraic data types. But these are both overkill (well depending on your needs, ADTs might not be).
The first thing to note is that Haskell has no downcasting. That is, if you use the following existential:
data SomeNameable = forall a. Nameable a => SomeNameable a
then when you create an object
foo :: SomeNameable
foo = SomeNameable $ Human { firstName = "John" }
the information about which concrete type the object was made with (here Human) is forever lost. The only things we know are: it is some type a, and there is a Nameable a instance.
What is it possible to do with such a pair? Well, you can get the name of the a you have, and... that's it. That's all there is to it. In fact, there is an isomorphism. I will make a new data type so you can see how this isomorphism arises in cases when all your concrete objects have more structure than the class.
data ProtoNameable = ProtoNameable {
-- one field for each typeclass method
protoName :: String
}
instance Nameable ProtoNameable where
name = protoName
toProto :: SomeNameable -> ProtoNameable
toProto (SomeNameable x) = ProtoNameable { protoName = name x }
fromProto :: ProtoNameable -> SomeNameable
fromProto = SomeNameable
As we can see, this fancy existential type SomeNameable has the same structure and information as ProtoNameable, which is isomorphic to String, so when you are using this lofty concept SomeNameable, you're really just saying String in a convoluted way. So why not just say String?
Your items definition has exactly the same information as this definition:
items = [ "Don", "Empire State" ]
I should add a few notes about this "protoization": it is only as straightforward as this when the typeclass you are existentially quantifying over has a certain structure: namely when it looks like an OO class.
class Foo a where
method1 :: ... -> a -> ...
method2 :: ... -> a -> ...
...
That is, each method only uses a once as an argument. If you have something like Num
class Num a where
(+) :: a -> a -> a
...
which uses a in multiple argument positions, or as a result, then eliminating the existential is not as easy, but still possible. However my recommendation to do this changes from a frustration to a subtle context-dependent choice, because of the complexity and distant relationship of the two representations. However, every time I have seen existentials used in practice it is with the Foo kind of tyepclass, where it only adds needless complexity, so I quite emphatically consider it an antipattern. In most of these cases I recommend eliminating the entire class from your codebase and exclusively using the protoized type (after you give it a good name).
Also, if you do need to downcast, then existentials aren't your man. You can either use an algebraic data type, as others people have answered, or you can use Data.Dynamic (which is basically an existential over Typeable. But don't do that; a Haskell programmer resorting to Dynamic is ungentlemanlike. An ADT is the way to go, where you characterize all the possible types it could be in one place (which is necessary so that the functions that do the "downcasting" know that they handle all possible cases).
I like #hammar's answer, and you should also check out this article which provides another example.
But, you might want to think differently about your types. The boxing of Nameable into the SomeNameable data type usually makes me start thinking about whether a union type for the specific case is meaningful.
data Entity = H Human | B Building
instance Nameable Entity where ...
items = [H (Human "Don"), B (Building "Town Hall")]
I'm not sure why you want to use the same function for
getting the name of a Human and the name of a Building.
If their names are used in fundamentally different ways,
except maybe for simple things like printing them,
then you probably want two
different functions for that. The type system
will automatically guide you to choose the right function
to use in each situation.
But if having a name is something significant about the
whole purpose of your program, and a Human and a Building
are really pretty much the same thing in that respect as far as your program
is concerned, then you would define their type together:
data NameableThing =
Human { name :: String } |
Building { name :: String }
That gives you a polymorphic function name that works for
whatever particular flavor of NameableThing you happen to have,
without needing to get into type classes.
Usually you would use a type class for a different kind of situation:
if you have some kind of non-trivial operation that has the same purpose
but a different implementation for several different types.
Even then, it's often better to use some other approach instead, like
passing a function as a parameter (a "higher order function", or "HOF").
Haskell type classes are a beautiful and powerful tool, but they are totally
different than what is called a "class" in object-oriented languages,
and they are used far less often.
And I certainly don't recommend complicating your program by using an advanced
extension to Haskell like Existential Qualification just to fit into
an object-oriented design pattern.
You can try to use Existentially Quanitified types and do it like this:
data T = forall a. Nameable a => MkT a
items = [MkT (Human "bla"), MkT (Building "bla")]
I've just had a look at the code that this question is abstracting from. For this, I would recommend merging the Task and RecurringTaskDefinition types:
data Task
= Once
{ name :: String
, scheduled :: Maybe Day
, category :: TaskCategory
}
| Recurring
{ name :: String
, nextOccurrence :: Day
, frequency :: RecurFrequency
}
type ProgramData = [Task] -- don't even need a new data type for this any more
Then, the name function works just fine on either type, and the functions you were complaining about like deleteTask and deleteRecurring don't even need to exist -- you can just use the standard delete function as usual.

When should I use record syntax for data declarations in Haskell?

Record syntax seems extremely convenient compared to having to write your own accessor functions. I've never seen anyone give any guidelines as to when it's best to use record syntax over normal data declaration syntax, so I'll just ask here.
You should use record syntax in two situations:
The type has many fields
The type declaration gives no clue about its intended layout
For instance a Point type can be simply declared as:
data Point = Point Int Int deriving (Show)
It is obvious that the first Int denotes the x coordinate and the second stands for y. But the case with the following type declaration is different (taken from Learn You a Haskell for Great Good):
data Person = Person String String Int Float String String deriving (Show)
The intended type layout is: first name, last name, age, height, phone number, and favorite ice-cream flavor. But this is not evident in the above declaration. Record syntax comes handy here:
data Person = Person { firstName :: String
, lastName :: String
, age :: Int
, height :: Float
, phoneNumber :: String
, flavor :: String
} deriving (Show)
The record syntax made the code more readable, and saved a great deal of typing by automatically defining all the accessor functions for us!
In addition to complex multi-fielded data, newtypes are often defined with record syntax. In either of these cases, there aren't really any downsides to using record syntax, but in the case of sum types, record accessors usually don't make sense. For example:
data Either a b = Left { getLeft :: a } | Right { getRight :: b }
is valid, but the accessor functions are partial – it is an error to write getLeft (Right "banana"). For that reason, such accessors are generally speaking discouraged; something like getLeft :: Either a b -> Maybe a would be more common, and that would have to be defined manually. However, note that accessors can share names:
data Item = Food { description :: String, tastiness :: Integer }
| Wand { description :: String, magic :: Integer }
Now description is total, although tastiness and magic both still aren't.

Resources