When should I use record syntax for data declarations in Haskell? - haskell

Record syntax seems extremely convenient compared to having to write your own accessor functions. I've never seen anyone give any guidelines as to when it's best to use record syntax over normal data declaration syntax, so I'll just ask here.

You should use record syntax in two situations:
The type has many fields
The type declaration gives no clue about its intended layout
For instance a Point type can be simply declared as:
data Point = Point Int Int deriving (Show)
It is obvious that the first Int denotes the x coordinate and the second stands for y. But the case with the following type declaration is different (taken from Learn You a Haskell for Great Good):
data Person = Person String String Int Float String String deriving (Show)
The intended type layout is: first name, last name, age, height, phone number, and favorite ice-cream flavor. But this is not evident in the above declaration. Record syntax comes handy here:
data Person = Person { firstName :: String
, lastName :: String
, age :: Int
, height :: Float
, phoneNumber :: String
, flavor :: String
} deriving (Show)
The record syntax made the code more readable, and saved a great deal of typing by automatically defining all the accessor functions for us!

In addition to complex multi-fielded data, newtypes are often defined with record syntax. In either of these cases, there aren't really any downsides to using record syntax, but in the case of sum types, record accessors usually don't make sense. For example:
data Either a b = Left { getLeft :: a } | Right { getRight :: b }
is valid, but the accessor functions are partial – it is an error to write getLeft (Right "banana"). For that reason, such accessors are generally speaking discouraged; something like getLeft :: Either a b -> Maybe a would be more common, and that would have to be defined manually. However, note that accessors can share names:
data Item = Food { description :: String, tastiness :: Integer }
| Wand { description :: String, magic :: Integer }
Now description is total, although tastiness and magic both still aren't.

Related

What is the difference between single double qoute/apostrophe in template-haskell?

When learning about Haskell lenses with the Optics package, i encountered the following example:
data Person = Person
{ _name :: String
, _age :: Int
}
makeLenses ''Person
makePrisms 'Person
What does a value of type Name represent and what is the difference between that single and double single qoute/apostrophe?
Both seem to have the same type:
makeLenses, makePrisms :: Name -> DecsQ
The template-haskell documentation is incomprehensible to me. It focuses on syntax and lacks examples:
* 'f has type Name, and names the function f. Similarly 'C has type Name and names the data constructor C. In general '⟨thing⟩ interprets ⟨thing⟩ in an expression context.
* ''T has type Name, and names the type constructor T. That is, ''⟨thing⟩ interprets ⟨thing⟩ in a type context.
We have two forms of quoting to distinguish between the data constructor and the type constructor.
Consider this variant:
data Person = KPerson
{ _name :: String
, _age :: Int
}
makeLenses ''Person -- the type constructor
makePrisms 'KPerson -- the data constructor
Here it is clear that in one case we use a Name for the type constructor while in the other case we refer to a Name for the data constructor.
In principle, Haskell could have used a single form of quoting, provided that the names of constructors such as Person and KPerson are always kept distinct. Since this is not the case, we need to disambiguate between naming the type and data constructors.
Note that, in practice, it is customary to use the same name for both constructors, so this disambiguation is often needed in actual code.
Type constructors and term constructors can have the same name in Haskell, so you use double and single ticks, respectively, to indicate the difference. Here is that example from Optics with distinct names:
data Person = P
{ _name :: String
, _age :: Int
}
makeLenses ''Person
makePrisms 'P

Pass no char to a function that is expecting it in Haskell

I am working with Haskell and I have defined the following type
--Build type Transition--
data Transition = Transition {
start_state :: Int,
symbol :: Char,
end_state :: Int
} deriving Show
and I would like to be able to define the following Transition
Transition 0 '' 1
which would be mean "a transition given by no symbol" (I need it to compute the epsilon closure of a NFA). How can I do this?
Thank you!
Well the idea of defining a type is that every value you pass to that field is a "member" of that type. Char only contains only characters (and the empty string is not a character) and undefined (but it is advisable not to use undefined here).
Usually in case you want to make values optional, you can use a Maybe a type instead, so:
data Transaction = Transaction {
start_state :: Int,
symbol :: Maybe Char,
end_state :: Int
} deriving Show
So now we can pass two kinds of values: Nothing which thus should be interpreted as "no character", or Just x, with x a character, and this thus acts as a character, so in your case, that would be:
Transaction 0 Nothing 1
Maybe is also an instance of Functor, Applicative and Monad, which should make working with Maybe types quite convenient (yes it can sometimes introduce some extra work, but by using fmap, etc. the amount of pattern matching shifting to Maybe Char should be rather low).
Note: like #amalloy says, an NFA (and DFA) has Transitions, not Transactions.

New type declaring functions?

I'm familiar with the newtype declaration:
newtype MyAge = Age {age :: Int} deriving (Show, Eq, Ord)
In this instance Age is an Int, however I've come across the code below and I can't understand it:
newtype Ages a = Ages {age :: String -> [(a,String)]}
This appears to be a function declaration? (takes string, returns list of tuples containing 'a' and string) - is this correct?
N.B I've just realized this is just basic record syntax to declare a function.
Additionally, I've tried to implement this type, but I must be doing something wrong:
newtype Example a = Example {ex :: Int -> Int}
myexample = Example {ex = (\x -> x + 1)}
This compiles, however I don't understand why as I haven't passed the 'a' parameter?
This appears to be a function declaration?
Yes. Specifically, String -> [(a,String)] is a function type. A newtype declaration is analogous to a simple wrapper around any given type. There's no restriction that says you can't make it based on a function type, and it works in exactly the same way.
Also remember that you can always replace newtype with data; in this case, thinking about the resulting type as a record type that has a field that is a function might be helpful; newtype is just a special, optimized case.
One other thing to mention is that your two lines also differ in that the second one is parametrized over a. This can of course be used with regular types:
newtype MyWrapper a = MyWrapper a
or a function type can be newtype-d without parametrisation
newtype MyFunction = MyFunction (Float -> Float)
You can also write the above using the record syntax that gives you the "getter" function as well.

Haskell generic data structure

I want to create a type to store some generic information, as for me, this type is
Molecule, where i store chemical graph and molecular properties.
data Molecule = Molecule {
name :: Maybe String,
graph :: Gr Atom Bond,
property :: Maybe [Property] -- that's a question
} deriving(Show)
Properties I want to represent as tuple
type Property a = (String,a)
because a property may have any type: Float, Int, String e.t.c.
The question is how to form Molecule data structure, so I will be able to collect any numbers of any types of properties in Molecule. If I do
data Molecule a = Molecule {
name :: Maybe String,
graph :: Gr Atom Bond,
property :: Maybe [Property a]
} deriving(Show)
I have to diretly assign one type when I create a molecule.
If you know in advance the set of properties a molecule might have, you could define a sum type:
data Property = Mass Float | CatalogNum Int | Comment String
If you want this type to be extensible, you could use Data.Dynamic as another answer suggests. For instance:
data Molecule = Molecule { name :: Maybe String,
graph :: Gr Atom Bond,
property :: [(String,Dynamic)]
} deriving (Show)
mass :: Molecule -> Maybe Float
mass m = case lookup "mass" (property m) of
Nothing -> Nothing
Just i -> fromDynamic i
You could also get rid of the "stringly-typed" (String,a) pairs, say:
-- in Molecule:
-- property :: [Dynamic]
data Mass = Mass Float
mass :: Molecule -> Maybe Mass
mass m = ...
Neither of these attempts gives much type safety over just parsing out of (String,String) pairs since there is no way to enforce the invariant that the user creates well-formed properties (short of wrapping properties in a new type and hiding the constructors in another module, which again breaks extensibility).
What you might want are Ocaml-style polymorphic variants. You could look at Vinyl, which provides type-safe extensible records.
As an aside, you might want to get rid of the Maybe wrapper around the list of properties, since the empty list already encodes the case of no properties.
You might want to look at Data.Dynamic for a psudo-dynamic typing solution.

Why doesn't GHC Haskell support overloaded record parameter names?

What I am talking about is that it is not possible to define:
data A = A {name :: String}
data B = B {name :: String}
I know that the GHC just desugars this to plain functions and the idiomatic way to solve this would be:
data A = A {aName :: String}
data B = B {bName :: String}
class Name a where
name :: a -> String
instance Name A where
name = aName
instance Name B where
name = bName
After having written this out I don't like it that much ... couldn't this typeclassing be part of the desugaring process?
The thought came to me when I was writing some Aeson JSON parsing. Where it would have been too easy to just derive the FromJSON instances for every data type I had to write everything out by hand (currently >1k lines and counting).
Having names like name or simply value in a data record is not that uncommon.
http://www.haskell.org/haskellwiki/Performance/Overloading mentions that function overloading introduces some runtime overhead. But I actually don't see why the compiler wouldn't be able to resolve this at compile time and give them different names internally.
This SO question from 2012 more or less states historical reasons and points to a mail thread from 2006. Has anything changed recently?
Even if there would be some runtime overhead most people wouldn't mind cause most code hardly is performance critical.
Is there some hidden language extension that actually allows this? Again I am not sure ... but I think Idris actually does this?
Many, mostly minor reasons. One is the problem raised by a better answer, overloading just on the first argument is insufficient to handle all the useful cases.
You could "desugar"
data A { name :: String }
data B { name :: Text }
into
class Has'name a b | a -> b where
name :: a -> b
data A { aName :: String }
instance Has'name A String where
name :: aName
data B { bName :: Text }
instance Has'name B Text where
name :: bName
but that would require GHC extensions (Functional Dependencies) that haven't made it into the standard, yet. It would preclude using just 'name' for record creation, updates, and pattern matching (view patterns might help there), since 'name' isn't "just" a function in those cases. You can probably pull off something very similar with template Haskell.
Using the record syntax
data A { name :: String }
implicitly defines a function
name :: A -> String
If define both A and B with a { name :: String }, we have conflicting type definitions for name:
name :: A -> String
name :: B -> String
It's not clear how your proposed implicit type classes would work because if we define two types
data A { name :: String }
data B { name :: Text }
then we have just shifted the problem to conflicting type class definitions:
class Has'name a where
name :: a -> String
class Has'name a where
name :: a -> Text
In principle this could be resolved one way or another, but this is just one of several tricky conflicting desirable properties for records. When Haskell was defined, it was decided that it was better to have simple if limited support rather than to try to design something more ambitious and complicated. Several improvements to records have been discussed at various times and there are perennial discussions, e.g. this Haskell Cafe thread. Perhaps something will be worked out for Haskell Prime.
The best way I found, is to use a preprocessor to solve this definitely rather stupid problem.
Haskell and GHC make this easy, because the whole Haskell parser is available as a normal library. You could just parse all the files, do that renaming scheme (e.g. « data A { name :: String } » and « let a = A "Betty" in name a » into « data A { a_Name :: String } » and « let a = A "Betty" in aName a ») depending on the type of data the name function is applied to, using the type resolver, and write them out for compilation.
But honestly, that should be integrated into GHC. You’re right: It’s silly that this isn’t included.

Resources