Redundancy regarding product types and tuples in Haskell - haskell

In Haskell you have product types and you have tuples.
You use tuples if you don't want to associate a dedicated type with the value, and you can use product types if you wish to do so.
However I feel there is redundancy in the notation of product types
data Foo = Foo (String, Int, Char)
data Bar = Bar String Int Char
Why are there both kinds of notations? Is there any case where you would prefer one the other?
I guess you can't use record notation when using tuples, but that's just a convenience problem. Another thing might be the notion of order in tuples, as opposed to product types, but I think that's just due to the naming of the functions fst and snd.

#chi's answer is about the technical differences in terms of Haskell's evaluation model. I hope to give you some insight into the philosophy of this sort of typed programming.
In category theory we generally work with objects "up to isomorphism". Your Bar is of course isomorphic to (String, Int, Char), so from a categorical perspective they're the same thing.
bar_tuple :: Iso' Bar (String, Int, Char)
bar_tuple = iso to from
where to (Bar s i c) = (s, i, c)
from (s, i, c) = Bar s i c
In some sense tuples are a Platonic form of product type, in that they have no meaning beyond being a collection of disparate values. All the other product types can be mapped to and from a plain old tuple.
So why not use tuples everywhere, when all Haskell types ultimately boil down to a sum of products? It's about communication. As Martin Fowler says,
Any fool can write code that a computer can understand. Good programmers write code that humans can understand.
Names are important! Writing down a custom product type like
data Customer = Customer { name :: String, address :: String }
imbues the type Customer with meaning to the person reading the code, unlike (String, String) which just means "two strings".
Custom types are particularly useful when you want to enforce invariants by hiding the representation of your data and using smart constructors:
newtype NonEmpty a = NonEmpty [a]
nonEmpty :: [a] -> Maybe (NonEmpty a)
nonEmpty [] = Nothing
nonEmpty xs = Just (NonEmpty xs)
Now, if you don't export the NonEmpty constructor, you can force people to go through the nonEmpty smart constructor. If someone hands you a NonEmpty value you may safely assume that it has at least one element.
You can of course represent Customer as a tuple under the hood and expose evocatively-named field accessors,
newtype Customer = Bar (String, String)
name, address :: Customer -> String
name (Customer (n, a)) = n
address (Customer (n, a)) = a
but this doesn't really buy you much, except that it's now cheaper to convert Customer to a tuple (if, say, you're writing performance-sensitive code that works with a tuple-oriented API).
If your code is intended to solve a particular problem - which of course is the whole point of writing code - it pays to not just solve the problem, but make it look like you've solved it too. Someone - maybe you in a couple of years - is going to have to read this code and understand it with no a priori knowledge of how it works. Custom types are a very important communication tool in this regard.

The type
data Foo = Foo (String, Int, Char)
represents a double-lifted tuple. It values comprise
undefined
Foo undefined
Foo (undefined, undefined, undefined)
etc.
This is usually troublesome. Because of this, it's rare to see such definitions in actual code. We either have plain data types
data Foo = Foo String Int Char
or newtypes
newtype Foo = Foo (String, Int, Char)
The newtype can be just as inconvenient to use, but at least it
does not double-lift the tuple: undefined and Foo undefined are now equal values.
The newtype also provides zero-cost conversion between a plain tuple and Foo, in both directions.
You can see such newtypes in use e.g. when the programmer needs a different instance for some type class, than the one already associated with the tuple. Or, perhaps, it is used in a "smart constructor" idiom.

I would not expect the pattern used in Foo to be frequent. There is slight difference in what the constructor acts like: Foo :: (String, Int, Char) -> Foo as opposed to Bar :: String -> Int -> Char -> Bar. Then Foo undefined and Foo (undefined, ..., ...) are strictly speaking different things, whereas you miss one level of undefinedness in Bar.

Related

What does a stand for in a data type declaration?

Normally when using type declarations we do:
function_name :: Type -> Type
However in an exercise I am trying to solve there is the following structure:
function_name :: Type a -> Type a
or explicitly as in the exercise
alphabet :: DFA a -> Alphabet a
alphabet = undefined
What does a stand for?
Short answer: it's a type variable.
At the computation level, the way we define functions is to use variables to refer to their arguments. Like this:
f x = x + 3
Here x is a variable, and its value will be chosen when the function is called. Haskell has a similar (but not identical...) mechanism in its type sublanguage. For example, you can write things like:
type F x = (x, Int, x)
type Endo a = a -> a -> a
Here again x is a variable in the first one (and a in the second), and its value will be chosen at use sites. One can also use this mechanism when defining new types. (The previous two examples just give new names to existing types, but the following does more.) One of the most basic nontrivial examples of this is the Maybe family of types:
data Maybe a = Nothing | Just a
The things on the right of the = are computation-level, so you can mostly ignore them for now, but on the left we are declaring a new family of types Maybe which accepts other types as an argument. For example, Maybe Int, Maybe (Bool, String), Maybe (Endo Char), and even passing in expressions that have variables like Maybe (x, Int, x) are all possible.
Syntactically, type constructors (things which are defined as part of the program text and that we expect the compiler to look up the definition for) start with an upper case letter and type variables (things which will be instantiated later and so don't currently have a concrete definition) start with lower case letters.
So, in the type signature you showed:
alphabet :: DFA a -> Alphabet a
I suspect there are actually two constructs new to you, not just one: first, the type variable a that you asked about, and second, the concept of type application, where we apply at the type level one "function-like" type to another. (Outside of this answer, people say "parameterized" instead of "function-like".)
...and, believe it or not, there is even a type system for types that makes sure you don't write things like these:
Int a -- Int is not parameterized, so shouldn't be applied to arguments
Int Char -- ditto
Maybe -> String -- Maybe is parameterized, so should be applied to
-- arguments, but isn't

Why aren't there existentially quantified type variables in GHC Haskell

There are universally quantified type variables, and there are existentially quantified data types. However, despite that people give pseudocode of the form exists a. Int -> a to help explain concepts sometimes, it doesn't seem like a compiler extension that there's any real interest in. Is this just a "there isn't much value in adding this" kind of thing (because it does seem valuable to me), or is there a problem like undecidability that's makes it truly impossible.
EDIT:
I've marked viorior's answer as correct because it seems like it is probably the actual reason why this was not included. I'd like to add some additional commentary though just in case anyone would want to help clarify this more.
As requested in the comments, I'll give an example of why I would consider this useful. Suppose we have a data type as follows:
data Person a = Person
{ age: Int
, height: Double
, weight: Int
, name: a
}
So we choose parameterize over a, which is a naming convention (I know that it probably makes more sense in this example to make a NamingConvention ADT with appropriate data constructors for the American "first,middle,last", the hispanic "name,paternal name,maternal name", etc. But for now, just go with this).
So, there are several functions we see that basically ignore the type that Person is parameterized over. Examples would be
age :: Person a -> Int
height :: Person a -> Double
weight :: Person a -> Int
And any function built on top of these could similarly ignore the a type. For example:
atRiskForDiabetes :: Person a -> Bool
atRiskForDiabetes p = age p + weight p > 200
--Clearly, I am not actually a doctor
Now, if we have a heterogeneous list of people (of type [exists a. Person a]), we would like to be able to map some of our functions over the list. Of course, there are some useless ways to map:
heteroList :: [exists a. Person a]
heteroList = [Person 20 30.0 170 "Bob Jones", Person 50 32.0 140 3451115332]
extractedNames = map name heteroList
In this example, extractedNames is of course useless because it has type [exists a. a]. However, if we use our other functions:
totalWeight :: [exists a. Person a] -> Int
totalWeight = sum . map age
numberAtRisk :: [exists a. Person a] -> Int
numberAtRisk = length . filter id . map atRiskForDiabetes
Now, we have something useful that operates over a heterogeneous collection (And, we didn't even involve typeclasses). Notice that we were able to reuse our existing functions. Using an existential data type would go as follows:
data SomePerson = forall a. SomePerson (Person a) --fixed, thanks viorior
But now, how can we use age and atRiskForDiabetes? We can't. I think that you would have to do something like this:
someAge :: SomePerson -> Int
someAge (SomePerson p) = age p
Which is really lame because you have to rewrite all of your combinators for a new type. It gets even worse if you want to do this with a data type that's parameterized over several type variables. Imagine this:
somewhatHeteroPipeList :: forall a b. [exists c d. Pipe a b c d]
I won't explain this line of thought any further, but just notice that you'd be rewriting a lot of combinators to do anything like this using just existential data types.
That being said, I hope I've give a mildly convincing use that this could be useful. If it doesn't seem useful (or if the example seems too contrived), feel free to let me know. Also, since I am firstly a programmer and have no training in type theory, it's a little difficult for me to see how to use Skolem's theorum (as posted by viorior) here. If anyone could show me how to apply it to the Person a example I gave, I would be very grateful. Thanks.
It is unnecessary.
By Skolem's Theorem we could convert existential quantifier into universal quantifier with higher rank types:
(∃b. F(b)) -> Int <===> ∀b. (F(b) -> Int)
Every existentially quantified type of rank n+1 can be encoded as a universally quantified type of rank n
Existentially quantified types are available in GHC, so the question is predicated on a false assumption.

Why does Haskell not have records with structural typing?

I have heard Haskell described as having structural typing. Records are an exception to that though as I understand. For example foo cannot be called with something of type HRec2 even though HRec and HRec2 are only nominally different in their fields.
data HRec = HRec { x :: Int, y :: Bool }
data HRec2 = HRec2 { p :: Int, q :: Bool }
foo :: HRec -> Bool
Is there some explanation for rejecting extending structural typing to everything including records?
Are there statically typed languages with structural typing even for records? Is there maybe some debate on this I can read about for all statically typed languages in general?
Haskell has structured types, but not structural typing, and that's not likely to change.*
The refusal to permit nominally different but structurally similar types as interchangeable arguments is called type safety. It's a good thing. Haskell even has a newtype declaration to provide types which are only nominally different, to allow you to enforce more type safety. Type safety is an easy way to catch bugs early rather than permit them at runtime.
In addition to amindfv's good answer which includes ad hoc polymorphism via typeclasses (effectively a programmer-declared feature equivalence), there's parametric polymorphism where you allow absolutely any type, so [a] allows any type in your list and BTree a allows any type in your binary tree.
This gives three answers to "are these types interchangeable?".
No; the programmer didn't say so.
Equivalent for a specific purpose because the programmer said so.
Don't care - I can do the same thing to this collection of data because it doesn't use any property of the data itself.
There's no 4: compiler overrules programmer because they happened to use a couple of Ints and a String like in that other function.
*I said Haskell is unlikely to change to structural typing. There is some discussion to introduce some form of extensible records, but no plans to make (Int,(Int,Int)) count as the same as (Int, Int, Int) or Triple {one::Int, two::Int, three::Int} the same as Triple2 {one2::Int, two2::Int, three2::Int}.
Haskell records aren't really "less structural" than the rest of the type system. Every type is either completely specified, or "specifically vague" (i.e. defined with a typeclass).
To allow both HRec and HRec2 to f, you have a couple of options:
Algebraic types:
Here, you define HRec and HRec2 records to both be part of the HRec type:
data HRec = HRec { x :: Int, y :: Bool }
| HRec2 { p :: Int, q :: Bool }
foo :: HRec -> Bool
(alternately, and maybe more idiomatic:)
data HRecType = Type1 | Type2
data HRec = HRec { hRecType :: HRecType, x :: Int, y :: Bool }
Typeclasses
Here, you define foo as able to accept any type as input, as long as a typeclass instance has been written for that type:
data HRec = HRec { x :: Int, y :: Bool }
data HRec2 = HRec2 { p :: Int, q :: Bool }
class Flexible a where
foo :: a -> Bool
instance Flexible HRec where
foo (HRec a _) = a == 5 -- or whatever
instance Flexible HRec2 where
foo (HRec2 a _) = a == 5
Using typeclasses allows you to go further than regular structural typing -- you can accept types that have the necessary information embedded in them, even if the types don't superficially look similar, e.g.:
data Foo = Foo { a :: String, b :: Float }
data Bar = Bar { c :: String, d :: Integer }
class Thing a where
doAThing :: a -> Bool
instance Thing Foo where
doAThing (Foo x y) = (x == "hi") && (y == 0)
instance Thing Bar where
doAThing (Bar x y) = (x == "hi") && ((fromInteger y) == 0)
We can run fromInteger (or any arbitrary function) to get the data we need from what we have!
I'm aware of two library implementations of structurally typed records in Haskell:
HList is older, and is explained in an excellent paper: Haskell's overlooked object system (free online, but SO won't let me include more links)
vinyl is newer, and uses fancy new GHC features. There's at least one library, vinyl-gl, using it.
I cannot answer the language-design part of your question, though.
To answer your last question, Go and Scalas definitely have structural typing. Some people (including me) would call that "statically unsafe typing", since it implicitly declares all samely- named methods in a program to have the same semantics, which implies "spooky action at a distance", relating code in on source file to code in some library that the program has never seen.
IMO, better to require that the same-named methods to explicitly declare that they conform to a named semantic "model" of behavior.
Yes, the compiler would guarantee that the method is callable, but it isn't much safer than saying:
f :: [a] -> Int
And letting the compiler choose an arbitrary implementation that may or may not be length.
(A similar idea can be made safe with Scala "implicits" or Haskell (GHC?) "reflection" package.)

How to define a class that allows uniform access to different records in Haskell?

I have two records that both have a field I want to extract for display. How do I arrange things so they can be manipulated with the same functions? Since they have different fields (in this case firstName and buildingName) that are their name fields, they each need some "adapter" code to map firstName to name. Here is what I have so far:
class Nameable a where
name :: a -> String
data Human = Human {
firstName :: String
}
data Building = Building {
buildingName :: String
}
instance Nameable Human where
name x = firstName x
instance Nameable Building where
-- I think the x is redundant here, i.e the following should work:
-- name = buildingName
name x = buildingName x
main :: IO ()
main = do
putStr $ show (map name items)
where
items :: (Nameable a) => [a]
items = [ Human{firstName = "Don"}
-- Ideally I want the next line in the array too, but that gives an
-- obvious type error at the moment.
--, Building{buildingName = "Empire State"}
]
This does not compile:
TypeTest.hs:23:14:
Couldn't match expected type `a' against inferred type `Human'
`a' is a rigid type variable bound by
the type signature for `items' at TypeTest.hs:22:23
In the expression: Human {firstName = "Don"}
In the expression: [Human {firstName = "Don"}]
In the definition of `items': items = [Human {firstName = "Don"}]
I would have expected the instance Nameable Human section would make this work. Can someone explain what I am doing wrong, and for bonus points what "concept" I am trying to get working, since I'm having trouble knowing what to search for.
This question feels similar, but I couldn't figure out the connection with my problem.
Consider the type of items:
items :: (Nameable a) => [a]
It's saying that for any Nameable type, items will give me a list of that type. It does not say that items is a list that may contain different Nameable types, as you might think. You want something like items :: [exists a. Nameable a => a], except that you'll need to introduce a wrapper type and use forall instead. (See: Existential type)
{-# LANGUAGE ExistentialQuantification #-}
data SomeNameable = forall a. Nameable a => SomeNameable a
[...]
items :: [SomeNameable]
items = [ SomeNameable $ Human {firstName = "Don"},
SomeNameable $ Building {buildingName = "Empire State"} ]
The quantifier in the data constructor of SomeNameable basically allows it to forget everything about exactly which a is used, except that it is Nameable. Therefore, you will only be allowed to use functions from the Nameable class on the elements.
To make this nicer to use, you can make an instance for the wrapper:
instance Nameable (SomeNameable a) where
name (SomeNameable x) = name x
Now you can use it like this:
Main> map name items
["Don", "Empire State"]
Everybody is reaching for either existential quantification or algebraic data types. But these are both overkill (well depending on your needs, ADTs might not be).
The first thing to note is that Haskell has no downcasting. That is, if you use the following existential:
data SomeNameable = forall a. Nameable a => SomeNameable a
then when you create an object
foo :: SomeNameable
foo = SomeNameable $ Human { firstName = "John" }
the information about which concrete type the object was made with (here Human) is forever lost. The only things we know are: it is some type a, and there is a Nameable a instance.
What is it possible to do with such a pair? Well, you can get the name of the a you have, and... that's it. That's all there is to it. In fact, there is an isomorphism. I will make a new data type so you can see how this isomorphism arises in cases when all your concrete objects have more structure than the class.
data ProtoNameable = ProtoNameable {
-- one field for each typeclass method
protoName :: String
}
instance Nameable ProtoNameable where
name = protoName
toProto :: SomeNameable -> ProtoNameable
toProto (SomeNameable x) = ProtoNameable { protoName = name x }
fromProto :: ProtoNameable -> SomeNameable
fromProto = SomeNameable
As we can see, this fancy existential type SomeNameable has the same structure and information as ProtoNameable, which is isomorphic to String, so when you are using this lofty concept SomeNameable, you're really just saying String in a convoluted way. So why not just say String?
Your items definition has exactly the same information as this definition:
items = [ "Don", "Empire State" ]
I should add a few notes about this "protoization": it is only as straightforward as this when the typeclass you are existentially quantifying over has a certain structure: namely when it looks like an OO class.
class Foo a where
method1 :: ... -> a -> ...
method2 :: ... -> a -> ...
...
That is, each method only uses a once as an argument. If you have something like Num
class Num a where
(+) :: a -> a -> a
...
which uses a in multiple argument positions, or as a result, then eliminating the existential is not as easy, but still possible. However my recommendation to do this changes from a frustration to a subtle context-dependent choice, because of the complexity and distant relationship of the two representations. However, every time I have seen existentials used in practice it is with the Foo kind of tyepclass, where it only adds needless complexity, so I quite emphatically consider it an antipattern. In most of these cases I recommend eliminating the entire class from your codebase and exclusively using the protoized type (after you give it a good name).
Also, if you do need to downcast, then existentials aren't your man. You can either use an algebraic data type, as others people have answered, or you can use Data.Dynamic (which is basically an existential over Typeable. But don't do that; a Haskell programmer resorting to Dynamic is ungentlemanlike. An ADT is the way to go, where you characterize all the possible types it could be in one place (which is necessary so that the functions that do the "downcasting" know that they handle all possible cases).
I like #hammar's answer, and you should also check out this article which provides another example.
But, you might want to think differently about your types. The boxing of Nameable into the SomeNameable data type usually makes me start thinking about whether a union type for the specific case is meaningful.
data Entity = H Human | B Building
instance Nameable Entity where ...
items = [H (Human "Don"), B (Building "Town Hall")]
I'm not sure why you want to use the same function for
getting the name of a Human and the name of a Building.
If their names are used in fundamentally different ways,
except maybe for simple things like printing them,
then you probably want two
different functions for that. The type system
will automatically guide you to choose the right function
to use in each situation.
But if having a name is something significant about the
whole purpose of your program, and a Human and a Building
are really pretty much the same thing in that respect as far as your program
is concerned, then you would define their type together:
data NameableThing =
Human { name :: String } |
Building { name :: String }
That gives you a polymorphic function name that works for
whatever particular flavor of NameableThing you happen to have,
without needing to get into type classes.
Usually you would use a type class for a different kind of situation:
if you have some kind of non-trivial operation that has the same purpose
but a different implementation for several different types.
Even then, it's often better to use some other approach instead, like
passing a function as a parameter (a "higher order function", or "HOF").
Haskell type classes are a beautiful and powerful tool, but they are totally
different than what is called a "class" in object-oriented languages,
and they are used far less often.
And I certainly don't recommend complicating your program by using an advanced
extension to Haskell like Existential Qualification just to fit into
an object-oriented design pattern.
You can try to use Existentially Quanitified types and do it like this:
data T = forall a. Nameable a => MkT a
items = [MkT (Human "bla"), MkT (Building "bla")]
I've just had a look at the code that this question is abstracting from. For this, I would recommend merging the Task and RecurringTaskDefinition types:
data Task
= Once
{ name :: String
, scheduled :: Maybe Day
, category :: TaskCategory
}
| Recurring
{ name :: String
, nextOccurrence :: Day
, frequency :: RecurFrequency
}
type ProgramData = [Task] -- don't even need a new data type for this any more
Then, the name function works just fine on either type, and the functions you were complaining about like deleteTask and deleteRecurring don't even need to exist -- you can just use the standard delete function as usual.

What's the recommended way of handling complexly composed POD(plain-old-data in OO) in Haskell?

I'm a Haskell newbie.
In statically typed OO languages (for instance, Java), all complex data structures are presented as class and instances. An object can have many attributes (fields). And another object can be a value of the field. Those fields can be accessed with their names, and statically typed by class. Finally, those objects construct huge graph of object which linked each other. Most program uses data graph like this.
How can I archive these functionality in Haskell?
If you really do have data without behavior, this maps nicely to a Haskell record:
data Person = Person { name :: String
, address :: String }
deriving (Eq, Read, Show)
data Department = Management | Accounting | IT | Programming
deriving (Eq, Read, Show)
data Employee = Employee { identity :: Person
, idNumber :: Int
, department :: Department }
| Contractor { identity :: Person
, company :: String }
deriving (Eq, Read, Show)
This says that a Person is a Person who has a name and address (both Strings); a Department is either Management, Accounting, IT, or Programming; and an Employee is either an Employee who has an identity (a Person), an idNumber (an Int), and a department (a Department), or is a Contractor who has an identity (a Person) and a company (a String). The deriving (Eq, Read, Show) lines enable you to compare these objects for equality, read them in, and convert them to strings.
In general, a Haskell data type is a combination of unions (also called sums) and tuples (also called products).1 The |s denote choice (a union): an Employee is either an Employee or a Contractor, a Department is one of four things, etc. In general, tuples are written something like the following:
data Process = Process String Int
This says that Process (in addition to being a type name) is a data constructor with type String -> Int -> Process. Thus, for instance, Process "init" 1, or Process "ls" 57300. A Process has to have both a String and an Int to exist. The record notation used above is just syntactic sugar for these products; I could also have written data Person = Person String String, and then defined
name :: Person -> String
name (Person n _) = n
address :: Person -> String
address (Person _ a) = a
Record notation, however, can be nice for complex data structures.
Also note that you can parametrize a Haskell type over other types; for instance, a three-dimensional point could be data Point3 a = Point3 a a a. This means that Point3 :: a -> a -> a -> Point3 a, so that one could write Point3 (3 :: Int) (4 :: Int) (5 :: Int) to get a Point3 Int, or Point3 (1.1 :: Double) (2.2 :: Double) (3.3 :: Double) to get a Point3 Double. (Or Point3 1 2 3 to get a Num a => Point3 a, if you've seen type classes and overloaded numeric literals.)
This is what you need to represent a data graph. However, take note: one problem for people transitioning from imperative languages to functional ones—or, really, between any two different paradigms (C to Python, Prolog to Ruby, Erlang to Java, whatever)—is to continue to try to solve problems the old way. The solution you're trying to model may not be constructed in a way amenable to easy functional programming techniques, even if the problem is. For instance, in Haskell, thinking about types is very important, in a way that's different from, say, Java. At the same time, implementing behaviors for those types is done very differently: higher-order functions capture some of the abstractions you've seen in Java, but also some which aren't easily expressible (map :: (a -> b) -> [a] -> [b], filter :: (a -> Bool) -> [a] -> [a], and foldr :: (a -> b -> b) -> b -> [a] -> b come to mind). So keep your options open, and consider addressing your problems in a functional way. Of course, maybe you are, in which case, full steam ahead. But do keep this in mind as you explore a new language. And have fun :-)
1: And recursion: you can represent a binary tree, for instance, with data Tree a = Leaf a | Branch a (Tree a) (Tree a).
Haskell has algebraic data types, which can describe structures or unions of structures such that something of a given type can hold one of a number of different sets of fields. These fields can set and accessed both positionally or via names with record syntax.
See here: http://learnyouahaskell.com/making-our-own-types-and-typeclasses

Resources