Confusion about "type" and "data" in haskell - haskell

data MoneyAmount = Amount Float Currency
deriving (Show, Eq)
data Currency = EUR | GBP | USD | CHF
deriving (Show, Eq)
type Account = (Integer, MoneyAmount)
putAmount :: MoneyAmount -> Account -> Account
putAmount mon acc = undefined
I need to write a function that adds money to an account (display error if money added is wrong currency in account).
I know how to create an Amount
let moni = Amount 6.6 EUR
but i have no idea what to write to create an Account? (i hope that sentence makes sense) I don't know how to manipulate the input to do the whole add to account thing.
I've tried things like
let acc = Account 1 moni
My question is more how to manipulate the Account so I can write the function.

type creates a type synonym; an Account is exactly the same as an (Integer, MoneyAmount), and you write it the same way:
let acc = (1, moni)

A type is just an alias. It doesn't define a new type but instead a new name for an existing type. So you could do
type Money = Float
And you can use Money where ever you can use a Float and vice-versa. If you had
foo :: Float -> Float
foo x = 2 * x
Then
> foo (1 :: Float)
2
> foo (1 :: Money)
2
Both work fine. In your case, Account is just an alias for (Integer, MoneyAmount), so you would construct one just as you would any other tuple.
A data defines an entirely new type, and this requires new constructors. For example:
data Bool = False | True
defines the Bool type with the constructors False and True. A more complicated example would be
data Maybe a = Nothing | Just a
which defines the Maybe a polymorphic type with constructors Nothing :: Maybe a and Just :: a -> Maybe a. I've included the types of these constructors to highlight that they exist as normal values and functions. The difference between a function and a constructor is that you can do anything you want in a function, but a constructor is only allowed to take existing values and make a value of another type without performing any transformations to it. Constructors are just wrappers around values.

Related

Clarifying Data Constructor in Haskell

In the following:
data DataType a = Data a | Datum
I understand that Data Constructor are value level function. What we do above is defining their type. They can be function of multiple arity or const. That's fine. I'm ok with saying Datum construct Datum. What is not that explicit and clear to me here is somehow the difference between the constructor function and what it produce. Please let me know if i am getting it well:
1 - a) Basically writing Data a, is defining both a Data Structure and its Constructor function (as in scala or java usually the class and the constructor have the same name) ?
2 - b) So if i unpack and make an analogy. With Data a We are both defining a Structure(don't want to use class cause class imply a type already i think, but maybe we could) of object (Data Structure), the constructor function (Data Constructor/Value constructor), and the later return an object of that object Structure. Finally The type of that Structure of object is given by the Type constructor. An Object Structure in a sense is just a Tag surrounding a bunch value of some type. Is my understanding correct ?
3 - c) Can I formally Say:
Data Constructor that are Nullary represent constant values -> Return the the constant value itself of which the type is given by the Type Constructor at the definition site.
Data Constructor that takes an argument represent class of values, where class is a Tag ? -> Return an infinite number of object of that class, of which the type is given by the Type constructor at the definition site.
Another way of writing this:
data DataType a = Data a | Datum
Is with generalised algebraic data type (GADT) syntax, using the GADTSyntax extension, which lets us specify the types of the constructors explicitly:
{-# LANGUAGE GADTSyntax #-}
data DataType a where
Data :: a -> DataType a
Datum :: DataType a
(The GADTs extension would work too; it would also allow us to specify constructors with different type arguments in the result, like DataType Int vs. DataType Bool, but that’s a more advanced topic, and we don’t need that functionality here.)
These are exactly the types you would see in GHCi if you asked for the types of the constructor functions with :type / :t:
> :{
| data DataType a where
| Data :: a -> DataType a
| Datum :: DataType a
| :}
> :type Data
Data :: a -> DataType a
> :t Datum
Datum :: DataType a
With ExplicitForAll we can also specify the scope of the type variables explicitly, and make it clearer that the a in the data definition is a separate variable from the a in the constructor definitions by also giving them different names:
data DataType a where
Data :: forall b. b -> DataType b
Datum :: forall c. DataType c
Some more examples of this notation with standard prelude types:
data Either a b where
Left :: forall a b. a -> Either a b
Right :: forall a b. b -> Either a b
data Maybe a where
Nothing :: Maybe a
Just :: a -> Maybe a
data Bool where
False :: Bool
True :: Bool
data Ordering where
LT, EQ, GT :: Ordering -- Shorthand for repeated ‘:: Ordering’
I understand that Data Constructor are value level function. What we do above is defining their type. They can be function of multiple arity or const. That's fine. I'm ok with saying Datum construct Datum. What is not that explicit and clear to me here is somehow the difference between the constructor function and what it produce.
Datum and Data are both “constructors” of DataType a values; neither Datum nor Data is a type! These are just “tags” that select between the possible varieties of a DataType a value.
What is produced is always a value of type DataType a for a given a; the constructor selects which “shape” it takes.
A rough analogue of this is a union in languages like C or C++, plus an enumeration for the “tag”. In pseudocode:
enum Tag {
DataTag,
DatumTag,
}
// A single anonymous field.
struct DataFields<A> {
A field1;
}
// No fields.
struct DatumFields<A> {};
// A union of the possible field types.
union Fields<A> {
DataFields<A> data;
DatumFields<A> datum;
}
// A pair of a tag with the fields for that tag.
struct DataType<A> {
Tag tag;
Fields<A> fields;
}
The constructors are then just functions returning a value with the appropriate tag and fields. Pseudocode:
<A> DataType<A> newData(A x) {
DataType<A> result;
result.tag = DataTag;
result.fields.data.field1 = x;
return result;
}
<A> DataType<A> newDatum() {
DataType<A> result;
result.tag = DatumTag;
// No fields.
return result;
}
Unions are unsafe, since the tag and fields can get out of sync, but sum types are safe because they couple these together.
A pattern-match like this in Haskell:
case someDT of
Datum -> f
Data x -> g x
Is a combination of testing the tag and extracting the fields. Again, in pseudocode:
if (someDT.tag == DatumTag) {
f();
} else if (someDT.tag == DataTag) {
var x = someDT.fields.data.field1;
g(x);
}
Again this is coupled in Haskell to ensure that you can only ever access the fields if you have checked the tag by pattern-matching.
So, in answer to your questions:
1 - a) Basically writing Data a, is defining both a Data Structure and its Constructor function (as in scala or java usually the class and the constructor have the same name) ?
Data a in your original code is not defining a data structure, in that Data is not a separate type from DataType a, it’s just one of the possible tags that a DataType a value may have. Internally, a value of type DataType Int is one of the following:
The tag for Data (in GHC, a pointer to an “info table” for the constructor), and a reference to a value of type Int.
x = Data (1 :: Int) :: DataType Int
+----------+----------------+ +---------+----------------+
x ---->| Data tag | pointer to Int |---->| Int tag | unboxed Int# 1 |
+----------+----------------+ +---------+----------------+
The tag for Datum, and no other fields.
y = Datum :: DataType Int
+-----------+
y ----> | Datum tag |
+-----------+
In a language with unions, the size of a union is the maximum of all its alternatives, since the type must support representing any of the alternatives with mutation. In Haskell, since values are immutable, they don’t require any extra “padding” since they can’t be changed.
It’s a similar situation for standard data types, e.g., a product or sum type:
(x :: X, y :: Y) :: (X, Y)
+---------+--------------+--------------+
| (,) tag | pointer to X | pointer to Y |
+---------+--------------+--------------+
Left (m :: M) :: Either M N
+-----------+--------------+
| Left tag | pointer to M |
+-----------+--------------+
Right (n :: N) :: Either M N
+-----------+--------------+
| Right tag | pointer to N |
+-----------+--------------+
2 - b) So if i unpack and make an analogy. With Data a We are both defining a Structure(don't want to use class cause class imply a type already i think, but maybe we could) of object (Data Structure), the constructor function (Data Constructor/Value constructor), and the later return an object of that object Structure. Finally The type of that Structure of object is given by the Type constructor. An Object Structure in a sense is just a Tag surrounding a bunch value of some type. Is my understanding correct ?
This is sort of correct, but again, the constructors Data and Datum aren’t “data structures” by themselves. They’re just the names used to introduce (construct) and eliminate (match) values of type DataType a, for some type a that is chosen by the caller of the constructors to fill in the forall
data DataType a = Data a | Datum says:
If some term e has type T, then the term Data e has type DataType T
Inversely, if some value of type DataType T matches the pattern Data x, then x has type T in the scope of the match (case branch or function equation)
The term Datum has type DataType T for any type T
3 - c) Can I formally Say:
Data Constructor that are Nullary represent constant values -> Return the the constant value itself of which the type is given by the Type Constructor at the definition site.
Data Constructor that takes an argument represent class of values, where class is a Tag ? -> Return an infinite number of object of that class, of which the type is given by the Type constructor at the definition site.
Not exactly. A type constructor like DataType :: Type -> Type, Maybe :: Type -> Type, or Either :: Type -> Type -> Type, or [] :: Type -> Type (list), or a polymorphic data type, represents an “infinite” family of concrete types (Maybe Int, Maybe Char, Maybe (String -> String), …) but only in the same way that id :: forall a. a -> a represents an “infinite” family of functions (id :: Int -> Int, id :: Char -> Char, id :: String -> String, …).
That is, the type a here is a parameter filled in with an argument value given by the caller. Usually this is implicit, through type inference, but you can specify it explicitly with the TypeApplications extension:
-- Akin to: \ (a :: Type) -> \ (x :: a) -> x
id :: forall a. a -> a
id x = x
id #Int :: Int -> Int
id #Int 1 :: Int
Data :: forall a. a -> DataType a
Data #Char :: Char -> DataType Char
Data #Char 'x' :: DataType Char
The data constructors of each instantiation don’t really have anything to do with each other. There’s nothing in common between the instantiations Data :: Int -> DataType Int and Data :: Char -> DataType Char, apart from the fact that they share the same tag name.
Another way of thinking about this in Java terms is with the visitor pattern. DataType would be represented as a function that accepts a “DataType visitor”, and then the constructors don’t correspond to separate data types, they’re just the methods of the visitor which accept the fields and return some result. Writing the equivalent code in Java is a worthwhile exercise, but here it is in Haskell:
{-# LANGUAGE RankNTypes #-}
-- (Allows passing polymorphic functions as arguments.)
type DataType a
= forall r. -- A visitor with a generic result type
r -- With one “method” for the ‘Datum’ case (no fields)
-> (a -> r) -- And one for the ‘Data’ case (one field)
-> r -- Returning the result
newData :: a -> DataType a
newData field = \ _visitDatum visitData -> visitData field
newDatum :: DataType a
newDatum = \ visitDatum _visitData -> visitDatum
Pattern-matching is simply running the visitor:
matchDT :: DataType a -> b -> (a -> b) -> b
matchDT dt visitDatum visitData = dt visitDatum visitData
-- Or: matchDT dt = dt
-- Or: matchDT = id
-- case someDT of { Datum -> f; Data x -> g x }
-- f :: r
-- g :: a -> r
-- someDT :: DataType a
-- :: forall r. r -> (a -> r) -> r
someDT f (\ x -> g x)
Similarly, in Haskell, data constructors are just the ways of introducing and eliminating values of a user-defined type.
What is not that explicit and clear to me here is somehow the difference between the constructor function and what it produce
I'm having trouble following your question, but I think you are complicating things. I would suggest not thinking too deeply about the "constructor" terminology.
But hopefully the following helps:
Starting simple:
data DataType = Data Int | Datum
The above reads "Declare a new type named DataType, which has the possible values Datum or Data <some_number> (e.g. Data 42)"
So e.g. Datum is a value of type DataType.
Going back to your example with a type parameter, I want to point out what the syntax is doing:
data DataType a = Data a | Datum
^ ^ ^ These things appear in type signatures (type level)
^ ^ These things appear in code (value level stuff)
There's a bit of punning happening here. so in the data declaration you might see "Data Int" and this is mixing type-level and value-level stuff in a way that you wouldn't see in code. In code you'd see e.g. Data 42 or Data someVal.
I hope that helps a little...

Retrieve hidden type of a phantom type

I declared a a phantom type like this with Haskell.
newtype Length (a::UnitLength) b = Length b deriving (Eq,Show)
data UnitLength = Meter
| KiloMeter
| Miles
deriving (Eq,Show)
Now, I would like to write some functions to use this type. But I didn't happened to see and use the hidden type.
Is it possible to retrieve the hidden type a of the phantom type Length to perform test, pattern matching, .... ?
If you want a runtime representation of the phantom type you have used, you have to use what we call a singleton. It has precisely one constructor for each ones of the constructors in UnitLength and their types say precisely which constructor we are considering:
data SUnitLength (a :: UnitLength) where
SMeter :: SUnitLength Meter
SKiloMeter :: SUnitLength KiloMeter
SMiles :: SUnitLength Miles
Now that you have this you can for instance write a display function picking the right unit abbreviation depending on the phantom parameter:
display :: Show b => SUnitLength a -> Length a b -> String
display sa l = show (payload l) ++
case sa of
SKiloMeter -> "km"
_ -> "m"
Now, that does not really match your demand: the parameter a is available in the type Length a b but we somehow still have to manufacture the witness by hand. That's annoying. One way to avoid this issue is to define a type class doing that work for us. CUnitLength a tells us that provided a value of type Length a b, we can get a witness SUnitLength a of the shape a has.
class CUnitLength (a :: UnitLength) where
getUnit :: Length a b -> SUnitLength a
It is easy for us to write instances of CUnitLength for the various UnitLength constructors: getUnit can even ignore its argument!
instance CUnitLength Meter where
getUnit _ = SMeter
instance CUnitLength KiloMeter where
getUnit _ = SKiloMeter
instance CUnitLength Miles where
getUnit _ = SMiles
So why bother with getUnit's argument? Well if we remove it, getUnit needs to somehow magically guess which a it is suppose to describe. Sometimes it's possible to infer that ̀a based on the expected type at the call site but sometimes it's not. Having the Length a b argument guarantees that all calls will be unambiguous. We can always recover a simpler getUnit' anyway:
getUnit' :: CUnitLength a => SUnitLength a
getUnit' = getUnit (undefined :: Length a ())
Which leads us to the last definition display' which has the same role as display but does not require the extra argument:
display' :: (CUnitLength a, Show b) => Length a b -> String
display' = display getUnit'
I have put everything (including the LANGUAGE extensions, and the definition of payload to extract a b from Length a b) in a self-contained gist in case you want to play with the code.

Why `Just String` will be wrong in Haskell

Hi I have a trivial but exhausting question during learning myself the Parameterized Types topic in Haskell. Here is my question:
Look this is the definition of Maybe:
data Maybe a = Just a | Nothing
And we use this like:
Just "hello world"
Just 100
But why can't Just take a type variable?
For example:
Just String
Just Int
I know this problem is quite fool, but I still can't figure it out...
Well, first note that String and Int aren't type variables, but types (type constants, if you will). But that doesn't really matter for the purpose of your question.
What matters is the destinction between Haskells type language and value language. These are generally kept apart. String and Int and Maybe live in the type language, while "hello world" and 100 and Just and Nothing live in the value language. Each knows nothing about the other side. Only, the compiler knows "this discription of a value belongs to that type", but really types exist only at compile-time and values exist only at runtime.
Two things that are a bit confusing:
It's allowed to have names that exist both in the type- and value language. Best-known are () and mere synonym-type like
newtype Endo a = Endo { runEndo :: a -> a }
but really these are two seperate entities: the type constructor Endo :: *->* (see below for these * thingies) and the value constructor Endo :: (a->a) -> Endo a. They just happen to share the same name, but in completely different scopes – much like when you declare both addTwo x = x + 2 and greet x = "Hello "++x, where both uses of the x symbol have nothing to do with each other.
The data syntax seems to intermingle types and values. Everywhere else, types and values must always be separated by a ::, most typically in signatures
"hello world" :: String
100 :: Int
Just :: Int -> Maybe Int
{-hence-}Just 100 :: Maybe Int
Nothing :: Maybe Int
foo :: (Num a, Ord a) => a -> Maybe a -- this really means `forall a . (Num a, Ord a) => a -> Maybe a
foo n | n <= 0 = Nothing
| otherwise = Just $ n - 1
and indeed that syntax can be used to define data in more distinctive way too, if you enable -XGADTs:
data Maybe a where
Just :: a -> Maybe a
Nothing :: Maybe a
Now we have the :: again as a clear distinction between value-level (left) and type-level.
You can actually take it up one more level: the above declaration can also be written
data Maybe :: * -> * where
Just :: a -> Maybe a
Nothing :: Maybe a
Here Maybe :: * -> * means, "Maybe is a type-level thing that has kind * -> *", i.e. it takes a type-level argument of kind * (such as Int) and returns another type-level thing of kind * (here, Maybe Int). Kinds are to types as types are to values.
You can certainly declare data Maybe a = Just String | Nothing, and you can declare data Maybe a = Just Int | Nothing, but only one of them at a time. Using a type variable permits to declare in what way the type of the contents of the constructed values change with the value of the type variable. So data Maybe a = Just a | Nothing tells us that the contents "inside" Just is exactly of the type passed to Maybe. That way Maybe String means that "inside" Just there is a value of type String, and Maybe Int means that "inside" Just there is a value of type Int.

Haskell generic data structure

I want to create a type to store some generic information, as for me, this type is
Molecule, where i store chemical graph and molecular properties.
data Molecule = Molecule {
name :: Maybe String,
graph :: Gr Atom Bond,
property :: Maybe [Property] -- that's a question
} deriving(Show)
Properties I want to represent as tuple
type Property a = (String,a)
because a property may have any type: Float, Int, String e.t.c.
The question is how to form Molecule data structure, so I will be able to collect any numbers of any types of properties in Molecule. If I do
data Molecule a = Molecule {
name :: Maybe String,
graph :: Gr Atom Bond,
property :: Maybe [Property a]
} deriving(Show)
I have to diretly assign one type when I create a molecule.
If you know in advance the set of properties a molecule might have, you could define a sum type:
data Property = Mass Float | CatalogNum Int | Comment String
If you want this type to be extensible, you could use Data.Dynamic as another answer suggests. For instance:
data Molecule = Molecule { name :: Maybe String,
graph :: Gr Atom Bond,
property :: [(String,Dynamic)]
} deriving (Show)
mass :: Molecule -> Maybe Float
mass m = case lookup "mass" (property m) of
Nothing -> Nothing
Just i -> fromDynamic i
You could also get rid of the "stringly-typed" (String,a) pairs, say:
-- in Molecule:
-- property :: [Dynamic]
data Mass = Mass Float
mass :: Molecule -> Maybe Mass
mass m = ...
Neither of these attempts gives much type safety over just parsing out of (String,String) pairs since there is no way to enforce the invariant that the user creates well-formed properties (short of wrapping properties in a new type and hiding the constructors in another module, which again breaks extensibility).
What you might want are Ocaml-style polymorphic variants. You could look at Vinyl, which provides type-safe extensible records.
As an aside, you might want to get rid of the Maybe wrapper around the list of properties, since the empty list already encodes the case of no properties.
You might want to look at Data.Dynamic for a psudo-dynamic typing solution.

What's the recommended way of handling complexly composed POD(plain-old-data in OO) in Haskell?

I'm a Haskell newbie.
In statically typed OO languages (for instance, Java), all complex data structures are presented as class and instances. An object can have many attributes (fields). And another object can be a value of the field. Those fields can be accessed with their names, and statically typed by class. Finally, those objects construct huge graph of object which linked each other. Most program uses data graph like this.
How can I archive these functionality in Haskell?
If you really do have data without behavior, this maps nicely to a Haskell record:
data Person = Person { name :: String
, address :: String }
deriving (Eq, Read, Show)
data Department = Management | Accounting | IT | Programming
deriving (Eq, Read, Show)
data Employee = Employee { identity :: Person
, idNumber :: Int
, department :: Department }
| Contractor { identity :: Person
, company :: String }
deriving (Eq, Read, Show)
This says that a Person is a Person who has a name and address (both Strings); a Department is either Management, Accounting, IT, or Programming; and an Employee is either an Employee who has an identity (a Person), an idNumber (an Int), and a department (a Department), or is a Contractor who has an identity (a Person) and a company (a String). The deriving (Eq, Read, Show) lines enable you to compare these objects for equality, read them in, and convert them to strings.
In general, a Haskell data type is a combination of unions (also called sums) and tuples (also called products).1 The |s denote choice (a union): an Employee is either an Employee or a Contractor, a Department is one of four things, etc. In general, tuples are written something like the following:
data Process = Process String Int
This says that Process (in addition to being a type name) is a data constructor with type String -> Int -> Process. Thus, for instance, Process "init" 1, or Process "ls" 57300. A Process has to have both a String and an Int to exist. The record notation used above is just syntactic sugar for these products; I could also have written data Person = Person String String, and then defined
name :: Person -> String
name (Person n _) = n
address :: Person -> String
address (Person _ a) = a
Record notation, however, can be nice for complex data structures.
Also note that you can parametrize a Haskell type over other types; for instance, a three-dimensional point could be data Point3 a = Point3 a a a. This means that Point3 :: a -> a -> a -> Point3 a, so that one could write Point3 (3 :: Int) (4 :: Int) (5 :: Int) to get a Point3 Int, or Point3 (1.1 :: Double) (2.2 :: Double) (3.3 :: Double) to get a Point3 Double. (Or Point3 1 2 3 to get a Num a => Point3 a, if you've seen type classes and overloaded numeric literals.)
This is what you need to represent a data graph. However, take note: one problem for people transitioning from imperative languages to functional ones—or, really, between any two different paradigms (C to Python, Prolog to Ruby, Erlang to Java, whatever)—is to continue to try to solve problems the old way. The solution you're trying to model may not be constructed in a way amenable to easy functional programming techniques, even if the problem is. For instance, in Haskell, thinking about types is very important, in a way that's different from, say, Java. At the same time, implementing behaviors for those types is done very differently: higher-order functions capture some of the abstractions you've seen in Java, but also some which aren't easily expressible (map :: (a -> b) -> [a] -> [b], filter :: (a -> Bool) -> [a] -> [a], and foldr :: (a -> b -> b) -> b -> [a] -> b come to mind). So keep your options open, and consider addressing your problems in a functional way. Of course, maybe you are, in which case, full steam ahead. But do keep this in mind as you explore a new language. And have fun :-)
1: And recursion: you can represent a binary tree, for instance, with data Tree a = Leaf a | Branch a (Tree a) (Tree a).
Haskell has algebraic data types, which can describe structures or unions of structures such that something of a given type can hold one of a number of different sets of fields. These fields can set and accessed both positionally or via names with record syntax.
See here: http://learnyouahaskell.com/making-our-own-types-and-typeclasses

Resources