What are the various use cases for union types and intersection types? There has been lately a lot of buzz about these type system features, yet somehow I have never felt need for either of these!
Union Types
To quote Robert Harper, "Practical Foundations for Programming
Languages", ch 15:
Most data structures involve
alternatives such as the distinction
between a leaf and an interior node in
a tree, or a choice in the outermost
form of a piece of abstract syntax.
Importantly, the choice determines the
structure of the value. For example,
nodes have children, but leaves do
not, and so forth. These concepts are
expressed by sum types, specifically
the binary sum, which offers a choice
of two things, and the nullary sum,
which offers a choice of no things.
Booleans
The simplest sum type is the Boolean,
data Bool = True
| False
Booleans have only two valid values, T or F. So instead of representing them as numbers, we can instead use a sum type to more accurately encode the fact there are only two possible values.
Enumerations
Enumerations are examples of more general sum types: ones with many, but finite, alternative values.
Sum types and null pointers
The best practically motivating example for sum types is discriminating between valid results and error values returned by functions, by distinguishing the failure case.
For example, null pointers and end-of-file characters are hackish encodings of the sum type:
data Maybe a = Nothing
| Just a
where we can distinguish between valid and invalid values by using the Nothing or Just tag to annotate each value with its status.
By using sum types in this way we can rule out null pointer errors entirely, which is a pretty decent motivating example. Null pointers are entirely due to the inability of older languages to express sum types easily.
Intersection Types
Intersection types are much newer, and their applications are not as widely understood. However, Benjamin Pierce's thesis ("Programming with Intersection Types
and Bounded Polymorphism") gives a good overview:
The most intriguing and potentially
useful property of intersection types
is their ability to express an
essentially unbounded (though of
course finite) amount of information
about the components of a program.
For
example, the addition function (+) can be
given the type Int -> Int -> Int ^ Real -> Real -> Real, capturing both the
general fact that the sum of two real
numbers is always a real and the more
specialized fact that the sum of two
integers is always an integer. A
compiler for a language with
intersection types might even provide
two different object-code sequences
for the two versions of (+), one using a
floating point addition instruction and
one using integer addition. For each
instance of+ in a program, the
compiler can decide whether both
arguments are integers and generate
the more efficient object code sequence
in this case.
This kind of finitary
polymorphism or coherent overloading
is so expressive, that ... the set of
all valid typings for a program
amounts to a complete characterization
of the program’s behavior
They let us encode a lot of information in the type, explaining via type theory what multiple inheritance means, giving types to type classes,
Union types are useful for typing dynamic languages or otherwise allowing more flexibility in the types passed around than most static languages allow. For example, consider this:
var a;
if (condition) {
a = "string";
} else {
a = 123;
}
If you have union types, it's easy to type a as int | string.
One use for intersection types is to describe an object that implements multiple interfaces. For example, C# allows multiple interface constraints on generics:
interface IFoo {
void Foo();
}
interface IBar {
void Bar();
}
void Method<T>(T arg) where T : IFoo, IBar {
arg.Foo();
arg.Bar();
}
Here, arg's type is the intersection of IFoo and IBar. Using that, the type-checker knows both Foo() and Bar() are valid methods on it.
If you want a more practice-oriented answer:
With union and recursive types you can encode regular tree types and therefore XML types.
With intersection types you can type BOTH overloaded functions and refinement types (what in a previous post is called coherent overloading)
So for instance you can write the function add (that overloads integer sum and string concatenation) as follows
let add ( (Int,Int)->Int ; (String,String)->String )
| (x & Int, y & Int) -> x+y
| (x & String, y & String) -> x#y ;;
Which has the intersection type
(Int,Int)->Int & (String,String)->String
But you can also refine the type above and type the function above as
(Pos,Pos) -> Pos &
(Neg,Neg) -> Neg &
(Int,Int)->Int &
(String,String)->String.
where Pos and Neg are positive and negative integer types.
The code above is executable in the language CDuce ( http://www.cduce.org ) whose type system includes union, intersections, and negation types (it is mainly targeted at XML transformations).
If you want to try it and you are on Linux, then it is probably included in your distribution (apt-get install cduce or yum install cduce should do the work) and you can use its toplevel (a la OCaml) to play with union and intersection types. On the CDuce site you will find a lot of practical examples of use of union and intersection types. And since there is a complete integration with OCaml libraries (you can import OCaml libraries in CDuce and export CDuce modules to OCaml) you can also check the correspondence with ML sum types (see here).
Here you are a complex example that mix union and intersection types (explained in the page "http://www.cduce.org/tutorial_overloading.html#val"), but to understand it you need to understand regular expression pattern matching, which requires some effort.
type Person = FPerson | MPerson
type FPerson = <person gender = "F">[ Name Children ]
type MPerson = <person gender = "M">[ Name Children ]
type Children = <children>[ Person* ]
type Name = <name>[ PCDATA ]
type Man = <man name=String>[ Sons Daughters ]
type Woman = <woman name=String>[ Sons Daughters ]
type Sons = <sons>[ Man* ]
type Daughters = <daughters>[ Woman* ]
let fun split (MPerson -> Man ; FPerson -> Woman)
<person gender=g>[ <name>n <children>[(mc::MPerson | fc::FPerson)*] ] ->
(* the above pattern collects all the MPerson in mc, and all the FPerson in fc *)
let tag = match g with "F" -> `woman | "M" -> `man in
let s = map mc with x -> split x in
let d = map fc with x -> split x in
<(tag) name=n>[ <sons>s <daughters>d ] ;;
In a nutshell it transforms values of type Person into values of type (Man | Women) (where the vertical bar denotes a union type) but keeping the correspondence between genres: split is a function with intersection type
MPerson -> Man & FPerson -> Woman
For instance with union types one could describe json domain model without introducing actual new classes but using only type aliases.
type JObject = Map[String, JValue]
type JArray = List[JValue]
type JValue = String | Number | Bool | Null | JObject | JArray
type Json = JObject | JArray
def stringify(json: JValue): String = json match {
case String | Number | Bool | Null => json.toString()
case JObject => "{" + json.map(x y => x + ": " + stringify(y)).mkStr(", ") + "}"
case JArray => "[" + json.map(stringify).mkStr(", ") + "]"
}
Related
Haskell enables one to construct algebraic data types using type constructors and data constructors. For example,
data Circle = Circle Float Float Float
and we are told this data constructor (Circle on right) is a function that constructs a circle when give data, e.g. x, y, radius.
Circle :: Float -> Float -> Float -> Circle
My questions are:
What is actually constructed by this function, specifically?
Can we define the constructor function?
I've seen Smart Constructors but they just seem to be extra functions that eventually call the regular constructors.
Coming from an OO background, constructors, of course, have imperative specifications. In Haskell, they seem to be system-defined.
In Haskell, without considering the underlying implementation, a data constructor creates a value, essentially by fiat. “ ‘Let there be a Circle’, said the programmer, and there was a Circle.” Asking what Circle 1 2 3 creates is akin to asking what the literal 1 creates in Python or Java.
A nullary constructor is closer to what you usually think of as a literal. The Boolean type is literally defined as
data Boolean = True | False
where True and False are data constructors, not literals defined by Haskell grammar.
The data type is also the definition of the constructor; as there isn't really anything to a value beyond the constructor name and its arguments, simply stating it is the definition. You create a value of type Circle by calling the data constructor Circle with 3 arguments, and that's it.
A so-called "smart constructor" is just a function that calls a data constructor, with perhaps some other logic to restrict which instances can be created. For example, consider a simple wrapper around Integer:
newtype PosInteger = PosInt Integer
The constructor is PosInt; a smart constructor might look like
mkPosInt :: Integer -> PosInteger
mkPosInt n | n > 0 = PosInt n
| otherwise = error "Argument must be positive"
With mkPosInt, there is no way to create a PosInteger value with a non-positive argument, because only positive arguments actually call the data constructor. A smart constructor makes the most sense when it, and not the data constructor, is exported by a module, so that a typical user cannot create arbitrary instances (because the data constructor does not exist outside the module).
Good question. As you know, given the definition:
data Foo = A | B Int
this defines a type with a (nullary) type constructor Foo and two data constructors, A and B.
Each of these data constructors, when fully applied (to no arguments in the case of A and to a single Int argument in the case of B) constructs a value of type Foo. So, when I write:
a :: Foo
a = A
b :: Foo
b = B 10
the names a and b are bound to two values of type Foo.
So, data constructors for type Foo construct values of type Foo.
What are values of type Foo? Well, first of all, they are different from values of any other type. Second, they are wholly defined by their data constructors. There is a distinct value of type Foo, different from all other values of Foo, for each combination of a data constructor with a set of distinct arguments passed to that data constructor. That is, two values of type Foo are identical if and only if they were constructed with the same data constructor given identical sets of arguments. ("Identical" here means something different from "equality", which may not necessarily be defined for a given type Foo, but let's not get into that.)
That's also what makes data constructors different from functions in Haskell. If I have a function:
bar :: Int -> Bool
It's possible that bar 1 and bar 2 might be exactly the same value. For example, if bar is defined by:
bar n = n > 0
then it's obvious that bar 1 and bar 2 (and bar 3) are identically True. Whether the value of bar is the same for different values of its arguments will depend on the function definition.
In contrast, if Bar is a constructor:
data BarType = Bar Int
then it's never going to be the case that Bar 1 and Bar 2 are the same value. By definition, they will be different values (of type BarType).
By the way, the idea that constructors are just a special kind of function is a common viewpoint. I personally think this is inaccurate and causes confusion. While it's true that constructors can often be used as if they are functions (specifically that they behave very much like functions when used in expressions), I don't think this view stands up under much scrutiny -- constructors are represented differently in the surface syntax of the language (with capitalized identifiers), can be used in contexts (like pattern matching) where functions cannot be used, are represented differently in compiled code, etc.
So, when you ask "can we define the constructor function", the answer is "no", because there is no constructor function. Instead, a constructor like A or B or Bar or Circle is what it is -- something different from a function (that sometimes behaves like a function with some special additional properties) which is capable of constructing a value of whatever type the data constructor belongs to.
This makes Haskell constructors very different from OO constructors, but that's not surprising since Haskell values are very different from OO objects. In an OO language, you can typically provide a constructor function that does some processing in building the object, so in Python you might write:
class Bar:
def __init__(self, n):
self.value = n > 0
and then after:
bar1 = Bar(1)
bar2 = Bar(2)
we have two distinct objects bar1 and bar2 (which would satify bar1 != bar2) that have been configured with the same field values and are in some sense "equal". This is sort of halfway between the situation above with bar 1 and bar 2 creating two identical values (namely True) and the situation with Bar 1 and Bar 2 creating two distinct values that, by definition, can't possibly be the "same" in any sense.
You can never have this situation with Haskell constructors. Instead of thinking of a Haskell constructor as running some underlying function to "construct" an object which might involve some cool processing and deriving of field values, you should instead think of a Haskell constructor as a passive tag attached to a value (which may also contain zero or more other values, depending on the arity of the constructor).
So, in your example, Circle 10 20 5 doesn't "construct" an object of type Circle by running some function. It directly creates a tagged object that, in memory, will look something like:
<Circle tag>
<Float value 10>
<Float value 20>
<Float value 5>
(or you can at least pretend that's what it looks like in memory).
The closest you can come to OO constructors in Haskell is using smart constructors. As you note, eventually a smart constructor just calls a regular constructor, because that's the only way to create a value of a given type. No matter what kind of bizarre smart constructor you build to create a Circle, the value it constructs will need to look like:
<Circle tag>
<some Float value>
<another Float value>
<a final Float value>
which you'll need to construct with a plain old Circle constructor call. There's nothing else the smart constructor could return that would still be a Circle. That's just how Haskell works.
Does that help?
I’m going to answer this in a somewhat roundabout way, with an example that I hope illustrates my point, which is that Haskell decouples several distinct ideas that are coupled in OOP under the concept of a “class”. Understanding this will help you translate your experience from OOP into Haskell with less difficulty. The example in OOP pseudocode:
class Person {
private int id;
private String name;
public Person(int id, String name) {
if (id == 0)
throw new InvalidIdException();
if (name == "")
throw new InvalidNameException();
this.name = name;
this.id = id;
}
public int getId() { return this.id; }
public String getName() { return this.name; }
public void setName(String name) { this.name = name; }
}
In Haskell:
module Person
( Person
, mkPerson
, getId
, getName
, setName
) where
data Person = Person
{ personId :: Int
, personName :: String
}
mkPerson :: Int -> String -> Either String Person
mkPerson id name
| id == 0 = Left "invalid id"
| name == "" = Left "invalid name"
| otherwise = Right (Person id name)
getId :: Person -> Int
getId = personId
getName :: Person -> String
getName = personName
setName :: String -> Person -> Either String Person
setName name person = mkPerson (personId person) name
Notice:
The Person class has been translated to a module which happens to export a data type by the same name—types (for domain representation and invariants) are decoupled from modules (for namespacing and code organisation).
The fields id and name, which are specified as private in the class definition, are translated to ordinary (public) fields on the data definition, since in Haskell they’re made private by omitting them from the export list of the Person module—definitions and visibility are decoupled.
The constructor has been translated into two parts: one (the Person data constructor) that simply initialises the fields, and another (mkPerson) that performs validation—allocation & initialisation and validation are decoupled. Since the Person type is exported, but its constructor is not, this is the only way for clients to construct a Person—it’s an “abstract data type”.
The public interface has been translated to functions that are exported by the Person module, and the setName function that previously mutated the Person object has become a function that returns a new instance of the Person data type that happens to share the old ID. The OOP code has a bug: it should include a check in setName for the name != "" invariant; the Haskell code can avoid this by using the mkPerson smart constructor to ensure that all Person values are valid by construction. So state transitions and validation are also decoupled—you only need to check invariants when constructing a value, because it can’t change thereafter.
So as for your actual questions:
What is actually constructed by this function, specifically?
A constructor of a data type allocates space for the tag and fields of a value, sets the tag to which constructor was used to create the value, and initialises the fields to the arguments of the constructor. You can’t override it because the process is completely mechanical and there’s no reason (in normal safe code) to do so. It’s an internal detail of the language and runtime.
Can we define the constructor function?
No—if you want to perform additional validation to enforce invariants, you should use a “smart constructor” function which calls the lower-level data constructor. Because Haskell values are immutable by default, values can be made correct by construction; that is, when you don’t have mutation, you don’t need to enforce that all state transitions are correct, only that all states themselves are constructed correctly. And often you can arrange your types so that smart constructors aren’t even necessary.
The only thing you can change about the generated data constructor “function” is making its type signature more restrictive using GADTs, to help enforce more invariants at compile-time. And as a side note, GADTs also let you do existential quantification, which lets you carry around encapsulated/type-erased information at runtime, exactly like an OOP vtable—so this is another thing that’s decoupled in Haskell but coupled in typical OOP languages.
Long story short (too late), you can do all the same things, you just arrange them differently, because Haskell provides the various features of OOP classes under separate orthogonal language features.
I'm a beginner in haskell and I wonder about the right way to define a new type. Suppose I want to define a Point type. In an imperative language, it's usually the equivalent of:
data Point = Int Int
However in haskell I usually see definitions such as:
data Point = Point Int Int
What are the differences and when should each approach be used?
In OO languages you can define a class with something like this
class Point {
int x,y;
Point(int x, int y) {...
}
it's similar
data Point = ...
is the type definition (similar to class Point above , and
... = Point Int Int
is the constructor, you can also define the constructor with a different name, but you need a name regardless.
data Point = P Int Int
The data definitions are, ultimately, tagged unions. For example:
data Maybe a = Nothing | Just a
Now how would you write this type using your syntax?
Moreover it remains the fact that in Haskell you can pattern match over this values and see which constructor was used to build a value. The name of the constructor is needed for pattern matching, and if the type has just one constructor it often re-uses the same name as the type.
For example:
let x = someOperationReturningMaybe
in case x of
Nothing -> 0
Just y -> y+5
This is different from plain union type, such as C's union where you can say "this thing is etiher an int or a float" but you have no way to know which one it actually is (except by keeping track of the state by hand).
Writing the code above using a C union you have no way to use a case to perform different actions depending on the constructor used, and you have to keep track explicitly what type is contained in that x and use an if.
I am learning Haskell from learnyouahaskell.com. I am having trouble understanding type constructors and data constructors. For example, I don't really understand the difference between this:
data Car = Car { company :: String
, model :: String
, year :: Int
} deriving (Show)
and this:
data Car a b c = Car { company :: a
, model :: b
, year :: c
} deriving (Show)
I understand that the first is simply using one constructor (Car) to built data of type Car. I don't really understand the second one.
Also, how do data types defined like this:
data Color = Blue | Green | Red
fit into all of this?
From what I understand, the third example (Color) is a type which can be in three states: Blue, Green or Red. But that conflicts with how I understand the first two examples: is it that the type Car can only be in one state, Car, which can take various parameters to build? If so, how does the second example fit in?
Essentially, I am looking for an explanation that unifies the above three code examples/constructs.
In a data declaration, a type constructor is the thing on the left hand side of the equals sign. The data constructor(s) are the things on the right hand side of the equals sign. You use type constructors where a type is expected, and you use data constructors where a value is expected.
Data constructors
To make things simple, we can start with an example of a type that represents a colour.
data Colour = Red | Green | Blue
Here, we have three data constructors. Colour is a type, and Green is a constructor that contains a value of type Colour. Similarly, Red and Blue are both constructors that construct values of type Colour. We could imagine spicing it up though!
data Colour = RGB Int Int Int
We still have just the type Colour, but RGB is not a value – it's a function taking three Ints and returning a value! RGB has the type
RGB :: Int -> Int -> Int -> Colour
RGB is a data constructor that is a function taking some values as its arguments, and then uses those to construct a new value. If you have done any object-oriented programming, you should recognise this. In OOP, constructors also take some values as arguments and return a new value!
In this case, if we apply RGB to three values, we get a colour value!
Prelude> RGB 12 92 27
#0c5c1b
We have constructed a value of type Colour by applying the data constructor. A data constructor either contains a value like a variable would, or takes other values as its argument and creates a new value. If you have done previous programming, this concept shouldn't be very strange to you.
Intermission
If you'd want to construct a binary tree to store Strings, you could imagine doing something like
data SBTree = Leaf String
| Branch String SBTree SBTree
What we see here is a type SBTree that contains two data constructors. In other words, there are two functions (namely Leaf and Branch) that will construct values of the SBTree type. If you're not familiar with how binary trees work, just hang in there. You don't actually need to know how binary trees work, only that this one stores Strings in some way.
We also see that both data constructors take a String argument – this is the String they are going to store in the tree.
But! What if we also wanted to be able to store Bool, we'd have to create a new binary tree. It could look something like this:
data BBTree = Leaf Bool
| Branch Bool BBTree BBTree
Type constructors
Both SBTree and BBTree are type constructors. But there's a glaring problem. Do you see how similar they are? That's a sign that you really want a parameter somewhere.
So we can do this:
data BTree a = Leaf a
| Branch a (BTree a) (BTree a)
Now we introduce a type variable a as a parameter to the type constructor. In this declaration, BTree has become a function. It takes a type as its argument and it returns a new type.
It is important here to consider the difference between a concrete type (examples include Int, [Char] and Maybe Bool) which is a type that can be assigned to a value in your program, and a type constructor function which you need to feed a type to be able to be assigned to a value. A value can never be of type "list", because it needs to be a "list of something". In the same spirit, a value can never be of type "binary tree", because it needs to be a "binary tree storing something".
If we pass in, say, Bool as an argument to BTree, it returns the type BTree Bool, which is a binary tree that stores Bools. Replace every occurrence of the type variable a with the type Bool, and you can see for yourself how it's true.
If you want to, you can view BTree as a function with the kind
BTree :: * -> *
Kinds are somewhat like types – the * indicates a concrete type, so we say BTree is from a concrete type to a concrete type.
Wrapping up
Step back here a moment and take note of the similarities.
A data constructor is a "function" that takes 0 or more values and gives you back a new value.
A type constructor is a "function" that takes 0 or more types and gives you back a new type.
Data constructors with parameters are cool if we want slight variations in our values – we put those variations in parameters and let the guy who creates the value decide what arguments they are going to put in. In the same sense, type constructors with parameters are cool if we want slight variations in our types! We put those variations as parameters and let the guy who creates the type decide what arguments they are going to put in.
A case study
As the home stretch here, we can consider the Maybe a type. Its definition is
data Maybe a = Nothing
| Just a
Here, Maybe is a type constructor that returns a concrete type. Just is a data constructor that returns a value. Nothing is a data constructor that contains a value. If we look at the type of Just, we see that
Just :: a -> Maybe a
In other words, Just takes a value of type a and returns a value of type Maybe a. If we look at the kind of Maybe, we see that
Maybe :: * -> *
In other words, Maybe takes a concrete type and returns a concrete type.
Once again! The difference between a concrete type and a type constructor function. You cannot create a list of Maybes - if you try to execute
[] :: [Maybe]
you'll get an error. You can however create a list of Maybe Int, or Maybe a. That's because Maybe is a type constructor function, but a list needs to contain values of a concrete type. Maybe Int and Maybe a are concrete types (or if you want, calls to type constructor functions that return concrete types.)
Haskell has algebraic data types, which very few other languages have. This is perhaps what's confusing you.
In other languages, you can usually make a "record", "struct" or similar, which has a bunch of named fields that hold various different types of data. You can also sometimes make an "enumeration", which has a (small) set of fixed possible values (e.g., your Red, Green and Blue).
In Haskell, you can combine both of these at the same time. Weird, but true!
Why is it called "algebraic"? Well, the nerds talk about "sum types" and "product types". For example:
data Eg1 = One Int | Two String
An Eg1 value is basically either an integer or a string. So the set of all possible Eg1 values is the "sum" of the set of all possible integer values and all possible string values. Thus, nerds refer to Eg1 as a "sum type". On the other hand:
data Eg2 = Pair Int String
Every Eg2 value consists of both an integer and a string. So the set of all possible Eg2 values is the Cartesian product of the set of all integers and the set of all strings. The two sets are "multiplied" together, so this is a "product type".
Haskell's algebraic types are sum types of product types. You give a constructor multiple fields to make a product type, and you have multiple constructors to make a sum (of products).
As an example of why that might be useful, suppose you have something that outputs data as either XML or JSON, and it takes a configuration record - but obviously, the configuration settings for XML and for JSON are totally different. So you might do something like this:
data Config = XML_Config {...} | JSON_Config {...}
(With some suitable fields in there, obviously.) You can't do stuff like this in normal programming languages, which is why most people aren't used to it.
Start with the simplest case:
data Color = Blue | Green | Red
This defines a "type constructor" Color which takes no arguments - and it has three "data constructors", Blue, Green and Red. None of the data constructors takes any arguments. This means that there are three of type Color: Blue, Green and Red.
A data constructor is used when you need to create a value of some sort. Like:
myFavoriteColor :: Color
myFavoriteColor = Green
creates a value myFavoriteColor using the Green data constructor - and myFavoriteColor will be of type Color since that's the type of values produced by the data constructor.
A type constructor is used when you need to create a type of some sort. This is usually the case when writing signatures:
isFavoriteColor :: Color -> Bool
In this case, you are calling the Color type constructor (which takes no arguments).
Still with me?
Now, imagine you not only wanted to create red/green/blue values but you also wanted to specify an "intensity". Like, a value between 0 and 256. You could do that by adding an argument to each of the data constructors, so you end up with:
data Color = Blue Int | Green Int | Red Int
Now, each of the three data constructors takes an argument of type Int. The type constructor (Color) still doesn't take any arguments. So, my favorite color being a darkish green, I could write
myFavoriteColor :: Color
myFavoriteColor = Green 50
And again, it calls the Green data constructor and I get a value of type Color.
Imagine if you don't want to dictate how people express the intensity of a color. Some might want a numeric value like we just did. Others may be fine with just a boolean indicating "bright" or "not so bright". The solution to this is to not hardcode Int in the data constructors but rather use a type variable:
data Color a = Blue a | Green a | Red a
Now, our type constructor takes one argument (another type which we just call a!) and all of the data constructors will take one argument (a value!) of that type a. So you could have
myFavoriteColor :: Color Bool
myFavoriteColor = Green False
or
myFavoriteColor :: Color Int
myFavoriteColor = Green 50
Notice how we call the Color type constructor with an argument (another type) to get the "effective" type which will be returned by the data constructors. This touches the concept of kinds which you may want to read about over a cup of coffee or two.
Now we figured out what data constructors and type constructors are, and how data constructors can take other values as arguments and type constructors can take other types as arguments. HTH.
As others pointed out, polymorphism isn't that terrible useful here. Let's look at another example you're probably already familiar with:
Maybe a = Just a | Nothing
This type has two data constructors. Nothing is somewhat boring, it doesn't contain any useful data. On the other hand Just contains a value of a - whatever type a may have. Let's write a function which uses this type, e.g. getting the head of an Int list, if there is any (I hope you agree this is more useful than throwing an error):
maybeHead :: [Int] -> Maybe Int
maybeHead [] = Nothing
maybeHead (x:_) = Just x
> maybeHead [1,2,3] -- Just 1
> maybeHead [] -- None
So in this case a is an Int, but it would work as well for any other type. In fact you can make our function work for every type of list (even without changing the implementation):
maybeHead :: [t] -> Maybe t
maybeHead [] = Nothing
maybeHead (x:_) = Just x
On the other hand you can write functions which accept only a certain type of Maybe, e.g.
doubleMaybe :: Maybe Int -> Maybe Int
doubleMaybe Just x = Just (2*x)
doubleMaybe Nothing= Nothing
So long story short, with polymorphism you give your own type the flexibility to work with values of different other types.
In your example, you may decide at some point that String isn't sufficient to identify the company, but it needs to have its own type Company (which holds additional data like country, address, back accounts etc). Your first implementation of Car would need to change to use Company instead of String for its first value. Your second implementation is just fine, you use it as Car Company String Int and it would work as before (of course functions accessing company data need to be changed).
The second one has the notion of "polymorphism" in it.
The a b c can be of any type. For example, a can be a [String], b can be [Int]
and c can be [Char].
While the first one's type is fixed: company is a String, model is a String and year is Int.
The Car example might not show the significance of using polymorphism. But imagine your data is of the list type. A list can contain String, Char, Int ... In those situations, you will need the second way of defining your data.
As to the third way I don't think it needs to fit into the previous type. It's just one other way of defining data in Haskell.
This is my humble opinion as a beginner myself.
Btw: Make sure that you train your brain well and feel comfortable to this. It is the key to understand Monad later.
It's about types: In the first case, your set the types String (for company and model) and Int for year. In the second case, your are more generic. a, b, and c may be the very same types as in the first example, or something completely different. E.g., it may be useful to give the year as string instead of integer. And if you want, you may even use your Color type.
Consider the following simple variant of the Address Book example
sig Name, Addr {}
sig Book { addr : Name -> Addr } // no lone on Addr
pred show(b:Book) { some n : Name | #addr[b,n] > 1 }
run show for exactly 2 Book, exactly 2 Addr, exactly 2 Name
In some model instances, I can get the following results in the evaluator
all b:Book | show[b]
--> yields false
some b:Book | show[b]
--> yields true
show[Book]
--> yields true
If show was a relation, then one might expect to get an answer like: { true, false }. Given that it is a predicate, a single Boolean value is returned. I would have expected show[Book] to be a shorthand for the universally quantified expression above it. Instead, it seems to be using existential quantification to fold the results. Anyone know what might be the rational for this, or have another explanation for the meaning of show[Book]?
(I'm not sure I have the correct words for this, so bear with me if this seems fuzzy.)
Bear in mind that all expressions in Alloy that denote individuals denote sets of individuals, and that there is no distinction available in the language between 'individual X' and 'the singleton set whose member is the individual X'. ([Later addendum:] In the terms more usually used: the general rule in Alloy's logic is that all values are relations. Binary relations are sets of pairs, n-ary relations sets of n-tuples, sets are unary relations, and scalars are singleton sets. See the discussion in sec. 3.2.2 of Software Abstractions, or the slide "Everything's a relation" in the Alloy Analyzer 4 tutorial by Greg Dennis and Rob Seater.)
Given the declaration you give of the 'show' predicate, it's easy to expect that the argument of 'show' should be a single Book -- or more correctly, a singleton set of Book --, and then to expect further that if the argument is not actually a singleton set (as in the expression show[Book] here) then the system will coerce it to being a singleton set, or interpret it with some sort of implicit existential or universal quantification. But in the declaration pred show(b:Book) ..., the expression b:Book just names an object b which will be a set of objects in the signature Book. (To require that b be a singleton set, write pred show(one b: Book) ....) The expression which constitutes the body of show is evaluated for b = Book just as readily as for b = Book$0.
The appearance of existential quantification is a consequence of the way the dot operator at the heart of the expression addr[b,n] (or equivalently n.(b.addr) is defined. Actually, if you experiment you'll find that show[Book] is true whenever there is any name for which the set of all books contains a mapping to two different addresses, even in cases where an existential interpretation would fail. Try adding this to your model, for example:
pred hmmmm { show[Book] and no b: Book | show[b] }
run hmmmm for exactly 2 Book, exactly 2 Addr, exactly 2 Name
I like reading about programming theories, so could you tell me if there is any object-oriented static typed language that allow variables to have a few types?
Example in pesudocode:
var value: BigInteger | Double | Nil
I think about way of calling methods on this object. If object value have type BigInteger | Double language could allow user to call only shared methods (lake plus, minus) but when the type is BigInteger | Double | Nil then object of Nil hasn't methods plus and minus, so we can't do anything usefull with this object, because it has only few shared methods (like toString).
So is there any idea how should work calling methods on variable with few types in static typed object-oriented language?
What you are describing is an intersection type. They do exist in Java, for example, but they only arise within the type-checker as the result of capture conversion and type-inference. You cannot write one yourself.
I don't know of any language which uses them directly, but they are often used to describe or analyze type systems of languages, espececially languages which don't actually have a type system. For example, Diamondback Ruby, which is a static type system and type-inferencer for the dynamically typed Ruby programming language, uses both union and intersection types.
Note that the syntax you are using is generally used to denote union types, which are the dual of intersection types. Intersection types are generally written A & B & C.
I am not aware of any language that does this... sadly, I'd love to play around with it (but first, they should adopt type inference and parametric polymorphism ;) ).
Although it is alreapossible: Relatively elegantly in a structural type system (type a is a subtype of type b if a has everything b has), simply by specifying a type for value that is a structural subtype of BigInteger and of Double and of Nil and slightly less elegantly in a nominative type system (type a is a subtype of type b if and only if it inherits from it, directly or indirectly) by specifying a common ancestor of all three (if all else fails, object). Of course we'd need to go recursive - what is the type of toString? And what's the typ of (Integer | Double | BigInteger).+?!? This is far from trivial (in fact, looking for a solution made my head hurt a bit). I can't say if it is impossible, but no mainly-OO-language's type system is anywhere sophisticated enough for a possible solution.
The bottom line is: It'd be really cool if some whizz came along and sorted out the issues it raises. Propably not worth the effort...
Edit: Do you know algebraic data types? They are similar to your idea (but much older ;) ) in that an algebraic data type is composed of several types and can therefore contain e.g. a BigInteger, a Double and Nil - the actual value is one of these and a tag (as in tagged union) says which. But to use the value stored in an algebraic data type, you have to use pattern matching to extract it safely. This concept is very powerful, and still "simple" enough to be understood tools - e.g. type inference and static typechecking work.
It has not much to do with OO but (as far as I understand it) what you describe looks much like polymorphism as implemented by C++.
Yes, OCaml has these in the form of polymorphic variants:
type my_var = Integer of int | Float of float;;
let x = Integer(10);;
let y = Float(3.14);;
Pike has them, as does Magpie, an optionally-typed language I'm working on. Google's Closure compiler for Javascript allows you to annotate types in Javascript using |.
They crop up frequently in languages that bridge static and dynamic typing because a lot of expressions in a dynamic language can yield one of a couple of types:
var a = 123;
if (foo) { a = "string"; }
bar(a);
The statically-determined type being passed to bar() is Number | String.
I'm not so sure if we really have a complete definition of what a static typed language is but I also hope that the language you describe wouldn't qualify as one.
One of my concerns is that if you add type T1 and T2 to be a part of your BigInteger | Double | Nil, how would they know about each other and how to handle the operations you defined? Now I realize you never said that the language would allow expanding the "implicit" conversion definition.
Come to think of it, C# does something that resembles this in its string handling
string s = -42 + '+' + "+" + -0.1 / -0.1 + "=" + (7 ^ 5) +
" is " + true + " and not " + AddressFamily.Unknown;
=> "1+1=2 is True and not Unknown"
string str = 1 + 2 + "!=" + 1 + 2;
=> "3!=12"
And I do not like it.