Haskell Type Variable - haskell

Learn You a Haskell creates a Tree Algebraic Data Type:
data Tree a = EmptyTree | Node a (Tree a) (Tree a) deriving (Show, Read, Eq)
My understanding is that a can be any type.
So I tried to create a treeInsert function that puts a Tree on the left or right side depending upon its Side value:
data Side = L | R
singletonTree :: a -> Tree a
singletonTree x = Node x EmptyTree EmptyTree
treeInsert :: a -> Tree a -> Tree a
treeInsert x EmptyTree = singletonTree x
treeInsert x#(L, _) (Node a left right) = Node a (treeInsert x left) right -- ERROR
treeInsert x#(R, _) (Node a left right) = Node a left (treeInsert x right)
But I got a compile-time error:
Couldn't match expected type `a' with actual type `(Side, t0)'
`a' is a rigid type variable bound by
the type signature for treeInsert :: a -> Tree a -> Tree a
at File.hs:10:15
In the pattern: (L, _)
In an equation for `treeInsert':
treeInsert x#(L, _) (Node a left right)
= Node a (treeInsert x left) right
Failed, modules loaded: none.
Perhaps a is still any type, but my pattern matching is invalid?

Just because a can be any type in a Tree a in general, doesn't mean it can be any type in a tree passed to treeInsert. You need to refine it to a type that actually allows the (L, _) and (R, _) pattern matches on it.
In fact, you can delete the type annotation on treeInsert and it will compile, after which you can ask GHCi with :t for the correct type (and then re-add that annotation if you want):
treeInsert :: (Side, t) -> Tree (Side, t) -> Tree (Side, t)

treeInsert :: a -> Tree a -> Tree a
So treeInsert is going to get a value of any type, and a tree containing the same type, and result in a tree of the same type.
Your equations try to match x against the patterns (L, _) and the (R, _). But wait; the type said I could call it with any type I liked. I should be able to call it like treeInsert ["this", "is", "a", "list", "of", "String"] EmptyTree. How can I match something of type [String] against a pattern like (L, _); that only works for types of the form (Side, t0) (for any type t0).
Type variables like a are extremely flexible when you call the function; you can use any value at all of any type you like, and the function will just work.
But you're not calling treeInsert, you're implementing it. For the implementer, type variables like a are extremely restrictive rather than flexible. This is an inevitable trade off1. The caller gets the freedom to pick anything they want; the implementer has to provide code that will work for anything the caller might choose (even types that the caller's program made up long after you finish implementing your function).
So you can't test it against a pattern like (L, _). Or any other meaningful pattern. Nor can you pass it to any "concrete" function, only to other functions which will accept any possible type. So in fact in treeInsert with type a -> Tree a -> Tree a, there's no way to use any property at all of the values you're inserting to decide where they'll go; for any property you might like to use to decide whether to put it in the left or right tree, there's no guarantee that the property will be meaningfully defined.
You'll either need to do your insertion based on something you can examine (like the structure of the tree that's also passed in), or use a type that provides less flexibility to the caller and actually gives you some information about the values you're inserting, such as treeInsert :: (Side, t) -> Tree (Side, t) -> Tree (Side, t), as Ørjan Johansen suggested.
1 And that very restrictiveness is actually where Haskell as a whole gets a lot of its power, since a lot of really useful and generic code relies on what functions with certain types can't do.

Related

How to Access Fields of Custom Data Types without Record Syntax in Haskell?

I'd like to understand how to access fields of custom data types without using the record syntax. In LYAH it is proposed to do it like this:
-- Example
data Person = Subject String String Int Float String String deriving (Show)
guy = Subject "Buddy" "Finklestein" 43 184.2 "526-2928" "Chocolate"
firstName :: Person -> String
firstName (Subject firstname _ _ _ _ _) = firstname
I tried applying this way of accessing data by getting the value of a node of a BST:
data Tree a = EmptyTree | Node a (Tree a) (Tree a) deriving (Show, Read, Eq)
singleton :: a -> Tree a
singleton x = Node x EmptyTree EmptyTree
treeInsert :: (Ord a) => a -> Tree a -> Tree a
treeInsert x EmptyTree = singleton x
treeInsert x (Node a left right)
| x == a = Node x left right
| x < a = Node a (treeInsert x left) right
| x > a = Node a left (treeInsert x right)
getValue :: Tree -> a
getValue (Node a _ _) = a
But I got the following error:
Could someone explain how to access the field correctly without using the record syntax and what the error message means? Please note that I'm a beginner in Haskell. My purpose is to understand why this particular case throws an error and how to do it it correctly. I'm not asking this to be pointed to more convenient ways of accessing fields (like the record syntax). If someone asked a similar question before: Sorry! I really tried finding an answer here but could not. Any help is appreciated!
You forgot to add the type parameter to Tree in your function's type signature
getValue :: Tree -> a
should be
getValue :: Tree a -> a
Expecting one more argument to `Tree'
means that Tree in the type signature is expecting a type argument, but wasn't provided one
Expected a type, but Tree has a kind `* -> *'
Tree a is a type, but Tree isn't (?) because it is expecting a type argument.
A kind is like a type signature for a type.
A kind of * means the type constructor does not expect any type of argument.
data Tree = EmptyTree | Tree Int Tree Tree
has a kind of *
It is like the type signature of a no-argument function (technically this is not really called a function, I think)
f :: Tree Int
f = Node 0 EmptyTree EmptyTree
A kind of * -> * means the type constructor expects an argument.
Your Tree type has a kind of * -> *, because it takes one type argument, the a on the left hand side of =.
A kind of * -> * is sort of like a function that takes one argument:
f :: Int -> Tree Int
f x = Node x EmptyTree EmptyTree

Haskell type checking in code

Could you please show me how can I check if type of func is Tree or not, in code not in command page?
data Tree = Leaf Float | Gate [Char] Tree Tree deriving (Show, Eq, Ord)
func a = Leaf a
Well, there are a few answers, which zigzag in their answers to "is this possible".
You could ask ghci
ghci> :t func
func :: Float -> Tree
which tells you the type.
But you said in your comment that you are wanting to write
if func == Tree then 0 else 1
which is not possible. In particular, you can't write any function like
isTree :: a -> Bool
isTree x = if x :: Tree then True else False
because it would violate parametericity, which is a neat property that all polymorphic functions in Haskell have, which is explored in the paper Theorems for Free.
But you can write such a function with some simple generic mechanisms that have popped up; essentially, if you want to know the type of something at runtime, it needs to have a Typeable constraint (from the module Data.Typeable). Almost every type is Typeable -- we just use the constraint to indicate the violation of parametericity and to indicate to the compiler that it needs to pass runtime type information.
import Data.Typeable
import Data.Maybe (isJust)
data Tree = Leaf Float | ...
deriving (Typeable) -- we need Trees to be typeable for this to work
isTree :: (Typeable a) => a -> Bool
isTree x = isJust (cast x :: Maybe Tree)
But from my experience, you probably don't actually need to ask this question. In Haskell this question is a lot less necessary than in other languages. But I can't be sure unless I know what you are trying to accomplish by asking.
Here's how to determine what the type of a binding is in Haskell: take something like f a1 a2 a3 ... an = someExpression and turn it into f = \a1 -> \a2 -> \a3 -> ... \an -> someExpression. Then find the type of the expression on the right hand side.
To find the type of an expression, simply add a SomeType -> for each lambda, where SomeType is whatever the appropriate type of the bound variable is. Then use the known types in the remaining (lambda-less) expression to find its actual type.
For your example: func a = Leaf a turns into func = \a -> Leaf a. Now to find the type of \a -> Leaf a, we add a SomeType -> for the lambda, where SomeType is Float in this case. (because Leaf :: Float -> Tree, so if Leaf is applied to a, then a :: Float) This gives us Float -> ???
Now we find the type of the lambda-less expression Leaf (a :: Float), which is Tree because Leaf :: Float -> Tree. Now we can add substitute Tree for ??? to get Float -> Tree, the actual type of func.
As you can see, we did that all by just looking at the source code. This means that no matter what, func will always have that type, so there is no need to check whether or not it does. In fact, the compiler will throw out all information about the type of func when it compiles your code, and your code will still work properly because of type-checking. (The caveat to this (Typeable) is pointed out in the other answer)
TL;DR: Haskell is statically typed, so func always has the type Float -> Tree, so asking how to check whether that is true doesn't make sense.

What is the difference between value constructors and tuples?

It's written that Haskell tuples are simply a different syntax for algebraic data types. Similarly, there are examples of how to redefine value constructors with tuples.
For example, a Tree data type in Haskell might be written as
data Tree a = EmptyTree | Node a (Tree a) (Tree a)
which could be converted to "tuple form" like this:
data Tree a = EmptyTree | Node (a, Tree a, Tree a)
What is the difference between the Node value constructor in the first example, and the actual tuple in the second example? i.e. Node a (Tree a) (Tree a) vs. (a, Tree a, Tree a) (aside from just the syntax)?
Under the hood, is Node a (Tree a) (Tree a) just a different syntax for a 3-tuple of the appropriate types at each position?
I know that you can partially apply a value constructor, such as Node 5 which will have type: (Node 5) :: Num a => Tree a -> Tree a -> Tree a
You sort of can partially apply a tuple too, using (,,) as a function ... but this doesn't know about the potential types for the un-bound entries, such as:
Prelude> :t (,,) 5
(,,) 5 :: Num a => b -> c -> (a, b, c)
unless, I guess, you explicitly declare a type with ::.
Aside from syntactical specialties like this, plus this last example of the type scoping, is there a material difference between whatever a "value constructor" thing actually is in Haskell, versus a tuple used to store positional values of the same types are the value constructor's arguments?
Well, coneptually there indeed is no difference and in fact other languages (OCaml, Elm) present tagged unions exactly that way - i.e., tags over tuples or first class records (which Haskell lacks). I personally consider this to be a design flaw in Haskell.
There are some practical differences though:
Laziness. Haskell's tuples are lazy and you can't change that. You can however mark constructor fields as strict:
data Tree a = EmptyTree | Node !a !(Tree a) !(Tree a)
Memory footprint and performance. Circumventing intermediate types reduces the footprint and raises the performance. You can read more about it in this fine answer.
You can also mark the strict fields with the the UNPACK pragma to reduce the footprint even further. Alternatively you can use the -funbox-strict-fields compiler option. Concerning the last one, I simply prefer to have it on by default in all my projects. See the Hasql's Cabal file for example.
Considering the stated above, if it's a lazy type that you're looking for, then the following snippets should compile to the same thing:
data Tree a = EmptyTree | Node a (Tree a) (Tree a)
data Tree a = EmptyTree | Node {-# UNPACK #-} !(a, Tree a, Tree a)
So I guess you can say that it's possible to use tuples to store lazy fields of a constructor without a penalty. Though it should be mentioned that this pattern is kinda unconventional in the Haskell's community.
If it's the strict type and footprint reduction that you're after, then there's no other way than to denormalize your tuples directly into constructor fields.
They're what's called isomorphic, meaning "to have the same shape". You can write something like
data Option a = None | Some a
And this is isomorphic to
data Maybe a = Nothing | Just a
meaning that you can write two functions
f :: Maybe a -> Option a
g :: Option a -> Maybe a
Such that f . g == id == g . f for all possible inputs. We can then say that (,,) is a data constructor isomorphic to the constructor
data Triple a b c = Triple a b c
Because you can write
f :: (a, b, c) -> Triple a b c
f (a, b, c) = Triple a b c
g :: Triple a b c -> (a, b, c)
g (Triple a b c) = (a, b, c)
And Node as a constructor is a special case of Triple, namely Triple a (Tree a) (Tree a). In fact, you could even go so far as to say that your definition of Tree could be written as
newtype Tree' a = Tree' (Maybe (a, Tree' a, Tree' a))
The newtype is required since you can't have a type alias be recursive. All you have to do is say that EmptyLeaf == Tree' Nothing and Node a l r = Tree' (Just (a, l, r)). You could pretty simply write functions that convert between the two.
Note that this is all from a mathematical point of view. The compiler can add extra metadata and other information to be able to identify a particular constructor making them behave slightly differently at runtime.

Binding together data, types and functions

I want to model a large tree (or forest) of some regular structure - tree can be decomposed to small tree (the irregular part) and (i.e.) large list of params, each of them with each of nodes make a node of big tree.
So, I want a data structure, where each node in a tree is representing many nodes. And real node is of type (node,param).
For algorithms that work on this kind of trees type of that param does not mattter. They are just placeholders. But some data should be possible to extract from the plain param or combination of node and param, and all possible params should be iterable. All that kinds of data is known apriori, they reflect semantic of that tree.
So, actual type, semantics and stuff of param is up to implementation of tree.
I model it in C++ using nested typedefs for params type, fixed method names for all kind of stuff that should be available to algorithm (this two together making a concept) and templates for algorithm itself.
I.e. if I want to associate with each node of big tree an integer, I would provide a function int data(const node& n, const param& p), where param is available as nested typedef, and algorithm could get list of all available params, and call data with nodes of interest and each of params
I have some plain data type, i.e. tree data, like this
data Tree = Node [Tree] | Leaf
Now I want to package up:
concrete tree
some type
some values of that type
some functions operating on (that concrete) tree nodes and (that) values
So one can write some function that use this packaged up types and functions, like, generic way.
How to achieve that?
With type families I came to
class PackagedUp t where
type Value t
tree :: Tree t
values :: [Value t]
f :: Tree t -> Value t -> Int
Tree now become Tree t because type families want type of their members to depend on typeclass argument.
Also, as in https://stackoverflow.com/a/16927632/1227578 type families to deal with injectivity will be needed.
With this I can
instance PackagedUp MyTree where
type Value MyTree = (Int,Int)
tree = Leaf
values = [(0,0),(1,1)]
f t v = fst v
And how to write such a function now? I.e. a function that will take root of a tree, all of values and make a [Int] of all f tree value.
First of all, your tree type should be defined like this:
data Tree a = Node a [Tree a] | Leaf
The type above is polymorphic. As far as semantics go that resembles what we would call a generic type in OO parlance (in C# or Java we might write Tree<A> instead). A node of a Tree a holds a value of type a and a list of subtrees.
Next, we come to PackagedUp. Classes in Haskell have little to do with the OO concept of the same name; they are not meant to package data and behaviour together. Things are actually much simpler: all you need to do is defining the appropriate functions for your tree type
getRoot :: Tree a -> Maybe a
getRoot Leaf = Nothing
getRoot (Node x _) = Just x
(Returning Maybe a is a simple way to handle failure with type safety. Think of the Nothing value as a polite cousin of null that doesn't explode with null reference exceptions.)
One thing that type classes are good at is in expressing data structure algorithm interfaces such as the ones you allude to. One of the most common classes is Functor, which provides a general interface for mapping over data structures.
instance Functor Tree where
fmap f Leaf = Leaf
fmap f (Node x ts) = Node (f x) (fmap f ts)
fmap has the following polymorphic type:
fmap :: Functor f => (a -> b) -> f a -> f b
With your tree, it specialises to
fmap :: (a -> b) -> Tree a -> Tree b
and with lists (as in fmap f ts) it becomes
fmap :: (a -> b) -> [a] -> [b]
Finally, the Data.Tree module provides a data structure which looks a lot like what you want to define.

How do I use Haskell's type system to enforce correctness while still being able to pattern-match?

Let's say that I have an adt representing some kind of tree structure:
data Tree = ANode (Maybe Tree) (Maybe Tree) AValType
| BNode (Maybe Tree) (Maybe Tree) BValType
| CNode (Maybe Tree) (Maybe Tree) CValType
As far as I know there's no way of pattern matching against type constructors (or the matching functions itself wouldn't have a type?) but I'd still like to use the compile-time type system to eliminate the possibility of returning or parsing the wrong 'type' of Tree node. For example, it might be that CNode's can only be parents to ANodes. I might have
parseANode :: Parser (Maybe Tree)
as a Parsec parsing function that get's used as part of my CNode parser:
parseCNode :: Parser (Maybe Tree)
parseCNode = try (
string "<CNode>" >>
parseANode >>= \maybeanodel ->
parseANode >>= \maybeanoder ->
parseCValType >>= \cval ->
string "</CNode>"
return (Just (CNode maybeanodel maybeanoder cval))
) <|> return Nothing
According to the type system, parseANode could end up returning a Maybe CNode, a Maybe BNode, or a Maybe ANode, but I really want to make sure that it only returns a Maybe ANode. Note that this isn't a schema-value of data or runtime-check that I want to do - I'm actually just trying to check the validity of the parser that I've written for a particular tree schema. IOW, I'm not trying to check parsed data for schema-correctness, what I'm really trying to do is check my parser for schema correctness - I'd just like to make sure that I don't botch-up parseANode someday to return something other than an ANode value.
I was hoping that maybe if I matched against the value constructor in the bind variable, that the type-inferencing would figure out what I meant:
parseCNode :: Parser (Maybe Tree)
parseCNode = try (
string "<CNode>" >>
parseANode >>= \(Maybe (ANode left right avall)) ->
parseANode >>= \(Maybe (ANode left right avalr)) ->
parseCValType >>= \cval ->
string "</CNode>"
return (Just (CNode (Maybe (ANode left right avall)) (Maybe (ANode left right avalr)) cval))
) <|> return Nothing
But this has a lot of problems, not the least of which that parseANode is no longer free to return Nothing. And it doesn't work anyways - it looks like that bind variable is treated as a pattern match and the runtime complains about non-exhaustive pattern matching when parseANode either returns Nothing or Maybe BNode or something.
I could do something along these lines:
data ANode = ANode (Maybe BNode) (Maybe BNode) AValType
data BNode = BNode (Maybe CNode) (Maybe CNode) BValType
data CNode = CNode (Maybe ANode) (Maybe ANode) CValType
but that kind of sucks because it assumes that the constraint is applied to all nodes - I might not be interested in doing that - indeed it might just be CNodes that can only be parenting ANodes. So I guess I could do this:
data AnyNode = AnyANode ANode | AnyBNode BNode | AnyCNode CNode
data ANode = ANode (Maybe AnyNode) (Maybe AnyNode) AValType
data BNode = BNode (Maybe AnyNode) (Maybe AnyNode) BValType
data CNode = CNode (Maybe ANode) (Maybe ANode) CValType
but then this makes it much harder to pattern-match against *Node's - in fact it's impossible because they're just completely distinct types. I could make a typeclass wherever I wanted to pattern-match I guess
class Node t where
matchingFunc :: t -> Bool
instance Node ANode where
matchingFunc (ANode left right val) = testA val
instance Node BNode where
matchingFunc (BNode left right val) = val == refBVal
instance Node CNode where
matchingFunc (CNode left right val) = doSomethingWithACValAndReturnABool val
At any rate, this just seems kind of messy. Can anyone think of a more succinct way of doing this?
I don't understand your objection to your final solution. You can still pattern match against AnyNodes, like this:
f (AnyANode (ANode x y z)) = ...
It's a little more verbose, but I think it has the engineering properties you want.
I'd still like to use the compile-time type system to eliminate the possibility of returning or parsing the wrong 'type' of Tree node
This sounds like a use case for GADTs.
{-# LANGUAGE GADTs, EmptyDataDecls #-}
data ATag
data BTag
data CTag
data Tree t where
ANode :: Maybe (Tree t) -> Maybe (Tree t) -> AValType -> Tree ATag
BNode :: Maybe (Tree t) -> Maybe (Tree t) -> BValType -> Tree BTag
CNode :: Maybe (Tree t) -> Maybe (Tree t) -> CValType -> Tree CTag
Now you can use Tree t when you don't care about the node type, or Tree ATag when you do.
An extension of keegan's answer: encoding the correctness properties of red/black trees is sort of a canonical example. This thread has code showing both the GADT and nested data type solution: http://www.reddit.com/r/programming/comments/w1oz/how_are_gadts_useful_in_practical_programming/cw3i9

Resources