Why can't I compare tuples of arbitrary length in Haskell? - haskell

I know that there are predefined Eq instances for tuples of lengths 2 to 15.
Why aren't tuples defined as some kind of recursive datatype such that they can be decomposed, allowing a definition of a function for a compare that works with arbitrary length tuples?
After all, the compiler does support arbitrary length tuples.

You might ask yourself what the type of that generalized comparison function would be. First of all we need a way to encode the component types:
data Tuple ??? = Nil | Cons a (Tuple ???)
There is really nothing valid we can replace the question marks with. The conclusion is that a regular ADT is not sufficient, so we need our first language extension, GADTs:
data Tuple :: ??? -> * where
Nil :: Tuple ???
Cons :: a -> Tuple ??? -> Tuple ???
Yet we end up with question marks. Filling in the holes requires another two extensions, DataKinds and TypeOperators:
data Tuple :: [*] -> * where
Nil :: Tuple '[]
Cons :: a -> Tuple as -> Tuple (a ': as)
As you see we needed three type system extensions just to encode the type. Can we compare now? Well, it's not that straightforward to answer, because it's actually far from obvious how to write a standalone comparison function. Luckily the type class mechanism allows us to take a simple recursive approach. However, this time we are not just recursing on the value level, but also on the type level. Obviously empty tuples are always equal:
instance Eq (Tuple '[]) where
_ == _ = True
But the compiler complains again. Why? We need another extension, FlexibleInstances, because '[] is a concrete type. Now we can compare empty tuples, which isn't that compelling. What about non-empty tuples? We need to compare the heads as well as the rest of the tuple:
instance (Eq a, Eq (Tuple as)) => Eq (Tuple (a ': as)) where
Cons x xs == Cons y ys = x == y && xs == ys
Seems to make sense, but boom! We get another complaint. Now the compiler wants FlexibleContexts, because we have a not-fully-polymorphic type in the context, Tuple as.
That's a total of five type system extensions, three of them just to express the tuple type, and they didn't exist before GHC 7.4. The other two are needed for comparison. Of course there is a payoff. We get a very powerful tuple type, but because of all those extensions, we obviously can't put such a tuple type into the base library.

You can always rewrite any n-tuple in terms of binary tuples. For example, given the following 4-tuple:
(1, 'A', "Hello", 20)
You can rewrite it as:
(1, ('A', ("Hello", (20, ()))))
Think of it as a list, where (,) plays the role of (:) (i.e. "cons") and () plays the role of [] (i.e. "nil"). Using this trick, as long as you formulate your n-tuple in terms of a "list of binary tuples", then you can expand it indefinitely and it will automatically derive the correct Eq and Ord instances.

A type of compare is a -> a -> Ordering, which suggests that both of the inputs must be of the same type. Tuples of different arities are by definition different types.
You can however solve your problem by approaching it either with HLists or GADTs.

I just wanted to add to ertes' answer that you don't need a single extension to do this. The following code should be haskell98 as well as 2010 compliant. And the datatypes therein can be mapped one on one to tuples with the exception of the singleton tuple. If you do the recursion after the two-tuple you could also achieve that.
module Tuple (
TupleClass,
TupleCons(..),
TupleNull(..)
) where
class (TupleClassInternal t) => TupleClass t
class TupleClassInternal t
instance TupleClassInternal ()
instance TupleClassInternal (TupleCons a b)
data (TupleClassInternal b) => TupleCons a b = TupleCons a !b deriving (Show)
instance (Eq a, Eq b, TupleClass b) => Eq (TupleCons a b) where
(TupleCons a1 b1) == (TupleCons a2 b2) = a1 == a2 && b1 == b2
You could also just derive Eq. Of course it would look a bit cooler with TypeOperators but haskell's list system has syntactical sugar too.

Related

Error matching types: using MultiParamTypeClasses and FunctionalDependencies to define heterogeneous lists and a function that returns first element

There is a relevant question concerning Functional Dependencies used with GADTs. It turns out that the problem is not using those two together, since similar problems arise without the use of GADTs. The question is not definitively answered, and there is a debate in the comments.
The problem
I am trying to make a heterogeneous list type which contains its length in the type (sort of like a tuple), and I am having a compiling error when I define the function "first" that returns the first element of the list (code below). I do not understand what it could be, since the tests I have done have the expected outcomes.
I am a mathematician, and a beginner to programming and Haskell.
The code
I am using the following language extensions:
{-# LANGUAGE GADTs, MultiParamTypeClasses, FunctionalDependencies, FlexibleInstances #-}
First, I defined natural numbers in the type level:
data Zero = Zero
newtype S n = S n
class TInt i
instance TInt Zero
instance (TInt i) => TInt (S i)
Then, I defined a heterogeneous list type, along with a type class that provides the type of the first element:
data HList a as where
EmptyList :: HList a as
HLCons :: a -> HList b bs -> HList a (HList b bs)
class First list a | list -> a
instance First (HList a as) a
And finally, I defined the Tuple type:
data Tuple list length where
EmptyTuple :: Tuple a Zero
TCons :: (TInt length) => a -> Tuple list length -> Tuple (HList a list) (S length)
I wanted to have the function:
first :: (First list a) => Tuple list length -> a
first EmptyTuple = error "first: empty Tuple"
first (TCons x _) = x
but it does not compile, with a long error that appears to be that it cannot match the type of x with a.
Could not deduce: a1 ~ a
from the context: (list ~ HList a1 list1, length ~ S length1,
TInt length1)
bound by a pattern with constructor:
TCons :: forall length a list.
TInt length =>
a -> Tuple list length -> Tuple (HList a list) (S length),
in an equation for ‘first’
[...]
Relevant bindings include
x :: a1 (bound at problem.hs:26:14)
first :: Tuple list length -> a (bound at problem.hs:25:1)
The testing
I have tested the type class First by defining:
testFirst :: (First list a) => Tuple list length -> a
testFirst = undefined
and checking the type of (testFirst x). For example:
ghci> x = TCons 'a' (TCons 5 (TCons "lalala" EmptyTuple))
ghci> :t (testFirst x)
(testFirst x) :: Char
Also this works as you would expect:
testMatching :: (Tuple (HList a as) n) -> a
testMatching (TCons x _) = x
"first" is basically these two combined.
The question
Am I attempting to do something the language does not support (maybe?), or have I stumbled on a bug (unlikely), or something else (most likely)?
Oh dear, oh dear. You have got yourself in a tangle. I imagine you've had to battle through several perplexing error messages to get this far. For a (non-mathematician) programmer, there would have been several alarm bells hinting it shouldn't be this complicated. I'll 'patch up' what you've got now; then try to unwind some of the iffy code.
The correction is a one-line change. The signature for first s/b:
first :: Tuple (HList a as) length -> a
As you figured out in your testing. Your testFirst only appeared to make the right inference because the equation didn't try to probe inside a GADT. So no, that's not comparable.
There's a code 'smell' (as us humble coders call it): your class First has only one instance -- and I can't see any reason it would have more than one. So just throw it away. All I've done with the signature for first is put the list's type from the First instance head direct into the signature.
Explanation
Suppose you wrote the equations for first without giving a signature. That gets rejected blah rigid type variable bound by a pattern with constructor ... blah. Yes GHC's error messages are completely baffling. What it means is that the LHS of first's equations pattern match on a GADT constructor -- TCons, and any constraints/types bound into it cannot 'escape' the pattern match.
We need to expose the types inside TCons in order to let the type pattern-matched to variable x escape. So TCons isn't binding just any old list; it's specifically binding something of the form (HList a as).
Your class First and its functional dependency was attempting to drive that typing inside the pattern match. But GADTs have been maliciously designed to not co-operate with FunDeps, so that just doesn't work. (Search SO for '[haskell] functional dependency GADT' if you want to be more baffled.)
What would work is putting an Associated Type inside class First; or building a stand-alone 'type family'. But I'm going to avoid leading a novice into advanced features.
Type Tuple not needed
As #JonPurdy points out, the 'spine' of your HList type -- that is, the nesting of HLConss is doing just the same job as the 'spine' of your TInt data structure -- that is, the nesting of Ss. If you want to know the length of an HList, just count the number of HLCons.
So also throw away Tuple and TInt and all that gubbins. Write first to apply to an HList. (Then it's a useful exercise to write a length-indexed access: nth element of an HList -- not forgetting to fail gracefully if the index points beyond its end.)
I'd avoid the advanced features #Jon talks about, until you've got your arms around GADTs. (It's perfectly possible to program over heterogeneous lists without using GADTs -- as did the pioneers around 2003, but I won't take you backwards.)
I was trying to figure out how to do "calculations" on the type level, there are more things I'd like to implement.
Ok ... My earlier answer was trying to make minimal changes to get your code going. There's quite a bit of duplication/unnecessary baggage in your O.P. In particular:
Both HList and Tuple have constructors for empty lists (also length Zero means empty list). Furthermore both those Emptys allege there's a type (variable) for the head and the tail of those empty (non-)lists.
Because you've used constructors to denote empty, you can't catch at the type level attempts to extract the first of an empty list. You've ended up with a partial function that calls error at run time. Better is to be able to trap first of an empty list at compile time, by rejecting the program.
You want to experiment with FunDeps for obtaining the type of the first (and presumably other elements). Ok then to be type-safe, prevent there being an instance of the class that alleges the non-existent head of an empty has some type.
I still want to have the length as part of the type, it is the whole point of my type.
Then let's express that more directly (I'll use fresh names here, to avoid clashes with your existing code.) (I'm going to need some further extensions, I'll explain those at the end.):
data HTuple list length where
MkHTuple :: (HHList list, TInt length) => list -> length -> HTuple list length
This is a GADT with a single constructor to pair the list with its length. TInt length and types Zero, S n are as you have already. With HHList I've followed a similar structure.
data HNil = HNil deriving (Eq, Show)
data HCons a as = HCons a as deriving (Eq, Show)
class HHList l
instance HHList HNil
instance HHList (HCons a as)
class (HHList list) => HFirst list a | list -> a where hfirst :: list -> a
-- no instance HFirst HNil -- then attempts to call hfirst will be type error
instance HFirst (HCons a as) a where hfirst (HCons x xs) = x
HNil, HCons are regular datatypes, so we can derive useful classes for them. Class HHList groups the two datatypes, as you've done with TInt.
HFirst with its FunDep for the head of an HHList then works smoothly. And no instance HFirst HNil because HNil doesn't have a first. Note that HFirst has a superclass constraint (HHList list) =>, saying that HFirst applies only for HHLists.
We'd like HTuple to be Eqable and Showable. Because it's a GADT, we must go a little around the houses:
{-# LANGUAGE StandaloneDeriving #-}
deriving instance (Eq list, Eq length) => Eq (HTuple list length)
deriving instance (Show list, Show length) => Show (HTuple list length)
Here's a couple of sample tuples; and a function to get the first element, going via the HFirst class and its method:
htEmpty = MkHTuple HNil Zero
htup1 = MkHTuple (HCons (1 :: Int) HNil) (S Zero)
tfirst :: (HFirst list a) => HTuple list length -> a -- using the FunDep
tfirst (MkHTuple list length) = hfirst list
-- > :set -XFlexibleContexts -- need this in your session
-- > tfirst htup1 ===> 1
-- > tfirst htEmpty ===> error: No instance for (HFirst HNil ...
But you don't want to be building tuples with all those explicit constructors. Ok, we could define a cons-like function:
thCons x (MkHTuple li le) = MkHTuple (HCons x li) (S le)
htup2 = thCons "Two" htup1
But that doesn't look like a constructor. Furthermore you really (I suspect) want something to both construct and destruct (pattern-match) a tuple into a head and a tail. Then welcome to PatternSynonyms (and I'm afraid quite a bit of ugly declaration syntax, so the rest of your code can be beautiful). I'll put the beautiful bits first: THCons looks just like a constructor; you can nest multiple calls; you can pattern match to get the first element.
htupA = THCons 'A' THEmpty
htup3 = THCons True $ THCons "bB" $ htupA
htfirst (THCons x xs) = x
-- > htfirst htup3 ===> True
{-# LANGUAGE PatternSynonyms, ViewPatterns, LambdaCase #-}
pattern THEmpty = MkHTuple HNil Zero -- simple pattern, spelled upper-case like a constructor
-- now the ugly
pattern THCons :: (HHList list, TInt length)
=> a -> HTuple list length -> HTuple (HCons a list) (S length)
pattern THCons x tup <- ((\case
{ (MkHTuple (HCons x li) (S le) )
-> (x, (MkHTuple li le)) } )
-> (x, tup) )
where
THCons x (MkHTuple li le) = MkHTuple (HCons x li) (S le)
Look first at the last line (below the where) -- it's just the same as for function thCons, but spelled upper case. The signature (two lines starting pattern THCons ::) is as inferred for thCons; but with explicit class constraints -- to make sure we can build a HTuple only from valid components; which we need to have guaranteed when deconstructing the tuple to get its head and a properly formed tuple for the rest -- which is all that ugly code in the middle.
Question to the floor: can that pattern decl be made less ugly?
I won't try to explain the ugly code; read ViewPatterns in the User Guide. That's the last -> just above the where.

Flattening tuples in Haskell

In Haskell we can flatten a list of lists Flatten a list of lists
For simple cases of tuples, I can see how we would flatten certain tuples, as in the following examples:
flatten :: (a, (b, c)) -> (a, b, c)
flatten x = (fst x, fst(snd x), snd(snd x))
flatten2 :: ((a, b), c) -> (a, b, c)
flatten2 x = (fst(fst x), snd(fst x), snd x)
However, I'm after a function that accepts as input any nested tuple and which flattens that tuple.
Can such a function be created in Haskell?
If one cannot be created, why is this the case?
No, it's not really possible. There are two hurdles to clear.
The first is that all the different sizes of tuples are different type constructors. (,) and (,,) are not really related to each other at all, except in that they happen to be spelled with a similar sequence of characters. Since there are infinitely many such constructors in Haskell, having a function which did something interesting for all of them would require a typeclass with infinitely many instances. Whoops!
The second is that there are some very natural expectations we naively have about such a function, and these expectations conflict with each other. Suppose we managed to create such a function, named flatten. Any one of the following chunks of code seems very natural at first glance, if taken in isolation:
flattenA :: ((Int, Bool), Char) -> (Int, Bool, Char)
flattenA = flatten
flattenB :: ((a, b), c) -> (a, b, c)
flattenB = flatten
flattenC :: ((Int, Bool), (Char, String)) -> (Int, Bool, Char, String)
flattenC = flatten
But taken together, they seem a bit problematic: flattenB = flatten can't possibly be type-correct if both flattenA and flattenC are! Both of the input types for flattenA and flattenC unify with the input type to flattenB -- they are both pairs whose first component is itself a pair -- but flattenA and flattenC return outputs with differing numbers of components. In short, the core problem is that when we write (a, b), we don't yet know whether a or b is itself a tuple and should be "recursively" flattened.
With sufficient effort, it is possible to do enough type-level programming to put together something that sometimes works on limited-size tuples. But it is 1. a lot of up-front effort, 2. very little long-term programming efficiency payoff, and 3. even at use sites requires a fair amount of boilerplate. That's a bad combo; if there's use-site boilerplate, then you might as well just write the function you cared about in the first place, since it's generally so short to do so anyway.

Simple dependent type example in Haskell for Dummies. How are they useful in practice in Haskell? Why should I care about dependent types ?

I hear a lot about dependent types nowadays and I heard that DataKinds is somehow related to dependent typing (but I am not sure about this... just heard it on a Haskell Meetup).
Could someone illustrate with a super simple Haskell example what dependent typing is and what is it good for ?
On wikipedia it is written that dependent types can help prevent bugs. Could you give a simple example about how dependent types in Haskell can prevent bugs?
Something that I could start using in five minutes right now to prevent bugs in my Haskell code?
Dependent types are basically functions from values to types, how can this be used in practice? Why is that good ?
Late to the party, this answer is basically a shameless plug.
Sam Lindley and I wrote a paper about Hasochism, the pleasure and pain of dependently typed programming in Haskell. It gives plenty of examples of what's possible now in Haskell and draws points of comparison (favourable as well as not) with the Agda/Idris generation of dependently typed languages.
Although it is an academic paper, it is about actual programs, and you can grab the code from Sam's repo. We have lots of little examples (e.g. orderedness of mergesort output) but we end up with a text editor example, where we use indexing by width and height to manage screen geometry: we make sure that components are regular rectangles (vectors of vectors, not ragged lists of lists) and that they fit together exactly.
The key power of dependent types is to maintain consistency between separate data components (e.g., the head vector in a matrix and every vector in its tail must all have the same length). That's never more important than when writing conditional code. The situation (which will one day come to be seen as having been ridiculously naïve) is that the following are all type-preserving rewrites
if b then t else e => if b then e else t
if b then t else e => t
if b then t else e => e
Although we are presumably testing b because it gives us some useful insight into what would be appropriate (or even safe) to do next, none of that insight is mediated via the type system: the idea that b's truth justifies t and its falsity justifies e is missing, despite being critical.
Plain old Hindley-Milner does give us one means to ensure some consistency. Whenever we have a polymorphic function
f :: forall a. r[a] -> s[a] -> t[a]
we must instantiate a consistently: however the first argument fixes a, the second argument must play along, and we learn something useful about the result while we are at it. Allowing data at the type level is useful because some forms of consistency (e.g. lengths of things) are more readily expressed in terms of data (numbers).
But the real breakthrough is GADT pattern matching, where the type of a pattern can refine the type of the argument it matches. You have a vector of length n; you look to see whether it's nil or cons; now you know whether n is zero or not. This is a form of testing where the type of the code in each case is more specific than the type of the whole, because in each case something which has been learned is reflected at the type level. It is learning by testing which makes a language dependently typed, at least to some extent.
Here's a silly game to play, whatever typed language you use. Replace every type variable and every primitive type in your type expressions with 1 and evaluate types numerically (sum the sums, multiply the products, s -> t means t-to-the-s) and see what you get: if you get 0, you're a logician; if you get 1, you're a software engineer; if you get a power of 2, you're an electronic engineer; if you get infinity, you're a programmer. What's going on in this game is a crude attempt to measure the information we're managing and the choices our code must make. Our usual type systems are good at managing the "software engineering" aspects of coding: unpacking and plugging together components. But as soon as a choice has been made, there is no way for types to observe it, and as soon as there are choices to make, there is no way for types to guide us: non-dependent type systems approximate all values in a given type as the same. That's a pretty serious limitation on their use in bug prevention.
The common example is to encode the length of a list in it's type, so you can do things like (pseudo code).
cons :: a -> List a n -> List a (n+1)
Where n is an integer. This let you specify that adding an object to list increment its length by one.
You can then prevent head (which give you the first element of a list) to be ran on empty list
head :: n > 0 => List a n -> a
Or do things like
to3uple :: List a 3 -> (a,a,a)
The problem with this type of approach is you then can't call head on a arbitrary list without having proven first that the list is not null.
Sometime the proof can be done by the compiler, ex:
head (a `cons` l)
Otherwise, you have to do things like
if null list
then ...
else (head list)
Here it's safe to call head, because you are in the else branch and therefore guaranteed that the length is not null.
However, Haskell doesn't do dependent type at the moment, all the examples have given won't work as nicely, but you should be able to declare this type of list using DataKind because you can promote a int to type which allow to instanciate List a b with List Int 1. (b is a phantom type taking a literal).
If you are interested in this type of safety, you can have a look a liquid Haskell.
Here is a example of such code
{-# LANGUAGE DataKinds, KindSignatures, TypeFamilies, TypeOperators #-}
import GHC.TypeLits
data List a (n:: Nat) = List [a] deriving Show
cons :: a -> List a n -> List a (n + 1)
cons x (List xs) = List (x:xs)
singleton :: a -> List a 1
singleton x = List [x]
data NonEmpty
data EmptyList
type family ListLength a where
ListLength (List a 0) = EmptyList
ListLength (List a n) = NonEmpty
head' :: (ListLength (List a n) ~ NonEmpty) => List a n -> a
head' (List xs) = head xs
tail' :: (ListLength (List a n) ~ NonEmpty) => List a n -> List a (n-1)
tail' (List xs) = List (tail xs)
list = singleton "a"
head' list -- return "a"
Trying to do head' (tail' list) doesn't compile and give
Couldn't match type ‘EmptyList’ with ‘NonEmpty’
Expected type: NonEmpty
Actual type: ListLength (List [Char] 0)
In the expression: head' (tail' list)
In an equation for ‘it’: it = head' (tail' list)
Adding to #mb14's example, here's some simpler working code.
First, we need DataKinds, GADTs, and KindSignatures to really make it clear:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTS #-}
{-# LANGUAGE KindSignatures #-}
Now let's define a Nat type, and a Vector type based on it:
data Nat :: * where
Z :: Nat
S :: Nat -> Nat
data Vector :: Nat -> * -> * where
Nil :: Vector Z a
(:-:) :: a -> Vector n a -> Vector (S n) a
And voila, lists using dependent types that can be called safe in certain circumstances.
Here are the head and tail functions:
head' :: Vector (S n) a -> a
head' (a :-: _) = a
-- The other constructor, Nil, doesn't apply here because of the type signature!
tail' :: Vector (S n) a -> Vector n a
tail (_ :-: xs) = xs
-- Ditto here.
This is a more concrete and understandable example than above, but does the same sort of thing.
Note that in Haskell, Types can influence values, but values cannot influence types in the same dependent ways. There are languages such as Idris that are similar to Haskell but also support value-to-type dependent typing, which I would recommend looking into.
The machines package lets users define machines that can request values. Many machines request only one type of value, but it's also possible to define machines that sometimes ask for one type and sometimes ask for another type. The requests are values of a GADT type, which allows the value of the request to determine the type of the response.
Step k o r = ...
| forall t . Await (t -> r) (k t) r
The machine provides a request of type k t for some unspecified type t, and a function to deal with the result. By pattern matching on the request, the machine runner learns what type it must supply the machine. The machine's response handler doesn't need to check that it got the right sort of response.

Is it possible to define a list recursively in Haskell?

In several programming languages (including JavaScript, Python, and Ruby), it's possible to place a list inside itself, which can be useful when using lists to represent infinitely-detailed fractals. However, I tried doing this in Haskell, and it did not work as I expected:
--aList!!0!!0!!1 should be 1, since aList is recursively defined: the first element of aList is aList.
main = putStrLn $ show $ aList!!0!!0!!1
aList = [aList, 1]
Instead of printing 1, the program produced this compiler error:
[1 of 1] Compiling Main ( prog.hs, prog.o )
prog.hs:3:12:
Occurs check: cannot construct the infinite type: t0 = [t0]
In the expression: aList
In the expression: [aList, 1]
In an equation for `aList': aList = [aList, 1]
Is it possible to put an list inside itself in Haskell, as I'm attempting to do here?
No, you can't. First off, there's a slight terminological confusion: what you have there are lists, not arrays (which Haskell also has) , although the point stands either way. So then, as with all things Haskell, you must ask yourself: what would the type of aList = [aList, 1] be?
Let's consider the simpler case of aList = [aList]. We know that aList must be a list of something, so aList :: [α] for some type α. What's α? As the type of the list elements, we know that α must be the type of aList; that is, α ~ [α], where ~ represents type equality. So α ~ [α] ~ [[α]] ~ [[[α]]] ~ ⋯ ~ [⋯[α]⋯] ~ ⋯. This is, indeed, an infinite type, and Haskell forbids such things.
In the case of the value aList = [aList, 1], you also have the restriction that 1 :: α, but all that that lets us conclude is that there must be a Num α constraint (Num α => [⋯[α]⋯]), which doesn't change anything.
The obvious next three questions are:
Why do Haskell lists only contain one type of element?
Why does Haskell forbid infinite types?
What can I do about this?
Let's tackle those in order.
Number one: Why do Haskell lists only contain one type of element? This is because of Haskell's type system. Suppose you have a list of values of different types: [False,1,2.0,'c']. What's the type of the function someElement n = [False,1,2.0,'c'] !! n? There isn't one, because you couldn't know what type you'd get back. So what could you do with that value, anyway? You don't know anything about it, after all!
Number two: Why does Haskell forbid infinite types? The problem with infinite types is that they don't add many capabilities (you can always wrap them in a new type; see below), and they make some genuine bugs type-check. For example, in the question "Why does this Haskell code produce the ‘infinite type’ error?", the non-existence of infinite types precluded a buggy implementation of intersperse (and would have even without the explicit type signature).
Number three: What can I do about this? If you want to fake an infinite type in Haskell, you must use a recursive data type. The data type prevents the type from having a truly infinite expansion, and the explicitness avoids the accidental bugs mentioned above. So we can define a newtype for an infinitely nested list as follows:
Prelude> newtype INL a = MkINL [INL a] deriving Show
Prelude> let aList = MkINL [aList]
Prelude> :t aList
aList :: INL a
Prelude> aList
MkINL [MkINL [MkINL [MkINL ^CInterrupted.
This got us our infinitely-nested list that we wanted—printing it out is never going to terminate—but none of the types were infinite. (INL a is isomorphic to [INL a], but it's not equal to it. If you're curious about this, the difference is between isorecursive types (what Haskell has) and equirecursive types (which allow infinite types).)
But note that this type isn't very useful; the only lists it contains are either infinitely nested things like aList, or variously nested collections of the empty list. There's no way to get a base case of a value of type a into one of the lists:
Prelude> MkINL [()]
<interactive>:15:8:
Couldn't match expected type `INL a0' with actual type `()'
In the expression: ()
In the first argument of `MkINL', namely `[()]'
In the expression: MkINL [()]
So the list you want is an arbitrarily nested list. The 99 Haskell Problems has a question about these, which requires defining a new data type:
data NestedList a = Elem a | List [NestedList a]
Every element of NestedList a is either a plain value of type a, or a list of more NestedList as. (This is the same thing as an arbitrarily-branching tree which only stores data in its leaves.) Then you have
Prelude> data NestedList a = Elem a | List [NestedList a] deriving Show
Prelude> let aList = List [aList, Elem 1]
Prelude> :t aList
aList :: NestedList Integer
Prelude> aList
List [List [List [List ^CInterrupted.
You'll have to define your own lookup function now, and note that it will probably have type NestedList a -> Int -> Maybe (NestedList a)—the Maybe is for dealing with out-of-range integers, but the important part is that it can't just return an a. After all, aList ! 0 is not an integer!
Yes. If you want a value that contains itself, you'll need a type that contains itself. This is no problem; for example, you might like rose trees, defined roughly like this in Data.Tree:
data Tree a = Node a [Tree a]
Now we can write:
recursiveTree = Node 1 [recursiveTree]
This isn't possible with the list type in Haskell, since each element has to be of the same type, but you could create a data type to do it. I'm not exactly sure why you'd want to, though.
data Nested a
= Value a
| List [Nested a]
deriving (Eq, Show)
nested :: Nested Int
nested = List [nested, Value 1]
(!) :: Nested a -> Int -> Nested a
(!) (Value _) _ = undefined
(!) (List xs) n = xs !! n
main = print $ nested ! 0 ! 0 ! 1
This will print out Value 1, and this structure could be of some use, but I'd imagine it's pretty limited.
There were several answers from "yes you can" to "no you absolutely cannot". Well, both are right, because all of them address different aspects of your question.
One other way to add "array" to itself, is permit a list of anything.
{-# LANGUAGE ExistentialQuantification #-}
data T = forall a. T a
arr :: [T]
arr = [T arr, T 1]
So, this adds arr to itself, but you cannot do anything else with it, except prove it is a valid construct and compile it.
Since Haskell is strongly typed, accessing list elements gives you T, and you could extract the contained value. But what is the type of that value? It is "forall a. a" - can be any type, which in essence means there are no functions at all that can do anything with it, not even print, because that would require a function that can convert any type a to String. Note that this is not specific to Haskell - even in dynamic languages the problem exists; there is no way to figure out the type of arr !! 1, you only assume it is a Int. What makes Haskell different to that other language, is that it does not let you use the function unless you can explain the type of the expression.
Other examples here define inductive types, which is not exactly what you are asking about, but they show the tractable treatment of self-referencing.
And here is how you could actually make a sensible construct:
{-# LANGUAGE ExistentialQuantification #-}
data T = forall a. Show a => T a
instance Show T where -- this also makes Show [T],
-- because Show a => Show [a] is defined in standard library
show (T x) = show x
arr :: [T]
arr = [T arr, T 1]
main = print $ arr !! 1
Now the inner value wrapped by T is restricted to be any instance of Show ("implementation of Show interface" in OOP parlance), so you can at least print the contents of the list.
Note that earlier we could not include arr in itself only because there was nothing common between a and [a]. But the latter example is a valid construct once you can determine what's the common operation that all the elements in the list support. If you can define such a function for [T], then you can include arr in the list of itself - this function determines what's common between certain kinds of a and [a].
No. We could emulate:
data ValueRef a = Ref | Value a deriving Show
lref :: [ValueRef Int]
lref = [Value 2, Ref, Value 1]
getValue :: [ValueRef a] -> Int -> [ValueRef a]
getValue lref index = case lref !! index of
Ref -> lref
a -> [a]
and have results:
>getValue lref 0
[Value 2]
>getValue lref 1
[Value 2,Ref,Value 1]
Sure, we could reuse Maybe a instead of ValueRef a

Type-conditional controls in Haskell

I'm going through the 99 Haskell problems to build my proficiency with the language. On problem 7 ("Flatten a nested list structure"), I found myself wanting to define a conditional behavior based on the type of argument passed to a function. That is, since
*Main> :t 1
1 :: (Num t) => t
*Main> :t [1,2]
[1,2] :: (Num t) => [t]
*Main> :t [[1],[2]]
[[1],[2]] :: (Num t) => [[t]]
(i.e. lists nested at different levels have different data types) it seems like I should be able to write a function that can read the type of the argument, and then behave accordingly. My first attempt was along these lines:
listflatten l = do
if (:t l) /= ((Num t) => [t]) then
listflatten (foldl (++) [] l)
else id l
But when I try to do that, Haskell returns a parse error. Is Haskell flexible enough to allow this sort of type manipulation, do I need to find another way?
1. Use pattern matching instead
You can solve that problem without checking for data types dynamically. In fact, it is very rarely needed in Haskell. Usually you can use pattern matching instead.
For example, if you have a type
data List a = Elem a | Nested [List a]
you can pattern match like
flatten (Elem x) = ...
flatten (Nested xs) = ...
Example:
data List a = Elem a | Nested [List a]
deriving (Show)
nested = Nested [Elem 1, Nested [Elem 2, Elem 3, Nested [Elem 4]], Elem 5]
main = print $ flatten nested
flatten :: List a -> [a]
flatten (Elem x) = [x]
flatten (Nested lists) = concat . map flatten $ lists
map flatten flattens every inner list, thus it behaves like [List a] -> [[a]], and we produce a list of lists here. concat merges all lists together (concat [[1],[2,3],[4]] gives [1,2,3,4]). concat . map flatten is the same as concatMap flatten.
2. To check types dynamically, use Data.Typeable
And if on some rare occasion (not in this problem) you really need to check types dynamically, you can use Data.Typeable type class and its typeOf function. :t works only in GHCI, it is not part of the language.
ghci> :m + Data.Typeable
ghci> typeOf 3 == typeOf "3"
False
ghci> typeOf "a" == typeOf "b"
True
Likely, you will need to use DeriveDataTypeable extension too.
(Sorry about the length—I go a little bit far afield/in excessive depth. The CliffsNotes version is "No, you can't really do what you want because types aren't values and we can't give your function a sensible type; use your own data type.". The first and the fifth paragraph, not counting this one or the code block, explain the core of what I mean by that first part, and the rest of the answer should provide some clarification/detail.)
Roughly speaking, no, this is not possible, for two reasons. The first is the type-dispatch issue. The :t command is a feature (an enormously useful one) of GHCi, and isn't a Haskell function. Think about why: what type would it have? :t :: a -> ?? Types themselves aren't values, and thus don't have a type. It's two different worlds. So the way you're trying to do this isn't possible. Also note that you have a random do. This is bad—do notation is a syntactic sugar for monadic computation, and you aren't doing any of that. Get rid of it!
Why is this? Haskell has two kinds polymorphism, and the one we're concerned with at the moment is parametric polymorphism. This is what you see when you have a type like concat :: [[a]] -> a. That a says that one single definition of concat must be usable for every possible a from now until the end of time. How on earth would you type flatten using this scheme? It's just not possible.
You're trying to call a different function, defined ad-hoc, for different kinds of data. This is called, shockingly, ad-hoc polymorphism. For instance, in C++, you could define the following function:
template <typename T>
void flatten(vector<T>& v) { ... }
template <typename T>
void flatten(vector< vector<T> >& v) { ... }
This would allow you do different things for different types. You could even have template <> void flatten(int) { ... }! You can accomplish this in Haskell by using type classes such as Num or Show; the whole point of a type signature like Show a => a -> String is that a different function can be called for different as. And in fact, you can take advantage of this to get a partial solution to your problem…but before we do, let's look at the second problem.
This issue is with the list you are trying to feed in. Haskell's list type is defined as (roughly) data [a] = [] | a : [a]. In other words, every element of a list must have the same type; a list of ints, [Int], contains only ints, Int; and a list of lists of ints, [[Int]], contains only lists of ints, [Int]. The structure [1,2,[3,4],5] is illegal! Reading your code, I think you understand this; however, there's another ramification. For similar reasons, you can't write a fully-generic flatten function of type flatten :: [...[a]...] -> [a]. Your function also has to be able to deal with arbitrary nesting depth, which still isn't possible with a list. You need [a], [[a]], and so on to all be the same type!
Thus, to get all of the necessary properties, you want a different type. The type you want has a different property: it contains either nothing, a single element followed by the rest of the value, or a nested list of elements followed by the rest of the value. In other words, something like
data NList a = Nil
| a :> NList a
| (NList a) :>> NList a
deriving (Eq, Show)
infixr 5 :>, :>>
Then, instead of the list [1,2,3] == 1 : 2 : 3 : [], you would write 1 :> 2 :> 3 :> Nil; instead of Lisp's (1 (2 3) 4 ()), you would write
1 :> (2 :> 3 :> Nil) :>> 4 :> Nil :>> Nil. You can even begin to define functions to manipulate it:
nhead :: NList a -> Either a [a]
nhead Nil = error "nhead: Empty NList."
nhead (h :> _) = Left a
nhead (h :>> _) = Right a
ntail :: NList a -> NList a
ntail Nil = error "nhead: Empty NList."
ntail (_ :> t) = t
ntail (_ :>> t) = t
Admittedly, you might find this a bit clunky (or perhaps not), so you might try to think about your type differently. Another option, which the Haskell translation of the 99 problems uses, is to realize that everything in a nested list is either a single item or a list of nested lists. This translation gives you
data NestedList a = Elem a
| List [NestedList a]
deriving (Eq, Show)
The two above lists then become List [Elem 1, Elem 2, Elem 3] and List [Elem 1, List [Elem 2, Elem 3], Elem 4, List []]. As for how to flatten them—since you're trying to learn from the 99 problems, that I won't say :) And after all, you seem to have a handle on that part of the problem.
Now, let's return to type classes. I lied a bit when I said that you couldn't write something which took an arbitrarily-nested list—you can, in fact, using type classes and some GHC extensions. Now, before I continue, I should say: don't use this! Seriously. The other technique is almost definitely a better choice. However, this technique is cool, and so I will present it here. Consider the following code:
{-# LANGUAGE MultiParamTypeClasses, FlexibleInstances, UndecidableInstances #-}
class Flattenable f e where
flatten :: f -> [e]
instance Flattenable a a where
flatten = return
instance Flattenable f e => Flattenable [f] e where
flatten = concatMap flatten
We are creating a type class whose instances are the things we can flatten. If we have Flattenable f e, then f should be a collection, in this case a list, whose elements are ultimately of type e. Any single object is such a collection, and its element type is itself; thus, the first instance declaration allows us to flatten anything into a singleton list. The second instance declaration says that if we can flatten an f into a list of es, then we can also flatten a list of fs into a list of es by flattening each f and sticking the resulting lists together. This recursive class definition defines the function recursively for the nested list types, giving you the ability to flatten a list of any nesting with the single function flatten: [1,2,3], [[4,5],[6]], [[[7,8],[9]],[[10]],[[11],[12]]], and so on.
However, because of the multiple instances and such, it does require a single type annotation: you will need to write, for instance, flatten [[True,False],[True]] :: [Bool]. If you have something that's type class-polymorphic within your lists, then things are a little stricter; you need to write flatten [[1],[2,3 :: Int]] :: [Int], and as far as I can tell, the resulting list cannot be polymorphic itself. (However, I could well be wrong about this last part, as I haven't tried everything by any means.) For a similar reason, this is too open—you could declare instance Flattenable [f] () where flatten = [()] if you wanted too. I tried to get things to work with type families/functional dependencies in order to remove some of these problems, but thanks to the recursive structure, couldn't get it to work (I had no e and a declaration along the lines of type Elem a = a and type Elem [f] = Elem f, but these conflicted since [f] matches a). If anyone knows how, I'd very much like to see it!
Again, sorry about the length—I tend to start blathering when I get tired. Still, I hope this is helpful!
You are confusing the interactive command :t in the interpreter with a built-in function. You cannot query the type at runtime.
Look at the example for that problem:
flatten (List [Elem 1, List [Elem 2, List [Elem 3, Elem 4], Elem 5]])
As you see, the problem wants you to create your own data structure for arbitrarily nested lists.
Normal haskell lists can not be arbitrarily nested. Every element of the list has to have the same type, statically known, which is why it makes no sense to check the type of the elements dynamically.
In general haskell does not allow you to create a list of different types and then check the type at runtime. You could use typeclasses to define different behaviors for flatten with different types of arguments, but that still wouldn't give you arbitrarily nested lists.

Resources