Create type with double symbol - haskell

The List type is created with
data [] a = [] | a : [a]
But I can't create my own type with the same structure:
data %% a = %% | a : %a%
error: parse error on input `%%'

The List type is created with
data [] a = [] | a : [a]
No, it isn't. If you look at the source (for GHC; other compilers may do it differently), it says
data [] a = MkNil
but this is just a marker for the compiler (not even this, see chepner's comment). This is because
data [] a = [] | a : [a]
isn't legal syntax in Haskell.
What is true is that list works as if it were defined this way: it's entirely equivalent to
data List a = Nil | Cons a (List a)
except for the names.

Type and constructor names must either be alphanumeric names, starting with uppercase
data MyType a b = K a | L b a
or be symbolic infix operators, starting with :
data a :** b = K a | b :+-& a
Both types above are perfectly isomorphic: we only replaced MyType with the infix :** and L with the infix :+-&.
Also note that infixes must be binary, i.e. take two arguments. Alphanumeric names do not have such constraint (e.g. K above only takes one argument).
List syntax [] is specially handled by the compiler, similarly to (,),(,,),... for tuples. Only : follows the general rule (perhaps incidentally).

Related

A curried version of union for a fold?

I have this simple version of a set union
union (s, []) = s
union (s, t:ts) | member (t,s) = union (s,ts)
| otherwise = t : union (s,ts)
where member is a version of elem. But this, which I found in a textbook, says the union must be a curried union
uniteElems [] = []
uniteElems (h:t) = union h (uniteElems t)
This has me confused. As the text says, uniteElems is a sort of proto-fold (foldr?), i.e., it should take a single list and recursively apply union, thereby weeding out the duplicates. So is this my "curried version" of a union
union2 s [] = s
union2 s (t:ts) | elem t s = union2 s ts
| otherwise = union2 (t:s) ts
In any event, this union2 doesn't work as-is in uniteElems, giving an error.
* Non type-variable argument in the constraint: Num [a]
: (Use FlexibleContexts to permit this)
: * When checking the inferred type
: it :: forall a. (Eq a, Num [a]) => [a]
This version, however, does work
uniteElems [] = []
uniteElems (h:t) = union2 [h] (uniteElems t)
but I've kludged in the [h]. What would make uniteElems, a sort of proto-fold, work? How would a union function look for regular use with foldr would be another related question, I suppose.
There seems to be some confusion in your code over the desired meaning (purpose, or semantics) of your various functions. Related to that, there is definitely some confusion over the types of the functions.
Let's start with uniteElems. It seems to have type Eq a => [a] -> [a] along with some documented guarantee that the input list is a regular old list while the output list is actually a set. It relies on there existing a function union :: Eq a => a -> [a] -> [a]. Looking at the definition of uniteElems, it seems to be that this union function should take an element and a set (represented as a list) and add the element to the set.
Now, let's look at your definition of union2 (or similar for union, just uncurried). It has type Eq a => [a] -> [a] -> [a], and its meaning seems to be that if it is given two lists that represent sets, it will return a list representing the union of the two sets.
Right away, it seems pretty clear what the problem is: the union expected in uniteElems and the union2 you have defined are fundamentally different! One is adding a single element to a set and the other is combining two whole sets.
Now, it also seems clear why your proposed modification to uniteElems works: basically, you're taking each individual element, turning it into a singleton set, and then using your union2 function to combine those sets.
In order to make uniteElems do what you want, you need a simpler definition of union, like:
union :: Eq a => a -> [a] -> [a]
union t s | elem t s = s
| otherwise = t:s
As for folds and proto-folds, uniteElems is a protofold because it is essentially a fold where some of the arguments have been supplied. That is, you could write:
uniteElems = foldr union []

Type casting when working with nested data structures

I have the following data structures defined:
data Operator = Plus | Times | Minus deriving (Eq,Show)
data Variable = A | B | C deriving (Eq,Show)
newtype Const = D Numeral deriving (Eq,Show)
data CVO = Const | Variable | Operator deriving (Eq,Show)
type Expr = [CVO]
I have defined the following function:
eval2 :: Expr -> Integer
eval2 x = helper x
I would like to check if an element of the CVO list (Expr) is either an instance of Const, Variable or Operator (this works) and I would like to implement varying code for the specific type of the instance (e.g. Plus, Times, Minus for Operator).
helper :: Expr -> Integer
helper [] = 2
helper (x:xs)
| x == Operator && x == Plus = 1
I cannot compare x to Plus, because it expects x to be of type CVO.
Couldn't match expected type ‘CVO’ with actual type ‘Operator’
Is it somehow possible to cast x to be an instance of Operator in order to do the comparison?
A value can't have two different types at the same time. If x is a CVO you can't use == to compare it to Plus which is an Operator.
At the moment the type CVO consists of three constant values called Const, Variable and Operator. I'm guessing you actually wanted it to contain values of the type Const, Variable or Operator. You do that by declaring arguments to the constructors.
data CVO = Const Const -- a constructor whose name is Const and contains a value of type Const
| Var Variable -- a constructor named Var containing a Variable
| Op Operator -- a constructor named Op containing an Operator
A given value of type CVO must have been built from one of those three constructors, containing a value of the correct type. You can test which constructor was used to create the CVO, and simultaneously unpack the value, using pattern matching. Something like this:
helper :: Expr -> Integer
helper [] = 0
helper (Op o:xs) -- because we matched Op, we know o :: Operator
| o == Plus = 1
| otherwise = 2
helper _ = 3

What's the difference between algebraic data types and base types?

I understand what G.A.D.T's are, but what is the difference between G.A.D.T's and base types (in Haskell, or elsewhere)?
I'm not sure if you mean regular data declarations vs. types like Int or generalized algebraic data types using the GADTs extension, so if this doesn't answer your question then please clarify.
Normal data declarations let you create types that are a combination of products (this and that) and sums (this or that).
Some examples are:
data Color = Red | Green | Blue -- a sum type
data Person = Person { name :: String, age :: Int } -- a product type
data Address = Mail Street City Country | Email String -- both!
GADTs allow you to be more specific about the type of each constructor. Here's my favorite example:
-- helper types to encode natural numbers
data Z -- zero
data S n -- successor
-- a list that encodes its size in its type
data List a n where
Nil :: List a Z
Cons :: a -> List a n -> List a (S n)
-- head that cannot be called on an empty list!
head :: List a (S n) -> a
head (Cons h _) = h
-- tail that cannot be called on a empty list!
tail :: List a (S n) -> List a n
tail (Cons _ t) = t
Note that we cannot do this trick with normal data declarations like
data List a n = Nil | Cons a (List a n)
because there's no way to specify that Nil's type is List a Z and that Cons increments the size of the list by one.

parse error on input ‘::’ when making an explicit type definition for a function which accepts multiple arguments

I am working on a project in Haskell for the first time and I am working on translating over my ADT into the code properly, however when I am writing the explicit type definitions for my functions and I load my code in GHCi I get the following error:
Blockquote parse error on input ‘::’
The line in question is for a function called type which accepts a character and a tuple and returns a tuple as shown below:
type :: validChars -> tuple -> tuple
where validChars is the list of valid characters, the definitions for my lists are shown here if this helps:
tuple = (l, r, b, k)
l = [l | l <- validChars]
m = [m | m <- validChars]
b = [b | b <- validChars]
k = [k | k <- validChars]
validChars = [ chr c | c <-alphanumericChars , c >= 32, c <= 122]
alphanumericChars = [ a | a <- [0..255]]
I checked to make sure it wasn't validChars causing the error by replacing it with the Charstype as shown:
type :: Chars -> tuple -> tuple
But I still get the same error, I am a complete beginner at Haskell so I'm probably missing something important, but I am not sure what that would be exactly; I've looked at answers for similar questions I have been unsuccessful thus far. Any help with this would be appreciated.
type is a keyword in Haskell, so you can't use it as the name of your function.
Furthermore type names start with a capital letter in Haskell and anything starting with a lower case letter in a type is a type variable. So if you define myFunction :: validChars -> tuple -> tuple, that defines a function that takes two arguments of arbitrary types and produces a result of the same type as the second argument. It's the same as myFunction :: a -> b -> b.
If you write myFunction :: Chars -> tuple -> tuple, you get a function whose first argument needs to be of type Chars (which needs to exist) and the second argument is of an arbitrary type that is also the type of the result. Again it's the same as myFunction :: Chars -> a -> a.
Note that for this to work, you'll actually have to have defined a type named Chars somewhere. If you want to take a list of Chars, the type should be [Char] instead.
And if you want the second argument and result to actually be tuples (rather than just a type variable arbitrarily named tuple), you need to specify a tuple type like (a,b,c,d), which would accept arbitrary 4-tuples, or something specific like (Integer, String, String, String), which would accept 4-tuples containing an Integer and three Strings.

Annotating Nested ADT in Haskell with Additional Data

While writing a compiler in Haskell, I have come across a particular issue a couple times now while working with nested data types. I will have an ADT defined something like
data AST = AST [GlobalDecl]
data GlobalDecl = Func Type Identifier [Stmt] | ...
data Stmt = Assign Identifier Exp | ...
data Exp = Var Identifier | ...
While performing some transformation on the AST, I might want to briefly carry around some extra data with variables that are used with in an expression. All of the options for doing this that I have considered so far seem to be fairly awkward. I could make a new data type:
data Exp' = Var' Identifier ExtraInfo | ...
but this means I would need a new definitions Stmt', GDecl', in order to form the slightly changed AST'. Another option is to add another data constructor to the original Exp, but only use it in that one particular part of the program:
data Exp = Var Identifier | Var' Identifier ExtraInfo | ...
If you do this, the typechecker can no longer prevent you from mistakenly using Var' in some other part the program.
A third option would be to simply keep the extra information around all the time, even though it has no relevance to the rest of the program:
data Exp = Var Identifier ExtraInfo | ...
Doable, but it's ugly, particularly if you only need the extra information briefly. For now I have just been putting the extra info in a Map Indentifier ExtraInfo, and carrying it around with the AST, either explicitly or via the state monad. This can get awkward fast, if, for instance, you need to annotate different occurances of the same Identifier with different info.
Does anyone have any elegant techniques for annotating nested data types?
One option to tag a structure with extra data is to use a higher kinded type parameter. If you only ever need to tag variables, you can do e.g.
data AST f = AST [GlobalDecl f]
data GlobalDecl f = Func Type Identifier [Stmt f] | ...
data Stmt = Assign Identifier (Exp f) | ...
data Exp f = Var (f Identifier) | ...
This is similar to what Peter suggested but instead of making the types fully generic it only parametricizes the part you want to vary.
You'll get your original, untagged structure with AST Identity or you can have a type like AST ((,) ExtraInfo) which would turn Var (f Identifier) into Var (ExtraInfo, Identifier).
If you need to tag every level of the AST with some extra information (e.g. token positions), you could even define the data type as
data AST f = AST [f (GlobalDecl f)]
data GlobalDecl f = Func (f (Type f)) (f (Identifier f)) [f (Stmt f)] | ...
data Stmt f = Assign (f (Identifier f)) (f (Exp f)) | ...
data Exp f = Var (f (Identifier f)) | ...
Now AST ((,) ExtraInfo) would contain extra information at every branching point in the syntax tree (granted, working with the above structure will get a bit cumbersome).
If you make all of your types more polymorphic, like this:
data AST a = AST a
data GlobalDecl t i s = Func t i [s] | ...
data Stmt i e = Assign i e | ...
data Exp a = Var a | ...
then you can temporarily instantiate them with a tuple - e.g. Exp (Int, Identifier) - for intermediate computations. If necessary, you can make newtypes, for the concrete types above, for convenience.

Resources