Understanding double colon "::" and type variables in Haskell?

Understanding double colon "::" and type variables in Haskell? - haskell

For a uni assignment, a function "lookup" has been defined as:
lookup :: Env e -> String -> Maybe e
The "Env" keyword has been defined as:
import qualified Data.Map as M
newtype Env e = Env (M.Map String e) deriving (Functor, Foldable, Traversable, Show, Eq, Monoid, Semigroup)
I understand that the double colon "::" means "has type of". So the lookup function takes two arguments and returns something. I know the second argument is a string, however, I'm struggling to understand the first argument and the output. What is the "Env" type and what does the "e" after it represent. It feels like e is an argument to "Env" but if that is the case, the output wouldn't make sense.
Any help would be appreciated.

It feels like e is an argument to "Env" but if that is the case, the output wouldn't make sense.
That's the direction the syntax is trying to "feel like", because that's what it means. Env is a parameterised type, and e is being passed as an argument to that parameter.
I'm not 100% sure why you think the output wouldn't make sense, under that interpretation? Maybe is also a parameterised type, and the variable e is also being passed to Maybe; this is no different than using a variable twice in any piece of code. You can think of it as something like this pseudo-code:
lookupType(e) {
argument1_type = Env(e)
argument2_type = String
result_type = Maybe(e)
}
Is your confusion that you are thinking of e in the type signature of lookup not as being passed to Env, but as being defined to receive the argument of Env? That happens in the place where Env is defined, not where it is used in the type of lookup. Again, this is just like what happens at the value level with function arguments; e.g. when you're writing code to define a function like plus:
plus x y = x + y
Then the x and y variables are created to stand for whatever plus is applied to elsewhere. But in another piece of code where you're using plus, something like incr x = plus x 1 here the variable is just being passed to plus as an argument, it is not defined as the parameter of plus (it was in fact defined as the parameter of incr).
Perhaps the thing that you need more explicitly called out is this. lookup :: Env e -> String -> Maybe e is saying:
For any type e that you like, lookup takes an Env e and a String and returns a Maybe e
Thus you could pass lookup an Env Integer and a String, and it will give you back a Maybe Integer. Or you could pass it an Env (Maybe [(String, Float)]) and a String, and it will give you back a Maybe (Maybe [(String, Float)]). This should hopefully be intuitive, because it's just looking up keys in an environment; whatever type of data is stored in the environment you pass to lookup is the type that lookup will maybe-return.
The e is there because in fact lookup is parametrically polymorphic; it's almost like lookup takes a type argument called e, which it can then pass to other things in its type signature.1 That's why I wrote my pseudo-code the way I did, above.
But this is just how variables in type signatures work, in base Haskell2. You simply write variables in your type signature without defining them anyway, and they mean your definition can be used with ANY type at the position you write the variable. The only reason the variables have names like e (rather than being some sort of wildcard symbol like ?) is because you often need to say that you can work with any type, but it has to be the same type in several different places. lookup is like that; it can take any type of Env and return any sort of Maybe, but they have to be consistent. Giving the variable the name e merely allows you to link them to say that they're the same variable.
1 This is in fact exactly how types like this work at a lower level. Haskell normally keeps this kind of type argument invisible though; you just write a type signature containing variables, without defining them, and every time you use the accompanying binding the compiler figures out how the variables should be instantiated.
2 In more advanced Haskell, when you turn on a bunch of extensions, you can actually control exactly where type variables are introduced, rather than it always happening automatically at the beginning of every type signature you use with variables. You don't need to know that yet, and I'm not going to talk about it further.

I'll try to give concrete examples, providing and motivating intuition. With thtat intuition, I think the question has a very natural answer:
e is a type variable, and the lookup function wants to work on all possible environments, regardless of which concrete type e is". The unbound e is a natural way to syntactically express that
Step 1, the Env type
The Env type is a wrapper for Map type in the Data.Map module in the containers package. It is a collection of key-value pairs, and you can insert new key-value pairs and look them up. If the key you are looking up is missing, you must return an error, a null value, a default or something else. Just like hashmaps or dictionaries in other programming languages.
The documentation (linked above) writes
data Map = Map k a
A Map from keys k to values a.
and we will try that out and wee what k and a can be.
I use ghci to get interactive feedback.
Prelude> import Data.Map as M
Prelude M> map1 = M.fromList [("Sweden","SEK"),("Chile","CLP")]
Prelude M> :type map1
map1 :: Map [Char] [Char]
Prelude M> map2 = M.fromList [(1::Integer,"Un"),(2,"deux")]
Prelude M> :type map2
map2 :: Map Integer [Char]
Prelude M> map3 = fromList [("Ludvig",10000::Double),("Mikael",25000)]
Prelude M> :type map3
map3 :: Map [Char] Double
You can see that we create various mappings based on lists of key-value pairs.The type signature in the documentation Map k a correspond to different k and a in the ghci session above. For map2, k correspongs to Integer and a to [Char].
You can also see how I declared manual types at some places, using the double colon syntax.
To reduce the flexibility, we can create a new "wrapping" type that for M.Map. With this wrapper, we make sure that the keys are always Strings.
Prelude M> newtype Env e = Env (M.Map String e) deriving (Show)
this definition says that the type Env e is, for every e an alias for M.Map String e. The newtype declaration further says that we must always be explicit and wrap the mappings in a Env value constructor. Let see if we can do that.
Prelude M> Env map1
Env (fromList [("Chile","CLP"),("Sweden","SEK")])
Prelude M> Env map2
<interactive>:34:6: error:
• Couldn't match type ‘Integer’ with ‘[Char]’
Expected type: Map String [Char]
Actual type: Map Integer [Char]
• In the first argument of ‘Env’, namely ‘(map2)’
In the expression: Env (map2)
In an equation for ‘it’: it = Env (map2)
Prelude M> Env map3
Env (fromList [("Ludvig",10000.0),("Mikael",25000.0)])
In the session above, we see that both map1 and map3 can be wrapped in an Env, since they have an appropriate type (they have k==String), but map2 cannot (having k==Integer).
The logical step from Map to Env is a little tricky, since also renames some variables. What was called a when talking about maps, is called e in the Env case. But variable names are always arbitrary, so that is ok.
Step 2, the lookup
We have established that Env e is a wrapping type that contains a lookup table from strings to values of some type e. How do you lookup things? Let us start with the non-wrapped case, and then the wrapped case. In Data.Map there is a function called lookup. Lets try it!
Prelude M> M.lookup "Ludvig" map3
Just 10000.0
Prelude M> M.lookup "Elias" map3
Nothing
Okay, to look up a value, we supply a key, and get just the corresponding value. If the key is missing, we get nothing. What is the type of the return value?
Prelude M> :type M.lookup "Ludvig" map3
M.lookup "Ludvig" map3 :: Maybe Double
when making a lookup in a Data [Char] Double, we need a key of type [Char] and return a vlue of type Maybe Double. Okay. That sounds reasonable. What about some other example?
Prelude M> :type M.lookup 1 map2
M.lookup 1 map2 :: Maybe [Char]
when making a lookup in a Data Integer [Char], we need a key of type Integer and return a value of type Maybe [Char]. So in general, to lookup we need a key of type k and a map of type M.Map k a and return a Maybe a.
Generally, we think M.lookup :: k -> M.Map k a -> Maybe a. Lets see if ghci agrees.
Prelude M> :t M.lookup
M.lookup :: Ord k => k -> Map k a -> Maybe a
It does! It also requires that the k type must be some type with a defined order on it, such as strings or numbers. That is the meaning of the Ord k => thing in the beginning. Why? Well, it has to do with how M.Map type is defined... do not care too much about it right now.
We can specialize the type signature if we set k to be String. In that special case, it would look like
M.lookup :: String -> Map String a -> Maybe a
hmm. this inspires us to flip the order of argument 1 and 2 to lookup, and replace the variable a with e. It is just a variable, and names on variables are arbitrary anyways... Lets call this new function myLookup
myLookup :: Map String e -> String -> Maybe e
and since Env a is essentially the same as Map String e, if we just unwrap the Env type, we may define
myLookup :: Env a -> String -> Maybe e
how does one implement such a function? One way is to just unwrap the type by pattern matching, and let the library to the heavy lifting.
myLookup (Env map) key = M.lookup key map
Conclusion
I have tried to build up concrete examples using functions and types from the standard library of Haskell. I hope I have illustrated the idea of a polymorhic type (that M.Map k a is a type when supplied with a k and a v) and that Env is a wrapper, specializing to the case when k is a String.
I hope that this concrete example motivates why the type signatures looks like they do and why type variables are useful.
I have not tried to give you the formal treatment with terminology. I have not explained the Maybe type, nor typeclasses and derived typeclasses. I hope you can read up on that elsewhere.
I hope it helps.

Related

How does the :: operator syntax work in the context of bounded typeclass?

I'm learning Haskell and trying to understand the reasoning behind it's syntax design at the same time. Most of the syntax is beautiful.
But since :: normally is like a type annotation, How is it that this works:
Input: minBound::Int
Output: -2147483648

There is no separate operator: :: is a type annotation in that example. Perhaps the best way to understand this is to consider this code:
main = print (f minBound)
f :: Int -> Int
f = id
This also prints -2147483648. The use of minBound is inferred to be an Int because it is the parameter to f. Once the type has been inferred, the value for that type is known.
Now, back to:
main = print (minBound :: Int)
This works in the same way, except that minBound is known to be an Int because of the type annotation, rather than for some more complex reason. The :: isn't some binary operation; it just directs the compiler that the expression minBound has the type Int. Once again, since the type is known, the value can be determined from the type class.

:: still means "has type" in that example.
There are two ways you can use :: to write down type information. Type declarations, and inline type annotations. Presumably you've been used to seeing type declarations, as in:
plusOne :: Integer -> Integer
plusOne = (+1)
Here the plusOne :: Integer -> Integer line is a separate declaration about the identifier plusOne, informing the compiler what its type should be. It is then actually defined on the following line in another declaration.
The other way you can use :: is that you can embed type information in the middle of any expression. Any expression can be followed by :: and then a type, and it means the same thing as the expression on its own except with the additional constraint that it must have the given type. For example:
foo = ('a', 2) :: (Char, Integer)
bar = ('a', 2 :: Integer)
Note that for foo I attached the entire expression, so it is very little different from having used a separate foo :: (Char, Integer) declaration. bar is more interesting, since I gave a type annotation for just the 2 but used that within a larger expression (for the whole pair). 2 :: Integer is still an expression for the value 2; :: is not an operator that takes 2 as input and computes some result. Indeed if the 2 were already used in a context that requires it to be an Integer then the :: Integer annotation changes nothing at all. But because 2 is normally polymorphic in Haskell (it could fit into a context requiring an Integer, or a Double, or a Complex Float) the type annotation pins down that the type of this particular expression is Integer.
The use is that it avoids you having to restructure your code to have a separate declaration for the expression you want to attach a type to. To do that with my simple example would have required something like this:
two :: Integer
two = 2
baz = ('a', two)
Which adds a relatively large amount of extra code just to have something to attach :: Integer to. It also means when you're reading bar, you have to go read a whole separate definition to know what the second element of the pair is, instead of it being clearly stated right there.
So now we can answer your direct question. :: has no special or particular meaning with the Bounded type class or with minBound in particular. However it's useful with minBound (and other type class methods) because the whole point of type classes is to have overloaded names that do different things depending on the type. So selecting the type you want is useful!
minBound :: Int is just an expression using the value of minBound under the constraint that this particular time minBound is used as an Int, and so the value is -2147483648. As opposed to minBound :: Char which is '\NUL', or minBound :: Bool which is False.
None of those options mean anything different from using minBound where there was already some context requiring it to be an Int, or Char, or Bool; it's just a very quick and simple way of adding that context if there isn't one already.
It's worth being clear that both forms of :: are not operators as such. There's nothing terribly wrong with informally using the word operator for it, but be aware that "operator" has a specific meaning in Haskell; it refers to symbolic function names like +, *, &&, etc. Operators are first-class citizens of Haskell: we can bind them to variables1 and pass them around. For example I can do:
(|+|) = (+)
x = 1 |+| 2
But you cannot do this with ::. It is "hard-wired" into the language, just as the = symbol used for introducing definitions is, or the module Main ( main ) where syntax for module headers. As such there are lots of things that are true about Haskell operators that are not true about ::, so you need to be careful not to confuse yourself or others when you use the word "operator" informally to include ::.
1 Actually an operator is just a particular kind of variable name that is applied by writing it between two arguments instead of before them. The same function can be bound to operator and ordinary variables, even at the same time.

Just to add another example, with Monads you can play a little like this:
import Control.Monad
anyMonad :: (Monad m) => Int -> m Int
anyMonad x = (pure x) >>= (\x -> pure (x*x)) >>= (\x -> pure (x+2))
$> anyMonad 4 :: [Int]
=> [18]
$> anyMonad 4 :: Either a Int
=> Right 18
$> anyMonad 4 :: Maybe Int
=> Just 18
it's a generic example telling you that the functionality may change with the type, another example:

Why can't I use the type `Show a => [Something -> a]`?

I have a record type say
data Rec {
recNumber :: Int
, recName :: String
-- more fields of various types
}
And I want to write a toString function for Rec :
recToString :: Rec -> String
recToString r = intercalate "\t" $ map ($ r) fields
where fields = [show . recNumber, show . recName]
This works. fields has type [Rec -> String]. But I'm lazy and I would prefer writing
recToString r = intercalate "\t" $ map (\f -> show $ f r) fields
where fields = [recNumber, recName]
But this doesn't work. Intuitively I would say fields has type Show a => [Rec -> a] and this should be ok. But Haskell doesn't allow it.
I'd like to understand what is going on here. Would I be right if I said that in the first case I get a list of functions such that the 2 instances of show are actually not the same function, but Haskell is able to determine which is which at compile time (which is why it's ok).
[show . recNumber, show . recName]
^-- This is show in instance Show Number
^-- This is show in instance Show String
Whereas in the second case, I only have one literal use of show in the code, and that would have to refer to multiple instances, not determined at compile time ?
map (\f -> show $ f r) fields
^-- Must be both instances at the same time
Can someone help me understand this ? And also are there workarounds or type system expansions that allow this ?

The type signature doesn't say what you think it says.
This seems to be a common misunderstanding. Consider the function
foo :: Show a => Rec -> a
People frequently seem to think this means that "foo can return any type that it wants to, so long as that type supports Show". It doesn't.
What it actually means is that foo must be able to return any possible type, because the caller gets to choose what the return type should be.
A few moments' thought will reveal that foo actually cannot exist. There is no way to turn a Rec into any possible type that can ever exist. It can't be done.
People often try to do something like Show a => [a] to mean "a list of mixed types but they all have Show". That obviously doesn't work; this type actually means that the list elements can be any type, but they still have to be all the same.
What you're trying to do seems reasonable enough. Unfortunately, I think your first example is about as close as you can get. You could try using tuples and lenses to get around this. You could try using Template Haskell instead. But unless you've got a hell of a lot of fields, it's probably not even worth the effort.

The type you actually want is not:
Show a => [Rec -> a]
Any type declaration with unbound type variables has an implicit forall. The above is equivalent to:
forall a. Show a => [Rec -> a]
This isn't what you wan't, because the a must be specialized to a single type for the entire list. (By the caller, to any one type they choose, as MathematicalOrchid points out.) Because you want the a of each element in the list to be able to be instantiated differently... what you are actually seeking is an existential type.
[exists a. Show a => Rec -> a]
You are wishing for a form of subtyping that Haskell does not support very well. The above syntax is not supported at all by GHC. You can use newtypes to sort of accomplish this:
{-# LANGUAGE ExistentialQuantification #-}
newtype Showy = forall a. Show a => Showy a
fields :: [Rec -> Showy]
fields = [Showy . recNumber, Showy . recName]
But unfortunatley, that is just as tedious as converting directly to strings, isn't it?
I don't believe that lens is capable of getting around this particular weakness of the Haskell type system:
recToString :: Rec -> String
recToString r = intercalate "\t" $ toListOf (each . to fieldShown) fields
where fields = (recNumber, recName)
fieldShown f = show (f r)
-- error: Couldn't match type Int with [Char]
Suppose the fields do have the same type:
fields = [recNumber, recNumber]
Then it works, and Haskell figures out which show function instance to use at compile time; it doesn't have to look it up dynamically.
If you manually write out show each time, as in your original example, then Haskell can determine the correct instance for each call to show at compile time.
As for existentials... it depends on implementation, but presumably, the compiler cannot determine which instance to use statically, so a dynamic lookup will be used instead.

I'd like to suggest something very simple instead:
recToString r = intercalate "\t" [s recNumber, s recName]
where s f = show (f r)

All the elements of a list in Haskell must have the same type, so a list containing one Int and one String simply cannot exist. It is possible to get around this in GHC using existential types, but you probably shouldn't (this use of existentials is widely considered an anti-pattern, and it doesn't tend to perform terribly well). Another option would be to switch from a list to a tuple, and use some weird stuff from the lens package to map over both parts. It might even work.

Type of id function in Haskell

In GHCI, the type of id is:
Prelude> :t id
id :: a -> a
But if I define my own id function, why is the name of the type variable t ? Is there a difference between t and a ?
Prelude> let identity x = x
Prelude> :t identity
identity :: t -> t

There is no difference between a and t, they are called Type variables and stand for any type you can have. Note that they are written in lowercase letters where types are written with an uppercase letter at the beginning (except for lists, who have special syntax).
In addition if you write a file and load it into ghci by ghci testmodule.hs
module Testmodule where
identity :: b -> b
identity x = x
then ghci will show you exactly the letter you used in your definition.

This has actually the same answer as if I were asking
If I define my own version of your version as
Prelude> let identity' q = q
why is the name of the value variable q? Is there a difference between q and x?
The crucial thing about parameter variables in general is that their names are basically arbitrary. It's a fundamental property of lambda-calculus: α-equivalence. We merely replace \x -> x with \q -> q (or, in lambda-style, λx.x with λq.q). And indeed type variables are parameters as well, though it doesn't really look like they are. But under the hood, Haskell polymorphic signatures should be read as System F, so really we have Λα . α -> α, usually written forall a . a -> a. Which is obviously equivalent to forall t . t -> t.

Substitution Algorithm in Haskell

I'm trying to write a substitution algorithm in Haskell.
I have defined a polymorphic data type Subst a with a single constructor S::[(String, a)] -> Subst a as so:
data Subst a = S [(String, a)]
I now want to write a function single::String -> a -> Subst a for constructing a substitution for only a single variable
This is what I tried:
single::String -> a -> Subst a
single s1 (Subst a) = s1 a
However, I'm getting this error: Not in scope: data constructor 'Subst'
Does anyone have insight to what i'm doing wrong?

The data constructor is not the same thing as the type constuctor
In your code the type constructor is Subst the data constructor is S
Type constructors are used to create new types, e.g. in data Foo = Foo (Maybe Int) Maybe is a type constructor, Foo is the data constructor (as well as the type constructor, but they can be named differently as you discovered). Data constructors are used to create instances of types (also don't confuse this with creating an instance of a polymorphic type, e.g. Int -> Int is an instance of a -> a).
So you need to use S when you want to pattern match in your single function. Not Subst.
Hopefully that makes sense, if not, please tell me :)
P.S. data constructors are, for all intents and purposes, functions, which means you can do the same things with them that you'd typically do with functions. E.g. you can do map Bar [a,b,c] and it will apply the data constructor to each element.

single :: String -> a -> Subst a
single str a = S [(str, a)]
The [(str, a)] part creates a list with one element. That element is a tuple (or "pair"), with str as the left part of the tuple and a as the right part of the tuple. The above function then wraps that single-element list in the S constructor to create a value of type Subst a.
The result is a list that contains the rule for a single substitution from str to the value a.

Why can't I use record selectors with an existentially quantified type?

When using Existential types, we have to use a pattern-matching syntax for extracting the foralled value. We can't use the ordinary record selectors as functions. GHC reports an error and suggest using pattern-matching with this definition of yALL:
{-# LANGUAGE ExistentialQuantification #-}
data ALL = forall a. Show a => ALL { theA :: a }
-- data ok
xALL :: ALL -> String
xALL (ALL a) = show a
-- pattern matching ok
-- ABOVE: heaven
-- BELOW: hell
yALL :: ALL -> String
yALL all = show $ theA all
-- record selector failed
forall.hs:11:19:
Cannot use record selector `theA' as a function due to escaped type variables
Probable fix: use pattern-matching syntax instead
In the second argument of `($)', namely `theA all'
In the expression: show $ theA all
In an equation for `yALL': yALL all = show $ theA all
Some of my data take more than 5 elements. It's hard to maintain the code if I
use pattern-matching:
func1 (BigData _ _ _ _ elemx _ _) = func2 elemx
Is there a good method to make code like that maintainable or to wrap it up so that I can use some kind of selectors?

Existential types work in a more elaborate manner than regular types. GHC is (rightly) forbidding you from using theA as a function. But imagine there was no such prohibition. What type would that function have? It would have to be something like this:
-- Not a real type signature!
theA :: ALL -> t -- for a fresh type t on each use of theA; t is an instance of Show
To put it very crudely, forall makes GHC "forget" the type of the constructor's arguments; all that the type system knows is that this type is an instance of Show. So when you try to extract the value of the constructor's argument, there is no way to recover the original type.
What GHC does, behind the scenes, is what the comment to the fake type signature above says—each time you pattern match against the ALL constructor, the variable bound to the constructor's value is assigned a unique type that's guaranteed to be different from every other type. Take for example this code:
case ALL "foo" of
ALL x -> show x
The variable x gets a unique type that is distinct from every other type in the program and cannot be matched with any type variable. These unique types are not allowed to escape to the top level—which is the reason why theA cannot be used as a function.

You can use record syntax in pattern matching,
func1 BigData{ someField = elemx } = func2 elemx
works and is much less typing for huge types.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string