Haskell: Understanding the bind operator (>>=) for Monads - haskell

I have the following function:
parse :: String -> Maybe Token
And I am trying to implement the following function:
maketokenlist :: String -> Maybe [Token]
The function returns Nothing if there is a token that was not able to be parsed (i.e. parse returns Nothing if the token is not an integer or an arithmetic operator), otherwise it returns a list of Tokens.
As Maybe is an instance of the Monad type class, I have the following approach:
maketokenlist str = return (words str) >>= parse
I convert the string into a list of individual tokens (e.g. "2 3 +" becomes ["2","3","+"] , and then map the parse function over each string in the list.
Since the Monad instance for lists is defined as:
instance Monad [] where
return x = [x]
xs >>= f = concat (map f xs)
fail _ = []
However, say that I had the list of strings [2, 3, "+", "a"] and after mapping parse over each element using >>= I get [Just 2, Just 3, Just (+), Nothing], as "a" cannot be parsed. Is there a way to make the function maketokenlist return Nothing using just the >>= operator? Any insights are appreciated.

If parse :: String -> Maybe Token, then:
traverse parse :: [String] -> Maybe [Token]
This version of traverse (which I have specialized to act on lists as the Traversable instance and Maybe as the Applicative instance) may indeed be implemented using (>>=):
listMaybeTraverse parse [] = pure []
listMaybeTraverse parse (s:ss) =
parse s >>= \token ->
listMaybeTraverse parse ss >>= \tokens ->
pure (token:tokens)
I have chosen the names parse, s, and token to show the correspondence with your planned usage, but of course it will work for any appropriately-typed functions, not just parse.
The instance of Monad for lists does not make an appearance in this code.

Related

Describing my Parser type as a series of monad transformers

I have to describe the Parser type as a series of monad transformers.
As far as I understand, monad transformers are used to wrap monads into another monad. But I don't understand what is the task here.
Instead of defining a new type for Parser, you can simply define it as a type alias for a type created by one or more monad transformers. That is, you definition would look something like
type Parser a = SomeMonadT <some set of monads and types>
Your task, then, is to determine which monad transformer(s) to use, and what the arguments to the transformer(s) should be.
Before we begin to define combinators that act on parsers, we must choose a representation for a parser first.
A parser takes a string and produces an output that can be just about anything. A list parser will produce a list as it's output, an integer parser will produce Ints, a JSON parser might return a custom ADT representing a JSON.
Therefore, it makes sense to make Parser a polymorphic type. It also makes sense to return a list of results instead of a single result since grammar can be ambiguous, and there may be several ways to parse the same input string.
An empty list, then, implies the parser failed to parse the provided input.
newtype Parser a = Parser { parse :: String -> [(a, String)] }
You might wonder why we return the tuple (a, String), not just a. Well, a parser might not be able to parse the entire input string. Often, a parser is only intended to parse some prefix of the input, and let another parser do the rest of the parsing. Thus, we return a pair containing the parse result a and the unconsumed string subsequent parsers can use.
We can start by describing some basic parsers that do very little work. A result parser always succeeds in parsing without consuming the input string.
result :: a -> Parser a
result val = Parser $ \inp -> [(val, inp)]
item unconditionally accepts the first character of any input string.
item :: Parser Char
item = Parser parseItem
where
parseItem [] = []
parseItem (x:xs) = [(x, xs)]
Let's try some of these parsers in GHCi:
*Main> parse (result 42) "abc"
[(42, "abc")]
*Main> parse item "abc"
[('a', "bc")]
Say we want a parser that consumes a string if its first character satisfies a predicate. We can generalize this idea by writing a function that takes a (Char -> Bool) predicate and returns a parser that only consumes an input string if its first character returns True when supplied to the predicate.
The simplest solution for this would be:
sat :: (Char -> Bool) -> Parser Char
sat p = Parser parseIfSat
where
parseIfSat (x : xs) = if p x then [(x, xs)] else []
Using the previously defined item parser (this requires a Monad instance for the type Parser, which I leave to you as an exercise):
sat p =
-- Apply `item`, if it fails on an empty string, we simply short circuit and get `[]`.
item >>= \x ->
if p x
then result x
else zero
parseIfSat [] = []
Now we can use the sat combinator to describe several useful parsers. For example, parser for ASCII digits:
-- import Data.Char (isDigit, isLower, isUpper)
digit :: Parser Char
digit = sat isDigit
You get the idea. You start by defining elementary parsers, and use those to build more complex parsers. The type Parser shown here is actually StateT Monad Transformer, it combines State and [] in this case.
The code shown was taken from here.

Haskell Monad Beginner

Hello I have a code which I'm trying to understand:
zipM:: Monad m => [m a] -> [m b] -> m [(a,b)]
zipM [] list = return []
zipM list [] = return []
zipM (hFirst:tFirst) (hSecond:tSecond) = do
rest <- zipM tFirst tSecond
headUnpacked <- unpack hFirst hSecond
return (headUnpacked : rest) -- rest = Just []
unpack ::Monad m => m a -> m b -> m (a,b)
unpack first second = do
fi <- first
se <- second
return (fi, se)
Generally I got a point but still have problem at two moments.
How to call 'unpack' in console ? I tried unpack (Just 5) and it's causing an error.
Since 'rest' is 'Just []' how can I perform simple list add?
I tried (Just 4): (Just []) and also (Just [1,2]): (Just [2,3]) both of which result in errors.
I'll be most grateful if someone will show with some basic explanation how to perform this two actions.
You can use unpack in GHCi like this:
Prelude> unpack (Just 5) (Just 42)
Just (5,42)
Prelude> unpack [[42,1337],[123]] [["foo"],["bar","baz"]]
[([42,1337],["foo"]),([42,1337],["bar","baz"]),([123],["foo"]),([123],["bar","baz"])]
Prelude> :m +Data.Monoid
Prelude Data.Monoid> unpack (Product 7, "foo") (Product 6, "bar")
(Product {getProduct = 42},("foo","bar"))
Notice that the function works with any Monad, not only Maybe. The above GHCi session demonstrates that by also calling unpack with two lists, and two tuples (which are Monad instances when the first element is a Monoid).
There's nothing inherently wrong with the expression unpack (Just 5):
Prelude Data.Monoid> :t unpack (Just 5)
unpack (Just 5) :: Num a => Maybe b -> Maybe (a, b)
The reason you see an error is because, as you can see from the above type inquiry, the expression is a function. Again, there's nothing wrong with functions, but GHCi doesn't know how to render functions because it relies on the Show type class to turn expressions into printable values, and functions don't have Show instances.
To address the second question, rest isn't Just []. First, Just isn't a type, but a data constructor, but secondly, even if you meant Maybe [a], that's not necessarily true, because you don't know whether or not the Monad in question is Maybe or some other Monad.
Finally, as far as I can tell, rest is the result of calling unpack, the result of which has the type Monad m => m (a,b). Since rest is bound using do notation, it must, then, have the type (a,b), unless I'm mistaken.

Haskell's (<-) in Terms of the Natural Transformations of Monad

So I'm playing around with the hasbolt module in GHCi and I had a curiosity about some desugaring. I've been connecting to a Neo4j database by creating a pipe as follows
ghci> pipe <- connect $ def {credentials}
and that works just fine. However, I'm wondering what the type of the (<-) operator is (GHCi won't tell me). Most desugaring explanations describe that
do x <- a
return x
desugars to
a >>= (\x -> return x)
but what about just the line x <- a?
It doesn't help me to add in the return because I want pipe :: Pipe not pipe :: Control.Monad.IO.Class.MonadIO m => m Pipe, but (>>=) :: Monad m => m a -> (a -> m b) -> m b so trying to desugar using bind and return/pure doesn't work without it.
Ideally it seems like it'd be best to just make a Comonad instance to enable using extract :: Monad m => m a -> a as pipe = extract $ connect $ def {creds} but it bugs me that I don't understand (<-).
Another oddity is that, treating (<-) as haskell function, it's first argument is an out-of-scope variable, but that wouldn't mean that
(<-) :: a -> m b -> b
because not just anything can be used as a free variable. For instance, you couldn't bind the pipe to a Num type or a Bool. The variable has to be a "String"ish thing, except it never is actually a String; and you definitely can't try actually binding to a String. So it seems as if it isn't a haskell function in the usual sense (unless there is a class of functions that take values from the free variable namespace... unlikely). So what is (<-) exactly? Can it be replaced entirely by using extract? Is that the best way to desugar/circumvent it?
I'm wondering what the type of the (<-) operator is ...
<- doesn't have a type, it's part of the syntax of do notation, which as you know is converted to sequences of >>= and return during a process called desugaring.
but what about just the line x <- a ...?
That's a syntax error in normal haskell code and the compiler would complain. The reason the line:
ghci> pipe <- connect $ def {credentials}
works in ghci is that the repl is a sort of do block; you can think of each entry as a line in your main function (it's a bit more hairy than that, but that's a good approximation). That's why you need (until recently) to say let foo = bar in ghci to declare a binding as well.
Ideally it seems like it'd be best to just make a Comonad instance to enable using extract :: Monad m => m a -> a as pipe = extract $ connect $ def {creds} but it bugs me that I don't understand (<-).
Comonad has nothing to do with Monads. In fact, most Monads don't have any valid Comonad instance. Consider the [] Monad:
instance Monad [a] where
return x = [x]
xs >>= f = concat (map f xs)
If we try to write a Comonad instance, we can't define extract :: m a -> a
instance Comonad [a] where
extract (x:_) = x
extract [] = ???
This tells us something interesting about Monads, namely that we can't write a general function with the type Monad m => m a -> a. In other words, we can't "extract" a value from a Monad without additional knowledge about it.
So how does the do-notation syntax do {x <- [1,2,3]; return [x,x]} work?
Since <- is actually just syntax sugar, just like how [1,2,3] actually means 1 : 2 : 3 : [], the above expression actually means [1,2,3] >>= (\x -> return [x,x]), which in turn evaluates to concat (map (\x -> [[x,x]]) [1,2,3])), which comes out to [1,1,2,2,3,3].
Notice how the arrow transformed into a >>= and a lambda. This uses only built-in (in the typeclass) Monad functions, so it works for any Monad in general.
We can pretend to extract a value by using (>>=) :: Monad m => m a -> (a -> m b) -> m b and working with the "extracted" a inside the function we provide, like in the lambda in the list example above. However, it is impossible to actually get a value out of a Monad in a generic way, which is why the return type of >>= is m b (in the Monad)
So what is (<-) exactly? Can it be replaced entirely by using extract? Is that the best way to desugar/circumvent it?
Note that the do-block <- and extract mean very different things even for types that have both Monad and Comonad instances. For instance, consider non-empty lists. They have instances of both Monad (which is very much like the usual one for lists) and Comonad (with extend/=>> applying a function to all suffixes of the list). If we write a do-block such as...
import qualified Data.List.NonEmpty as N
import Data.List.NonEmpty (NonEmpty(..))
import Data.Function ((&))
alternating :: NonEmpty Integer
alternating = do
x <- N.fromList [1..6]
-x :| [x]
... the x in x <- N.fromList [1..6] stands for the elements of the non-empty list; however, this x must be used to build a new list (or, more generally, to set up a new monadic computation). That, as others have explained, reflects how do-notation is desugared. It becomes easier to see if we make the desugared code look like the original one:
alternating :: NonEmpty Integer
alternating =
N.fromList [1..6] >>= \x ->
-x :| [x]
GHCi> alternating
-1 :| [1,-2,2,-3,3,-4,4,-5,5,-6,6]
The lines below x <- N.fromList [1..6] in the do-block amount to the body of a lambda. x <- in isolation is therefore akin to a lambda without body, which is not a meaningful thing.
Another important thing to note is that x in the do-block above does not correspond to any one single Integer, but rather to all Integers in the list. That already gives away that <- does not correspond to an extraction function. (With other monads, the x might even correspond to no values at all, as in x <- Nothing or x <- []. See also Lazersmoke's answer.)
On the other hand, extract does extract a single value, with no ifs or buts...
GHCi> extract (N.fromList [1..6])
1
... however, it is really a single value: the tail of the list is discarded. If we want to use the suffixes of the list, we need extend/(=>>)...
GHCi> N.fromList [1..6] =>> product =>> sum
1956 :| [1236,516,156,36,6]
If we had a co-do-notation for comonads (cf. this package and the links therein), the example above might get rewritten as something in the vein of:
-- codo introduces a function: x & f = f x
N.fromList [1..6] & codo xs -> do
ys <- product xs
sum ys
The statements would correspond to plain values; the bound variables (xs and ys), to comonadic values (in this case, to list suffixes). That is exactly the opposite of what we have with monadic do-blocks. All in all, as far as your question is concerned, switching to comonads just swaps which things we can't refer to outside of the context of a computation.

Haskell parser combinator - do notation

I was reading a tutorial regarding building a parser combinator library and i came across a method which i don't quite understand.
newtype Parser a = Parser {parse :: String -> [(a,String)]}
chainl :: Parser a -> Parser (a -> a -> a) -> a -> Parser a
chainl p op a = (p `chainl1` op) <|> return a
chainl1 :: Parser a -> Parser (a -> a -> a) -> Parser a
p `chainl1` op = do {a <- p; rest a}
where rest a = (do f <- op
b <- p
rest (f a b))
<|> return a
bind :: Parser a -> (a -> Parser b) -> Parser b
bind p f = Parser $ \s -> concatMap (\(a, s') -> parse (f a) s') $ parse p s
the bind is the implementation of the (>>=) operator. I don't quite get how the chainl1 function works. From what I can see you extract f from op and then you apply it to f a b and you recurse, however I do not get how you extract a function from the parser when it should return a list of tuples?
Start by looking at the definition of Parser:
newtype Parser a = Parser {parse :: String -> [(a,String)]}`
A Parser a is really just a wrapper around a function (that we can run later with parse) that takes a String and returns a list of pairs, where each pair contains an a encountered when processing the string, along with the rest of the string that remains to be processed.
Now look at the part of the code in chainl1 that's confusing you: the part where you extract f from op:
f <- op
You remarked: "I do not get how you extract a function from the parser when it should return a list of tuples."
It's true that when we run a Parser a with a string (using parse), we get a list of type [(a,String)] as a result. But this code does not say parse op s. Rather, we are using bind here (with the do-notation syntactic sugar). The problem is that you're thinking about the definition of the Parser datatype, but you're not thinking much about what bind specifically does.
Let's look at what bind is doing in the Parser monad a bit more carefully.
bind :: Parser a -> (a -> Parser b) -> Parser b
bind p f = Parser $ \s -> concatMap (\(a, s') -> parse (f a) s') $ parse p s
What does p >>= f do? It returns a Parser that, when given a string s, does the following: First, it runs parser p with the string to be parsed, s. This, as you correctly noted, returns a list of type [(a, String)]: i.e. a list of the values of type a encountered, along with the string that remained after each value was encountered. Then it takes this list of pairs and applies a function to each pair. Specifically, each (a, s') pair in this list is transformed by (1) applying f to the parsed value a (f a returns a new parser), and then (2) running this new parser with the remaining string s'. This is a function from a tuple to a list of tuples: (a, s') -> [(b, s'')]... and since we're mapping this function over every tuple in the original list returned by parse p s, this ends up giving us a list of lists of tuples: [[(b, s'')]]. So we concatenate (or join) this list into a single list [(b, s'')]. All in all then, we have a function from s to [(b, s'')], which we then wrap in a Parser newtype.
The crucial point is that when we say f <- op, or op >>= \f -> ... that assigns the name f to the values parsed by op, but f is not a list of tuples, b/c it is not the result of running parse op s.
In general, you'll see a lot of Haskell code that defines some datatype SomeMonad a, along with a bind method that hides a lot of the dirty details for you, and lets you get access to the a values you care about using do-notation like so: a <- ma. It may be instructive to look at the State a monad to see how bind passes around state behind the scenes for you. Similarly, here, when combining parsers, you care most about the values the parser is supposed to recognize... bind is hiding all the dirty work that involves the strings that remain upon recognizing a value of type a.

What is the intuitive meaning of "join"?

What is the intuitive meaning of join for a Monad?
The monads-as-containers analogies make sense to me, and inside these analogies join makes sense. A value is double-wrapped and we unwrap one layer. But as we all know, a monad is not a container.
How might one write sensible, understandable code using join in normal circumstances, say when in IO?
An action :: IO (IO a) is a way of producing a way of producing an a. join action, then, is a way of producing an a by running the outermost producer of action, taking the producer it produced and then running that as well, to finally get to that juicy a.
join collapses consecutive layers of the type constructor.
A valid join must satisfy the property that, for any number of consecutive applications of the type constructor, it shouldn't matter the order in which we collapse the layers.
For example
ghci> let lolol = [[['a'],['b','c']],[['d'],['e']]]
ghci> lolol :: [[[Char]]]
ghci> lolol :: [] ([] ([] Char)) -- the type can also be expressed like this
ghci> join (fmap join lolol) -- collapse inner layers first
"abcde"
ghci> join (join lolol) -- collapse outer layers first
"abcde"
(We used fmap to "get inside" the outer monadic layer so that we could collapse the inner layers first.)
A small non container example where join is useful: for the function monad (->) a, join is equivalent to \f x -> f x x, a function of type (a -> a -> b) -> a -> b that applies two times the same argument to another function.
For the List monad, join is simply concat, and concatMap is join . fmap.
So join implicitly appears in any list expression which uses concat
or concatMap.
Suppose you were asked to find all of the numbers which are divisors of any
number in an input list. If you have a divisors function:
divisors :: Int -> [Int]
divisors n = [ d | d <- [1..n], mod n d == 0 ]
you might solve the problem like this:
foo xs = concat $ (map divisors xs)
Here we are thinking of solving the problem by first mapping the
divisors function over all of the input elements and then concatenating
all of the resulting lists. You might even think that this is a very
"functional" way of solving the problem.
Another approch would be to write a list comprehension:
bar xs = [ d | x <- xs, d <- divisors x ]
or using do-notation:
bar xs = do x <- xs
d <- divisors
return d
Here it might be said we're thinking a little more
imperatively - first draw a number from the list xs; then draw
a divisors from the divisors of the number and yield it.
It turns out, though, that foo and bar are exactly the same function.
Morever, these two approaches are exactly the same in any monad.
That is, for any monad, and appropriate monadic functions f and g:
do x <- f
y <- g x is the same as: (join . fmap g) f
return y
For instance, in the IO monad if we set f = getLine and g = readFile,
we have:
do x <- getLine
y <- readFile x is the same as: (join . fmap readFile) getLine
return y
The do-block is a more imperative way of expressing the action: first read a
line of input; then treat returned string as a file name, read the contents
of the file and finally return the result.
The equivalent join expression seems a little unnatural in the IO-monad.
However it shouldn't be as we are using it in exactly the same way as we
used concatMap in the first example.
Given an action that produces another action, run the action and then run the action that it produces.
If you imagine some kind of Parser x monad that parses an x, then Parser (Parser x) is a parser that does some parsing, and then returns another parser. So join would flatten this into a Parser x that just runs both actions and returns the final x.
Why would you even have a Parser (Parser x) in the first place? Basically, because fmap. If you have a parser, you can fmap a function that changes the result over it. But if you fmap a function that itself returns a parser, you end up with a Parser (Parser x), where you probably want to just run both actions. join implements "just run both actions".
I like the parsing example because a parser typically has a runParser function. And it's clear that a Parser Int is not an integer. It's something that can parse an integer, after you give it some input to parse from. I think a lot of people end up thinking of an IO Int as being just a normal integer but with this annoying IO bit that you can't get rid of. It isn't. It's an unexecuted I/O operation. There's no integer "inside" it; the integer doesn't exist until you actually perform the I/O.
I find these things easier to interpret by writing out the types and refactoring them a bit to reveal what the functions do.
Reader monad
The Reader type is defined thus, and its join function has the type shown:
newtype Reader r a = Reader { runReader :: r -> a }
join :: Reader r (Reader r a) -> Reader r a
Since this is a newtype, this means that the type Reader r a is isomorphic to r -> a. So we can refactor the type definition to give us this type that, albeit it's not the same, it's really "the same" with scare quotes:
In the (->) r monad, which is isomorphic to Reader r, join is the function:
join :: (r -> r -> a) -> r -> a
So the Reader join is the function that takes a two-place function (r -> r -> a) and applies to the same value at both its argument positions.
Writer monad
Since the Writer type has this definition:
newtype Writer w a = Writer { runWriter :: (a, w) }
...then when we remove the newtype, its join function has a type isomorphic to:
join :: Monoid w => ((a, w), w) -> (a, w)
The Monoid constraint needs to be there because the Monad instance for Writer requires it, and it lets us guess right away what the function does:
join ((a, w0), w1) = (a, w0 <> w1)
State monad
Similarly, since State has this definition:
newtype State s a = State { runState :: s -> (a, s) }
...then its join is like this:
join :: (s -> (s -> (a, s), s)) -> s -> (a, s)
...and you can also venture just writing it directly:
join f s0 = (a, s2)
where
(g, s1) = f s0
(a, s2) = g s1
{- Here's the "map" to the variable names in the function:
f g s2 s1 s0 s2
join :: (s -> (s -> (a, s ), s )) -> s -> (a, s )
-}
If you stare at this type a bit, you might think that it bears some resemblance to both the Reader and Writer's types for their join operations. And you'd be right! The Reader, Writer and State monads are all instances of a more general pattern called update monads.
List monad
join :: [[a]] -> [a]
As other people have pointed out, this is the type of the concat function.
Parsing monads
Here comes a really neat thing to realize. Very often, "fancy" monads turn out to be combinations or variants of "basic" ones like Reader, Writer, State or lists. So often what I do when confronted with a novel monad is ask: which of the basic monads does it resemble, and how?
Take for example parsing monads, which have been brought up in other answers here. A simplistic parser monad (with no support for important things like error reporting) looks like this:
newtype Parser a = Parser { runParser :: String -> [(a, String)] }
A Parser is a function that takes a string as input, and returns a list of candidate parses, where each candidate parse is a pair of:
A parse result of type a;
The leftovers (the suffix of the input string that was not consumed in that parse).
But notice that this type looks very much like the state monad:
newtype Parser a = Parser { runParser :: String -> [(a, String)] }
newtype State s a = State { runState :: s -> (a, s) }
And this is no accident! Parser monads are nondeterministic state monads, where the state is the unconsumed portion of the input string, and parse steps generate alternatives that may be later rejected in light of further input. List monads are often called "nondeterminism" monads, so it's no surprise that a parser resembles a mix of the state and list monads.
And this intuition can be systematized by using monad transfomers. The state monad transformer is defined like this:
newtype StateT s m a = StateT { runStateT :: s -> m (a, s) }
Which means that the Parser type from above can be written like this as well:
type Parser a = StateT String [] a
...and its Monad instance follows mechanically from those of StateT and [].
The IO monad
Imagine we could enumerate all of the possible primitive IO actions, somewhat like this:
{-# LANGUAGE GADTs #-}
data Command a where
-- An action that writes a char to stdout
putChar :: Char -> Command ()
-- An action that reads a char from stdin
getChar :: Command Char
-- ...
Then we could think of the IO type as this (which I've adapted from the highly-recommended Operational monad tutorial):
data IO a where
-- An `IO` action that just returns a constant value.
Return :: a -> IO a
-- An action that binds the result of a `Command` to
-- a function that computes the next step after it.
Bind :: Command x -> (x -> IO a) -> IO a
instance Monad IO where ...
Then join action would then look like this:
join :: IO (IO a) -> IO a
-- If the action is just `Return`, then its payload already
-- is what we need to return.
join (Return ioa) = ioa
-- If the action is a `Bind`, then its "next step" function
-- `f` produces `IO (IO a)`, so we can just recursively stick
-- a `join` to its result end.
join (Bind cmd f) = Bind cmd (join . f)
So all that the join does here is "chase down" the IO action until it sees a result that fits the pattern Return (ma :: IO a), and strip out the outer Return.
So what did I do here? Just like for parser monads, I just defined (or rather copied) a toy model of the IO type that has the virtue of being transparent. Then I work out the behavior of join from the toy model.

Resources