Why does this code not parse without whitespace? - haskell

So I am learning Monads and was playing around with the following expression:
[1,2] >>= \x -> ['a','b'] >>= \y -> return (x,y)
The above code produces the result [(1,a),(1,b),(2,a),(2,b)] as expected.
But since I was just experimenting, I got lazy and I entered:
[1,2]>>=\x->['a','b']>>=\y->return (x,y) (same code as above but without white-spaces)
which doesn't seem to work.
I understand that if I properly bracket out this expression as
[1,2]>>=(\x->(['a','b']>>=(\y->return (x,y))))
it will work (better I just put spaces than these monstrous brackets) but I don't get why the expression with white-space works whereas the other one doesn't.

You need spaces to separate identifier names: foo bar is two separate names, whereas foobar (without the space) is just one name.
The same thing happens with operators. Haskell allows arbitrary user-defined operators; if you want to write a function named ??++!?!, then go for it! But you must use spaces to separate operators from one another.
Just as >>= is not the same thing as >> =, so >>=\ isn't the same as >>= \. You could actually define a function named >>=\ if you wanted. But the space lets the Haskell language parser know this is two things, not one.

To fully understand this, you need to look at Chapter 2 of the Haskell Report, particularly section 2.4. The lexeme -> is a reservedop, >>= is not.
For example, does this expression require a space, or spaces?
[1,2]>>=return
[1,2] >>=return

Related

Understanding Haskell's Laziness

I was reading: Haskell interact function
So I tried
interact (unlines . map (show . length) . lines)
And it worked as I expected. I type something, press enter, then I get the length printed at the prompt.
So then I wanted to try make it simply repeat what I typed, so I tried
interact (unlines . map id . lines)
But now it repeats every character I type in. Why is that? I thought the trick was in the lines followed by unlines - but it's clearly not. lines "a" produces ["a"], so how come in the first function when I start typing my input, it doesn't just immediately give "1" as the output? There's clearly something I misunderstand about "Finding the length of a string is not like this -- the whole string must be known before any output can be produced."
The fact that lines "a" produces ["a"] does not mean that if you are currently entering a, that lines just processes the input to a list ["a"]. You should see the input as a (possibly) infinite list of characters. In case the prompt is waiting for user input, it is thus "blocking" on the next input.
That however does not mean that functions like lines can not partially resolve the result already. lines has been implemented in a lazy manner such that it processes the stream of characters, and each time when it sees a new line character, it starts emitting the next element. This thus means that lines could process an infinite sequence of characters into an infinite list of lines.
If you use length :: Foldable f => f a -> Int however, then this requires the list to be evaluated (not the elements of the list however). So that means length will only emit an answer from the moment lines starts emitting the next item.
You can use seq (and variants) to force the evaluation of a term before a certain action is done. For example seq :: a -> b -> b will evaluate the first parameter to Weak Head Normal Form (WHNF), and then return the second parameter.
Based on seq, other functions have been constructed, like seqList :: [a] -> [a] in the Data.Lists module of the lists package.
We can use this to postpone evaluation until the first line is known, like:
-- will echo full lines
import Data.Lists(seqList)
interact (unlines . map (\x -> seqList x x) . lines)
This is to do with lazy evaluation. I'll try to explain this in as intuitive a way as possible.
When you write interact (unlines . map (show . length) . lines), every time a character is input, we don't actually know what the next output character can be until you press enter. So, you get the behaviour you expected.
However, at every point in interact (unlines . map id . lines) = interact id, every time you enter a character, it's guaranteed that that character is included in the output. So, if you input a character, that character is also output immediately.
This is one of the reasons that the word "lazy" is a bit of a misnomer. It's true that Haskell will only evaluate something when it needs to, but the flipside of that is that when it needs to, it'll do so as soon as possible. Here Haskell needs to evaluate the output since you want to print it, so it evaluates it as much as it can—one character at a time—ironically making it seem eager!
More specifically, interact isn't intended for real time user input—it's intended for file input, in which you pipe a file into an executable with bash. It should be run something like this:
$ runhaskell Interactor.hs < my_big_file.txt > list_of_lengths.txt
If you want line-by-line buffering, you'll probably have to do it manually, unless you want to 'trick' the compiler as Willem does. Here's some very simple code that works as you expect—but note that it has no exit state unlike interact, which will terminate at the EOF.
main = do
ln <- getLine -- Buffers until you press enter
putStrLn ln -- Print the line we just got
main -- Loop forever

Can anything done in a Haskell script be reproduced in a GHCi session?

I want to run the function
act :: IO(Char, Char)
act = do x <- getChar
getChar
y <- getChar
return (x,y)
interactively in a GHCi session. I've seen elsewhere that you can define a function in a session by using the semi-colon to replace a line-break. However, when I write
act :: IO(Char, Char); act = do x <- getChar; getChar; y <- getChar; return (x,y)
it doesn't compile, saying
parse error on input ‘;’
I've elsewhere seen that :{ ... }: can be used for multiple line commands, but typing
:{ act :: IO(Char, Char)
and then hitting enter causes an error--perhaps I'm misunderstanding how to use them.
Besides just getting this particular case to work, is there a generic way of taking code that would run in a Haskell script and making it run in an interactive session?
You can't just insert semicolons to replace each line break. Doing stuff on one line means opting out of the layout rule, so you have to insert your own semicolons and braces. This means you need to know where those braces and semicolons would be required without the layout rule. For this case in particular, each do block needs braces around the whole block, and semicolons between each operation. The layout rule normally inserts these for you based on indentation.
So to write this specific example on one line, you can do this:
let act :: IO(Char, Char); act = do {x <- getChar; getChar; y <- getChar; return (x,y)}
On a new enough version of ghci you can omit the let as well.
For simple enough do blocks you might even get away with omitting the braces. In your example there's only one place the { and } could possibly go, and so GHCI inserts them even when you do everything on one line. But for an expression with multiple do blocks or other multiline constructs, you will need to insert them explicitly if you want them on one line.
On the broader question:
Besides just getting this particular case to work, is there a generic way of taking code that would run in a Haskell script and making it run in an interactive session?
The closest thing I know of is using the multiline delimiters, ":{ and :} (each on a single line of its own)". They can handle almost anything you can throw at them. They can't handle imports (GHCi does support the full import syntax, but each import must be on its own in a line) and pragmas (the only alternative is :set, which also need a line all of its own), which means you can't help but separate them from the rest of the code and enter them beforehand.
(You can always save the code somewhere and load the file with :l, and that will often turn out to be the more convenient option. Still, I have a soft spot for :{ and :} -- if I want no more than trying out half a dozen lines of impromptu code with no context, I tend to open a text editor window, write the little snippet and paste it directly in GHCi.)

Meaning of single colon in Haskell :t

As far as I've been able to gather, the single colon in Haskell is used in list comprehension. Why then does it show up in the :t command? Also in the :quit command? There isn't any list comprehension being done, is there?
The :t (short for :type) syntax is special to GHCi, and is not part of the Haskell language syntax. This is similar to how the SQLite interpreter accepts .tables as a command, even though this isn't valid a SQL statement. If you type :?, you can see a complete list of all the commands GHCi understands.
As for using the colon in actual Haskell code:
A colon by itself is a list constructor. This is a reserved name, and can never be redefined.
You should know that function names always start lowercase, while constructor names always start uppercase. Well, in a similar way, an infix constructor must start with a colon, whereas a normal infix operator must not start with a colon (but may contain colons elsewhere).
So, for example, "?:?" is a legal operator name, and :?? is a legal constructor operator name.
x ?:? y = ...whatever...
data Foobar = Int :?? Bool

Haskell Parsec strange issue with multiple expression occurrences

here is the code which to my mind shouldn't cause any issue but for some reason does?
program = expr8
<|> seqOfStmt
seqOfStmt =
do list <- (sepBy1 expr8 whiteSpace)
return $ if length list == 1 then head list else Seq list
I get 3 errors all in respect to 'list' not being in scope?
It's probably blatantly obvious what is going wrong but I can't figure out why
If there are any alternatives to this I would greatly like to hear them !
Thanks in advance,
Seán
Your final line uses a tab character for indentation, while the other lines use spaces only.
You have tabs set to four spaces in your editor, but ghc uses eight character tab stops (just as terminals do).
Therefore your return line is parsed as a continuation of the previous line, and list is not yet in scope.
One easy way to fix this is to refrain from using tabs: use spaces only.
Once you've fixed that, your next error will probably be a type error: head list and Seq list have different types (unless perhaps you have redefined head for some reason). It's not clear why you want to treat the list differently if it contains only a single element.

How to make Haskell or ghci able to show Chinese characters and run Chinese characters named scripts?

I want to make a Haskell script to read files in my /home folder. However there are many files named with Chinese characters, and Haskell and Ghci cannot manage it. It seems Haskell and Ghci aren't good at displaying UTF-8 characters.
Here is what I encountered:
Prelude> "让Haskell或者Ghci能正确显示汉字并且读取汉字命名的文档"
"\35753Haskell\25110\32773Ghci\33021\27491\30830\26174\31034\27721\23383\24182\19988\35835\21462\27721\23383\21629\21517\30340\25991\26723"
Prelude> putStrLn "\35753Haskell\25110\32773Ghci\33021\27491\30830\26174\31034\27721\23383\24182\19988\35835\21462\27721\23383\21629\21517\30340\25991\26723"
让Haskell或者Ghci能正确显示汉字并且读取汉字命名的文档
GHC handles unicode just fine. These are the things you should know about it:
It uses your system encoding for converting from byte to characters and back when reading from or writing to the console. Since it did the conversion from bytes to characters properly in your example, I'd say your system encoding is set properly.
The show function on String has a limited output character set. The show function is used by GHCI to print the result of evaluating an expression, and by the print function to convert the value passed in to a String representation.
The putStr and putStrLn functions are for actually writing a String to the console exactly as it was provided to them.
Thanks to Carl, i used putStrLn as a wrapper around my fuction:
ghci> let removeNonUppercase st = [c | c <- st, c `elem` ['А'..'Я']]
ghci> putStrLn (removeNonUppercase "Ха-ха-ха! А-ха-ха!")
ХА
Everything works fine!

Resources