Meaning of single colon in Haskell :t - haskell

As far as I've been able to gather, the single colon in Haskell is used in list comprehension. Why then does it show up in the :t command? Also in the :quit command? There isn't any list comprehension being done, is there?

The :t (short for :type) syntax is special to GHCi, and is not part of the Haskell language syntax. This is similar to how the SQLite interpreter accepts .tables as a command, even though this isn't valid a SQL statement. If you type :?, you can see a complete list of all the commands GHCi understands.
As for using the colon in actual Haskell code:
A colon by itself is a list constructor. This is a reserved name, and can never be redefined.
You should know that function names always start lowercase, while constructor names always start uppercase. Well, in a similar way, an infix constructor must start with a colon, whereas a normal infix operator must not start with a colon (but may contain colons elsewhere).
So, for example, "?:?" is a legal operator name, and :?? is a legal constructor operator name.
x ?:? y = ...whatever...
data Foobar = Int :?? Bool

Related

Understanding Haskell's Laziness

I was reading: Haskell interact function
So I tried
interact (unlines . map (show . length) . lines)
And it worked as I expected. I type something, press enter, then I get the length printed at the prompt.
So then I wanted to try make it simply repeat what I typed, so I tried
interact (unlines . map id . lines)
But now it repeats every character I type in. Why is that? I thought the trick was in the lines followed by unlines - but it's clearly not. lines "a" produces ["a"], so how come in the first function when I start typing my input, it doesn't just immediately give "1" as the output? There's clearly something I misunderstand about "Finding the length of a string is not like this -- the whole string must be known before any output can be produced."
The fact that lines "a" produces ["a"] does not mean that if you are currently entering a, that lines just processes the input to a list ["a"]. You should see the input as a (possibly) infinite list of characters. In case the prompt is waiting for user input, it is thus "blocking" on the next input.
That however does not mean that functions like lines can not partially resolve the result already. lines has been implemented in a lazy manner such that it processes the stream of characters, and each time when it sees a new line character, it starts emitting the next element. This thus means that lines could process an infinite sequence of characters into an infinite list of lines.
If you use length :: Foldable f => f a -> Int however, then this requires the list to be evaluated (not the elements of the list however). So that means length will only emit an answer from the moment lines starts emitting the next item.
You can use seq (and variants) to force the evaluation of a term before a certain action is done. For example seq :: a -> b -> b will evaluate the first parameter to Weak Head Normal Form (WHNF), and then return the second parameter.
Based on seq, other functions have been constructed, like seqList :: [a] -> [a] in the Data.Lists module of the lists package.
We can use this to postpone evaluation until the first line is known, like:
-- will echo full lines
import Data.Lists(seqList)
interact (unlines . map (\x -> seqList x x) . lines)
This is to do with lazy evaluation. I'll try to explain this in as intuitive a way as possible.
When you write interact (unlines . map (show . length) . lines), every time a character is input, we don't actually know what the next output character can be until you press enter. So, you get the behaviour you expected.
However, at every point in interact (unlines . map id . lines) = interact id, every time you enter a character, it's guaranteed that that character is included in the output. So, if you input a character, that character is also output immediately.
This is one of the reasons that the word "lazy" is a bit of a misnomer. It's true that Haskell will only evaluate something when it needs to, but the flipside of that is that when it needs to, it'll do so as soon as possible. Here Haskell needs to evaluate the output since you want to print it, so it evaluates it as much as it can—one character at a time—ironically making it seem eager!
More specifically, interact isn't intended for real time user input—it's intended for file input, in which you pipe a file into an executable with bash. It should be run something like this:
$ runhaskell Interactor.hs < my_big_file.txt > list_of_lengths.txt
If you want line-by-line buffering, you'll probably have to do it manually, unless you want to 'trick' the compiler as Willem does. Here's some very simple code that works as you expect—but note that it has no exit state unlike interact, which will terminate at the EOF.
main = do
ln <- getLine -- Buffers until you press enter
putStrLn ln -- Print the line we just got
main -- Loop forever

Why does this code not parse without whitespace?

So I am learning Monads and was playing around with the following expression:
[1,2] >>= \x -> ['a','b'] >>= \y -> return (x,y)
The above code produces the result [(1,a),(1,b),(2,a),(2,b)] as expected.
But since I was just experimenting, I got lazy and I entered:
[1,2]>>=\x->['a','b']>>=\y->return (x,y) (same code as above but without white-spaces)
which doesn't seem to work.
I understand that if I properly bracket out this expression as
[1,2]>>=(\x->(['a','b']>>=(\y->return (x,y))))
it will work (better I just put spaces than these monstrous brackets) but I don't get why the expression with white-space works whereas the other one doesn't.
You need spaces to separate identifier names: foo bar is two separate names, whereas foobar (without the space) is just one name.
The same thing happens with operators. Haskell allows arbitrary user-defined operators; if you want to write a function named ??++!?!, then go for it! But you must use spaces to separate operators from one another.
Just as >>= is not the same thing as >> =, so >>=\ isn't the same as >>= \. You could actually define a function named >>=\ if you wanted. But the space lets the Haskell language parser know this is two things, not one.
To fully understand this, you need to look at Chapter 2 of the Haskell Report, particularly section 2.4. The lexeme -> is a reservedop, >>= is not.
For example, does this expression require a space, or spaces?
[1,2]>>=return
[1,2] >>=return

Compiled vs Interpreted: To Let or Not To Let

Why does the Haskell interpreter (GHCI 7.10.3) need function definitions to be in a let expression, but the Haskell compiler (GHC 7.10.3) throws a parser error if a function definition is within a let expression?
I'm working through "Learn You a Haskell for Great Good!" Baby's first function is doubleMe:
doubleMe x = x + x
Why does the interpreter accept this definition if it is within a let expression and otherwise throw a parse error on input '='? Meanwhile, if I'm compiling the same function from a file, why does GHC throw a parse error if the function definition is within a let expression and compiles the definition if it is not within a let expression? Coming from a Lisp background, I'm surprised that interactive Haskell and file loading and compilation Haskell treats these definitions differently.
The reasoning behind this is that GHCi (in 7.10.3) expects at the prompt only
commands (type in :h to list the commands available)
declarations (things like data, type, newtype, class, instance, deriving, and foreign but not a regular definition)
imports
expressions (things like 1+1 or let x = 3 in x*x)
I/O Actions / do statments (things like print "hi" or x <- getLine OR let doubleMe x = x + x)
If this seems surprising to you, remember that the evaluation of Lisp and Haskell is very different - Lisp just gets interpretted, while Haskell is being compiled.
As you can tell, top-level definitions are not part of this list. Thankfully this got fixed in GHCi 8.0.1, which now supports raw top-level function declarations. The following works (in 8.0.1):
ghci> doubleMe x = x + x
ghci> doubleMe 1
2
The GHCi interpreter command line treats its input as if it were in a do clause. So you can type this:
:module + System.Random
v <- getStdRandom $ randomR (1,10)
Apart from the :module directive this is exactly how it would be in a do clause.
Likewise you can write
let f x = 2 * x
because that is how it would be in a do clause.
Modern Lisp implementations compile to native code, often by default even when code is entered at the prompt. Lisp's prompt isn't just a place to enter commands, it's a place to interact with the language because the entire language is made available by the Read-Evaluate-Print Loop. This means that Lisp reads the text into symbolic expressions, which it then evaluates, printing any print output and any returned values. For example,
? (defun a-fun () nil)
A-FUN
? (compiled-function-p #'a-fun)
T
Compiled-Function-P
Clozure Common Lisp
With Lisp, code you can enter into the Lisp image by compiling and loading a file you can also enter into the Lisp image by typing it out at the REPL. So it turns out I was surprised because I was expecting the GHCi prompt to be a REPL, but as #Alec describes it's not because it doesn't read text into Haskell expressions that it would then evaluate, as Lisp does. As #dfeuer says, the issue isn't about compilation versus interpretation. The issue is that GHCi's prompt offers limited interaction with a Haskell compiler, rather than interaction with Haskell itself as Lisp's REPL does.

Why are unpack and show defined differently in Data.Text (and behave differently for non-ASCII characters?)

unpack and show are two ways to convert Text to a String. They, however, behave and are defined differently for non-ASCII characters:
Prelude Data.Text> putStrLn $ unpack $ pack "你好我的朋友"
你好我的朋友
Prelude Data.Text> putStrLn $ show $ pack "你好我的朋友"
"\20320\22909\25105\30340\26379\21451"
With show, I believe, returning a string of codepoints, while unpack displays the actual characters. I have found this to be a nuisance while coding, as I had defined functions that take a Show instance and wanted to pass in Text, and expected it to return the actual non-ASCII characters as a String.
What was the design intent for this behavior? Why were show and unpack defined differently?
The source can be found at http://hackage.haskell.org/packages/archive/text/0.11.1.5/doc/html/src/Data-Text.html.
This is a general thing about Show: it's intended rather to produce a kind of preview of objects that can double as a portable serialisation, readable as Haskell code. Obviously, 你好我的朋友 is not valid Haskell (unless you define it as a variable, which you actually can!), so it would not be acceptable as output of show. It would be quite ok if it produced "你好我的朋友" (in fact, I would prefer that), but this might cause Platform etc. problems when you're not throughoutly using full UTF-8 in all of your work chain, so the safer expansion to ASCII was chosen.
If you want the nice non-escaped plain-string output as the GHCi echo, you can use the new custom-pretty-printer feature. I already wrote something about that here.

Haskell Parsec strange issue with multiple expression occurrences

here is the code which to my mind shouldn't cause any issue but for some reason does?
program = expr8
<|> seqOfStmt
seqOfStmt =
do list <- (sepBy1 expr8 whiteSpace)
return $ if length list == 1 then head list else Seq list
I get 3 errors all in respect to 'list' not being in scope?
It's probably blatantly obvious what is going wrong but I can't figure out why
If there are any alternatives to this I would greatly like to hear them !
Thanks in advance,
Seán
Your final line uses a tab character for indentation, while the other lines use spaces only.
You have tabs set to four spaces in your editor, but ghc uses eight character tab stops (just as terminals do).
Therefore your return line is parsed as a continuation of the previous line, and list is not yet in scope.
One easy way to fix this is to refrain from using tabs: use spaces only.
Once you've fixed that, your next error will probably be a type error: head list and Seq list have different types (unless perhaps you have redefined head for some reason). It's not clear why you want to treat the list differently if it contains only a single element.

Resources