When to use tabs and when to use spaces in Haskell? - haskell

I am wondering when I should use tabs and when I should use spaces?
Especially in guards, I'm working through he learn you a Haskell book and it said I should always use spaces.
The book itself seems to use 4 spaces in definition with guards though.
For example this function:
replicate' :: (Num i, Ord i) => i -> a -> [a]
replicate' n x
| n <= 0 = []
| otherwise = x:replicate' (n-1) x
When I replace the 4 spaces/tabs with a single space I get an indentation / missed brackets error in the otherwise case.
However, if I use a tab or 4 spaces this works.
Did I misunderstand something about using spaces over tabs? Should it be 4 spaces each time?
Because often 1 space does work, just with guards ghci is almost always (infuriatingly not always) complaining here.
I'm using sublime btw, in case there is an issue there.
Thanks a lot in advance.
For example:
maximum' [] = error "maximum of empty list"
maximum' [x] = x
maximum' (x:xs)
| x > maxTail = x
| otherwise = maxTail
where maxTail = maximum' xs
throws an indentation error

It sounds like the OP's actual problem was a weird state in a particular file, but I thought I'd provide an answer to the general question here.
Most parts of Haskell syntax are completely insensitive to indentation (most of the common practices about laying out Haskell code are stylistic, rather than necessary). For example, all of these ways of writing the last equation in the OP's example work just fine:
-- guards indented more than where
maximum' (x:xs)
| x > maxTail = x
| otherwise = maxTail
where maxTail = maximum' xs
-- where indented more than guards
maximum' (x:xs)
| x > maxTail = x
| otherwise = maxTail
where maxTail = maximum' xs
-- all on one line, no indentation at all!
maximum' (x:xs) | x > maxTail = x | otherwise = maxTail where maxTail = maximum' xs
Even something horrifying like this works:
-- please, no
maximum' (x:xs) |
x
> maxTail = x
| otherwise
= maxTail where
maxTail
= maximum' xs
There are exactly 2 things you can to do mess up indentation in the code shown:
Use more than one line to define the equation, with any of the continuation lines not starting with at least one whitespace character
Use more than one line to define the where clause, with any of the continuation lines starting at a character position less than that of the first character after the where keywords (i.e. the m in maxTail)
Otherwise, the whitespace in this example does not matter at all (apart from separating identifiers and keywords).
There is basically only one general way in which indentation matters in Haskell. And it's actually not indentation as such, but alignment that matters. That happens in the context of "blocks" containing a variable number of entries:
a let <decls> in <expr> expression contains 1 or more declarations in the <decls> part
a where clause introduces 1 or more declarations
an instance definition's where part has zero or more method definitions
a do block has 1 or more statements
a case expression has 1 or more cases (zero or more with the EmptyCase extension)
etc, etc
The variable number of entries in these blocks are the only places where alignment matters. There is always a keyword introducing the block, and the character position of the first entry in the block sets the alignment; after that every line that starts exactly at this character position is taken as the beginning of the next entry in the block, every line that starts past this position is taken as a continuation line of the previous entry, and the first line that starts before alignment position is taken as ending the block (and the contents of this line are not part of the block). Sometimes there is also a keyword that will indicate the end of the block, regardless of any indentation (e.g. the in part of let <decls> in <expr> is indicated by the in keyword even if it's on the same line as part or all of the <decls>).
As an aside, you may at this point be wondering why it's possible to get an error with code like this:
bar x y
= x + y
I haven't used any where, let, etc blocks above, but it's possible to get an alignment error here by continuing the bar definition onto a new line without indentation? Didn't I promise indentation only matters in blocks? Well, actually the entire global scope of a module is an aligned block! We just usually don't notice it because it's conventional to use alignment position 0 for this block. But technically, that's what's going on (thus you can't have a continuation line for one of the declarations in the global block that starts at alignment 0).
This layout based on alignment rather than indentation is why tabs are often considered difficult to use to layout Haskell code. As an example, consider this:
foo x y z = xy + yz
where xy = x * y
yz = y * z
Here I have used 4 spaces to indent the where part, and this is one of those places where the whitespace is completely irrelevant, so I could have used anything I like. Therefore, if I'm accustomed to using tabs as indentation in other programming languages, I might have been tempted to use a tab rather than 4 spaces.
Where things get nasty is that the correct indentation of the yx = y * z line is not "2 indent levels in", but rather "lining up exactly with the xy = x * y definition". So if I had used a tab to indent the where, the only correct way to indent the following line is to use a tab followed by 6 spaces. In my experience this is something that even smart formatting code editors never get right (let alone humans doing it manually); it is far more likely that if my view settings have a tab take up less space than the where keyword (such as the common 4 spaces) that I will get at least 2 leading tabs, followed by enough spaces to make the yz = y * z line appear to line up with the definition above.
Haskell compilers, by the spec, treat tab stops as eight spaces apart. So the situation I described above (where the first definition in the where is at 1 tab plus 6 normal characters and the second is at 2 tabs plus 2 normal characters) results in an invisible error. The compiler thinks these definitions are at positions 14 and 18, but to me they look the same. This sort problem is not fun. Hence the upvoted comment "When to use tabs? Never! That was an easy one."
Technically you can set your editor to show tabs stops at 8 spaces, and then it doesn't matter whether a given amount of indentation is all spaces or any mix of tabs and spaces that looks the same. However, most people don't like to have their editor set to show tabs as 8 spaces, and fixing any particular number defeats the entire point of indenting using tabs (having the visual appearance of "indent levels" be something that each user can configure independently in their editor).
It is also possible to adopt a code style that avoids the problem. Basically: always end the line immediately after a keyword introducing a block, so that the block starts on a new line (which you bump up the next indent level). You would then write (for the OP's example):
maximum' (x:xs)
| x > maxTail = x
| otherwise = maxTail
where
maxTail = maximum' xs
If you do that then your alignment positions will always be an exact number of tabs and zero normal characters, so you will not end up forced to use leading space that is a mix of tabs and spaces. In fact Haskell's alignment rules become extremely similar to Python's indentation rules if you code like this (the major reason they are different is that Haskell allows you to start an aligned block on the same line as preceding code, whereas Python's blocks are always preceded by a line ending in a colon).
But by far the most common approach to using tabs in Haskell is: simply don't do it. Configure your editor to insert spaces up to the next "tab-stop" when you press the tab key, if you like. But make sure the physical source code file is saved with spaces.
Gratuitous soapbox time!
For me personally, the reasoning above is why I don't like to use tabs in any language. Because sooner or later someone always ends up wanting to make something on one line visually align with something on another line, and this often needs indentation that is a mix of tabs and spaces. The tab and space mix is almost never correctly handled by the editor (to do so in general requires the editor to be able to tell when a line is a continuation line or the start of a new syntactic construct, before the coder has finished typing it, which is at best language-dependent and at-worst just impossible). So they write code that is simply incorrectly formatted as soon as someone uses a different tab-width preference than they used.
An example would be this fairly common layout (in no particular language):
class Foo {
public int foo(int x, char y, long listOfParameters,
bool z, double ooopsRanOutOfLetters) {
codeStartsHere();
If indent levels are tabs, then the correct indentation for the continuation of the parameter list is 1 tab and 15 spaces, but someone is just as likely to get 4 tabs and 3 spaces, which throws off the alignment completely at any other tab-width setting. Basically, if indent levels are to be configured for each coder's preference (by setting the tab-width), then there is a fundamental difference between inserting an indent level and inserting a visually-equivalent number of spaces, requiring you to think about which you intend every time you hit the tab key. Even if the formatting is purely a visual aid to human readers and causes no change in how the compiler/interpreter will read the code, aiding human readers is arguably more important than merely writing something that the machine will accept.
And again, this problem can be addressed by rigidly adhering to a style guide that is carefully constructed to avoid layouts like the above ever happening. But I just don't want to have to think about that when I'm designing or evaluating a style guide, nor when I'm writing code. "Always indent with spaces" is an incredibly simple rule to put in a style guide, and then it's never an issue regardless of the other rules you adopt (and regardless of whether those other rules are strictly followed or there are exceptions).

Only use spaces, because Haskell is indentation-sensitive and implicit layout blocks must start on a column greater than their layout keywords, so it's important to keep track of columns exactly.
Furthermore, tab stops are 8 columns apart according to Haskell2010, which is huge by today's indentation standards which are usually at most 4 spaces.

Related

Selecting multiple text spots with visual mode?

I'm doing some markdown editing in vim on a file. I'm trying to convert it to markdown with code highlighting, etc.
- Arithmetic operators:`+,−,*, /`
- Constants: `e`, `pi`
- Functions: (abs x), (max x y... ), (ceiling x) (expt x y), (exp x),
(cos x), ...
I want to select only the things that are in parantheses (including the parantheses) in the following using visual mode (so they would be disjoint by the commas):
(abs x), (max x y... ), (ceiling x) (expt x y), (exp x),
(cos x), ...
And then do S` to surround the each piece of text with backticks. How can I do this without selcting each one, then doing S` repeatedly?
How can I do this without selcting each one, then doing S` repeatedly?
This is actually what works best in Vim. With a bit of help from macros:
Interactive version:
/(.\{-})<CR>
qqysa)`nnq
#q
##
##
... till you do them all and wrap around to where you started.
Non-interactive "just do it" version:
:set nows<CR>
gg
/(.\{-})<CR>
qqqqqysa)`nn#qq#q
You'll probably want to go back to :set ns afterwards.
Of course, if you know that there are no nested parentheses, then the simplest answer is using :s, like the other answerer suggested.
EDIT with the explanation of the macro:
qqqqq...#qq#q is a loop. Here's how it works:
qq followed by q clears the q register. This will be important later.
qq starts the macro recording.
ysa) surrounds around the parentheses with `.
nn goes to the next match. We have to do it twice, because surround jumps to before the paren, and n will match the same parentheses again.
#q invokes the q macro. It is empty, so this does nothing... now. However, read further...
q stops the macro recording, and stores it to the q register.
Now that q is not empty any more, we can execute it with #q. However, during the execution q will still not be empty, so when we get to the point in the macro that did nothing during the recording, the macro will relaunch, giving us a primitive, but functioning, recursion loop.
The loop stops when something in it breaks: for example, not being able to go to the next match. Usually, you'd just change all the matches so no more matches remain; however, the edit this macro does does not make the match fail, so we have to rely on :set nows to make sure we don't continue infinitely adding backticks to all parentheses.
After a bit of thought, you can actually rewrite the pattern so that :set nows (and additional n) is not needed:
/`\#<!(.\{-})<CR>
qqqqqysa)`n#qq#q
This matches a pair of parentheses not preceded by a backtick, so that after all matches have been dealt with there is no match for n, naturally breaking the loop.
If anyone thinks this is complex... note that most editors plain can't do it (since this takes into account proper parenthesis nesting, whereas I haven't yet seen an editor with search-replace robust enough to be able to pull it off).
Using a global command (assuming `S`` comes from surround.vim):
:global/(/normal f(ysab`
(This affects the whole file, and may only do one change at a time. Repeat with #:)
With a macro:
qqf(ysab`q
Repeat with #q and then ##
Or with substitute:
:substitute/([^)]\+)/`&`/g

Can anything done in a Haskell script be reproduced in a GHCi session?

I want to run the function
act :: IO(Char, Char)
act = do x <- getChar
getChar
y <- getChar
return (x,y)
interactively in a GHCi session. I've seen elsewhere that you can define a function in a session by using the semi-colon to replace a line-break. However, when I write
act :: IO(Char, Char); act = do x <- getChar; getChar; y <- getChar; return (x,y)
it doesn't compile, saying
parse error on input ‘;’
I've elsewhere seen that :{ ... }: can be used for multiple line commands, but typing
:{ act :: IO(Char, Char)
and then hitting enter causes an error--perhaps I'm misunderstanding how to use them.
Besides just getting this particular case to work, is there a generic way of taking code that would run in a Haskell script and making it run in an interactive session?
You can't just insert semicolons to replace each line break. Doing stuff on one line means opting out of the layout rule, so you have to insert your own semicolons and braces. This means you need to know where those braces and semicolons would be required without the layout rule. For this case in particular, each do block needs braces around the whole block, and semicolons between each operation. The layout rule normally inserts these for you based on indentation.
So to write this specific example on one line, you can do this:
let act :: IO(Char, Char); act = do {x <- getChar; getChar; y <- getChar; return (x,y)}
On a new enough version of ghci you can omit the let as well.
For simple enough do blocks you might even get away with omitting the braces. In your example there's only one place the { and } could possibly go, and so GHCI inserts them even when you do everything on one line. But for an expression with multiple do blocks or other multiline constructs, you will need to insert them explicitly if you want them on one line.
On the broader question:
Besides just getting this particular case to work, is there a generic way of taking code that would run in a Haskell script and making it run in an interactive session?
The closest thing I know of is using the multiline delimiters, ":{ and :} (each on a single line of its own)". They can handle almost anything you can throw at them. They can't handle imports (GHCi does support the full import syntax, but each import must be on its own in a line) and pragmas (the only alternative is :set, which also need a line all of its own), which means you can't help but separate them from the rest of the code and enter them beforehand.
(You can always save the code somewhere and load the file with :l, and that will often turn out to be the more convenient option. Still, I have a soft spot for :{ and :} -- if I want no more than trying out half a dozen lines of impromptu code with no context, I tend to open a text editor window, write the little snippet and paste it directly in GHCi.)

Replacing in Vim every fourth occurrence

I have a file like this:
Question 1
b) answer b
c) answer c
a) answer a
d) answer d
Question 2
a) answer a
d) answer d
b) answer b
c) answer c
Alls answers are unsorted. I need to set all the first answers (no matter what letter, just that they are the first answer) to v) and the rest to x), so the output will be:
Question 1
v) answer b
x) answer c
x) answer a
x) answer d
Question 2
v) answer a
x) answer d
x) answer b
x) answer c
Is it possible with one-line command?
If you really wanted a one line sure. Anything can be made into a one liner if you use enough | to chain commands together.
You can do the iteration your self and run a different substitution command depending on how many of the pattern you have seen.
:let i = 0 | g/^[abcd])/if i % 4 == 0 | s//v)/ | else | s//x)/ | endif | let i = i + 1
If all the questions have the same number of lines, you could do this.
Starting with the cursor on the first column of the first line (in normal mode), you could record a macro (here, into a) that will do this for one block:
qajrvj^V}0rx'>)q
In more detail:
qa record a macro in register a
jrv move down one line and replace the first character with v
j^V} (type ^V as Ctrl-V) visually block select the first characters of the rest of the lines of the block
0 make sure we are on column 0 (this is important if we run into the end of the file)
rx replace these characters with x
'> go to the end of the last visual selection
) go to the next sentence (the beginning of the next question)
q stop recording the macro
This assumes all questions are on one line (this would be easy to change), that we are at the first column of the question line when the macro is run (also a pretty small change to generalize this) and that all answers are one line (the approach would need to be changed a bit, but generalizing this wouldn't be too much harder).
Now you can run this again with #a (and after that, ## repeats the last run macro). If you want to turn this into a recursive macro (which will keep running until one of the actions, probably a motion, fails), you can use this to append the recursive call to the end of the macro:
qA#aq
Using a capital register name means append to that register, rather than overwrite it. If you do this, and all the questions follow the restrictions, you can apply this replacement to an entire file by running the macro only once (manually).
You could also turn this into a function using the :normal command, if you wanted to keep it around for later use. If you do this, I would suggest spreading it out over several :normal commands and commenting it similar to how I did above.
You could use something like this:
:%s/\d\+\_s\zs\l\ze)/v/ | %s/\D\_s\zs\l\ze)/x/
But this is only for your special case.
Little bit of explanation:
%s substitute in all lines
\d\+ one or more numbers
\_s white space or end-of-line
\zs and \ze start and end of the match
\l lowercase character: [a-z]
\D non-digit: [^0-9]
| can be used to separate commands, so you can give multiple commands in one line.

Haskell Parsec strange issue with multiple expression occurrences

here is the code which to my mind shouldn't cause any issue but for some reason does?
program = expr8
<|> seqOfStmt
seqOfStmt =
do list <- (sepBy1 expr8 whiteSpace)
return $ if length list == 1 then head list else Seq list
I get 3 errors all in respect to 'list' not being in scope?
It's probably blatantly obvious what is going wrong but I can't figure out why
If there are any alternatives to this I would greatly like to hear them !
Thanks in advance,
Seán
Your final line uses a tab character for indentation, while the other lines use spaces only.
You have tabs set to four spaces in your editor, but ghc uses eight character tab stops (just as terminals do).
Therefore your return line is parsed as a continuation of the previous line, and list is not yet in scope.
One easy way to fix this is to refrain from using tabs: use spaces only.
Once you've fixed that, your next error will probably be a type error: head list and Seq list have different types (unless perhaps you have redefined head for some reason). It's not clear why you want to treat the list differently if it contains only a single element.

Moving between lines in VIM

Let's say I have a file with N lines. I'm at line X and I'd like to move to line Y, where both X and Y are visible on screen. I can do that by typing :Y<cr>, but if Y>99 that's a lot of typing. I can also do abs(Y-X)[kj] (move up or down by abs(Y-X)), but for big X,Y computing this difference mentally isn't so easy.
Is there a way to exploit the fact, that both X,Y are visible on screen and move between X and Y fast?
You can :set relativenumber which does that Y-X computing for you (only in Vim >= 7.3).
You can use H, M or L to go the top, middle and bottom of the screen.
Perhaps you can make use of H, M, or L.
These keys jump the cursor to display lines:
H "Home" top of screen
M "Middle" middle of screen
L "Last" last line of screen
With a count, they offset: 4L would go to the third line above the last (1L is the same as just L).
Personally, I make heavy use of the m command to mark a line for navigation. From where I am now, hit mq to mark the position with label q; then navigate to another line, and ma to mark it with label a; and from then on I can hit 'q to jump to position q and 'a to jump to position a. (q and a are arbitrary; I use those mostly due to their position on a QWERTY keyboard.)
One you have the marks, you can use them for commands. To delete from the current position to the line marked with q, you just use: d'q
There is a variant, where instead of single quote you use back quote. This takes you to the exact position on the line where you placed the mark; the single quote uses the start of the line.
Those marks work even for ex (command line) commands. To limit search and replace to a specific set of lines, I mark the beginning and end lines respectively with labels b and e, and then do my search and replace like so:
:'b,'es/foo/bar/g
Dropping my dime in the pond:
I find that traversing code is exceptionally easy with text objects. I rarely do use jk/JK for larger jumps any more. Instead I navigate for whitespace lines using { and }
Since on any one screen there are usually only so-many whitespace delineations (and they are very easily visually recognized and counted), I find that e.g.
3}j
lands me on the intended line a lot more often than, e.g., a guesstimated
27j
To top it all, many 'brace-full' programming languages have opening braces at the start of functions. These can be reached with [[ resp. ]]. So sometimes it is just a matter of doing, e.g.
2[[}
(meaning: go to start of previous function, after the first contiguous block of lines)
My version of VIM lets you guestimate a number immediately before hitting J or K to go that many lines.
15K goes up 15 lines
The tougher vimmer you are becoming, the bigger amount of lines you can count at first glance.
Don't know, maybe there are some clever techniques, but I just type something like 17k/23j and so on.
also, searching some word on the string you want to jump works.
also, zz (center screen) is sometimes helpful in this cases.

Resources