Background
Most style guides recommend keeping line lengths to 79 characters or less. In Haskell, indentation rules mean that expressions frequently need to be broken up with new lines.
Questions:
Within expressions, where is it legal to place a new line?
Is this documented somewhere?
Extended question: I see GHC formatting my code when it reports an error so someone has figured out how to automate the process of breaking long lines. Is there a utility that I can put haskell code into and have it spit that code back nicely formatted?
You can place a newline anywhere between lexical tokens of an expression. However, there are constraints about how much indentation may follow the newline. The easy rule of thumb is to indent the next line to start to the right of the line containing the expression. Beyond that, some style things:
If you are indenting an expression that appears in a definition name = expression, it's good style to indent to the right of the = sign.
If you are indenting an expression that appears on the right-hand side of a do binding or a list comprehension, it's good style to indent to the right of the <- sign.
The authoritative documentation is probably the Haskell 98 Report (Chapter 2 on lexical structure), but personally I don't find this material very easy to read.
Related
The exact problem: I have a source in C++ and I need to replace a symbol name to some other name. However, I need that this replace the symbol only, not accidentally the same looking word in comments or text in "".
The source information what particular language section it is, is enough defined in the syntax highlighting rules. I know they can fail sometimes, but let's state this isn't a problem. I need some way to walk through all found occurrences of the phrase, then check in which section it is found, and if it's text or comment, this phrase should be skipped. Otherwise the replacement should be done either immediately, or by asking first, depending on well known c flag.
What I imagine would be at least theoretically possible is:
Having a kinda "callback" when doing substitution (called for each phrase found, and requesting the answer whether to substitute or not), or extract the list of positions where the phrase has been found, then iterate through all of them
Extract the name of the current "hi-linked" syntax highlighting rule, which is used to color the text at given position
Is it at all possible within the current features of vim?
Yes, with a :help sub-replace-expression, you can evaluate arbitrary expressions in the replacement part of :substitute. Vim's synID() and synstack() functions allow you to get the current syntax element.
Luc Hermitte has an implementation that omits replacement inside strings, here. You can easily adapt this to your use case.
With the help of my ingo-library plugin, you can define a short predicate function, e.g. matching comments and constants (strings, numbers, etc.):
function! CommentOrConstant()
return ingo#syntaxitem#IsOnSyntax(getpos('.'), '^\%(Comment\|Constant\)$')
endfunction
My PatternsOnText plugin now provides a :SubstituteIf command that works like :substitute, but also takes a predicate expression. With that, it's very easy to do a replacement anywhere except in comments or constants:
:%SubstituteIf/pattern/replacement/g !CommentOrConstant()
I wrote a Vim script for the autocompletion of Fortran program units, type definition and so on, taking the cue from vim-latex plugin.
At the moment, if I strike <F5> while the cursor is on the word program, I get the following
PROGRAM <+program_name+>
USE <+used_module_name+>
IMPLICIT NONE
<++>
END PROGRAM <+program_name+>
with the first <+program_name+> visually selected and Vim in select mode. And this is perfect for me.
The problem arises when I use such a placeholder as a label for the IF construct. When I expand if I get
<+name+>: IF (<+logical expression+>) THEN
<++> ! this line is not indented => in turn the following are negative indented
ELSE IF (<+logical expression+>) THEN
<++>
ELSE
<++>
END IF <+name+>
where the second line is not indented due to the fact (at least I suppose!) that the string <+name+> is not a valid name. As a consequence, the following lines move back (obviously when the if is in the first column, the second line is the only one to be wrong).
This also happens for the DO construct, but, strangely, doesn't happen for the SELECT CASE construct:
<+name+>: SELECT CASE (<+case expression+>)
CASE (<+case selector+>)
<++>
CASE DEFAULT
<++>
END SELECT <+name+>
And this is why I think a soultion must exist and be not so complicated.
I decided to solve the problem in the "dirty" way, that is, by inserting spaces in proper position in the command sequence generating the the IF...THEN...ELSE...END IF and DO...END DO constructs. This is not an elegant solution, but I don't think that has so much drawbacks. The only thing to change would be the number of spaces to add manually to the command sequences according to shiftwidth.
As #SatoKatsura suggested in a comment, it'd be better to abandon this road and use existing snippet solutions.
I'm using Vim's SmartTabs plugin to alingn C code with tabs up to the indentation level, then spaces for alignment after that. It works great for things like
void fn(int a,
________int b) {
--->...
Tabs are --->, spaces are _. But it doesn't seem to work so well for cases like
--->if(some_variable >
--->--->some_other_variable) {
--->...
In the case above, Vim inserts tabs on the second line inside the parentheses. Is there a way I can modify what Vim sees as a continuation line to include cases like this, so I get:
--->if(some_variable >
--->___some_other_variable) {
--->...
If there's an indentation style that would both allow flexible indentation width according to one's preferences, and consistent alignment, your suggested scheme would be it. Unfortunately, this style requires some basic understanding of the underlying syntax (e.g. whether some_other_variable is part of the line-broken conditional (→ Spaces) or a function call within the conditional (→ Tab)), and this makes implementing it difficult.
I'm not aware of any existing Vim plugin. The 'copyindent' and 'preserveindent' options help a bit, but essentially you have to explicitly insert the non-indent with Space yourself (and probably :set list to verify).
I don't know about that other Editor, but the situation is similar for most other inferior code editors. Without good automatic support, this otherwise elegant style will have a hard time gaining acceptance. I would love to see such a plugin for Vim.
OK, so here's an unusual one. Every time you see an example of Haskell's record syntax, it always looks like
Sphere {center = 0, radius = 2}
or similar. My question is... are those curly brackets actually part of the record syntax? Or are they actually shorthand for layout? In other words, can you actually write something like
Sphere
center = 0
radius = 2
and have it work?
I doubt it would be very useful to do this - it takes up a lot of visual space - but I'm just curious as to whether this is syntactically valid or not.
Layout is an alternative to explicit braces and semicolons.
Record syntax uses explicit braces and commas.
So no, you can't use layout as part of record syntax.
Haskell Report 2010 §2.7 Layout:
Haskell permits the omission of the braces and semicolons used in several grammar productions, by using layout to convey the same information.
OK, well I thought I'd put this question here in case anybody was interested. Having consulted the Haskell Report itself, it appears that the braces are literally a formal part of the record construct:
http://www.haskell.org/onlinereport/haskell2010/haskellch4.html#x10-690004.2.1
That means that these tokens actually have two distinct meanings in Haskell - as declaration delimiters when layout is not being used, and as record delimiters. I bet that leads to some interesting parser edge-cases!
(I also note in passing that EmptyDataDecls appears to be on by default in Haskell 2010, which is worth knowing...)
After Sphere, the lexer won't insert a brace. Why should it? You dont expect a brace inserted in code like:
z = x
+ y
either, do you?
Vims errorformat (for parsing compile/build errors) uses an arcane format from c for parsing errors.
Trying to set up an errorformat for nant seems almost impossible, I've tried for many hours and can't get it. I also see from my searches that alot of people seem to be having the same problem. A regex to solve this would take minutesto write.
So why does vim still use this format? It's quite possible that the C parser is faster but that hardly seems relevant for something that happens once every few minutes at most. Is there a good reason or is it just an historical artifact?
It's not that Vim uses an arcane format from C. Rather it uses the ideas from scanf, which is a C function. This means that the string that matches the error message is made up of 3 parts:
whitespace
characters
conversion specifications
Whitespace is your tabs and spaces. Characters are the letters, numbers and other normal stuff. Conversion specifications are sequences that start with a '%' (percent) character. In scanf you would typically match an input string against %d or %f to convert to integers or floats. With Vim's error format, you are searching the input string (error message) for files, lines and other compiler specific information.
If you were using scanf to extract an integer from the string "99 bottles of beer", then you would use:
int i;
scanf("%d bottles of beer", &i); // i would be 99, string read from stdin
Now with Vim's error format it gets a bit trickier but it does try to match more complex patterns easily. Things like multiline error messages, file names, changing directory, etc, etc. One of the examples in the help for errorformat is useful:
1 Error 275
2 line 42
3 column 3
4 ' ' expected after '--'
The appropriate error format string has to look like this:
:set efm=%EError\ %n,%Cline\ %l,%Ccolumn\ %c,%Z%m
Here %E tells Vim that it is the start of a multi-line error message. %n is an error number. %C is the continuation of a multi-line message, with %l being the line number, and %c the column number. %Z marks the end of the multiline message and %m matches the error message that would be shown in the status line. You need to escape spaces with backslashes, which adds a bit of extra weirdness.
While it might initially seem easier with a regex, this mini-language is specifically designed to help with matching compiler errors. It has a lot of shortcuts in there. I mean you don't have to think about things like matching multiple lines, multiple digits, matching path names (just use %f).
Another thought: How would you map numbers to mean line numbers, or strings to mean files or error messages if you were to use just a normal regexp? By group position? That might work, but it wouldn't be very flexible. Another way would be named capture groups, but then this syntax looks a lot like a short hand for that anyway. You can actually use regexp wildcards such as .* - in this language it is written %.%#.
OK, so it is not perfect. But it's not impossible either and makes sense in its own way. Get stuck in, read the help and stop complaining! :-)
I would recommend writing a post-processing filter for your compiler, that uses regular expressions or whatever, and outputs messages in a simple format that is easy to write an errorformat for it. Why learn some new, baroque, single-purpose language unless you have to?
According to :help quickfix,
it is also possible to specify (nearly) any Vim supported regular
expression in format strings.
However, the documentation is confusing and I didn't put much time into verifying how well it works and how useful it is. You would still need to use the scanf-like codes to pull out file names, etc.
They are a pain to work with, but to be clear: you can use regular expressions (mostly).
From the docs:
Pattern matching
The scanf()-like "%*[]" notation is supported for backward-compatibility
with previous versions of Vim. However, it is also possible to specify
(nearly) any Vim supported regular expression in format strings.
Since meta characters of the regular expression language can be part of
ordinary matching strings or file names (and therefore internally have to
be escaped), meta symbols have to be written with leading '%':
%\ The single '\' character. Note that this has to be
escaped ("%\\") in ":set errorformat=" definitions.
%. The single '.' character.
%# The single '*'(!) character.
%^ The single '^' character. Note that this is not
useful, the pattern already matches start of line.
%$ The single '$' character. Note that this is not
useful, the pattern already matches end of line.
%[ The single '[' character for a [] character range.
%~ The single '~' character.
When using character classes in expressions (see |/\i| for an overview),
terms containing the "\+" quantifier can be written in the scanf() "%*"
notation. Example: "%\\d%\\+" ("\d\+", "any number") is equivalent to "%*\\d".
Important note: The \(...\) grouping of sub-matches can not be used in format
specifications because it is reserved for internal conversions.
lol try looking at the actual vim source code sometime. It's a nest of C code so old and obscure you'll think you're on an archaeological dig.
As for why vim uses the C parser, there are plenty of good reasons starting with that it's pretty universal. But the real reason is that sometime in the past 20 years someone wrote it to use the C parser and it works. No one changes what works.
If it doesn't work for you the vim community will tell you to write your own. Stupid open source bastards.