Haskell and Vim: Proper Indentation - haskell

Search for "vim haskell indent" on SO. There are lot of answers for how to configure Vim for Haskell indentation. None of them really "work". They don't provide code as is recommended by the Haskell indentation wiki page. For example, alignment of statements in a do or let block, the = and | of a data type, etc.
Does a Vim solution exist that generates code like the wiki?

This might not be the answer your are looking for, but there is a way you can follow the indentation wiki guide and be compatible with most editors.
For example, do-blocks
Instead of
myFunc x = do y <- bar
return $ x + y
You can indent it like this
myFunx x = do
y <- bar
return $ x + y
This is explicitly mentioned as an acceptable alternative in the indentation wiki.
In the same way, you can format data types
data FooBar
= Foo
| Bar
| Asdf
Guards
myFunc x
| x < 0 = 0
| otherwise = x
Where-clauses
myFunc x = x + y + c where
y = x + 5
c = x * y
And so on...
I personally started to use this kind of style because, like you said, no editor could reliable indent the code otherwise. This works better in all editors, as the indentation is always a multiple of four (or whatever else you pick for your base indentation level). As I used this style, I also started to prefer this consistent indentation level visually, so I wouldn't go back at this point even if editors got smarter.

Related

Haskell "parse error on input ‘=’"

I am doing some haskell exercises to learn the language and I have a syntax error I was hoping someone could help me with:
-- Split a list l at element k into a tuple: The first part up to and including k, the second part after k
-- For example "splitAtIndex 3 [1,1,1,2,2,2]" returns ([1,1,1],[2,2,2])
splitAtIndex k l = ([l !! x | x <- firstHalfIndexes], [l !! x | x <- firstHalfIndexes])
where firstHalfIndexes = [0..k-1]
secondHalfIndexes = [k..(length l-1)]
The syntax error is "parse error on input ‘=’" and seems to be coming from my second where clause, but I can't work out why the first where clause is ok but not the second?
The Haskell Report specifies that tab characters flesh out text to the next multiple of eight. Your code appears to assume that it gets fleshed out to the next multiple of four. (My best guess. Might also be configured to be five or six, but those settings seem less popular than four.)
See my page on tabs for ideas on how to safely use tabs in Haskell code; or else do what most other folks do and configure your editor to expand tabs to spaces.
For an example of the style I use, your current code looks like this to the compiler (using > to mark tabs and _ for spaces):
splitAtIndex_..._=_...
> where_> firstHalfIndexes_=_...
> > > secondHalfIndexes_=_...
I would write it to look like this to the compiler:
splitAtIndex_..._=_...
> where_> firstHalfIndexes_=_...
> ______> secondHalfIndexes_=_...
This also looks correct with four-space tabstops (and indeed any size tabstop):
splitAtIndex_..._=_...
> where_> firstHalfIndexes_=_...
> ______> secondHalfIndexes_=_...
(Actually, I would probably just use one space after where rather than a space and a tab, but that's an aesthetics thing, not really a technical one.)

G-machine, (non-)strict contexts - why case expressions need special treatment

I'm currently reading Implementing functional languages: a tutorial by SPJ and the (sub)chapter I'll be referring to in this question is 3.8.7 (page 136).
The first remark there is that a reader following the tutorial has not yet implemented C scheme compilation (that is, of expressions appearing in non-strict contexts) of ECase expressions.
The solution proposed is to transform a Core program so that ECase expressions simply never appear in non-strict contexts. Specifically, each such occurrence creates a new supercombinator with exactly one variable which body corresponds to the original ECase expression, and the occurrence itself is replaced with a call to that supercombinator.
Below I present a (slightly modified) example of such transformation from 1
t a b = Pack{2,1} ;
f x = Pack{2,2} (case t x 7 6 of
<1> -> 1;
<2> -> 2) Pack{1,0} ;
main = f 3
== transformed into ==>
t a b = Pack{2,1} ;
f x = Pack{2,2} ($Case1 (t x 7 6)) Pack{1,0} ;
$Case1 x = case x of
<1> -> 1;
<2> -> 2 ;
main = f 3
I implemented this solution and it works like charm, that is, the output is Pack{2,2} 2 Pack{1,0}.
However, what I don't understand is - why all that trouble? I hope it's not just me, but the first thought I had of solving the problem was to just implement compilation of ECase expressions in C scheme. And I did it by mimicking the rule for compilation in E scheme (page 134 in 1 but I present that rule here for completeness): so I used
E[[case e of alts]] p = E[[e]] p ++ [Casejump D[[alts]] p]
and wrote
C[[case e of alts]] p = C[[e]] p ++ [Eval] ++ [Casejump D[[alts]] p]
I added [Eval] because Casejump needs an argument on top of the stack in weak head normal form (WHNF) and C scheme doesn't guarantee that, as opposed to E scheme.
But then the output changes to enigmatic: Pack{2,2} 2 6.
The same applies when I use the same rule as for E scheme, i.e.
C[[case e of alts]] p = E[[e]] p ++ [Casejump D[[alts]] p]
So I guess that my "obvious" solution is inherently wrong - and I can see that from outputs. But I'm having trouble stating formal arguments as to why that approach was bound to fail.
Can someone provide me with such argument/proof or some intuition as to why the naive approach doesn't work?
The purpose of the C scheme is to not perform any computation, but just delay everything until an EVAL happens (which it might or might not). What are you doing in your proposed code generation for case? You're calling EVAL! And the whole purpose of C is to not call EVAL on anything, so you've now evaluated something prematurely.
The only way you could generate code directly for case in the C scheme would be to add some new instruction to perform the case analysis once it's evaluated.
But we (Thomas Johnsson and I) decided it was simpler to just lift out such expressions. The exact historical details are lost in time though. :)

Function definition with guards

Classic way to define Haskell functions is
f1 :: String -> Int
f1 ('-' : cs) -> f1 cs + 1
f1 _ = 0
I'm kinda unsatisfied writing function name at every line. Now I usually write in the following way, using pattern guards extension and consider it more readable and modification friendly:
f2 :: String -> Int
f2 s
| '-' : cs <- s = f2 cs + 1
| otherwise = 0
Do you think that second example is more readable, modifiable and elegant? What about generated code? (Haven't time to see desugared output yet, sorry!). What are cons? The only I see is extension usage.
Well, you could always write it like this:
f3 :: String -> Int
f3 s = case s of
('-' : cs) -> f3 cs + 1
_ -> 0
Which means the same thing as the f1 version. If the function has a lengthy or otherwise hard-to-read name, and you want to match against lots of patterns, this probably would be an improvement. For your example here I'd use the conventional syntax.
There's nothing wrong with your f2 version, as such, but it seems a slightly frivolous use of a syntactic GHC extension that's not common enough to assume everyone will be familiar with it. For personal code it's not a big deal, but I'd stick with the case expression for anything you expect other people to be reading.
I prefer writing function name when I am pattern matching on something as is shown in your case. I find it more readable.
I prefer using guards when I have some conditions on the function arguments, which helps avoiding if else, which I would have to use if I was to follow the first pattern.
So to answer your questions
Do you think that second example is more readable, modifiable and elegant?
No, I prefer the first one which is simple and readable. But more or less it depends on your personal taste.
What about generated code?
I dont think there will be any difference in the generated code. Both are just patternmatching.
What are cons?
Well patternguards are useful to patternmatch instead of using let or something more cleanly.
addLookup env var1 var2
| Just val1 <- lookup env var1
, Just val2 <- lookup env var2
= val1 + val2
Well the con is ofcourse you need to use an extension and also it is not Haskell98 (which you might not consider much of a con)
On the other hand for trivial pattern matching on function arguments I will just use the first method, which is simple and readable.

How to nest let statements in Haskell?

I'm trying to nest a couple let statements, but I'm getting syntax errors that don't make sense to me. I'm really new to Haskell programming so I'm sure it's something I just don't understand (probably having to do with the spacing). I understand that let and in must be in the same column.
Why is it that:
aaa = let y = 1+2
z = 4+6
in y+z
Works perfectly fine, whereas
aaa = let y = 1+2
z = 4+6
in let f = 3
e = 3
in e+f
gives me the error: "Syntax error in expression (unexpected `=')"
In the second example, the z = ... isn't aligned with the y = .... In a let block, every definition has to be aligned.
I suspect you're indenting with tab characters, and have your editor set to display tabs as less than 8 spaces, making it look like it's aligned to you. You should replace the tab with spaces, and preferably set your editor to expand tabs into spaces to avoid problems like this in the future.

What characters are allowed in Haskell function names?

What is a valid name for a function?
Examples
-- works
let µ x = x * x
let ö x = x * x
-- doesn't work
let € x = x * x
let § x = x * x
I am not sure, but my hunch is that Haskell doesn't allow Unicode function names, does it?
(Unicode like in http://www.cse.chalmers.se/~nad/listings/lib-0.4/Data.List.html)
From the Haskell report:
Haskell uses the Unicode character set. However, source programs are currently biased toward the ASCII character set used in earlier versions of Haskell .
Recent versions of GHC seem to be fine with unicode (at least in the form of UTF-8):
Prelude> let пять=5; два=2; умножить=(*); на=id in пять `умножить` на два
10
(In case you wonder, «пять `умножить` на два» means «five `multiplied` by two» in Russian.)
Your examples do not work because those character are «symbols» and can be used in infix operators but not in function names. See "uniSymbol" category in the report.
Prelude> let x € y = x * y in 2 € 5
10

Resources