Preprocessor for haskell source: is cpp the only option? - haskell

I can see from plenty of Q&As that cpp is the usual preprocessor for Haskell source; but that it isn't a good fit for the job. What other options are there?
Specifically:
Haskell syntax is newline-sensitive and space/indent-sensitive -- unlike C, so cpp just tramples on whitespace;
' in Haskell might surround a character literal, but also might be part of an identifier (in which case it won't be paired) -- but cpp complains if not a char literal;
\ gets a trailing space inserted -- which is not a terrible inconvenience, but I'd prefer not.
I'm trying to produce a macro to generate an instance from parameters for a newtype type and corresponding data constructor. It needs to generate both the instance head and constraints and a method binding. By just slotting the constructors into an instance skeleton.
(Probably Template Haskell could do this; but it seems rather a large hammer.)

cpphs seems to be just about enough for my (limited) purposes. I'm adding this answer for the record; an answer suggesting cpphs (and some sensible advice to prefer Template Haskell) was here and then gone.
But there's some gotchas that meant at first sight I'd overlooked how it helped.
Without setting any options, it behaves too much like cpp to be helpful. At least:
It doesn't complain about unpaired '. Indeed you can #define dit ' and that will expand happily.
More generally, it doesn't complain about any nonsense input: it grimly carries on and produces some sort of output file without warning you about ill-formed macro calls.
It doesn't insert space after \.
By default, it smashes together multiline macro expansions, so tramples on whitespace just as much.
Its tokenisation seems to get easily confused between Haskell vs C. specifically, using C-style comments /* ... */ seems to upset not only those lines, but a few lines below. (I had a #define I wanted to comment out; should have used Haskell style comments {- ... -} -- but then that appears in the output.)
The calling convention for macros is C style, not Haskell. myMacro(someArg) -- or myMacro (someArg) seems to work; but not myMacro someArg. So to embed a macro call inside a Haskell expression probably needs surrounding the lot in extra parens. Looks like (LISP).
A bare macro call on a line by itself myInstance(MyType, MyConstr) would not be valid Haskell. The dear beastie seems to get easily confused, and fails to recognise that's a macro call.
I'm nervous about # and ## -- because in cpp they're for stringisation and catenation. I did manage to define (##) = (++) and it seemed to work; magicHash# identifiers seemed ok; but I didn't try those inside macro expansion.
Remedies
(The docos don't make this at all obvious.)
To get multi-line output from a multi-line macro def'n, and preserving spaces/indentation (yay!) needs option --layout. So I have my instance definition validly expanded and indented.
If your tokenisation is getting confused, maybe --text will help: this will "treat input as plain text, not Haskell code" -- although it does still tolerate ' and \ better. (I didn't encounter any downsides from using --text -- the Haskell code seemed to get through unscathed, and the macros expanded.)
If you have a C-style comment that you don't want to appear in output, use --strip.
There's an option --hashes, which I imagine might interact badly with magicHash#.
The output file starts with a header line #line .... The compiler won't like that; suppress with --noline.

I would say that Template Haskell is the most perfect tool for this purpose. It is the standard set of combinators for constructing correct Haskell source code. After that there is GHC.Generics, which might allow you to write a single instance that would cover any type which is an instance of Generic.

Related

Recognise #define compiler directive in fortran with ctags

I would like to configure ctags to recognize compiler directives in a fortran code. More specifically, I would like to match the following vim search result
/\v[ \t]*#define[ \t]+([-[:alnum:]*+!_:\/.?]+)/
where \v induces the very magic level (see Can I turn on extended regular expressions support in Vim?). Alternatively, the search using a normal regular expression
/[ \t]*#define[ \t][ \t]*\([-[:alnum:]*+!_:\/.?][-[:alnum:]*+!_:\/.?]*\)/
could be used. If general compiler directives are found, that would help me too. A practical application would be that when pressing <C+]>, when having my cursor at _ABORT in the following piece of code
_ABORT("delta_time is too small")
I would be redirected to the corresponding definition
#define _ABORT(msg) call abimem_abort(msg, __FILE__, __LINE__)
Based on https://andrew.stwrt.ca/posts/vim-ctags/, I tried to add either
--regex-fortran=/[ \t]*#define[ \t]+([-[:alnum:]*+!_:\/.?]+)/\1/d,directives/
or
--regex-fortran=/[ \t]*#define[ \t][ \t]*\([-[:alnum:]*+!_:\/.?][-[:alnum:]*+!_:\/.?]*\)/\1/d,directives/
to ~/.ctags. Based on http://ctags.sourceforge.net/ctags.html, I also tried to add --line-directives=yes, but in neither case I could succeed in the practical application I gave as an example above. I can already see the additional kind when using
ctags --list-kinds
but that's all. What went wrong?
It seems that it works now. My current ~/.ctags is
--fortran-kinds=+i
--recurse=yes
--exclude=.git
--regex-fortran=/[ \t]*#define[ \t]+([-[:alnum:]*+!_:\/.?]+)/\1/d,directives/
Probably this has to do with the fact that I previously had put a '\v' in the ~/.ctags (and did not copied this properly to the question). Could someone explain why this '\v' must not be present there, though vim is configured to need it for extended regex's?
Another thing that happened between previous try and now is a reboot (clean-up of temporary space etc.), which might help if still stuck.
Moreover, one should remark that the extra regex is not always necessary. Following macro definition was found without the regex:
# define MSG_ERROR(msg) call libpaw_msg_hndl(msg,"ERROR" ,"PERS")

Where are line breaks allowed within Haskell expressions?

Background
Most style guides recommend keeping line lengths to 79 characters or less. In Haskell, indentation rules mean that expressions frequently need to be broken up with new lines.
Questions:
Within expressions, where is it legal to place a new line?
Is this documented somewhere?
Extended question: I see GHC formatting my code when it reports an error so someone has figured out how to automate the process of breaking long lines. Is there a utility that I can put haskell code into and have it spit that code back nicely formatted?
You can place a newline anywhere between lexical tokens of an expression. However, there are constraints about how much indentation may follow the newline. The easy rule of thumb is to indent the next line to start to the right of the line containing the expression. Beyond that, some style things:
If you are indenting an expression that appears in a definition name = expression, it's good style to indent to the right of the = sign.
If you are indenting an expression that appears on the right-hand side of a do binding or a list comprehension, it's good style to indent to the right of the <- sign.
The authoritative documentation is probably the Haskell 98 Report (Chapter 2 on lexical structure), but personally I don't find this material very easy to read.

Haskell Guards and SublimeText 3

I switched over to Sublime Text 3 but now that I was coding some Haskell in ST3 I noticed something quite odd, which is the syntax highlighting logic for guards.
As you can see, when I write it this way, it highlights the first guard in white colour and the different sign in a mix of white/magenta:
Only when I use this wrong syntax (with an equal sign after the argument) it displays correctly.
Does anyone know how to fix this?
You're probably using the default Haskell syntax highlighting. I would recommend disabling the Haskell package and installing SublimeHaskell. Its syntax highlighting is much better, and it recognizes things like otherwise as being a "built-in" (it's mainly Prelude functions that are considered built-in).
If you're using the built-in Haskell highlighting, you can check that it's buggy by using the CtrlAltShiftP shortcut. Highlight each guard pipe individually and then hit this shortcut. In the status bar it'll briefly show the syntax scope names associated with the region. For the first pipe, you'll get source.haskell meta.function.type-declaration.haskell, and for the second you'll get source.haskell keyword.operator.haskell. Using SublimeHaskell's syntax you'll get source.haskell keyword.operator.haskell for both pipes. I won't say that SublimeHaskell's is perfect (try indenting an entire file after module Name where), but it's definitely better. Since the syntaxes have the same name and because SublimeHaskell comes with snippets and whatnot that cover everything that the built-in does, I recommend disabling the Haskell plugin and only leaving SublimeHaskell's syntax selectable.
(NOT SURE!!!)
I now believe this is not a bug, instead I believe this is actually ST3's way of telling you you have non-exaustive patterns in that function.
Non-exaustive: http://i.imgur.com/74o4sgp.png
Exaustive: http://i.imgur.com/M9a4TTL.png

Record syntax clarification

OK, so here's an unusual one. Every time you see an example of Haskell's record syntax, it always looks like
Sphere {center = 0, radius = 2}
or similar. My question is... are those curly brackets actually part of the record syntax? Or are they actually shorthand for layout? In other words, can you actually write something like
Sphere
center = 0
radius = 2
and have it work?
I doubt it would be very useful to do this - it takes up a lot of visual space - but I'm just curious as to whether this is syntactically valid or not.
Layout is an alternative to explicit braces and semicolons.
Record syntax uses explicit braces and commas.
So no, you can't use layout as part of record syntax.
Haskell Report 2010 ยง2.7 Layout:
Haskell permits the omission of the braces and semicolons used in several grammar productions, by using layout to convey the same information.
OK, well I thought I'd put this question here in case anybody was interested. Having consulted the Haskell Report itself, it appears that the braces are literally a formal part of the record construct:
http://www.haskell.org/onlinereport/haskell2010/haskellch4.html#x10-690004.2.1
That means that these tokens actually have two distinct meanings in Haskell - as declaration delimiters when layout is not being used, and as record delimiters. I bet that leads to some interesting parser edge-cases!
(I also note in passing that EmptyDataDecls appears to be on by default in Haskell 2010, which is worth knowing...)
After Sphere, the lexer won't insert a brace. Why should it? You dont expect a brace inserted in code like:
z = x
+ y
either, do you?

Treat macro arguments in Common Lisp as (case-sensitive) strings

(This is one of those things that seems like it should be so simple that I imagine there may be a better approach altogether)
I'm trying to define a macro (for CLISP) that accepts a variable number of arguments as symbols (which are then converted to case-sensitive strings).
(defmacro symbols-to-words (&body body)
`(join-words (mapcar #'symbol-name '(,#body))))
converts the symbols to uppercase strings, whereas
(defmacro symbols-to-words (&body body)
`(join-words (mapcar #'symbol-name '(|,#body|))))
treats ,#body as a single symbol, with no expansion.
Any ideas? I'm thinking there's probably a much easier way altogether.
The symbol names are uppercased during the reader step, which occurs before macroexpansion, and so there is nothing you can do with macros to affect that. You can globally set READTABLE-CASE, but that will affect all code, in particular you will have to write all standard symbols in uppercase in your source. There is also a '-modern' option for CLISP, which provides lowercased version for names of the standard library and sets the reader to be case-preserving, but it is itself non-standard. I have never used it myself so I am not sure what caveats actually apply.
The other way to control the reader is through reader macros. Common Lisp already has a reader macro implementing a syntax for case-sensitive strings: the double quote. It is hard to offer more advice without knowing why you are not just using it.
As Ramarren correctly says, the case of symbols is determined during read time. Not at macro expansion time.
Common Lisp has a syntax for specifying symbols without changing the case:
|This is a symbol| - using the vertical bar as multiple escape character.
and there is also a backslash - a single escape character:
CL-USER > 'foo\bar
|FOObAR|
Other options are:
using a different global readtable case
using a read macro which reads and preserves case
using a read macro which uses its own reader
Also note that a syntax for something like |,#body| (where body is spliced in) does not exist in Common Lisp. The splicing in does only work for lists - not symbol names. |, the vertical bar, surrounds character elements of a symbol. The explanation in the Common Lisp Hyperspec is a bit cryptic: Multiple Escape Characters.

Resources