options library like Google GFlags for Haskell

options library like Google GFlags for Haskell - haskell

I'm interested in having something very similar to Google's flags library for Haskell.
Here is the small introduction to gflags that demonstrates why I love it: http://gflags.googlecode.com/svn/trunk/doc/gflags.html
I looked into the various getopt like libraries on Hackage and I haven't found one that matches the simplicity and flexibility of gflags.
Namely, I'd like to have these features:
generates --help (with default values mentioned in the help),
besides parsing the options given by the user, it should also err on unmatched options, so the user has a chance to note typos,
flags can be declared in any module easily (hopefully at the top-level, Template Haskell hackery acceptable if needed),
no need in main to call out to all the modules where I declared flags, instead the flags somehow register themselves at startup/linking/whatever time,
it's OK if main has to call a general initialization function, like in gflags' google::ParseCommandLineFlags(&argc, &argv, true);
flags can be used purely (yeah, I think this is an appropriate usage of unsafePerformIO to make the API more simple).
After looking around without success, I played with the idea of doing this myself (and of course sharing it on Hackage). However, I have absolutely no idea for the implementation of the registration part. I need something similar to GCC's ((constructor)) attribute or to C++'s static initialization, but in Haskell. Standard top-level unsafePerformIO is not enough, because that is lazy, so it won't be called before main starts to run.

After investigating all the solutions on Hackage (thanks for all the tips!), I went ahead with Don's typeclass implementation idea and created a library named HFlags.
It's on hackage: http://hackage.haskell.org/package/hflags
I also have a blog post, describing it: http://blog.risko.hu/2012/04/ann-hflags-0.html

You might like CmdArgs, though I haven't used it enough to tell if it satisfies all your constraints.

Related

Relation between MSVC Compiler & linker option for COMDAT folding

This question has some answers on SO but mine is slightly different. Before marking as duplicate, please give it a shot.
MSVC has always provided the /Gy compiler option to enable identical functions to be folded into COMDAT sections. At the same time, the linker also provides the /OPT:ICF option. Is my understanding right that these two options must be used in conjunction? That is, while the former packages functions into COMDAT, the latter eliminates redundant COMDATs. Is that correct?
If yes, then either we use both or turn off both?

Answer from someone who communicated with me off-line. Helped me understand these options a lot better.
===================================
That is essentially true. Suppose we talk just C, or C++ but with no member functions. Without /Gy, the compiler creates object files that are in some sense irreducible. If the linker wants just one function from the object, it gets them all. This is specially a consideration in programming for libraries, such that if you mean to be kind to the library's users, you should write your library as lots of small object files, typically one non-static function per object, so that the user of the library doesn't bloat from having to carry code that actually never executes.
With /Gy, the compiler creates object files that have COMDATs. Each function is in its own COMDAT, which is to some extent a mini-object. If the linker wants just one function from the object, it can pick out just that one. The linker's /OPT switch gives you some control over what the linker does with this selectivity - but without /Gy there's nothing to select.
Or very little. It's at least conceivable that the linker could, for instance, fold functions that are each the whole of the code in an object file and happen to have identical code. It's certainly conceivable that the linker could eliminate a whole object file that contains nothing that's referenced. After all, it does this with object files in libraries. The rule in practice, however, used to be that if you add a non-COMDAT object file to the linker's command line, then you're saying you want that in the binary even if unreferenced. The difference between what's conceivable and what's done is typically huge.
Best, then, to stick with the quick answer. The linker options benefit from being able to separate functions (and variables) from inside each object file, but the separation depends on the code and data to have been organised into COMDATs, which is the compiler's work.
===================================

As answered by Raymond Chen in Jan 2013
As explained in the documentation for /Gy, function-level linking
allows functions to be discardable during the "unused function" pass,
if you ask for it via /OPT:REF. It does not alter the actual classical
model for linking. The flag name is misleading. It's not "perform
function-level linking". It merely enables it by telling the linker
where functions begin and end. And it's not so much function-level
linking as it is function-level unlinking. -Raymond
(This snippet might make more sense with some further context:here are the posts about classical linking model:1, 2
So in a nutshell - yes. If you activate one switch without the other, there would be no observable impact.

How to use / where to find Haskell API?

Is this the documentation I should use for learning about the various haskell functions: https://www.haskell.org/platform/doc/2013.2.0.0/ghc-api/GHC.html ?
Can search from here or other Haskell doc for how each type should be used? For example if I wanted to learn more about the Int type (without tying :t on command line) can this be searched upon ?
If above is the API it seems much more minimalist that say a Java/Scala API. But perhaps this is one of the strong points of Haskell provide a succinct, yet very powerful set of base functions to build my abstractions upon ?

ghc-api is the API for interacting with the GHC compiler. The "standard library" is documented at http://hackage.haskell.org/package/base.

Can search from here or other Haskell doc for how each type should be
used?
In Haskel you usually think functions first, i.e. you may want to know what functions are available, what input they accept and what output they produce. This is typically expressed by class constraints ("must be a list of something") and sometimes as concrete types.
If you want to find particular functions, then try Hoogle. If you enter Prelude into the search field, you will find module Prelude among the answers. You may see the Prelude as the "Haskell API", i.e. the functions which are always available.
Then there is the base package, which consists of a number of additional modules (as already pointed out by Rein Henrichs), which are also widely used.
Finally there is all the rest, i.e. special purpose modules. Many of the can also be found on Hoogle.
But frankly, I don't think that "learning the API" is a good way to approach Haskell. This may work in Java, where you're dealing with Classes, Object and Methods all the time.
In Haskell you are at a much higher level of abstaction. In Haskell you may find ways to implement Classes, Objects and Methods as one example showcase for a certain abstraction. However, it would not be abvious from reading the API that you can simulate OOP with it.

Hoogle is a Haskell API search engine, which allows you to search many standard Haskell libraries by either function name, or by approximate type signature. The Hoogle manual contains more details, including further details on search queries, how to install Hoogle as a command line application and how to integrate Hoogle with Firefox/Emacs/Vim etc.
Thanks,
http://www.thecheesyanimation.com/3d-home-interior-design.html

What would be involved in calling ARPACK++ (a C++ library) from Haskell?

I've spent a couple of days developing a program in Haskell, while learning the language. Now I realize that I'll need to call Arpack (a Fortran library) or Arpack++ (a C++ wrapper to Arpack) -- I can't find a good implementation of Lanczos method with Haskell bindings. Do any more experienced Haskell programers have an opinion of how difficult this would be?
I've been able to get ".so" ("shared object") versions of libarpack and libarpack++ installed through Ubuntu's repository, but I'm not sure that will suffice. I suspect I'm going to ultimately need to build Arpack++ from source code, which is possible, but I'm getting a lot of build errors, so it will take time. Is there any way to use just the ".so" files, without knowing exactly which version of the header files were used to generate them?
I'm considering using GreenCard, because it looks like the most well maintained Haskell/C bridge. I can't find much documentation though, so I'm wondering whether it will support C++ too.
I'm also starting to wonder whether I should rewrite my program in Python, and use scipy to call Arpack, but I've already sunk a couple of days into writing Haskell. I really like Haskell too, so I'm hoping I can make this work. I guess my overall question is this: What would be involved in making this work with Haskell?
Thanks much.

ELF format is standard format of executables and shared libraries, so accessing the code in these compiled modules is only a matter of knowing function names. If I understand correctly, Fortran is interoperable with C. As a consequence, Fortran should be interoperable with any language which can use C bindings, including Haskell. FYI, you can find all names exported by a module (executable or shared object or simple object archive) using nm tool (it is usually available in all linux distros by default). This of course would work if the binary file was not "stripped", but AFAIK it is not common practice.
However, Haskell cannot use C++ bindings in sane way, since C++ polymorphic features require name mangling, and the method of this name transformation is highly compiler-dependent. It is well-known problem which is not specific to Haskell. Of course, you could try to get a list of exported symbols from C++ shared object and then bind them using FFI, but... It isn't worth it.
As dsign said, you can use Foreign Function Interface GHC feature to create bindings to foreign code. All you would require is library headers (and the library itself of course). In case of C language that would be header files (*.h), but since your library is written in Fortran, you have to find header files analogue in library sources, refere to this page to match Fortran and C types, and then use this information to write FFI bindings. It would be helpful first to write C bindings, i.e. write C header. Then you can even use automatic FFI binding programs like c2hs.
It maybe also helpful to look through C++ bindings. It is possible that it has the header file I've described above. If it has one, then writing FFI bindings will be no more difficult than writing them for any other library.
So, it is not entirely impossible, but it may require some thorough work. Writing bindings to scientific/pure computational libraries is way easier than writing them for some system library which does a lot of IO and keeps its own internal state, but since this library is written not in C... Well, it may be advisable to invest your time in easier alternatives. I cannot say anythin about scipy, I've never used it, but since Python as a language is much more simpler than Haskell, it may be good alternative.

I can tell you that using a C/Fortran library from Haskell, with the help of the Foreign Function Interface would be certainly possible and not terribly complicated. Here is an introduction. In my understanding, you should be able to call anything with a C calling convention, and perhaps even Fortran, without need of recompiling the code. The only exception is with things that look like function calls but are indeed macros, in which case you will have to figure out what the macros do and reproduce them in Haskell.
As of greencard, I have never used it, so I can not vouch for it.
Your second idea of using Python could potentially save you more than a couple of days. Sad as it is, I have never managed Haskell code to easily adapt to my changing requirements, while I find that trivial in Python. Of course, that could be a limitation on my skills with Haskell or my thinking process rather that something to blame to the language.

How are the Haddock module fields Portability, Stability and Maintainer used?

In lots of Haddock-generated module documentation (e.g. Prelude), a small box in the top-right can be seen, containing portability, stability and maintainer information:
From looking at the source code to such modules and experimentation, I confirmed that this information is generated from lines like the following in the module description:
-- Maintainer : libraries#haskell.org
-- Stability : stable
-- Portability : portable
There are several strange things about this:
The fields only seem to work in this order — any fields put out of order are simply treat as part of the module description itself. This is despite the fact that the order in the source file is the opposite of the order in the generated documentation!
I have been unable to find any official documentation of these fields. There is a Cabal package property named stability, the example values of which match the values I've seen in the equivalent Haddock fields, but beyond that, I've found nothing.
So: How are these fields intended to be used, and are they documented anywhere?
In particular, I'd like to know:
The full list of commonly-used values for Portability and Stability. This HaskellWiki page has a list, but I'd like to know where this list originated from.
The criteria for deciding whether a module is portable or non-portable. In particular, the package I would like the answers to these questions for, acme-strfry, is an FFI binding to strfry, a function only available in glibc. Is the package non-portable, because it only works on glibc systems, or portable, because it does not use any Haskell language extensions? The common usage seems to imply the latter.
Why a specific order of fields is required in the source file, and why it's the opposite of the ordering in the generated documentation.

Oh, I thought those fields were from the cabal package description. They don't seem to be documented at all on Haddock's docs. I've found this, which doesn't really answer your question but:
http://trac.haskell.org/haddock/ticket/71
So if it's freeform anyway, why not just write "non-portable (depends on glibc)"? I've seen even "portable (depends on ghc)", which is odd. I also wonder what happens with modules that were non-portable due to non-Haskell98 extension Foo, after Foo was added to Haskell 2010.
Note that the Cabal documenation you link to also says stability is freeform. Of course, even if haddock or cabal were to define what are the acceptable values, it'd still be up to the maintainer to subjectively select one.
About the specific order, you should probably just ask at the haddock mailing list, or check the source and file a bug.
PS: strfry is an invaluable contribution to the Haskell community, but it should be pure and portable, don't you think?

Ah yes, one of the more obscure and crufty features of Haddock.
As best as I can tell, it's just an undocumented hack. There's no sane reason why the order of the fields should matter, but it does. The specific choice of formatting (i.e., as a special form inside the module comment rather than as a separate block of some kind) isn't the best either. My guess is that somebody wanted to quickly add this feature one day, so they hacked up something minimal but functioning, and left it at that. (Without bothering to document it.)
Personally, I just don't bother with these fields at all. The information is available from Cabal, so I don't bother duplicating it in Haddock as well. Perhaps some day Cabal will pass this information to Haddock automatically...

How reasonable/possible/difficult is it to write a tsocks clone in Haskell

I'm a reasonably competent programmer who knows haskell, but who hasn't used it in any major projects. I know enough about c and systems and network programming that I believe I can pick apart tsocks from the source code.
I don't have any experience with the low-level systems interfaces haskell provides. I'm looking for any advice people can offer me on the topic, including, "Don't do it; you'll hate yourself for it," provided there is an explanation.

I really wouldn't do this, except as an experiment. I'm a Haskell guy, but not a deep systems guy, so there's a caveat there. But nonetheless, I see the following on the tsocks page:
tsocks is based on the 'shared library
interceptor' concept. Through use of
the LD_PRELOAD environment variable or
the /etc/ld.so.preload file tsocks is
automatically loaded into the process
space of every executed program. From
there it overrides the normal
connect() function by providing its
own. Thus when an application calls
connect() to establish a TCP
connection it instead passes control
to tsocks. tsocks determines if the
connection needs to be made via a
SOCKS server (by checking
/etc/tsocks.conf) and negotiates the
connection if so (through use of the
real connect() function )
It is possible to call Haskell from C, and vice-versa. And its relatively easy, in fact. For shared libraries, see this: http://www.haskell.org/ghc/docs/6.12.1/html/users_guide/using-shared-libs.html.
But when you invoke Haskell from C, you need to A) link in the runtime and B) invoke the runtime.
So that works when the C knows that its calling Haskell. But its relatively trickier when the C doesn't know that it's calling Haskell, and so you'd need to wrap the Haskell shared library with a C library that invoked and managed the runtime transparently to the program that is preloading the haskell-tsocks library to intercept its normal connect functions.
So I'm sure this can be done -- but it sounds rather painful and complicated, and somewhat expensive in terms of having to link the whole ghc runtime in for this one feature. And frankly, I imagine the code you'd be writing (I haven't inspected the tsocks code itself yet) would largely be FFI calls anyway.
So a Haskell implementation of some element of socks -- a proxy, a client, etc. sounds interesting and potentially useful. But the exact preload magic that tsocks does sounds like a perhaps poor fit.
Bear in mind that there are Haskell hackers that are much better at this stuff than me, more knowledgeable, and more experienced. So they might say otherwise.

(Posting as a separate answer, since this is advice unrelated to the FFI)
You probably know this stuff, but in case its useful for anyone...
Read up on the Network.Socket module
Search the Haskell Wiki for pages that might help you (like Applications and libraries/Network)
Check out System Programming in Haskell and other chapters from RWH
Ignore the people that say "Haskell is terrible for I/O" - protip: you can just scare them away by saying fancy words like "endofunctor"

This may not be exactly the answer you were looking for, but instead of re-writing it in Haskell, you could just use the Foreign Function Interface to wrap the already-existing C implementation in Haskell types.
Note, one of the few major changes in Haskell 2010 was to officially include the FFI as a language feature. Link: Haskell 2010 FFI

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string