What are the tradeoffs between the Haskell SQLite packages? - haskell

There are many Haskell SQLite bindings, which implies to me that there are many different tradeoffs on using building/using a SQLite binding. I've tried to read through the documentation of many of these packages but it became a blur after a while, and I was unable to really identify the primary tradeoffs of choosing one over another.
A search on Hackage finds:
direct-sqlite
HDBC-sqlite3
hdbi-sqlite
hsql-sqlite3
hsSqlite3
persistent-sqlite
simplest-sqlite
sql-simple-sqlite
sqlite
sqlite-simple
sqlite-simple-typed
bindings-sqlite3
Nevermind some "meta" SQLite packages. haskelldb-hdbc-sqlite3, haskelldb-hsql-sqlite3, language-sqlite, opaleye-sqlite
Hoping that someone has been able to do this successfully and can help me understand how to choose.

I looked at the package mentioned. Some of these package are a dependency of another package (like opaleye-sqlite and sqlite-simple) depend on direct-sqlite.
Therefore, let's first look at the package that provide the actual driver. Most of them are outdated. There seem to be 3 that still have recent updates:
https://hackage.haskell.org/package/simplest-sqlite https://github.com/YoshikuniJujo/test_haskell/tree/master/features/ffi/sqlite3/simplest-sqlite i wouldn't use it because the repository says "It's just my private Haskell learning/testing repository."
https://hackage.haskell.org/package/persistent-sqlite this one is based on direct-sqlite (seems like part of the direct-sqlite has been forked)
The last one being the direct-sqlite package. I used this website to find which package depend on direct-sqlite. Now leaving out package that don't have the purpose of working with sqlite (such as bake: Continuous integration system). And also leaving out packages that haven't seen updates in a long time.
That leaves us with the following package that provide extra functionality based on direct-sqlite. This list includes more levels of reverse lookup to see which other package make use of the package listed below.
persistent-sqlite as mentioned before
esqueleto
eventful
groundhog-sqlite
opaleye-sqlite
selda-sqlite
sqlite-simple
beam-sqlite

I've had very good experiences with the ...-simple family of libraries. They are very full-featured and sit at a nice medium level of abstraction where you get a large amount of flexibility over how you intereact with the database.
I'm the author of opaleye-sqlite. It is a somewhat experimental version of Opaleye for SQLite. The Postgres version of Opaleye is very solid and used in production in several places, but I only know of one person who has used opaleye-sqlite in production.

Related

Type hints for lxml?

New to Python and come from a statically typed language background. I want type hints for https://lxml.de just for ease of development (mypy flagging issues and suggesting methods would be nice!)
To my knowledge, this is a python 2.0 module and doesn’t have types. Currently I’ve used https://mypy.readthedocs.io/en/stable/stubgen.html to create stub type definitions and filling in “any”-types I’m using with more information, but it’s really hacky. Are there any safer ways to get type hints?
There is an official stubs package for lxml now called lxml-stubs:
$ pip install lxml-stubs
Note, however, that the stubs are still in development and are not 100% complete yet (although very much usable from my experience). These stubs were once part of typeshed, then curated by Jelle Zijlstra after removal and now are developed as part of the lxml project.
If you want the development version of the stubs, install via
$ pip install git+https://github.com/lxml/lxml-stubs.git
(the project's readme installation command is missing the git+ prefix in URL's scheme and won't work).
Recently I have done much more gap filling based on lxml-stubs with some good progress.
Welcome to check out types-lxml if any late comer are still interested. For most people I think lxml.objectify is the only missing piece lacking from the stubs, which is planned immediately after current release.
I’ve used stubgen to create stub type definitions and filling in “any”-types
This is actually the correct approach if it's not lxml; creating template from mypy stubgen is the starting point for many stub files. But lxml is mostly written in Cython, for which stubgen do not have perfect support yet. Besides as OP noted, this is a python 2.0 era module, and author uses function arguments in a quite polymorphous way. There are lots of unique challenges annotating lxml, as lxml is essentially a python interface for libxml and libxslt in its core.
As an example, the support of both unicode and bytes input complicates matter too; this is the same difficulty found when annotating xml.etree bundled with python, but in a much greater magnitude.
I would not call this "hacky", rather it is gradual typing.
You can take a closer look at lxml-stubs repository. From about:
This repository contains external type annotations (see PEP 484) for the lxml package. Such type annotations are normally included in typeshed, but lxml's annotations were frequently problematic and have therefore been deleted from typeshed. In particular, the stubs are incomplete and it has been difficult to provide complete stubs.
Perhaps it will be useful to you

What is Cabal Hell?

I am a little bit confused while reading about Cabal Hell, as the term is overloaded. I guess originally Cabal Hell referred to the diamond dependency problem, which was solved by restricting the build plan to have only a single version of any package in each build plan (two different versions of a package can't exist in a single build plan) as explained in this answer.
However, the term is also used in various other contexts. Such as destructive re-installations, incorrect package dependency boundaries (lower/upper version bounds), inconsistent environments ... (or any other error reported by Cabal).
Particular among these, I am confused about 1) destructive re-installations and 2) inconsistent environments? What do they mean, and how cabal new-build solves these problems (is it just sandboxing like cabal sandbox)? And what role ghc-pkg has to play here?
Any references or a simple example where these problems could be reproduced would be very appreciated.
Regarding "destructive re-installations": If I am not wrong, GHC has a package manager of itself (ghc-pkg), and the packages are installed as dynamically linkable libraries i.e: base depends on ghc-prim, so if ghc-prim is removed it will break base, am I right? And since GHC only allows one instance of a package with the same version, cabal install might register a newer build of the same (package, version) such that it breaks the dependents of the unregistered package. If the above understanding regarding "destructive re-installations" are correct; how does cabal new-build help here?
The only meaningful use of the term is the one given in the linked answer. Related are the follow-on problems from having lots of different packages in the global database, which can make encountering diamond dependencies more common, requiring destructive reinstalls to resolve, etc.
The other usages of the term are not helpful and just mean "problems somehow involving cabal."
That said, let me answer your other questions.
1) ghc-pkg is not a package manager, but rather a tool for managing ghc package databases. It is used by cabal to register packages into databases, and can be used by end-users to inspect the contents of the databases. Think of it as part of the underlying substrate provided by ghc, not a competing tool.
2) new-build eliminates and replaces the standard notion of a packagedb entirely. Instead of saying that a db consists of packages and versions, with at most one of each pair, instead a db consists of potentially many copies of packages at any given version, each with potentially different versions of its dependencies, all of which are managed in part by hash-addressing, so marked by a unique "fingerprint". This is called the store. When you new-build, cabal calculates a build plan irrespective of any previously installed dependencies, from scratch. If a particular fingerprint (consisting of a package, version, and the versions of all its dependencies, certain flags, etc) already exists in the store, then it makes use of it. If it does not, it calculates it.
As such, the only "diamond dependencies" that can occur are the truly insoluble ones, and not the ones occasioned by having fixed too-early (due to already-installed deps) some portion of the dependency tree.
tldr; you write "since GHC only allows one instance of a package with the same version" but new-build partially lifts this restriction in the store which allows the solver to produce better, more reproducible plans more often.

Parsing GHC Core in ghc-7.10

I am trying to parse some GHC Core to extract name information and other bits needed.
I am currently using the GHC API given that I haven't found other useful packages help with it.
I've looked through some packages like ghc-core, ghc-core-html and extcore but they seem slightly outdated and I haven't managed to use extcore with ghc-7.10.3.
I have also tried to look for up to date documentation on Core without luck. The best post I've come across is this one, but the discussion is slightly outdated (e.g. compiling the example from these slides, gives a different core dump using the latest ghc.
The question
Having said all this, do you guys know of any recent package that can help in parsing Core? Is there any new documentation regarding CORE manipulation?
Thanks!
The external core feature was removed because it was buggy and a hassle to maintain and if people were using it they didn't speak up. So there is no longer any textual representation of Core intended for machine consumption. Only the internal (AST) representation is available. Of course, I'm sure you'd be welcome to revive the external representation if you want to maintain it.

Haskell, Hackage, GHC and productivity. How to solve a real example?

I don't know the best way to solve a simple (probably) problems (hackage related).
I asked for help about it (http://stackoverflow.com/questions/12841599/haskell-hackage-ghc-and-productivity-what-to-do) but I knew not explain well.
Today, I'm with a this kin problem.
The concrete problem isn't relevant, but is it:
`Write a function that, given a string, remove diacritics.`
Example:
`simpleWord "Cigüeñal" <-> "Ciguenal"
The correct way (I think) is to use the standard Unicode normalization. In some languages/frameworks (.Net, PHP, Python, ...) exist some related function.
In Haskell, thanks to hackage community exist too:
`Text.Unicode.Normalization.normalize`
But, I couldn't install with (eg) ghc-7.4 but compact-string (that depends of) fail.
A fix for compact-string exists (compact-string-fix) then: can't I use cabal to install (directly)?, should I download and patch it?, should I look for another alternative to function about?
I explained a concrete real case (simple or complex, don't care), the question (that I ask help for) is how can, a novice haskeller, know the best way to select correct libraries, ghc correct (balanced) version, without hit a wall.
I'm really lost about it.
Really, thank you very much for any suggestion.
Best regards.
The documentation for compact-string says, "This package is obsolete. Use text instead.".
The documentation for text says, "To use an extended and very rich family of functions for working with Unicode text (including normalization, regular expressions, non-standard encodings, text breaking, and locales), see the text-icu package.".
The documentation for text-icu shows that it successfully builds on GHC 7.4 and has support for Unicode normalization.
Here's the general process I follow when deciding which packages to use. First, I try to identify multiple packages that meet my needs. Then I look more closely at each package to try to determine which ones are the best for me, according to the criteria listed below.
It's usually better to use packages that are currently maintained. To determine if a package is currently maintained, I check the "Upload date" link on the package description page. (Of course, there are some old tried-and-true packages that haven't been modified in ages because they don't need modification.)
It's usually better to use packages that are mature, so I check the version number on the package description page. A package with a version number of 7.3.5 is probably more mature than a version 0.1 package.
It's usually better to use packages that are well documented. Sometimes there's a nice example of how to use the package in the Haddock documentation (yay!). I'll also check the "Home page" link on the package description page, because often there will be more documentation there.
It's usually better to use packages that are popular, because any problems will probably be addressed quickly, and other users can answer questions. I'll usually do a Google search and see whch packages are mentioned most often on Haskell mailing lists and StackOverflow.
It's usually better to use packages that don't require a lot of packages I don't already have, so I check the "Dependencies" section on the package description page.
I tend to follow this procedure when choosing a package for any programming language, not just Haskell.

Non-maintainer uploads to Hackage

I have a package on Hackage which depends on third-party package, which doesn't build on newer versions of GHC (>= 7.2). The problem with the other package can be solved with just a one-line patch (a LANGUAGE pragma). I sent the patch to the upstream twice, but didn't receive any feedback. The problem is that my package is not installable neither until the dependency is fixed.
I could have uploaded the fixed version of depenency package (with a minor version bump), but I'd like to hear what is the attitude of the community about such non-maintainer uploads. Again, I don't want to change the library interface, I only add a new compilation flag to make it buildable again.
Are non-maintainer uploads to Hackage allowed and tolerated?
When a fork of the package on Hackage is a better approach?
Package uploads by non-maintainers are allowed (there may be license issues, but most packages if not all on hackage have licenses permitting this), but of course they are not usually done. They are tolerated if done in good faith and with reasonable procedure. If you contact the maintainer and don't get any response within n weeks (where I'm not sure what the appropriate value of n is, not less than 3, I'd say), uploading a new version yourself becomes an option, discussing that on the mailing lists seems however more prudent. If the package looks like it is abandoned, even taking over maintainership - of course after again contacting the maintainer, giving her/him time to respond - may be the appropriate action, but that should definitely be discussed with the community (haskell-cafe or mailing list, for example). Whether to prefer a non-maintainer upload or a fork must be left to your judgment, personally I tend to believe forks step on fewer people's toes.
But a better founded reply would be possible if we knew which package is concerned and could look at the concrete situation.
A forking is intrusive for a package that you suspect is still maintained but the author is temporarily missing. By intrusive, I mean that other programmers might pick up your fork then not go back to the mainline once the original author has resumed work on the mainline.
For packages where the original author has left the Haskell community, my personal opinion is that its better to fork the package if you are going to develop it further. Forking prevents succession problems, such as those that happened with Parsec where many developers didn't want to update because the successor was slower and less well documented than the original for some time.
In all cases asking on the Cafe is best, regardless of whether people have chosen not to follow it, it is still the center of the Haskell community.
For the particular case in the question, while it is nice if things on Hackage compile, there is no rule that says they have to. A package that depends on a broken package could simply put change instructions for the broken dependency on its front page, i.e. "This package depends on LambdaThing-0.2.0 which is broken, to fix LambdaThing add ... to the file Lambda.hs"
I would say, it's a very good idea to consult the mailing lists regarding the specific package and the specific person who is missing. I took control of the haskell-src-meta package from its original owner, but only after consulting with the lists and IRC, who assured me that Matt Morrow had been missing for months and no-one knew why.
In my opinion, package ownership should probably only be changed where there is a consensus to do so, or at the very least there should be efforts made to find one. In the development version of the Hackage software, it's my understanding there are access controls so that only administrators can make this kind of intervention.

Resources