Is it possible to profile only certain packages using GHC's profiler? - haskell

Is it possible to only generate cost-centres for functions defined in certain packages? I'm not sure if this question actually makes sense from a technical viewpoint, in that case please enlighten me. What I'm interested in, essentially, is to get profiler output that only shows cost-centres for bindings in my own packages. Is this possible?
Alternatively I'd like to be able exclude cost-centres from some packages. Is this possible?

Related

Haskell, Hackage, GHC and productivity. How to solve a real example?

I don't know the best way to solve a simple (probably) problems (hackage related).
I asked for help about it (http://stackoverflow.com/questions/12841599/haskell-hackage-ghc-and-productivity-what-to-do) but I knew not explain well.
Today, I'm with a this kin problem.
The concrete problem isn't relevant, but is it:
`Write a function that, given a string, remove diacritics.`
Example:
`simpleWord "Cigüeñal" <-> "Ciguenal"
The correct way (I think) is to use the standard Unicode normalization. In some languages/frameworks (.Net, PHP, Python, ...) exist some related function.
In Haskell, thanks to hackage community exist too:
`Text.Unicode.Normalization.normalize`
But, I couldn't install with (eg) ghc-7.4 but compact-string (that depends of) fail.
A fix for compact-string exists (compact-string-fix) then: can't I use cabal to install (directly)?, should I download and patch it?, should I look for another alternative to function about?
I explained a concrete real case (simple or complex, don't care), the question (that I ask help for) is how can, a novice haskeller, know the best way to select correct libraries, ghc correct (balanced) version, without hit a wall.
I'm really lost about it.
Really, thank you very much for any suggestion.
Best regards.
The documentation for compact-string says, "This package is obsolete. Use text instead.".
The documentation for text says, "To use an extended and very rich family of functions for working with Unicode text (including normalization, regular expressions, non-standard encodings, text breaking, and locales), see the text-icu package.".
The documentation for text-icu shows that it successfully builds on GHC 7.4 and has support for Unicode normalization.
Here's the general process I follow when deciding which packages to use. First, I try to identify multiple packages that meet my needs. Then I look more closely at each package to try to determine which ones are the best for me, according to the criteria listed below.
It's usually better to use packages that are currently maintained. To determine if a package is currently maintained, I check the "Upload date" link on the package description page. (Of course, there are some old tried-and-true packages that haven't been modified in ages because they don't need modification.)
It's usually better to use packages that are mature, so I check the version number on the package description page. A package with a version number of 7.3.5 is probably more mature than a version 0.1 package.
It's usually better to use packages that are well documented. Sometimes there's a nice example of how to use the package in the Haddock documentation (yay!). I'll also check the "Home page" link on the package description page, because often there will be more documentation there.
It's usually better to use packages that are popular, because any problems will probably be addressed quickly, and other users can answer questions. I'll usually do a Google search and see whch packages are mentioned most often on Haskell mailing lists and StackOverflow.
It's usually better to use packages that don't require a lot of packages I don't already have, so I check the "Dependencies" section on the package description page.
I tend to follow this procedure when choosing a package for any programming language, not just Haskell.

How are the Haddock module fields Portability, Stability and Maintainer used?

In lots of Haddock-generated module documentation (e.g. Prelude), a small box in the top-right can be seen, containing portability, stability and maintainer information:
From looking at the source code to such modules and experimentation, I confirmed that this information is generated from lines like the following in the module description:
-- Maintainer : libraries#haskell.org
-- Stability : stable
-- Portability : portable
There are several strange things about this:
The fields only seem to work in this order — any fields put out of order are simply treat as part of the module description itself. This is despite the fact that the order in the source file is the opposite of the order in the generated documentation!
I have been unable to find any official documentation of these fields. There is a Cabal package property named stability, the example values of which match the values I've seen in the equivalent Haddock fields, but beyond that, I've found nothing.
So: How are these fields intended to be used, and are they documented anywhere?
In particular, I'd like to know:
The full list of commonly-used values for Portability and Stability. This HaskellWiki page has a list, but I'd like to know where this list originated from.
The criteria for deciding whether a module is portable or non-portable. In particular, the package I would like the answers to these questions for, acme-strfry, is an FFI binding to strfry, a function only available in glibc. Is the package non-portable, because it only works on glibc systems, or portable, because it does not use any Haskell language extensions? The common usage seems to imply the latter.
Why a specific order of fields is required in the source file, and why it's the opposite of the ordering in the generated documentation.
Oh, I thought those fields were from the cabal package description. They don't seem to be documented at all on Haddock's docs. I've found this, which doesn't really answer your question but:
http://trac.haskell.org/haddock/ticket/71
So if it's freeform anyway, why not just write "non-portable (depends on glibc)"? I've seen even "portable (depends on ghc)", which is odd. I also wonder what happens with modules that were non-portable due to non-Haskell98 extension Foo, after Foo was added to Haskell 2010.
Note that the Cabal documenation you link to also says stability is freeform. Of course, even if haddock or cabal were to define what are the acceptable values, it'd still be up to the maintainer to subjectively select one.
About the specific order, you should probably just ask at the haddock mailing list, or check the source and file a bug.
PS: strfry is an invaluable contribution to the Haskell community, but it should be pure and portable, don't you think?
Ah yes, one of the more obscure and crufty features of Haddock.
As best as I can tell, it's just an undocumented hack. There's no sane reason why the order of the fields should matter, but it does. The specific choice of formatting (i.e., as a special form inside the module comment rather than as a separate block of some kind) isn't the best either. My guess is that somebody wanted to quickly add this feature one day, so they hacked up something minimal but functioning, and left it at that. (Without bothering to document it.)
Personally, I just don't bother with these fields at all. The information is available from Cabal, so I don't bother duplicating it in Haddock as well. Perhaps some day Cabal will pass this information to Haddock automatically...

Where do QuickCheck instances belong in a cabal package?

I have a cabal package that exports a type NBT which might be useful for other developers. I've gone through the trouble of defining an Arbitrary instance for my type, and it would be a shame to not offer it to other developers for testing their code that integrates my work.
However, I want to avoid situations where my instance might get in the way. Perhaps the other developer has a different idea for what the Arbitrary instance should be. Perhaps my package's dependency on a particular version of QuickCheck might interfere with or be unwanted in the dependencies of the client project.
My ideas, in no particular order, are:
Leave the Arbitrary instance next to the definition of the type, and let clients deal with shadowing the instance or overriding the QuickCheck version number.
Make the Arbitrary instance an orphan instance in a separate module within the same package, say Data.NBT.Arbitrary. The dependency on QuickCheck for the overall package remains.
Offer the Arbitrary instance in a totally separate package, so that it can be listed as a separate test dependency for client projects.
Conditionally include both the Arbitrary instance and the QuickCheck dependency in the main package, but only if a flag like -ftest is set.
I've seen combinations of all of these used in other libraries, but haven't found any consensus on which works best. I want to try and get it right before uploading to Hackage.
On the basis of not much specific experience, but a general desire for robustness, the guiding principle for package dependencies should perhaps be
From each according to their ability; to each according to their need.
It's good to keep the dependencies of a package to the minimum needed for its essential functionality. That suggests option 3 or option 4 to me. Of course, it's a pain to chop the package up so much. If options are capable of expressing the conditionality involved, then option 4 sounds like a sensible approach, based on using language effectively to say what you mean.
It would be really good if a consensus emerged about which one switch we need to flick to get the testing kit as well as the basic functionality.
It's also clear that there's room for refinement here. It's amazing that Cabal works as well as it does, but it could allow for more sophisticated notions of "package", perhaps after the manner of the SML module system. Translating dependencies into function types, we basically get to write
simplePackage :: (Dependency1, .., Dependencyn) -> Deliverable
but one could imagine more elaborate combinations of products and functions, like
fancyPackage :: BasicDependency -> (BasicDeliverable, HelpfulExtras -> Gravy)
Until then, pick the option that most accurately reflects the actual deal. And tell us about it, so we can build that consensus.
The problem comes down to: how likely is it that someone using your library will be wanting to run QuickCheck tests using your NBT type?
If it is likely, and the Arbitrary instance is detailed (and thus not likely to change for different people), it would probably be best to ship it with your package, especially if you're going to make sure you keep updating the package (as for using a flag or not, that comes down to a bit of personal preference). If the instance is relatively simple however (and thus more likely that people would want to customise it), then it might be an idea to just provide a sample instance in the documentation.
If the type is primarily internal in nature and not likely to be used by others wanting to run tests, then using a flag to conditionally bring in QuickCheck is probably the best way to go to avoid unnecessary dependencies (i.e. the test suite is there just so you can test the package).
I'm not a fan of having QuickCheck-only packages in general, though it might be useful in some situations.

I cannot find the packer of the executable

There is this executable that is packed however neither peid nor protection_id nor RDG tell me what it is, as they dont know.
How do i go about finding the packer?
Or what if its' custom made?
It could easily have been derived from another packer in such a way as to destroy the signature by which the packer is recognized by those tools. Someone with experience looking at packed binaries might be able to detect obvious signs that it originated from a specific tool, but if all three tools fail to detect it, there's a good chances that it's custom made. A sign that it's custom made would be if the unpacking code is fairly simple and doesn't go through more than a few KB of code before executing the payload. Also look for signs that it doesn't look like it could pack generic program binaries.

Is there a typical config or property file format and library in Haskell?

I need a set of key-value pairs for configuration read in from a file. I tried using show on a Data.Map and it doesn't look at all like what I want. It seems this is something many others might have already done so I'm wondering if there is a standard way to do it and what library to use.
Go to hackage.
Click on "packages"
Search for "config".
Notice ConfigFile(TH), EEConfig, and tconfig.
Read the Haddock documentation
Select a couple and implement your task.
Blog about your findings so the rest of us can learn from your new found expertise (thanks!).
EDIT:
I've recently used configurator - which was easy enough. I suggest you try that one!
(Yes, yes. If I took my own advice I would have made a blog for you all)
The configuration category on Hackage should list all relevant libraries:
http://hackage.haskell.org/packages/#cat:Configuration
I have researched the topic myself now, and my conclusion is:
configurator is very good, but it's currently only for user-edited configurations. The application only reads the configuration and cannot modify it. So it's more for server-side applications.
tconfig has a a simple API and looked like it was what I wanted, maybe a bit raw, until I realized it's unmaintained and that some commits which are really important to use the app are applied on github but the hackage package was not updated
Other solutions didn't look like they'd work for me, I didn't like the API, but every application (and tastes) are different.
I think using JSON for instance is not a good solution because at least with Aeson when you add new settings in a new release, the old JSON without the new member from the previous version won't load. Also, i find that solution a bit verbose.
The conclusion of my research is that I wrote my own library, app-settings, which aims to be key-value, read-write, with a as succint and type-safe API as possible. And you'll find it also in the hackage links for the configurations category that I gave.
So to summarize, I think configurator is the standard for read-only configurations (and it's very powerful too, you can split the configuration file with imports for instance). For read-write there are many small libraries, some unmaintained, and no real standard I think.
UPDATE 2018 be sure to look at dhall
I'd also suggest just using Text.JSON or one of the yaml libraries available (I prefer JSON myself, but...).
The configfile package looks like what you want.

Resources