When running hlint over my program it reported an error for
\x -> [x]
and suggested the alternative form
(: [])
What is there erroneous according to hlint about the first form, and thus why should I use the (less readable) second option?
Edit
(added hlint explicitly to the question)
My question lies not so much with what the difference is (I do understand both of them) in lexical point of view. My problem is that I do not understand why hlint is marking it as an error. Is there for example a difference in laziness? Furthermore why is the previous thought of as erroneous by hlint while \x -> Just x raises only a warning.
A common question, to which I've just added an answer in the HLint manual. It says:
Every hint has a severity level:
Error - for example concat (map f x) suggests concatMap f x as an "error" severity hint. From a style point of view, you should always replace a combination of concat and map with concatMap. Note that both expressions are equivalent - HLint is reporting an error in style, not an actual error in the code.
Warning - for example x !! 0 suggests head x as a "warning" severity hint. Typically head is a simpler way of expressing the first element of a list, especially if you are treating the list inductively. However, in the expression f (x !! 4) (x !! 0) (x !! 7), replacing the middle argument with head makes it harder to follow the pattern, and is probably a bad idea. Warning hints are often worthwhile, but should not be applied blindly.
The difference between error and warning is one of personal taste, typically my personal taste. If you already have a well developed sense of Haskell style, you should ignore the difference. If you are a beginner Haskell programmer you may wish to focus on error hints before warning hints.
While the difference is personal taste, sometimes I change my mind. Looking at the two examples in this thread, (:[]) seems a relatively "complex" hint - you are breaking down the syntactic sugar of [x] to x:[], which in some ways is peeling through the abstraction of a list as a generic container, if you never pattern match on it. In contrast \x -> Just x to Just always seems like a good idea. Therefore, in HLint-1.8.43 (just released) I have made the first a warning, and the second an error.
There is no real difference. HLint concerns itself with style issues; ultimately, they are just hints on how to make your code look better.
In general, using a lambda with a constructor or function like that is redundant and makes the code harder to read. As an extreme example, take a constructor like Just as an example: compare Just to \ x -> Just x. These are equivalent but the second version certainly makes things more confusing! As a closer example, most people would choose (+ 1) over \ x -> x + 1.
In your particular case, it's a different story because lists have special syntax. So if you like the \ x -> [x] version better, just keep it. However, once you become used to operator sections, it's likely you'll find the (: []) version as easy--if not easier--to read, so consider using it even now.
I might consider using return or pure for this:
ghci> return 0 :: [Int]
[0]
ghci> import Control.Applicative
ghci> pure 0 :: [Int]
[0]
I needed to include the type annotation (:: [Int]) because I was working in GHCi. In the middle of a bunch of other code you probably wouldn't need it.
Related
In prelude, head has the following signature : head :: [a] -> a which makes it unsafe on empty list, which is not good !! (head :: [a] -> Maybe a is the good way :-) )
this applies for a couple of other functions on list : last, tail, init minimum, maximum, cycle, last, init, foldl1, cycle... there are actually a lot of these calling errorEmptyList
Quoting Stephen Diehl from his website :
"Safe provides Maybe versions of many of the various partial functions
(head, tail) that are shipped by default. Wrapping it up in a Maybe is
widely considered the right approach and if Haskell were designed
today, they would not be present."
I would love to see these unsafe functions tagged somehow with some convention at least because I don't think any of us like when an exception blows up in production :-)
What prevents the community from fixing these issues in the prelude ?
The community has been fixing this issue in the custom preludes that are distributed on Hackage.
But it can't fix the prelude itself, it's up to the Haskell Committee in charge. For matters of backward compatibility, it has never been fixed.
(I personally prefer Relude's approach on that matter. This prelude's head function is typed as NonEmpty a -> a.)
The real question is why there is a function head at all. This isn't needed for working with lists. To case-separate (x:_) and [], as “safe head” would allow you, the best option is to just use pattern matching, or suitable higher-order functions / lens operators.
In practice, the only use for head is when you're in some big function, want at the head of some list xs (among many other parameters) where you have already checked that it is nonempty, e.g. because some other clause has already covered that. In that case, the safety would be redundant.
Arguably, this is still a kludge and would better be expressed with pattern-matching. But the point is, the unsafe head has at least some ugly-but-pragmatic uses, whereas a safe head would mostly add extra noise to pattern matching that needs to be done anyway.
IMO, head, tail and !! should just be deprecated, not changed. They keep tripping up beginners coming from Python etc. thinking that they need such functions, but actually this style is inherently against the grain of Haskell.
As a newcomer to Haskell I'm reading StackOverflow top rated questions, new questions, etc. and today there was this one:
Haskell: min distance between neighbor numbers on a list
I thought "well, I'll try that without looking at the answers". For starters, I wrote:
neighborsDistance [] = []
neighborsDistance [a] = []
neighborsDistance (x:y:xs) = abs (x - y) : neighborsDistance (y:xs)
Then I could do minimum neighborsDistance [2,3,6,2,0,1,9,8] => 1.
But I didn't much care for how the edge cases worked, so I thought perhaps I'd try using Maybe. I'd need a way to adapt the recursive rule to tolerate a "maybe" value so I looked on Hoogle at Data.Maybe and found fromMaybe...which looked like it did what I wanted:
neighborsDistance [] = Nothing
neighborsDistance [a] = Just []
neighborsDistance (x:y:xs) = Just (abs (x - y) : fromMaybe [] neighborsDistance (y:xs))
But that gave me a not in scope: fromMaybe error. Main question would be "why didn't that work?"
Another question is just generally about Haskell mindsets when looking at something like this. Is this a bad use of Maybe? Why does head throw an exception when called on an empty list vs return a Maybe type?
The question I got the problem from was trying to unify distance calculation with the minimum operation. I'm assuming that you can break it out like this without losing significant efficiency (such that it will compose with minimum, or maximum, or whatever)?
But that gave me a not in scope: fromMaybe error. Main question would be "why didn't that work?"
You need to import Data.Maybe (fromMaybe) at the top of your file.
The Prelude itself does do an import Data.Maybe to get the definitions of Maybe, Just, and Nothing. But it does the import inside a module 'where' clause, and exports only those three definitions. So the less common functions have to be manually imported.
Is this a bad use of Maybe? Why does head throw an exception when called on an empty list vs return a Maybe type?
Usage of head is generally regarded as non-ideal. The Safe package provides headMay, which does return a Maybe.
On the other hand, I think returning a Maybe [Int] is unnecessary here. The empty list already encodes the idea of there being no valid result. In fact, it's used commonly enough as a Maybe that we have maybeToList and listToMaybe. Wrapping it in a Maybe means you need to spend more effort 'unwrapping' the value.
I'm assuming that you can break it out like this without losing significant efficiency (such that it will compose with minimum, or maximum, or whatever)?
In most cases, yes. Haskell has short-cut fusion.
You main question can be answered in one line, you are probably missing:
import Data.Maybe
But why the whole stress with using a Maybe value? Because this problem has a clear answer for any list of length >= 2, you do not need to use a maybe type. A maybe type is generally only used when it is not sure whether the function will return a result or not, such as in this case.
Another question is just generally about Haskell mindsets when looking at something like this. Is this a bad use of Maybe? Why does head throw an exception when called on an empty list vs return a Maybe type?
Think about it: If you had a safe head which is implemented as the listToMaybe function in the Data.Maybe package, you would have to handle the empty case too which is not what head is intended to do (which is extract the first element of a list).
For further information about head and empty lists, you should look here as it already has been discussed.
I'm assuming that you can break it out like this without losing significant efficiency (such that it will compose with minimum, or maximum, or whatever)?
Yes you can do that in Haskell, its pretty neat. As a hint: the ($) and (.) operators are you best friends regarding this.
This has been a question I've been wondering for a while. if statements are staples in most programming languages (at least then ones I've worked with), but in Haskell it seems like it is quite frowned upon. I understand that for complex situations, Haskell's pattern matching is much cleaner than a bunch of ifs, but is there any real difference?
For a simple example, take a homemade version of sum (yes, I know it could just be foldr (+) 0):
sum :: [Int] -> Int
-- separate all the cases out
sum [] = 0
sum (x:xs) = x + sum xs
-- guards
sum xs
| null xs = 0
| otherwise = (head xs) + sum (tail xs)
-- case
sum xs = case xs of
[] -> 0
_ -> (head xs) + sum (tail xs)
-- if statement
sum xs = if null xs then 0 else (head xs) + sum (tail xs)
As a second question, which one of these options is considered "best practice" and why? My professor way back when always used the first method whenever possible, and I'm wondering if that's just his personal preference or if there was something behind it.
The problem with your examples is not the if expressions, it's the use of partial functions like head and tail. If you try to call either of these with an empty list, it throws an exception.
> head []
*** Exception: Prelude.head: empty list
> tail []
*** Exception: Prelude.tail: empty list
If you make a mistake when writing code using these functions, the error will not be detected until run time. If you make a mistake with pattern matching, your program will not compile.
For example, let's say you accidentally switched the then and else parts of your function.
-- Compiles, throws error at run time.
sum xs = if null xs then (head xs) + sum (tail xs) else 0
-- Doesn't compile. Also stands out more visually.
sum [] = x + sum xs
sum (x:xs) = 0
Note that your example with guards has the same problem.
I think the Boolean Blindness article answers this question very well. The problem is that boolean values have lost all their semantic meaning as soon as you construct them. That makes them a great source for bugs and also makes the code more difficult to understand.
Your first version, the one preferred by your prof, has the following advantages compared to the others:
no mention of null
list components are named in the pattern, so no mention of head and tail.
I do think that this one is considered "best practice".
What's the big deal? Why would we want to avoid especially head and tail? Well, everybody knows that those functions are not total, so one automatically tries to make sure that all cases are covered. A pattern match on [] not only stands out more than null xs, a series of pattern matches can be checked by the compiler for completeness. Hence, the idiomatic version with complete pattern match is easier to grasp (for the trained Haskell reader) and to proof exhaustive by the compiler.
The second version is slightly better than the last one because one sees at once that all cases are handled. Still, in the general case the RHS of the second equation could be longer and there could be a where clauses with a couple of definitions, the last of them could be something like:
where
... many definitions here ...
head xs = ... alternative redefnition of head ...
To be absolutly sure to understand what the RHS does, one has to make sure common names have not been redefined.
The 3rd version is the worst one IMHO: a) The 2nd match fails to deconstruct the list and still uses head and tail. b) The case is slightly more verbose than the equivalent notation with 2 equations.
In many programming languages, if-statements are fundamental primitives, and things like switch-blocks are just syntax sugar to make deeply-nested if-statements easier to write.
Haskell does it the other way around. Pattern matching is the fundamental primitive, and an if-expression is literally just syntax sugar for pattern matching. Similarly, constructs like null and head are simply user-defined functions, which are all ultimately implemented using pattern matching. So pattern matching is the thing at the bottom of it all. (And therefore potentially more efficient than calling user-defined functions.)
In many cases - such as the ones you list above - it's simply a matter of style. The compiler can almost certainly optimise things to the point where all versions are roughly equal in performance. But generally [not always!] pattern matching makes it clearer exactly what you're trying to achieve.
(It's annoyingly easy to write an if-expression where you get the two alternatives the wrong way around. You'd think it would be a rare mistake, but it's surprisingly common. With a pattern match, there's little chance of making that specific mistake, although there's still plenty of other things to screw up.)
Each call to null, head and tail entails a pattern match. But the 1st version in your answer does just one pattern match, and reuses its results through named components of the pattern.
Just for that, it is better. But it is also more visually apparent, more readable.
Pattern matching is better than a string of if-then-else statements for (at least) the following reasons:
it is more declarative
it interacts well with sum-types
Pattern matching helps to reduce the amount of "accidental complexity" in your code - that is, code that is really more about implementation details rather than the essential logic of your program.
In most other languages when the compier/run-time sees a string of if-then-else statements it has no choice but to test the conditions in exactly the order the programmer specified them. But pattern matching encourages the programmer to focus more on describing what should happen versus how things should be performed. Due to purity and immutability of values in Haskell the compiler can consider the collection of patterns as a whole and decide the how best to implement them.
An analogy would be C's switch statement. If you dump the assembly code for various switch statements you will see that sometimes the compiler will generate a chain/tree of comparisons and in other cases it will generate a jump table. The programmer uses the same syntax in both cases - the compiler chooses the implementation based on what the comparison values are. If they form a contiguous block of values the jump table method is used, otherwise a comparison tree is used. And this separation of concerns allows the compiler to implement even more strategies in the future if other patterns among the comparison values are detected.
Note to other potential contributors: Please don't hesitate to use abstract or mathematical notations to make your point. If I find your answer unclear, I will ask for elucidation, but otherwise feel free to express yourself in a comfortable fashion.
To be clear: I am not looking for a "safe" head, nor is the choice of head in particular exceptionally meaningful. The meat of the question follows the discussion of head and head', which serve to provide context.
I've been hacking away with Haskell for a few months now (to the point that it has become my main language), but I am admittedly not well-informed about some of the more advanced concepts nor the details of the language's philosophy (though I am more than willing to learn). My question then is not so much a technical one (unless it is and I just don't realize it) as it is one of philosophy.
For this example, I am speaking of head.
As I imagine you'll know,
Prelude> head []
*** Exception: Prelude.head: empty list
This follows from head :: [a] -> a. Fair enough. Obviously one cannot return an element of (hand-wavingly) no type. But at the same time, it is simple (if not trivial) to define
head' :: [a] -> Maybe a
head' [] = Nothing
head' (x:xs) = Just x
I've seen some little discussion of this here in the comment section of certain statements. Notably, one Alex Stangl says
'There are good reasons not to make everything "safe" and to throw exceptions when preconditions are violated.'
I do not necessarily question this assertion, but I am curious as to what these "good reasons" are.
Additionally, a Paul Johnson says,
'For instance you could define "safeHead :: [a] -> Maybe a", but now instead of either handling an empty list or proving it can't happen, you have to handle "Nothing" or prove it can't happen.'
The tone that I read from that comment suggests that this is a notable increase in difficulty/complexity/something, but I am not sure that I grasp what he's putting out there.
One Steven Pruzina says (in 2011, no less),
"There's a deeper reason why e.g 'head' can't be crash-proof. To be polymorphic yet handle an empty list, 'head' must always return a variable of the type which is absent from any particular empty list. It would be Delphic if Haskell could do that...".
Is polymorphism lost by allowing empty list handling? If so, how so, and why? Are there particular cases which would make this obvious? This section amply answered by #Russell O'Connor. Any further thoughts are, of course, appreciated.
I'll edit this as clarity and suggestion dictates. Any thoughts, papers, etc., you can provide will be most appreciated.
Is polymorphism lost by allowing empty
list handling? If so, how so, and why?
Are there particular cases which would
make this obvious?
The free theorem for head states that
f . head = head . $map f
Applying this theorem to [] implies that
f (head []) = head (map f []) = head []
This theorem must hold for every f, so in particular it must hold for const True and const False. This implies
True = const True (head []) = head [] = const False (head []) = False
Thus if head is properly polymorphic and head [] were a total value, then True would equal False.
PS. I have some other comments about the background to your question to the effect of if you have a precondition that your list is non-empty then you should enforce it by using a non-empty list type in your function signature instead of using a list.
Why does anyone use head :: [a] -> a instead of pattern matching? One of the reasons is because you know that the argument cannot be empty and do not want to write the code to handle the case where the argument is empty.
Of course, your head' of type [a] -> Maybe a is defined in the standard library as Data.Maybe.listToMaybe. But if you replace a use of head with listToMaybe, you have to write the code to handle the empty case, which defeats this purpose of using head.
I am not saying that using head is a good style. It hides the fact that it can result in an exception, and in this sense it is not good. But it is sometimes convenient. The point is that head serves some purposes which cannot be served by listToMaybe.
The last quotation in the question (about polymorphism) simply means that it is impossible to define a function of type [a] -> a which returns a value on the empty list (as Russell O'Connor explained in his answer).
It's only natural to expect the following to hold: xs === head xs : tail xs - a list is identical to its first element, followed by the rest. Seems logical, right?
Now, let's count the number of conses (applications of :), disregarding the actual elements, when applying the purported 'law' to []: [] should be identical to foo : bar, but the former has 0 conses, while the latter has (at least) one. Uh oh, something's not right here!
Haskell's type system, for all its strengths, is not up to expressing the fact that you should only call head on a non-empty list (and that the 'law' is only valid for non-empty lists). Using head shifts the burden of proof to the programmer, who should make sure it's not used on empty lists. I believe dependently typed languages like Agda can help here.
Finally, a slightly more operational-philosophical description: how should head ([] :: [a]) :: a be implemented? Conjuring a value of type a out of thin air is impossible (think of uninhabited types such as data Falsum), and would amount to proving anything (via the Curry-Howard isomorphism).
There are a number of different ways to think about this. So I am going to argue both for and against head':
Against head':
There is no need to have head': Since lists are a concrete data type, everything that you can do with head' you can do by pattern matching.
Furthermore, with head' you're just trading off one functor for another. At some point you want to get down to brass tacks and get some work done on the underlying list element.
In defense of head':
But pattern matching obscures what's going on. In Haskell we are interested in calculating functions, which is better accomplished by writing them in point-free style using compositions and combinators.
Furthermore, thinking about the [] and Maybe functors, head' allows you to move back and forth between them (In particular the Applicative instance of [] with pure = replicate.)
If in your use case an empty list makes no sense at all, you can always opt to use NonEmpty instead, where neHead is safe to use. If you see it from that angle, it's not the head function that is unsafe, it's the whole list data-structure (again, for that use case).
I think this is a matter of simplicity and beauty. Which is, of course, in the eye of the beholder.
If coming from a Lisp background, you may be aware that lists are built of cons cells, each cell having a data element and a pointer to next cell. The empty list is not a list per se, but a special symbol. And Haskell goes with this reasoning.
In my view, it is both cleaner, simpler to reason about, and more traditional, if empty list and list are two different things.
...I may add - if you are worried about head being unsafe - don't use it, use pattern matching instead:
sum [] = 0
sum (x:xs) = x + sum xs
Could someone provide a link to a good coding standard for Haskell? I've found this and this, but they are far from comprehensive. Not to mention that the HaskellWiki one includes such "gems" as "use classes with care" and "defining symbolic infix identifiers should be left to library writers only."
Really hard question. I hope your answers turn up something good. Meanwhile, here is a catalog of mistakes or other annoying things that I have found in beginners' code. There is some overlap with the Cal Tech style page that Kornel Kisielewicz points to. Some of my advice is every bit as vague and useless as the HaskellWiki "gems", but I hope at least it is better advice :-)
Format your code so it fits in 80 columns. (Advanced users may prefer 87 or 88; beyond that is pushing it.)
Don't forget that let bindings and where clauses create a mutually recursive nest of definitions, not a sequence of definitions.
Take advantage of where clauses, especially their ability to see function parameters that are already in scope (nice vague advice). If you are really grokking Haskell, your code should have a lot more where-bindings than let-bindings. Too many let-bindings is a sign of an unreconstructed ML programmer or Lisp programmer.
Avoid redundant parentheses. Some places where redundant parentheses are particularly offensive are
Around the condition in an if expression (brands you as an unreconstructed C programmer)
Around a function application which is itself the argument of an infix operator (Function application binds tighter than any infix operator. This fact should be burned into every Haskeller's brain, in much the same way that us dinosaurs had APL's right-to-left scan rule burned in.)
Put spaces around infix operators. Put a space following each comma in a tuple literal.
Prefer a space between a function and its argument, even if the argument is parenthesized.
Use the $ operator judiciously to cut down on parentheses. Be aware of the close relationship between $ and infix .:
f $ g $ h x == (f . g . h) x == f . g . h $ x
Don't overlook the built-in Maybe and Either types.
Never write if <expression> then True else False; the correct phrase is simply <expression>.
Don't use head or tail when you could use pattern matching.
Don't overlook function composition with the infix dot operator.
Use line breaks carefully. Line breaks can increase readability, but there is a tradeoff: Your editor may display only 40–50 lines at once. If you need to read and understand a large function all at once, you mustn't overuse line breaks.
Almost always prefer the -- comments which run to end of line over the {- ... -} comments. The braced comments may be appropriate for large headers—that's it.
Give each top-level function an explicit type signature.
When possible, align -- lines, = signs, and even parentheses and commas that occur in adjacent lines.
Influenced as I am by GHC central, I have a very mild preference to use camelCase for exported identifiers and short_name with underscores for local where-bound or let-bound variables.
Some good rules of thumbs imho:
Consult with HLint to make sure you don't have redundant braces and that your code isn't pointlessly point-full.
Avoid recreating existing library functions. Hoogle can help you find them.
Often times existing library functions are more general than what one was going to make. For example if you want Maybe (Maybe a) -> Maybe a, then join does that among other things.
Argument naming and documentation is important sometimes.
For a function like replicate :: Int -> a -> [a], it's pretty obvious what each of the arguments does, from their types alone.
For a function that takes several arguments of the same type, like isPrefixOf :: (Eq a) => [a] -> [a] -> Bool, naming/documentation of arguments is more important.
If one function exists only to serve another function, and isn't otherwise useful, and/or it's hard to think of a good name for it, then it probably should exist in it's caller's where clause instead of in the module's scope.
DRY
Use Template-Haskell when appropriate.
Bundles of functions like zip3, zipWith3, zip4, zipWith4, etc are very meh. Use Applicative style with ZipLists instead. You probably never really need functions like those.
Derive instances automatically. The derive package can help you derive instances for type-classes such as Functor (there is only one correct way to make a type an instance of Functor).
Code that is more general has several benefits:
It's more useful and reusable.
It is less prone to bugs because there are more constraints.
For example if you want to program concat :: [[a]] -> [a], and notice how it can be more general as join :: Monad m => m (m a) -> m a. There is less room for error when programming join because when programming concat you can reverse the lists by mistake and in join there are very few things you can do.
When using the same stack of monad transformers in many places in your code, make a type synonym for it. This will make the types shorter, more concise, and easier to modify in bulk.
Beware of "lazy IO". For example readFile doesn't really read the file's contents at the moment the file is read.
Avoid indenting so much that I can't find the code.
If your type is logically an instance of a type-class, make it an instance.
The instance can replace other interface functions you may have considered with familiar ones.
Note: If there is more than one logical instance, create newtype-wrappers for the instances.
Make the different instances consistent. It would have been very confusing/bad if the list Applicative behaved like ZipList.
I like to try to organize functions
as point-free style compositions as
much as possible by doing things
like:
func = boo . boppity . bippity . snd
where boo = ...
boppity = ...
bippity = ...
I like using ($) only to avoid nested parens or long parenthesized expressions
... I thought I had a few more in me, oh well
I'd suggest taking a look at this style checker.
I found good markdown file covering almost every aspect of haskell code style. It can be used as cheat sheet. You can find it here: link