What makes a good name for a helper function? - haskell

Consider the following problem: given a list of length three of tuples (String,Int), is there a pair of elements having the same "Int" part? (For example, [("bob",5),("gertrude",3),("al",5)] contains such a pair, but [("bob",5),("gertrude",3),("al",1)] does not.)
This is how I would implement such a function:
import Data.List (sortBy)
import Data.Function (on)
hasPair::[(String,Int)]->Bool
hasPair = napkin . sortBy (compare `on` snd)
where napkin [(_, a),(_, b),(_, c)] | a == b = True
| b == c = True
| otherwise = False
I've used pattern matching to bind names to the "Int" part of the tuples, but I want to sort first (in order to group like members), so I've put the pattern-matching function inside a where clause. But this brings me to my question: what's a good strategy for picking names for functions that live inside where clauses? I want to be able to think of such names quickly. For this example, "hasPair" seems like a good choice, but it's already taken! I find that pattern comes up a lot - the natural-seeming name for a helper function is already taken by the outer function that calls it. So I have, at times, called such helper functions things like "op", "foo", and even "helper" - here I have chosen "napkin" to emphasize its use-it-once, throw-it-away nature.
So, dear Stackoverflow readers, what would you have called "napkin"? And more importantly, how do you approach this issue in general?

General rules for locally-scoped variable naming.
f , k, g, h for super simple local, semi-anonymous things
go for (tail) recursive helpers (precedent)
n , m, i, j for length and size and other numeric values
v for results of map lookups and other dictionary types
s and t for strings.
a:as and x:xs and y:ys for lists.
(a,b,c,_) for tuple fields.
These generally only apply for arguments to HOFs. For your case, I'd go with something like k or eq3.
Use apostrophes sparingly, for derived values.

I tend to call boolean valued functions p for predicate. pred, unfortunately, is already taken.

In cases like this, where the inner function is basically the same as the outer function, but with different preconditions (requiring that the list is sorted), I sometimes use the same name with a prime, e.g. hasPairs'.
However, in this case, I would rather try to break down the problem into parts that are useful by themselves at the top level. That usually also makes naming them easier.
hasPair :: [(String, Int)] -> Bool
hasPair = hasDuplicate . map snd
hasDuplicate :: Ord a => [a] -> Bool
hasDuplicate = not . isStrictlySorted . sort
isStrictlySorted :: Ord a => [a] -> Bool
isStrictlySorted xs = and $ zipWith (<) xs (tail xs)

My strategy follows Don's suggestions fairly closely:
If there is an obvious name for it, use that.
Use go if it is the "worker" or otherwise very similar in purpose to the original function.
Follow personal conventions based on context, e.g. step and start for args to a fold.
If all else fails, just go with a generic name, like f
There are two techniques that I personally avoid. One is using the apostrophe version of the original function, e.g. hasPair' in the where clause of hasPair. It's too easy to accidentally write one when you meant the other; I prefer to use go in such cases. But this isn't a huge deal as long as the functions have different types. The other is using names that might connote something, but not anything that has to do with what the function actually does. napkin would fall into this category. When you revisit this code, this naming choice will probably baffle you, as you will have forgotten the original reason that you named it napkin. (Because napkins have 4 corners? Because they are easily folded? Because they clean up messes? They're found at restaurants?) Other offenders are things like bob and myCoolFunc.
If you have given a function a name that is more descriptive than go or h, then you should be able to look at either the context in which it is used, or the body of the function, and in both situations get a pretty good idea of why that name was chosen. This is where my point #3 comes in: personal conventions. Much of Don's advice applies. If you are using Haskell in a collaborative situation, then coordinate with your team and decide on certain conventions for common situations.

Related

Haskell what does the ' symbol do?

As the title states, I see pieces of code online where the variables/functions have ' next to it, what does this do/mean?
ex:
function :: [a] -> [a]
function ...
function' :: ....
The notation comes from mathematics. It is read x prime. In pretty much any math manual you can find something like let x be a number and x' be the projection of ... (math stuff).
Why not using another convention? well, in mathematics It makes a lot of sense because It can be very pedagogical... In programming we aren't used to this convention so I don't see the point of using it, but I am not against it neither.
Just to give you an example of its use in mathematics so you can understand why It is used in Haskell. Below, the same triangle concept but one using prime convention and other not using it. It is pretty clear in the first picture that pairs (A, A'), (B, B'), ... are related by one being the vertex and the prime version being the midpoint of the oposite edge. Whereas in the second example, you just have to remember that A is the midpoint of the oposite edge of vertex P. First is easier and more pedagogical:
As the other answers said, function' is just another variable name. So,
don'tUse :: Int -> IO ()
don'tUse won'tBe''used'' = return ()
is just like
dontUse :: Int -> IO ()
dontUse wontBeUsed = return ()
with slightly different names. The only requirement is that the name starts with a lowercase-letter or underscore, after that you can have as many single-quote characters as you want.
Prelude> let _' = 1
Prelude> let _'' = 2
Prelude> let _''''''''' = 9
Prelude> _' + _'' * _'''''''''
19
...Of course it's not necessarily a good idea to name variables like that; normally such prime-names are used when making a slightly different version of an already named thing. For example, foldl and foldl' are functions with the same signature that do essentially the same thing, only with different strictness (which often affects performance memory usage and whether infinite inputs are allowed, but not the actual results).
That said, to the question
Haskell what does the ' symbol do?
– the ' symbol does in fact do various other things as well, but only when it appears not as a non-leading character in a name.
'a' is a character literal.
'Foo is a constructor used on the type level. See DataKinds.
'bar and ''Baz are quoted names. See TemplateHaskell.

Haskell. It is required to flatten the same elements in the list in the order they follow

How to remove identical duplicate items adjacent to the items in the list?
flattlst "aaaasssaaabbbssaaa"
--"asabsa"
I tried to do it through the "nub" function in the Data.List,
import Data.List
flattlst = nub
--flattlst"aaaasssaaabbbssaaa"
"asb" -- !!! It is wrong!!
"asabsa" -- !!The answer should be like this.
but that is not what I need. The "Nub" removes all the same elements, but I only need those that go in a row.Help me solve this problem.
nub will only yield an element, if it did not yield an equivalent element first. So nub indeed will emit "asb", since the 'a's, 's's and 'b's after that are already emitted.
What you can use is group :: (Eq a, Foldable f) => f a -> [NonEmpty a]. This will yield a non-empty list for each subsequence of equal elements. For example:
Prelude Data.List.NonEmpty> group "aaaasssaaabbbssaaa"
['a' :| "aaa",'s' :| "ss",'a' :| "aa",'b' :| "bb",'s' :| "s",'a' :| "aa"]
You still need to post-process this, such that for each NonEmpty element, you return the first element. I leave that as an exercise. Hint: you can use pattern matching for that.
Here's a naive approach that should work fine. It uses group from Data.List:
flattlst = map head . group
Most Haskell programmers, including me, generally warn against using head, as it will crash the program if used on an empty list. However, group is guaranteed to result in a list of non-empty lists*, so this should be OK.
*at least if the input is non-empty. I'm not sure what group does when used on an empty list - I suspect one way or another this will blowup, so add a special case for the empty list/string if that's important to avoid.

Haskell # type annotation [duplicate]

I've come across a piece of Haskell code that looks like this:
ps#(p:pt)
What does the # symbol mean in this context? I can't seem to find any info on Google (it's unfortunately hard to search for symbols on Google), and I can't find the function in the Prelude documentation, so I imagine it must be some sort of syntactic sugar instead.
Yes, it's just syntactic sugar, with # read aloud as "as". ps#(p:pt) gives you names for
the list: ps
the list's head : p
the list's tail: pt
Without the #, you'd have to choose between (1) or (2):(3).
This syntax actually works for any constructor; if you have data Tree a = Tree a [Tree a], then t#(Tree _ kids) gives you access to both the tree and its children.
The # Symbol is used to both give a name to a parameter and match that parameter against a pattern that follows the #. It's not specific to lists and can also be used with other data structures.
This is useful if you want to "decompose" a parameter into it's parts while still needing the parameter as a whole somewhere in your function. One example where this is the case is the tails function from the standard library:
tails :: [a] -> [[a]]
tails [] = [[]]
tails xxs#(_:xs) = xxs : tails xs
I want to add that # works at all levels, meaning you can do this:
let a#(b#(Just c), Just d) = (Just 1, Just 2) in (a, b, c, d)
Which will then produce this: ((Just 1, Just 2), Just 1, 1, 2)
So basically it's a way for you to bind a pattern to a value. This also means that it works with any kind of pattern, not just lists, as demonstrated above. This is a very useful thing to know, as it means you can use it in many more cases.
In this case, a is the entire Maybe Tuple, b is just the first Just in the tuple, and c and d are the values contained in the first and second Just in the tuple respectively
To add to what the other people have said, they are called as-patterns (in ML the syntax uses the keyword "as"), and are described in the section of the Haskell Report on patterns.

Writing a function to get all subsequences of size n in Haskell?

I was trying to write a function to get all subsequences of a list that are of size n, but I'm not sure how to go about it.
I was thinking that I could probably use the built-in Data.List.subsequences and just filter out the lists that are not of size n, but it seems like a rather roundabout and inefficient way of doing it, and I'd rather not do that if I can avoid it, so I'm wondering if you have any ideas?
I want it to be something like this type
subseqofsize :: Int -> [a] -> [[a]]
For further clarification, here's an example of what I'm looking for:
subseqofsize 2 [1,2,3,3]
[[1,2],[1,3],[2,3],[1,3],[2,3],[3,3]]
Also, I don't care about the order of anything.
I'm assuming that this is homework, or that you are otherwise doing this as an exercise to learn, so I'll give you an outline of what the solution looks like instead of spoon-feeding you the correct answer.
Anyway, this is a recursion question.
There are two base cases:
sublistofsize 0 _ = ...
sublistofsize _ [] = ...
Then there are two recursive steps:
sublistofsize n (x : xs) = sublistsThatStartWithX ++ sublistsThatDon'tStartWithX
where sublistsThatStartWithX = ...
sublistsThatDon'tStartWithX = ...
Remember that the definitions of the base cases need to work appropriately with the definitions in the recursive cases. Think carefully: don't just assume that the base cases both result in an empty list.
Does this help?
You can think about this mathematically: to compute the sublists of size k, we can look at one element x of the list; either the sublists contain x, or they don't. In the former case, the sublist consists of x and then k-1 elements chosen from the remaining elements. In the latter case, the sublists consist of k elements chosen from the elements that aren't x. This lends itself to a (fairly) simple recursive definition.
(There are very very strong similarities to the recursive formula for binomial coefficients, which is expected.)
(Edit: removed code, per dave4420's reasons :) )

What does the "#" symbol mean in reference to lists in Haskell?

I've come across a piece of Haskell code that looks like this:
ps#(p:pt)
What does the # symbol mean in this context? I can't seem to find any info on Google (it's unfortunately hard to search for symbols on Google), and I can't find the function in the Prelude documentation, so I imagine it must be some sort of syntactic sugar instead.
Yes, it's just syntactic sugar, with # read aloud as "as". ps#(p:pt) gives you names for
the list: ps
the list's head : p
the list's tail: pt
Without the #, you'd have to choose between (1) or (2):(3).
This syntax actually works for any constructor; if you have data Tree a = Tree a [Tree a], then t#(Tree _ kids) gives you access to both the tree and its children.
The # Symbol is used to both give a name to a parameter and match that parameter against a pattern that follows the #. It's not specific to lists and can also be used with other data structures.
This is useful if you want to "decompose" a parameter into it's parts while still needing the parameter as a whole somewhere in your function. One example where this is the case is the tails function from the standard library:
tails :: [a] -> [[a]]
tails [] = [[]]
tails xxs#(_:xs) = xxs : tails xs
I want to add that # works at all levels, meaning you can do this:
let a#(b#(Just c), Just d) = (Just 1, Just 2) in (a, b, c, d)
Which will then produce this: ((Just 1, Just 2), Just 1, 1, 2)
So basically it's a way for you to bind a pattern to a value. This also means that it works with any kind of pattern, not just lists, as demonstrated above. This is a very useful thing to know, as it means you can use it in many more cases.
In this case, a is the entire Maybe Tuple, b is just the first Just in the tuple, and c and d are the values contained in the first and second Just in the tuple respectively
To add to what the other people have said, they are called as-patterns (in ML the syntax uses the keyword "as"), and are described in the section of the Haskell Report on patterns.

Resources