Function definition with guards - haskell

Classic way to define Haskell functions is
f1 :: String -> Int
f1 ('-' : cs) -> f1 cs + 1
f1 _ = 0
I'm kinda unsatisfied writing function name at every line. Now I usually write in the following way, using pattern guards extension and consider it more readable and modification friendly:
f2 :: String -> Int
f2 s
| '-' : cs <- s = f2 cs + 1
| otherwise = 0
Do you think that second example is more readable, modifiable and elegant? What about generated code? (Haven't time to see desugared output yet, sorry!). What are cons? The only I see is extension usage.

Well, you could always write it like this:
f3 :: String -> Int
f3 s = case s of
('-' : cs) -> f3 cs + 1
_ -> 0
Which means the same thing as the f1 version. If the function has a lengthy or otherwise hard-to-read name, and you want to match against lots of patterns, this probably would be an improvement. For your example here I'd use the conventional syntax.
There's nothing wrong with your f2 version, as such, but it seems a slightly frivolous use of a syntactic GHC extension that's not common enough to assume everyone will be familiar with it. For personal code it's not a big deal, but I'd stick with the case expression for anything you expect other people to be reading.

I prefer writing function name when I am pattern matching on something as is shown in your case. I find it more readable.
I prefer using guards when I have some conditions on the function arguments, which helps avoiding if else, which I would have to use if I was to follow the first pattern.
So to answer your questions
Do you think that second example is more readable, modifiable and elegant?
No, I prefer the first one which is simple and readable. But more or less it depends on your personal taste.
What about generated code?
I dont think there will be any difference in the generated code. Both are just patternmatching.
What are cons?
Well patternguards are useful to patternmatch instead of using let or something more cleanly.
addLookup env var1 var2
| Just val1 <- lookup env var1
, Just val2 <- lookup env var2
= val1 + val2
Well the con is ofcourse you need to use an extension and also it is not Haskell98 (which you might not consider much of a con)
On the other hand for trivial pattern matching on function arguments I will just use the first method, which is simple and readable.

Related

Haskell what does the ' symbol do?

As the title states, I see pieces of code online where the variables/functions have ' next to it, what does this do/mean?
ex:
function :: [a] -> [a]
function ...
function' :: ....
The notation comes from mathematics. It is read x prime. In pretty much any math manual you can find something like let x be a number and x' be the projection of ... (math stuff).
Why not using another convention? well, in mathematics It makes a lot of sense because It can be very pedagogical... In programming we aren't used to this convention so I don't see the point of using it, but I am not against it neither.
Just to give you an example of its use in mathematics so you can understand why It is used in Haskell. Below, the same triangle concept but one using prime convention and other not using it. It is pretty clear in the first picture that pairs (A, A'), (B, B'), ... are related by one being the vertex and the prime version being the midpoint of the oposite edge. Whereas in the second example, you just have to remember that A is the midpoint of the oposite edge of vertex P. First is easier and more pedagogical:
As the other answers said, function' is just another variable name. So,
don'tUse :: Int -> IO ()
don'tUse won'tBe''used'' = return ()
is just like
dontUse :: Int -> IO ()
dontUse wontBeUsed = return ()
with slightly different names. The only requirement is that the name starts with a lowercase-letter or underscore, after that you can have as many single-quote characters as you want.
Prelude> let _' = 1
Prelude> let _'' = 2
Prelude> let _''''''''' = 9
Prelude> _' + _'' * _'''''''''
19
...Of course it's not necessarily a good idea to name variables like that; normally such prime-names are used when making a slightly different version of an already named thing. For example, foldl and foldl' are functions with the same signature that do essentially the same thing, only with different strictness (which often affects performance memory usage and whether infinite inputs are allowed, but not the actual results).
That said, to the question
Haskell what does the ' symbol do?
– the ' symbol does in fact do various other things as well, but only when it appears not as a non-leading character in a name.
'a' is a character literal.
'Foo is a constructor used on the type level. See DataKinds.
'bar and ''Baz are quoted names. See TemplateHaskell.

Haskell conversion between types

Again stuck on something probably theoretical. There are many libraries in Haskell, i'd like to use less as possible. If I have a type like this:
data Note = Note { _noteID :: Int
, _noteTitle :: String
, _noteBody :: String
, _noteSubmit :: String
} deriving Show
And use that to create a list of [Note {noteID=1...}, Note {noteID=2...}, ] et cetera. I now have a list of type Note. Now I want to write it to a file using writeFile. Probably it ghc will not allow it considering writeFile has type FilePath -> String -> IO (). But I also want to avoid deconstructing (writeFile) and constructing (readFile) the types all the time, assuming I will not leave the Haskell 'realm'. Is there a way to do that, without using special libs? Again: thanks a lot. Books on Haskell are good, but StackOverflow is the glue between the books and the real world.
If you're looking for a "quick fix", for a one-off script or something like that, you can derive Read in addition to Show, and then you'll be able to use show to convert to String and read to convert back, for example:
data D = D { x :: Int, y :: Bool }
deriving (Show, Read)
d1 = D 42 True
s = show d1
-- s == "D {x = 42, y = True}"
d2 :: D
d2 = read s
-- d2 == d1
However, please, please don't put this in production code. First, you're implicitly relying on how the record is coded, and there are no checks to protect from subtle changes. Second, the read function is partial - that is, it will crash if it can't parse the input. And finally, if you persist your data this way, you'll be stuck with this record format and can never change it.
For a production-quality solution, I'm sorry, but you'll have to come up with an explicit, documented serialization format. No way around it - in any language.

Words from string to list

I'm trying to write a Haskell function which would read a string and return a list with the words from the string saved in it.
Here's how I did it:
toWordList :: String -> [String]
toWordList = do
[ toLower x | x <- str ]
let var = removePunctuation(x)
return (words var)
But I get this error:
Test1.hs:13:17: error: parse error on input 'let'
|
13 | let var = removePunctuation(x)
| ^^^
I'm new to Haskell so I don't have the grasp over its syntax so thanks in advance for the help.
There's quite a few mistakes here, you should spend more time reading over some tutorials (learn you a Haskell, Real World Haskell). You're pretty close though, so I'll try to do a break-down here.
do is special - it doesn't switch Haskell into "imperative mode", it lets you write clearer code when using Monads - if you don't yet know what Monads are, stay away from do! Keywords like return also don't behave the same as in imperative languages. Try to approach Haskell with a completely fresh mind.
Also in Haskell, indentation is important - see this link for a good explanation. Essentially, you want all the lines in the same "block" to have the same indentation.
Okay, let's strip out the do and return keywords, and align the indentation. We'll also name the parameter to the function str - in your original code, you missed this bit out.
toWordList :: String -> [String]
toWordList str =
[toLower x | x <- str]
let var = removePunctuation(x)
words var
The syntax for let is let __ = __ in __. There's different notation when using do, but forget about that for now. We also don't name the result of the list comprehension, so let's do that:
toWordList str =
let lowered = [toLower x | x <- str] in
let var = removePunctuation lowered in
words var
And this works! We just needed to get some syntax right and avoid the monadic syntactic sugar of do/return.
It's possible (and easy) to make it nicer though. Those let blocks are kinda ugly, we can strip those away. We can also replace the list comprehension with map toLower, which is a bit more elegant and is equivalent to your comprehension:
toWordList str = words (removePunctuation (map toLower str))
Nice, that's down to a single line now! But all those brackets are also a bit of an eyesore, how about we use the $ function?
toWordList str = words $ removePunctuation $ map toLower str
Looking good. There's another improvement we can make, which is to convert this into point-free style, where we don't explicitly name our parameter - instead we express this function as the composition of other functions.
toWordList = words . removePunctuation . (map toLower)
And we're done! Hopefully the first two code snippets make it clearer how the Haskell syntax works, and the last few might show you some nice examples of how you can make fairly verbose code much much cleaner.

Is there a (Template) Haskell library that would allow me to print/dump a few local bindings with their respective names?

For instance:
let x = 1 in putStrLn [dump|x, x+1|]
would print something like
x=1, (x+1)=2
And even if there isn't anything like this currently, would it be possible to write something similar?
TL;DR There is this package which contains a complete solution.
install it via cabal install dump
and/or
read the source code
Example usage:
{-# LANGUAGE QuasiQuotes #-}
import Debug.Dump
main = print [d|a, a+1, map (+a) [1..3]|]
where a = 2
which prints:
(a) = 2 (a+1) = 3 (map (+a) [1..3]) = [3,4,5]
by turnint this String
"a, a+1, map (+a) [1..3]"
into this expression
( "(a) = " ++ show (a) ++ "\t " ++
"(a+1) = " ++ show (a + 1) ++ "\t " ++
"(map (+a) [1..3]) = " ++ show (map (+ a) [1 .. 3])
)
Background
Basically, I found that there are two ways to solve this problem:
Exp -> String The bottleneck here is pretty-printing haskell source code from Exp and cumbersome syntax upon usage.
String -> Exp The bottleneck here is parsing haskell to Exp.
Exp -> String
I started out with what #kqr put together, and tried to write a parser to turn this
["GHC.Classes.not x_1627412787 = False","x_1627412787 = True","x_1627412787 GHC.Classes.== GHC.Types.True = True"]
into this
["not x = False","x = True","x == True = True"]
But after trying for a day, my parsec-debugging-skills have proven insufficient to date, so instead I went with a simple regular expression:
simplify :: String -> String
simplify s = subRegex (mkRegex "_[0-9]+|([a-zA-Z]+\\.)+") s ""
For most cases, the output is greatly improved.
However, I suspect this to likely mistakenly remove things it shouldn't.
For example:
$(dump [|(elem 'a' "a.b.c", True)|])
Would likely return:
["elem 'a' \"c\" = True","True = True"]
But this could be solved with proper parsing.
Here is the version that works with the regex-aided simplification: https://github.com/Wizek/kqr-stackoverflow/blob/master/Th.hs
Here is a list of downsides / unresolved issues I've found with the Exp -> String solution:
As far as I know, not using Quasi Quotation requires cumbersome syntax upon usage, like: $(d [|(a, b)|]) -- as opposed to the more succinct [d|a, b|]. If you know a way to simplify this, please do tell!
As far as I know, [||] needs to contain fully valid Haskell, which pretty much necessitates the use of a tuple inside further exacerbating the syntactic situation. There is some upside to this too, however: at least we don't need to scratch our had where to split the expressions since GHC does that for us.
For some reason, the tuple only seemed to accept Booleans. Weird, I suspect this should be possible to fix somehow.
Pretty pretty-printing Exp is not very straight-forward. A more complete solution does require a parser after all.
Printing an AST scrubs the original formatting for a more uniform looks. I hoped to preserve the expressions letter-by-letter in the output.
The deal-breaker was the syntactic over-head. I knew I could get to a simpler solution like [d|a, a+1|] because I have seen that API provided in other packages. I was trying to remember where I saw that syntax. What is the name...?
String -> Exp
Quasi Quotation is the name, I remember!
I remembered seeing packages with heredocs and interpolated strings, like:
string = [qq|The quick {"brown"} $f {"jumps " ++ o} the $num ...|]
where f = "fox"; o = "over"; num = 3
Which, as far as I knew, during compile-time, turns into
string = "The quick " ++ "brown" ++ " " ++ $f ++ "jumps " ++ o ++ " the" ++ show num ++ " ..."
where f = "fox"; o = "over"; num = 3
And I thought to myself: if they can do it, I should be able to do it too!
A bit of digging in their source code revealed the QuasiQuoter type.
data QuasiQuoter = QuasiQuoter {quoteExp :: String -> Q Exp}
Bingo, this is what I want! Give me the source code as string! Ideally, I wouldn't mind returning string either, but maybe this will work. At this point I still know quite little about Q Exp.
After all, in theory, I would just need to split the string on commas, map over it, duplicate the elements so that first part stays string and the second part becomes Haskell source code, which is passed to show.
Turning this:
[d|a+1|]
into this:
"a+1" ++ " = " ++ show (a+1)
Sounds easy, right?
Well, it turns out that even though GHC most obviously is capable to parse haskell source code, it doesn't expose that function. Or not in any way we know of.
I find it strange that we need a third-party package (which thankfully there is at least one called haskell-src-meta) to parse haskell source code for meta programming. Looks to me such an obvious duplication of logic, and potential source of mismatch -- resulting in bugs.
Reluctantly, I started looking into it. After all, if it is good enough for the interpolated-string folks (those packaged did rely on haskell-src-meta) then maybe it will work okay for me too for the time being.
And alas, it does contain the desired function:
Language.Haskell.Meta.Parse.parseExp :: String -> Either String Exp
Language.Haskell.Meta.Parse
From this point it was rather straightforward, except for splitting on commas.
Right now, I do a very simple split on all commas, but that doesn't account for this case:
[d|(1, 2), 3|]
Which fails unfortunatelly. To handle this, I begun writing a parsec parser (again) which turned out to be more difficult than anticipated (again). At this point, I am open to suggestions. Maybe you know of a simple parser that handles the different edge-cases? If so, tell me in a comment, please! I plan on resolving this issue with or without parsec.
But for the most use-cases: it works.
Update at 2015-06-20
Version 0.2.1 and later correctly parses expressions even if they contain commas inside them. Meaning [d|(1, 2), 3|] and similar expressions are now supported.
You can
install it via cabal install dump
and/or
read the source code
Conclusion
During the last week I've learnt quite a bit of Template Haskell and QuasiQuotation, cabal sandboxes, publishing a package to hackage, building haddock docs and publishing them, and some things about Haskell too.
It's been fun.
And perhaps most importantly, I now am able to use this tool for debugging and development, the absence of which has been bugging me for some time. Peace at last.
Thank you #kqr, your engagement with my original question and attempt at solving it gave me enough spark and motivation to continue writing up a full solution.
I've actually almost solved the problem now. Not exactly what you imagined, but fairly close. Maybe someone else can use this as a basis for a better version. Either way, with
{-# LANGUAGE TemplateHaskell, LambdaCase #-}
import Language.Haskell.TH
dump :: ExpQ -> ExpQ
dump tuple =
listE . map dumpExpr . getElems =<< tuple
where
getElems = \case { TupE xs -> xs; _ -> error "not a tuple in splice!" }
dumpExpr exp = [| $(litE (stringL (pprint exp))) ++ " = " ++ show $(return exp)|]
you get the ability to do something like
λ> let x = True
λ> print $(dump [|(not x, x, x == True)|])
["GHC.Classes.not x_1627412787 = False","x_1627412787 = True","x_1627412787 GHC.Classes.== GHC.Types.True = True"]
which is almost what you wanted. As you see, it's a problem that the pprint function includes module prefixes and such, which makes the result... less than ideally readable. I don't yet know of a fix for that, but other than that I think it is fairly usable.
It's a bit syntactically heavy, but that is because it's using the regular [| quote syntax in Haskell. If one wanted to write their own quasiquoter, as you suggest, I'm pretty sure one would also have to re-implement parsing Haskell, which would suck a bit.

What makes a good name for a helper function?

Consider the following problem: given a list of length three of tuples (String,Int), is there a pair of elements having the same "Int" part? (For example, [("bob",5),("gertrude",3),("al",5)] contains such a pair, but [("bob",5),("gertrude",3),("al",1)] does not.)
This is how I would implement such a function:
import Data.List (sortBy)
import Data.Function (on)
hasPair::[(String,Int)]->Bool
hasPair = napkin . sortBy (compare `on` snd)
where napkin [(_, a),(_, b),(_, c)] | a == b = True
| b == c = True
| otherwise = False
I've used pattern matching to bind names to the "Int" part of the tuples, but I want to sort first (in order to group like members), so I've put the pattern-matching function inside a where clause. But this brings me to my question: what's a good strategy for picking names for functions that live inside where clauses? I want to be able to think of such names quickly. For this example, "hasPair" seems like a good choice, but it's already taken! I find that pattern comes up a lot - the natural-seeming name for a helper function is already taken by the outer function that calls it. So I have, at times, called such helper functions things like "op", "foo", and even "helper" - here I have chosen "napkin" to emphasize its use-it-once, throw-it-away nature.
So, dear Stackoverflow readers, what would you have called "napkin"? And more importantly, how do you approach this issue in general?
General rules for locally-scoped variable naming.
f , k, g, h for super simple local, semi-anonymous things
go for (tail) recursive helpers (precedent)
n , m, i, j for length and size and other numeric values
v for results of map lookups and other dictionary types
s and t for strings.
a:as and x:xs and y:ys for lists.
(a,b,c,_) for tuple fields.
These generally only apply for arguments to HOFs. For your case, I'd go with something like k or eq3.
Use apostrophes sparingly, for derived values.
I tend to call boolean valued functions p for predicate. pred, unfortunately, is already taken.
In cases like this, where the inner function is basically the same as the outer function, but with different preconditions (requiring that the list is sorted), I sometimes use the same name with a prime, e.g. hasPairs'.
However, in this case, I would rather try to break down the problem into parts that are useful by themselves at the top level. That usually also makes naming them easier.
hasPair :: [(String, Int)] -> Bool
hasPair = hasDuplicate . map snd
hasDuplicate :: Ord a => [a] -> Bool
hasDuplicate = not . isStrictlySorted . sort
isStrictlySorted :: Ord a => [a] -> Bool
isStrictlySorted xs = and $ zipWith (<) xs (tail xs)
My strategy follows Don's suggestions fairly closely:
If there is an obvious name for it, use that.
Use go if it is the "worker" or otherwise very similar in purpose to the original function.
Follow personal conventions based on context, e.g. step and start for args to a fold.
If all else fails, just go with a generic name, like f
There are two techniques that I personally avoid. One is using the apostrophe version of the original function, e.g. hasPair' in the where clause of hasPair. It's too easy to accidentally write one when you meant the other; I prefer to use go in such cases. But this isn't a huge deal as long as the functions have different types. The other is using names that might connote something, but not anything that has to do with what the function actually does. napkin would fall into this category. When you revisit this code, this naming choice will probably baffle you, as you will have forgotten the original reason that you named it napkin. (Because napkins have 4 corners? Because they are easily folded? Because they clean up messes? They're found at restaurants?) Other offenders are things like bob and myCoolFunc.
If you have given a function a name that is more descriptive than go or h, then you should be able to look at either the context in which it is used, or the body of the function, and in both situations get a pretty good idea of why that name was chosen. This is where my point #3 comes in: personal conventions. Much of Don's advice applies. If you are using Haskell in a collaborative situation, then coordinate with your team and decide on certain conventions for common situations.

Resources