I'm reading Real World Haskell, and I tried to implement the splitLines code myself, and I came up with more or less the same implementation (Chapter 4, page 73):
splitLines :: String -> [String]
splitLines [] = []
splitLines ('\r':a) = splitLines a
splitLines ('\n':a) = splitLines a
splitLines a = let (l,r) = break isCRorNL a
in l:splitLines r
where isCRorNL e = ???
--the book defines isCRorNL c = c == '\n' || c == '\r'
However, I've been spending definitely too much time trying to write the isCRorNL in the most functional and readable way I could think of, so that I can get rid of the where and turning the last definition of splitLines into an amost-english sentence (just like compare `on` length and the likes), without success.
Some sparse thoughts I have been going through:
A lambda, (\c -> c == '\n' || c == '\r'), is just too much power and too little expressiveness for such a simple and specific task;
furthermore, it contains a fair amount of duplicated code and/or it is uselessly verbose.
Whatever I have to put in isCRorNL has to have type Char -> Bool,
therefore it can have any type a1 -> a2 -> ... -> an -> Char -> Bool if I provide it with the first n arguments.
The any function can help me checking if a given character is either '\n' or '\r' or, in other words, if it is in the list of Chars "\n\r".
Since I want to check for equality, I can pass (==) to my function.
Therefore isCRorNL can have type (Char -> Char -> Bool) -> [Char] -> Char -> Bool (or with the first two argument inverted), and I can pass to it (==) as the first argument and "\n\r" as the second argument.
So I was looking for some standard functions I could compose to get such a function.
Finally I gave up and defined it this way: isCRorNL e = any (== e) "\n\r"; I think this is quite good as regards extensibility, as I can add as many characters in the "…", and I can change the operator ==; sadly I cannot put the function directly where it is used, as I am not able to write it as a partially applied function.
How would you do it?
As soon as I looked for the link in the question and visited it (for the first time), I realized that the code chunks are commented by readers, and the first comment under splitLines reads:
augustss 2008-04-23
[...] If you're making a point about functional
style maybe you should use
isLineSeparator = (`elem` "\r\n")
So it comes out I was thinking to much about composition of functions, while the easiest solution lies in the partial application of a so simple function, elem. The drawback here is that the operator used to check for equality is built in elem and cannot be changed. Nonetheless I feel dumb for not having thought to elem myself.
Related
There is a technique I've seen a few times with foldr. It involves using a function in place of the accumulator in a foldr. I'm wondering when it is necessary to do this, as opposed to using an accumulator that is just a regular value.
Most people have seen this technique before when using foldr to define foldl:
myFoldl :: forall a b. (b -> a -> b) -> b -> [a] -> b
myFoldl accum nil as = foldr f id as nil
where
f :: a -> (b -> b) -> b -> b
f a continuation b = continuation $ accum b a
Here, the type of the combining function f is not just a -> b -> b like normal, but a -> (b -> b) -> b -> b. It takes not only an a and b, but a continuation (b -> b) that we need to pass the b to in order to get the final b.
I most recently saw an example of using this trick in the book Parallel and Concurrent Programming in Haskell. Here is a link to the source code of the example using this trick. Here is a link to the chapter of the book explaining this example.
I've taken the liberty of simplifying the source code into a similar (but shorter) example. Below is a function that takes a list of Strings, prints out whether each string's length is greater than five, then prints the full list of only the Strings that have a length greater than five:
import Text.Printf
stringsOver5 :: [String] -> IO ()
stringsOver5 strings = foldr f (print . reverse) strings []
where
f :: String -> ([String] -> IO ()) -> [String] -> IO ()
f str continuation strs = do
let isGreaterThan5 = length str > 5
printf "Working on \"%s\", greater than 5? %s\n" str (show isGreaterThan5)
if isGreaterThan5
then continuation $ str : strs
else continuation strs
Here's an example of using it from GHCi:
> stringsOver5 ["subdirectory", "bye", "cat", "function"]
Working on "subdirectory", greater than 5? True
Working on "bye", greater than 5? False
Working on "cat", greater than 5? False
Working on "function", greater than 5? True
["subdirectory","function"]
Just like in the myFoldl example, you can see that the combining function f is using the same trick.
However, it occurred to me that this stringsOver5 function could probably be written without this trick:
stringsOver5PlainFoldr :: [String] -> IO ()
stringsOver5PlainFoldr strings = foldr f (pure []) strings >>= print
where
f :: String -> IO [String] -> IO [String]
f str ioStrs = do
let isGreaterThan5 = length str > 5
printf "Working on \"%s\", greater than 5? %s\n" str (show isGreaterThan5)
if isGreaterThan5
then fmap (str :) ioStrs
else ioStrs
(Although maybe you could make the argument that IO [String] is a continuation?)
I have two questions regarding this:
Is it every absolutely necessary to use this trick of passing a continuation to foldr, instead of using foldr with a normal value as an accumulator? Is there an example of a function that absolutely can't be written using foldr with a normal value? (Aside from foldl and functions like that, of course.)
When would I want to use this trick in my own code? Is there any example of a function that can be significantly simplified by using this trick?
Is there any sort of performance considerations to take into account when using this trick? (Or, well, when not using this trick?)
I have two questions regarding this:
For some large value of "two" :-P
Is it every absolutely necessary to use this trick of passing a continuation to foldr? Is there an example of a function that
absolutely can't be written without this trick? (Aside from foldl and
functions like that, of course.)
No, never. Each foldr invocation can always be replaced by explicit recursion.
One should use foldr and other well-known library functions when they make the code simpler. When they do not, one should not shoehorn the code so that it fits the foldr pattern.
There is no shame in using plain recursion, when there is no obvious replacement.
Compare your code with this, for instance:
stringsOver5 :: [String] -> IO ()
stringsOver5 strings = go strings []
where
go :: [String] -> [String] -> IO ()
go [] acc = print (reverse acc)
go (s:ss) acc = do
let isGreaterThan5 = length str > 5
printf "Working on \"%s\", greater than 5? %s\n" str (show isGreaterThan5)
if isGreaterThan5
then go ss (s:acc)
else go ss acc
When would I want to use this trick in my own code? Is there any example of a function that can be significantly simplified by using
this trick?
In my humble opinion, almost never.
Personally, I find "calling foldr with four (or more) arguments" to be an anti-pattern in most cases. This is because it is not that shorter than using explicit recursion, and has the potential to be much less readable.
I would argue that this "idiom" is quite puzzling to any Haskeller who has not seen it before. It is a sort-of an acquired taste, so to speak.
Perhaps, it could be a good idea to use this style when the continuation functions are meaningful on their own. E.g., when representing lists as difference lists, the concatenation of a regular-list of difference-lists can be quite elegant
foldr (.) id listOfDLists []
is beautiful, even if the last [] might be puzzling at first.
Is there any sort of performance considerations to take into account when using this trick? (Or, well, when not using this trick?)
Performance should be essentially the same as using explicit recursion. GHC could even generate the exact same code.
Perhaps using foldr could help GHC fire some fold/build optimization rules, but I'm unsure about the need to do that when using continuations.
I currently have the following code in Haskell
splitStringOnDelimeter :: String -> Char -> [String]
splitStringOnDelimeter "" delimeter = return [""]
splitStringOnDelimeter string delimeter = do
let split = splitStringOnDelimeter (tail string) delimeter
if head string == delimeter
then return ([""] ++ split)
else return ( [( [(head string)] ++ (head split) )] ++ (tail split))
If I run it in a Haskell terminal (i.e. https://www.tryhaskell.org) with values for the return statement such as ( [( [(head "ZZZZ")] ++ (head ["first", "second", "third"]) )] ++ (tail ["first", "second", "third"])) or [""] ++ ["first", "second", "third"] or [""] then I receive the correct types from the terminal which is different to my local stack compiler. Furthermore, if I also change the top return statement to return "" then it doesn't complain about that statement which I'm pretty sure is incorrect.
My local compiler works fine with the rest of my Haskell codebase which is why I think it might be something wrong with my code...
One of the unfortunate things in the design of the Monad typeclass, is that they introduced a function called return. But although in many imperative programming languages return is a keyword to return content, in Haskell return has a totally different meaning, it does not really return something.
You can solve the problem by dropping the return:
splitStringOnDelimeter :: String -> Char -> [String]
splitStringOnDelimeter "" delimeter = [""]
splitStringOnDelimeter string delimeter =
let split = splitStringOnDelimeter (tail string) delimeter in
if head string == delimeter
then ([""] ++ split)
else ( [( [(head string)] ++ (head split) )] ++ (tail split))
The return :: Monad m => a -> m a is used to wrap a value (of type a) in a monad. Since here your signature hints about a list, Haskell will assume that you look for the list monad. So that means that you return would wrap [""] into another list, so implicitly with return [""] you would have written (in this context), [[""]], and this of course does not match with [String].
The same goes for do, again you them make a monadic function, but here your function has not much to do with monads.
Note that the name return is not per se bad, but since nearly all imperative languages attach an (almost) equivalent meaning to it, most people assume that it works the same way in functional languages, but it does not.
Mind that you use functions like head, tail, etc. These are usually seen as anti-patterns: you can use pattern matching instead. We can rewrite this to:
splitStringOnDelimeter :: String -> Char -> [String]
splitStringOnDelimeter "" delimeter = [""]
splitStringOnDelimeter (h:t) delimeter | h == delimeter = "" : split
| otherwise = (h : sh) : st
where split#(sh:st) = splitStringOnDelimeter t delimeter
By using pattern matching, we know for sure that the string has a head h and a tail t, and we can directly use these into the expression. This makes the expression shorter as well as more readable. Although if-then-else clauses are not per se anti-patterns, personally I think guards are syntactically more clean. We thus use a where clause here where we call splitStringOnDelimter t delimeter, and we pattern match this with split (as well as with (sh:st). We know that this will always match, since both the basecase and the inductive case always produce a list with at least one element. This again allows use to write a neat expression where we can use sh and st directly, instead of calling head and tail.
If I test this function locally, I got:
Prelude> splitStringOnDelimeter "foo!bar!!qux" '!'
["foo","bar","","qux"]
As take-away message, I think you better avoid using return, and do, unless you know what this function and keyword (do is a keyword) really mean. In the context of functional programming these have a different meaning.
return has type forall m a. Monad m => a -> m a.
The output type of the function splitStringOnDelimiter is [String], so if you try to write some output value using return, the compiler will infer that you want to provide some m a, thus instantiating m to [] (which is indeed an instance of the Monad typeclass), and a to String. It follows that the compiler will now expect some String to be used as argument of return. This expectation is violated in, for example, return ([""] ++ split), because here the argument of return, namely [""] ++ split has type [String] rather than String.
do is used as a convenient notation for monadic code, so you should rely on it only if you are interested in using the monadic operations of the output type. In this case, you really just want to manipulate lists using pure functions.
I'll add my 2 cents and suggest a solution. I used a foldr, that is a simple instance of a recursion scheme. Recursion schemes like foldr capture common patterns of computation; they make recursive definitions clear, easy to reason about, and total by construction.
I also took advantage of the fact that the output list is always non-empty, so I wrote it in the type. By being more precise about my intentions, I now know that split, the result of the recursive call, is a NonEmpty String, so I can use the total functions head and tail (from Data.List.NonEmpty), because a non-empty list has always a head and a tail.
import Data.List.NonEmpty as NE (NonEmpty(..), (<|), head, tail)
splitStringOnDelimeter :: String -> Char -> NonEmpty String
splitStringOnDelimeter string delimiter = foldr f (pure "") string
where f h split = if h == delimiter
then ("" <| split)
else (h : NE.head split) :| NE.tail split
If I want to add a space at the end of a character to return a list, how would I accomplish this with partial application if I am passing no arguments?
Also would the type be?
space :: Char -> [Char]
I'm having trouble adding a space at the end due to a 'parse error' by using the ++ and the : operators.
What I have so far is:
space :: Char -> [Char]
space = ++ ' '
Any help would be much appreciated! Thanks
Doing what you want is so common in Haskell it's got its own syntax, but being Haskell, it's extraordinarily lightweight. For example, this works:
space :: Char -> [Char]
space = (:" ")
so you weren't far off a correct solution. ([Char] is the same as String. " " is the string containing the character ' '.) Let's look at using a similar function first to get the hang of it. There's a function in a library called equalFilePath :: FilePath -> FilePath -> Bool, which is used to test whether two filenames or folder names represent the same thing. (This solves the problem that on unix, mydir isn't the same as MyDir, but on Windows it is.) Perhaps I want to check a list to see if it's got the file I want:
isMyBestFile :: FilePath -> Bool
isMyBestFile fp = equalFilePath "MyBestFile.txt" fp
but since functions gobble their first argument first, then return a new function to gobble the next, etc, I can write that shorter as
isMyBestFile = equalFilePath "MyBestFile.txt"
This works because equalFilePath "MyBestFile.txt" is itself a function that takes one argument: it's type is FilePath -> Bool. This is partial application, and it's super-useful. Maybe I don't want to bother writing a seperate isMyBestFile function, but want to check whether any of my list has it:
hasMyBestFile :: [FilePath] -> Bool
hasMyBestFile fps = any (equalFilePath "MyBestFile.txt") fps
or just the partially applied version again:
hasMyBestFile = any (equalFilePath "MyBestFile.txt")
Notice how I need to put brackets round equalFilePath "MyBestFile.txt", because if I wrote any equalFilePath "MyBestFile.txt", then filter would try and use just equalFilePath without the "MyBestFile.txt", because functions gobble their first argument first. any :: (a -> Bool) -> [a] -> Bool
Now some functions are infix operators - taking their arguments from before and after, like == or <. In Haskell these are just regular functions, not hard-wired into the compiler (but have precedence and associativity rules specified). What if I was a unix user who never heard of equalFilePath and didn't care about the portability problem it solves, then I would probably want to do
hasMyBestFile = any ("MyBestFile.txt" ==)
and it would work, just the same, because == is a regular function. When you do that with an operator function, it's called an operator section.
It can work at the front or the back:
hasMyBestFile = any (== "MyBestFile.txt")
and you can do it with any operator you like:
hassmalls = any (< 5)
and a handy operator for lists is :. : takes an element on the left and a list on the right, making a new list of the two after each other, so 'Y':"es" gives you "Yes". (Secretly, "Yes" is actually just shorthand for 'Y':'e':'s':[] because : is a constructor/elemental-combiner-of-values, but that's not relevant here.) Using : we can define
space c = c:" "
and we can get rid of the c as usual
space = (:" ")
which hopefully make more sense to you now.
What you want here is an operator section. For that, you'll need to surround the application with parentheses, i.e.
space = (: " ")
which is syntactic sugar for
space = (\x -> x : " ")
(++) won't work here because it expects a string as the first argument, compare:
(:) :: a -> [a] -> [a]
(++) :: [a] -> [a] -> [a]
I have a function that will take and int and return its square root. However now i want to modify it so that it takes an array of integers and gives back an array with the square roots of the elements of the first array. I know Haskell does not use loops so how can this modification be done? Thanks.
intSquareRoot :: Int -> Int
intSquareRoot n = try n where
try i | i*i > n = try (i - 1)
| i*i <= n = i
Don't.
The idea of “looping through some collection”, putting each result in the corresponding slot of its input, is a somewhat trivial, extremely common pattern. Patterns are for OO programmers. In Haskell, when there's a pattern, we want to abstract over it, i.e. give it a simple name that we can always re-use without extra boilerplate.
This particular “pattern” is the functor operation1. For lists it's called
map :: (a->b) -> [a]->[b]
more generally (e.g. it'll also work with real arrays; lists aren't actually arrays),
class Functor f where
fmap :: (a->b) -> f a->f b
So instead of defining an extra function
intListSquareRoot :: [Int] -> [Int]
intListSquareRoot = ...
you simply use map intSquareRoot right where you wanted to use that function.
Of course, you could also define that “lifted” version of intSquareRoot,
intListSquareRoot = map intSquareRoot
but that gains you practically nothing over simply inlining the map call right where you need it.
If you insist
That said... it's of course valid to wonder how map itself works. Well, you can manually “loop” through a list by recursion:
map' :: (a->b) -> [a]->[b]
map' _ [] = []
map' f (x:xs) = f x : map' f xs
Now, you could inline your specific function here
intListSquareRoot' :: [Int] -> [Int]
intListSquareRoot' [] = []
intListSquareRoot' (x:xs) = intSquareRoot x : intListSquareRoot' xs
This is not only much more clunky and awkward than quickly inserting the map magic word, it will also often be slower: compilers such as GHC can make better optimisations when they work on higher-level concepts2 such as folds, than when they have to work again and again with manually defined recursion.
1Not to be confused what many C++ programmers call a “functor”. Haskell uses the word in the correct mathematical sense, which comes from category theory.
2This is why languages such as Matlab and APL actually achieve decent performance for special applications, although they are dynamically-typed, interpreted languages: they have this special case of “vector looping” hard-coded into their very syntax. (Unfortunately, this is pretty much the only thing they can do well...)
You can use map:
arraySquareRoot = map intSquareRoot
I've just started trying to learn haskell and functional programming. I'm trying to write this function that will convert a binary string into its decimal equivalent. Please could someone point out why I am constantly getting the error:
"BinToDecimal.hs":19 - Syntax error in expression (unexpected `}', possibly due to bad layout)
module BinToDecimal where
total :: [Integer]
total = []
binToDecimal :: String -> Integer
binToDecimal a = if (null a) then (sum total)
else if (head a == "0") then binToDecimal (tail a)
else if (head a == "1") then total ++ (2^((length a)-1))
binToDecimal (tail a)
So, total may not be doing what you think it is. total isn't a mutable variable that you're changing, it will always be the empty list []. I think your function should include another parameter for the list you're building up. I would implement this by having binToDecimal call a helper function with the starting case of an empty list, like so:
binToDecimal :: String -> Integer
binToDecimal s = binToDecimal' s []
binToDecimal' :: String -> [Integer] -> Integer
-- implement binToDecimal' here
In addition to what #Sibi has said, I would highly recommend using pattern matching rather than nested if-else. For example, I'd implement the base case of binToDecimal' like so:
binToDecimal' :: String -> [Integer] -> Integer
binToDecimal' "" total = sum total -- when the first argument is the empty string, just sum total. Equivalent to `if (null a) then (sum total)`
-- Include other pattern matching statements here to handle your other if/else cases
If you think it'd be helpful, I can provide the full implementation of this function instead of giving tips.
Ok, let me give you hints to get you started:
You cannot do head a == "0" because "0" is String. Since the type of a is [Char], the type of head a is Char and you have to compare it with an Char. You can solve it using head a == '0'. Note that "0" and '0' are different.
Similarly, rectify your type error in head a == "1"
This won't typecheck: total ++ (2^((length a)-1)) because the type of total is [Integer] and the type of (2^((length a)-1)) is Integer. For the function ++ to typecheck both arguments passed to it should be list of the same type.
You are possible missing an else block at last. (before the code binToDecimal (tail a))
That being said, instead of using nested if else expression, try to use guards as they will increase the readability greatly.
There are many things we can improve here (but no worries, this is perfectly normal in the beginning, there is so much to learn when we start Haskell!!!).
First of all, a string is definitely not an appropriate way to represent a binary, because nothing prevents us to write "éaldkgjasdg" in place of a proper binary. So, the first thing is to define our binary type:
data Binary = Zero | One deriving (Show)
We just say that it can be Zero or One. The deriving (Show) will allow us to have the result displayed when run in GHCI.
In Haskell to solve problem we tend to start with a more general case to dive then in our particular case. The thing we need here is a function with an additional argument which holds the total. Note the use of pattern matching instead of ifs which makes the function easier to read.
binToDecimalAcc :: [Binary] -> Integer -> Integer
binToDecimalAcc [] acc = acc
binToDecimalAcc (Zero:xs) acc = binToDecimalAcc xs acc
binToDecimalAcc (One:xs) acc = binToDecimalAcc xs $ acc + 2^(length xs)
Finally, since we want only to have to pass a single parameter we define or specific function where the acc value is 0:
binToDecimal :: [Binary] -> Integer
binToDecimal binaries = binToDecimalAcc binaries 0
We can run a test in GHCI:
test1 = binToDecimal [One, Zero, One, Zero, One, Zero]
> 42
OK, all fine, but what if you really need to convert a string to a decimal? Then, we need a function able to convert this string to a binary. The problem as seen above is that not all strings are proper binaries. To handle this, we will need to report some sort of error. The solution I will use here is very common in Haskell: it is to use "Maybe". If the string is correct, it will return "Just result" else it will return "Nothing". Let's see that in practice!
The first function we will write is to convert a char to a binary. As discussed above, Nothing represents an error.
charToBinary :: Char -> Maybe Binary
charToBinary '0' = Just Zero
charToBinary '1' = Just One
charToBinary _ = Nothing
Then, we can write a function for a whole string (which is a list of Char). So [Char] is equivalent to String. I used it here to make clearer that we are dealing with a list.
stringToBinary :: [Char] -> Maybe [Binary]
stringToBinary [] = Just []
stringToBinary chars = mapM charToBinary chars
The function mapM is a kind of variation of map which acts on monads (Maybe is actually a monad). To learn about monads I recommend reading Learn You a Haskell for Great Good!
http://learnyouahaskell.com/a-fistful-of-monads
We can notice once more that if there are any errors, Nothing will be returned.
A dedicated function to convert strings holding binaries can now be written.
binStringToDecimal :: [Char] -> Maybe Integer
binStringToDecimal = fmap binToDecimal . stringToBinary
The use of the "." function allow us to define this function as an equality with another function, so we do not need to mention the parameter (point free notation).
The fmap function allow us to run binToDecimal (which expect a [Binary] as argument) on the return of stringToBinary (which is of type "Maybe [Binary]"). Once again, Learn you a Haskell... is a very good reference to learn more about fmap:
http://learnyouahaskell.com/functors-applicative-functors-and-monoids
Now, we can run a second test:
test2 = binStringToDecimal "101010"
> Just 42
And finally, we can test our error handling system with a mistake in the string:
test3 = binStringToDecimal "102010"
> Nothing