Haskell - Couldn't match type `[Char]' with `Char' - haskell

I currently have the following code in Haskell
splitStringOnDelimeter :: String -> Char -> [String]
splitStringOnDelimeter "" delimeter = return [""]
splitStringOnDelimeter string delimeter = do
let split = splitStringOnDelimeter (tail string) delimeter
if head string == delimeter
then return ([""] ++ split)
else return ( [( [(head string)] ++ (head split) )] ++ (tail split))
If I run it in a Haskell terminal (i.e. https://www.tryhaskell.org) with values for the return statement such as ( [( [(head "ZZZZ")] ++ (head ["first", "second", "third"]) )] ++ (tail ["first", "second", "third"])) or [""] ++ ["first", "second", "third"] or [""] then I receive the correct types from the terminal which is different to my local stack compiler. Furthermore, if I also change the top return statement to return "" then it doesn't complain about that statement which I'm pretty sure is incorrect.
My local compiler works fine with the rest of my Haskell codebase which is why I think it might be something wrong with my code...

One of the unfortunate things in the design of the Monad typeclass, is that they introduced a function called return. But although in many imperative programming languages return is a keyword to return content, in Haskell return has a totally different meaning, it does not really return something.
You can solve the problem by dropping the return:
splitStringOnDelimeter :: String -> Char -> [String]
splitStringOnDelimeter "" delimeter = [""]
splitStringOnDelimeter string delimeter =
let split = splitStringOnDelimeter (tail string) delimeter in
if head string == delimeter
then ([""] ++ split)
else ( [( [(head string)] ++ (head split) )] ++ (tail split))
The return :: Monad m => a -> m a is used to wrap a value (of type a) in a monad. Since here your signature hints about a list, Haskell will assume that you look for the list monad. So that means that you return would wrap [""] into another list, so implicitly with return [""] you would have written (in this context), [[""]], and this of course does not match with [String].
The same goes for do, again you them make a monadic function, but here your function has not much to do with monads.
Note that the name return is not per se bad, but since nearly all imperative languages attach an (almost) equivalent meaning to it, most people assume that it works the same way in functional languages, but it does not.
Mind that you use functions like head, tail, etc. These are usually seen as anti-patterns: you can use pattern matching instead. We can rewrite this to:
splitStringOnDelimeter :: String -> Char -> [String]
splitStringOnDelimeter "" delimeter = [""]
splitStringOnDelimeter (h:t) delimeter | h == delimeter = "" : split
| otherwise = (h : sh) : st
where split#(sh:st) = splitStringOnDelimeter t delimeter
By using pattern matching, we know for sure that the string has a head h and a tail t, and we can directly use these into the expression. This makes the expression shorter as well as more readable. Although if-then-else clauses are not per se anti-patterns, personally I think guards are syntactically more clean. We thus use a where clause here where we call splitStringOnDelimter t delimeter, and we pattern match this with split (as well as with (sh:st). We know that this will always match, since both the basecase and the inductive case always produce a list with at least one element. This again allows use to write a neat expression where we can use sh and st directly, instead of calling head and tail.
If I test this function locally, I got:
Prelude> splitStringOnDelimeter "foo!bar!!qux" '!'
["foo","bar","","qux"]
As take-away message, I think you better avoid using return, and do, unless you know what this function and keyword (do is a keyword) really mean. In the context of functional programming these have a different meaning.

return has type forall m a. Monad m => a -> m a.
The output type of the function splitStringOnDelimiter is [String], so if you try to write some output value using return, the compiler will infer that you want to provide some m a, thus instantiating m to [] (which is indeed an instance of the Monad typeclass), and a to String. It follows that the compiler will now expect some String to be used as argument of return. This expectation is violated in, for example, return ([""] ++ split), because here the argument of return, namely [""] ++ split has type [String] rather than String.
do is used as a convenient notation for monadic code, so you should rely on it only if you are interested in using the monadic operations of the output type. In this case, you really just want to manipulate lists using pure functions.
I'll add my 2 cents and suggest a solution. I used a foldr, that is a simple instance of a recursion scheme. Recursion schemes like foldr capture common patterns of computation; they make recursive definitions clear, easy to reason about, and total by construction.
I also took advantage of the fact that the output list is always non-empty, so I wrote it in the type. By being more precise about my intentions, I now know that split, the result of the recursive call, is a NonEmpty String, so I can use the total functions head and tail (from Data.List.NonEmpty), because a non-empty list has always a head and a tail.
import Data.List.NonEmpty as NE (NonEmpty(..), (<|), head, tail)
splitStringOnDelimeter :: String -> Char -> NonEmpty String
splitStringOnDelimeter string delimiter = foldr f (pure "") string
where f h split = if h == delimiter
then ("" <| split)
else (h : NE.head split) :| NE.tail split

Related

Unary predicate to check for a character being in string

I'm reading Real World Haskell, and I tried to implement the splitLines code myself, and I came up with more or less the same implementation (Chapter 4, page 73):
splitLines :: String -> [String]
splitLines [] = []
splitLines ('\r':a) = splitLines a
splitLines ('\n':a) = splitLines a
splitLines a = let (l,r) = break isCRorNL a
in l:splitLines r
where isCRorNL e = ???
--the book defines isCRorNL c = c == '\n' || c == '\r'
However, I've been spending definitely too much time trying to write the isCRorNL in the most functional and readable way I could think of, so that I can get rid of the where and turning the last definition of splitLines into an amost-english sentence (just like compare `on` length and the likes), without success.
Some sparse thoughts I have been going through:
A lambda, (\c -> c == '\n' || c == '\r'), is just too much power and too little expressiveness for such a simple and specific task;
furthermore, it contains a fair amount of duplicated code and/or it is uselessly verbose.
Whatever I have to put in isCRorNL has to have type Char -> Bool,
therefore it can have any type a1 -> a2 -> ... -> an -> Char -> Bool if I provide it with the first n arguments.
The any function can help me checking if a given character is either '\n' or '\r' or, in other words, if it is in the list of Chars "\n\r".
Since I want to check for equality, I can pass (==) to my function.
Therefore isCRorNL can have type (Char -> Char -> Bool) -> [Char] -> Char -> Bool (or with the first two argument inverted), and I can pass to it (==) as the first argument and "\n\r" as the second argument.
So I was looking for some standard functions I could compose to get such a function.
Finally I gave up and defined it this way: isCRorNL e = any (== e) "\n\r"; I think this is quite good as regards extensibility, as I can add as many characters in the "…", and I can change the operator ==; sadly I cannot put the function directly where it is used, as I am not able to write it as a partially applied function.
How would you do it?
As soon as I looked for the link in the question and visited it (for the first time), I realized that the code chunks are commented by readers, and the first comment under splitLines reads:
augustss 2008-04-23
[...] If you're making a point about functional
style maybe you should use
isLineSeparator = (`elem` "\r\n")
So it comes out I was thinking to much about composition of functions, while the easiest solution lies in the partial application of a so simple function, elem. The drawback here is that the operator used to check for equality is built in elem and cannot be changed. Nonetheless I feel dumb for not having thought to elem myself.

Haskell: Why ++ is not allowed in pattern matching?

Suppose we want to write our own sum function in Haskell:
sum' :: (Num a) => [a] -> a
sum' [] = 0
sum' (x:xs) = x + sum' xs
Why can't we do something like:
sum' :: (Num a) => [a] -> a
sum' [] = 0
sum' (xs++[x]) = x + sum' xs
In other words why can't we use ++ in pattern matching ?
This is a deserving question, and it has so far received sensible answers (mutter only constructors allowed, mutter injectivity, mutter ambiguity), but there's still time to change all that.
We can say what the rules are, but most of the explanations for why the rules are what they are start by over-generalising the question, addressing why we can't pattern match against any old function (mutter Prolog). This is to ignore the fact that ++ isn't any old function: it's a (spatially) linear plugging-stuff-together function, induced by the zipper-structure of lists. Pattern matching is about taking stuff apart, and indeed, notating the process in terms of the plugger-togetherers and pattern variables standing for the components. Its motivation is clarity. So I'd like
lookup :: Eq k => k -> [(k, v)] -> Maybe v
lookup k (_ ++ [(k, v)] ++ _) = Just v
lookup _ _ = Nothing
and not only because it would remind me of the fun I had thirty years ago when I implemented a functional language whose pattern matching offered exactly that.
The objection that it's ambiguous is a legitimate one, but not a dealbreaker. Plugger-togetherers like ++ offer only finitely many decompositions of finite input (and if you're working on infinite data, that's your own lookout), so what's involved is at worst search, rather than magic (inventing arbitrary inputs that arbitrary functions might have thrown away). Search calls for some means of prioritisation, but so do our ordered matching rules. Search can also result in failure, but so, again, can matching.
We have a sensible way to manage computations offering alternatives (failure and choice) via the Alternative abstraction, but we are not used to thinking of pattern matching as a form of such computation, which is why we exploit Alternative structure only in the expression language. The noble, if quixotic, exception is match-failure in do-notation, which calls the relevant fail rather than necessarily crashing out. Pattern matching is an attempt to compute an environment suitable for the evaluation of a 'right-hand side' expression; failure to compute such an environment is already handled, so why not choice?
(Edit: I should, of course, add that you only really need search if you have more than one stretchy thing in a pattern, so the proposed xs++[x] pattern shouldn't trigger any choices. Of course, it takes time to find the end of a list.)
Imagine there was some sort of funny bracket for writing Alternative computations, e.g., with (|) meaning empty, (|a1|a2|) meaning (|a1|) <|> (|a2|), and a regular old (|f s1 .. sn|) meaning pure f <*> s1 .. <*> sn. One might very well also imagine (|case a of {p1 -> a1; .. pn->an}|) performing a sensible translation of search-patterns (e.g. involving ++) in terms of Alternative combinators. We could write
lookup :: (Eq k, Alternative a) => k -> [(k, v)] -> a k
lookup k xs = (|case xs of _ ++ [(k, v)] ++ _ -> pure v|)
We may obtain a reasonable language of search-patterns for any datatype generated by fixpoints of differentiable functors: symbolic differentiation is exactly what turns tuples of structures into choices of possible substructures. Good old ++ is just the sublists-of-lists example (which is confusing, because a list-with-a-hole-for-a-sublist looks a lot like a list, but the same is not true for other datatypes).
Hilariously, with a spot of LinearTypes, we might even keep hold of holey data by their holes as well as their root, then plug away destructively in constant time. It's scandalous behaviour only if you don't notice you're doing it.
You can only pattern match on constructors, not on general functions.
Mathematically, a constructor is an injective function: each combination of arguments gives one unique value, in this case a list. Because that value is unique, the language can deconstruct it again into the original arguments. I.e., when you pattern match on :, you essentially use the function
uncons :: [a] -> Maybe (a, [a])
which checks if the list is of a form you could have constructed with : (i.e., if it is non-empty), and if yes, gives you back the head and tail.
++ is not injective though, for example
Prelude> [0,1] ++ [2]
[0,1,2]
Prelude> [0] ++ [1,2]
[0,1,2]
Neither of these representations is the right one, so how should the list be deconstructed again?
What you can do however is define a new, “virtual” constructor that acts like : in that it always seperates exactly one element from the rest of the list (if possible), but does so on the right:
{-# LANGUAGE PatternSynonyms, ViewPatterns #-}
pattern (:>) :: [a] -> a -> [a]
pattern (xs:>ω) <- (unsnoc -> Just (xs,ω))
where xs:>ω = xs ++ [ω]
unsnoc :: [a] -> Maybe ([a], a)
unsnoc [] = Nothing
unsnoc [x] = Just x
unsnoc (_:xs) = unsnoc xs
Then
sum' :: Num a => [a] -> a
sum' (xs:>x) = x + sum xs
sum' [] = 0
Note that this is very inefficient though, because the :> pattern-synonym actually needs to dig through the entire list, so sum' has quadratic rather than linear complexity.
A container that allows pattern matching on both the left and right end efficiently is Data.Sequence, with its :<| and :|> pattern synonyms.
You can only pattern-match on data constructors, and ++ is a function, not a data constructor.
Data constructors are persistent; a value like 'c':[] cannot be simplified further, because it is a fundamental value of type [Char]. An expression like "c" ++ "d", however, can replaced with its equivalent "cd" at any time, and thus couldn't reliably be counted on to be present for pattern matching.
(You might argue that "cd" could always replaced by "c" ++ "d", but in general there isn't a one-to-one mapping between a list and a decomposition via ++. Is "cde" equivalent to "c" ++ "de" or "cd" ++ "e" for pattern matching purposes?)
++ isn't a constructor, it's just a plain function. You can only match on constructors.
You can use ViewPatterns or PatternSynonyms to augment your ability to pattern match (thanks #luqui).

composition and partial application on haskell [duplicate]

If I want to add a space at the end of a character to return a list, how would I accomplish this with partial application if I am passing no arguments?
Also would the type be?
space :: Char -> [Char]
I'm having trouble adding a space at the end due to a 'parse error' by using the ++ and the : operators.
What I have so far is:
space :: Char -> [Char]
space = ++ ' '
Any help would be much appreciated! Thanks
Doing what you want is so common in Haskell it's got its own syntax, but being Haskell, it's extraordinarily lightweight. For example, this works:
space :: Char -> [Char]
space = (:" ")
so you weren't far off a correct solution. ([Char] is the same as String. " " is the string containing the character ' '.) Let's look at using a similar function first to get the hang of it. There's a function in a library called equalFilePath :: FilePath -> FilePath -> Bool, which is used to test whether two filenames or folder names represent the same thing. (This solves the problem that on unix, mydir isn't the same as MyDir, but on Windows it is.) Perhaps I want to check a list to see if it's got the file I want:
isMyBestFile :: FilePath -> Bool
isMyBestFile fp = equalFilePath "MyBestFile.txt" fp
but since functions gobble their first argument first, then return a new function to gobble the next, etc, I can write that shorter as
isMyBestFile = equalFilePath "MyBestFile.txt"
This works because equalFilePath "MyBestFile.txt" is itself a function that takes one argument: it's type is FilePath -> Bool. This is partial application, and it's super-useful. Maybe I don't want to bother writing a seperate isMyBestFile function, but want to check whether any of my list has it:
hasMyBestFile :: [FilePath] -> Bool
hasMyBestFile fps = any (equalFilePath "MyBestFile.txt") fps
or just the partially applied version again:
hasMyBestFile = any (equalFilePath "MyBestFile.txt")
Notice how I need to put brackets round equalFilePath "MyBestFile.txt", because if I wrote any equalFilePath "MyBestFile.txt", then filter would try and use just equalFilePath without the "MyBestFile.txt", because functions gobble their first argument first. any :: (a -> Bool) -> [a] -> Bool
Now some functions are infix operators - taking their arguments from before and after, like == or <. In Haskell these are just regular functions, not hard-wired into the compiler (but have precedence and associativity rules specified). What if I was a unix user who never heard of equalFilePath and didn't care about the portability problem it solves, then I would probably want to do
hasMyBestFile = any ("MyBestFile.txt" ==)
and it would work, just the same, because == is a regular function. When you do that with an operator function, it's called an operator section.
It can work at the front or the back:
hasMyBestFile = any (== "MyBestFile.txt")
and you can do it with any operator you like:
hassmalls = any (< 5)
and a handy operator for lists is :. : takes an element on the left and a list on the right, making a new list of the two after each other, so 'Y':"es" gives you "Yes". (Secretly, "Yes" is actually just shorthand for 'Y':'e':'s':[] because : is a constructor/elemental-combiner-of-values, but that's not relevant here.) Using : we can define
space c = c:" "
and we can get rid of the c as usual
space = (:" ")
which hopefully make more sense to you now.
What you want here is an operator section. For that, you'll need to surround the application with parentheses, i.e.
space = (: " ")
which is syntactic sugar for
space = (\x -> x : " ")
(++) won't work here because it expects a string as the first argument, compare:
(:) :: a -> [a] -> [a]
(++) :: [a] -> [a] -> [a]

Why am I receiving this syntax error - possibly due to bad layout?

I've just started trying to learn haskell and functional programming. I'm trying to write this function that will convert a binary string into its decimal equivalent. Please could someone point out why I am constantly getting the error:
"BinToDecimal.hs":19 - Syntax error in expression (unexpected `}', possibly due to bad layout)
module BinToDecimal where
total :: [Integer]
total = []
binToDecimal :: String -> Integer
binToDecimal a = if (null a) then (sum total)
else if (head a == "0") then binToDecimal (tail a)
else if (head a == "1") then total ++ (2^((length a)-1))
binToDecimal (tail a)
So, total may not be doing what you think it is. total isn't a mutable variable that you're changing, it will always be the empty list []. I think your function should include another parameter for the list you're building up. I would implement this by having binToDecimal call a helper function with the starting case of an empty list, like so:
binToDecimal :: String -> Integer
binToDecimal s = binToDecimal' s []
binToDecimal' :: String -> [Integer] -> Integer
-- implement binToDecimal' here
In addition to what #Sibi has said, I would highly recommend using pattern matching rather than nested if-else. For example, I'd implement the base case of binToDecimal' like so:
binToDecimal' :: String -> [Integer] -> Integer
binToDecimal' "" total = sum total -- when the first argument is the empty string, just sum total. Equivalent to `if (null a) then (sum total)`
-- Include other pattern matching statements here to handle your other if/else cases
If you think it'd be helpful, I can provide the full implementation of this function instead of giving tips.
Ok, let me give you hints to get you started:
You cannot do head a == "0" because "0" is String. Since the type of a is [Char], the type of head a is Char and you have to compare it with an Char. You can solve it using head a == '0'. Note that "0" and '0' are different.
Similarly, rectify your type error in head a == "1"
This won't typecheck: total ++ (2^((length a)-1)) because the type of total is [Integer] and the type of (2^((length a)-1)) is Integer. For the function ++ to typecheck both arguments passed to it should be list of the same type.
You are possible missing an else block at last. (before the code binToDecimal (tail a))
That being said, instead of using nested if else expression, try to use guards as they will increase the readability greatly.
There are many things we can improve here (but no worries, this is perfectly normal in the beginning, there is so much to learn when we start Haskell!!!).
First of all, a string is definitely not an appropriate way to represent a binary, because nothing prevents us to write "éaldkgjasdg" in place of a proper binary. So, the first thing is to define our binary type:
data Binary = Zero | One deriving (Show)
We just say that it can be Zero or One. The deriving (Show) will allow us to have the result displayed when run in GHCI.
In Haskell to solve problem we tend to start with a more general case to dive then in our particular case. The thing we need here is a function with an additional argument which holds the total. Note the use of pattern matching instead of ifs which makes the function easier to read.
binToDecimalAcc :: [Binary] -> Integer -> Integer
binToDecimalAcc [] acc = acc
binToDecimalAcc (Zero:xs) acc = binToDecimalAcc xs acc
binToDecimalAcc (One:xs) acc = binToDecimalAcc xs $ acc + 2^(length xs)
Finally, since we want only to have to pass a single parameter we define or specific function where the acc value is 0:
binToDecimal :: [Binary] -> Integer
binToDecimal binaries = binToDecimalAcc binaries 0
We can run a test in GHCI:
test1 = binToDecimal [One, Zero, One, Zero, One, Zero]
> 42
OK, all fine, but what if you really need to convert a string to a decimal? Then, we need a function able to convert this string to a binary. The problem as seen above is that not all strings are proper binaries. To handle this, we will need to report some sort of error. The solution I will use here is very common in Haskell: it is to use "Maybe". If the string is correct, it will return "Just result" else it will return "Nothing". Let's see that in practice!
The first function we will write is to convert a char to a binary. As discussed above, Nothing represents an error.
charToBinary :: Char -> Maybe Binary
charToBinary '0' = Just Zero
charToBinary '1' = Just One
charToBinary _ = Nothing
Then, we can write a function for a whole string (which is a list of Char). So [Char] is equivalent to String. I used it here to make clearer that we are dealing with a list.
stringToBinary :: [Char] -> Maybe [Binary]
stringToBinary [] = Just []
stringToBinary chars = mapM charToBinary chars
The function mapM is a kind of variation of map which acts on monads (Maybe is actually a monad). To learn about monads I recommend reading Learn You a Haskell for Great Good!
http://learnyouahaskell.com/a-fistful-of-monads
We can notice once more that if there are any errors, Nothing will be returned.
A dedicated function to convert strings holding binaries can now be written.
binStringToDecimal :: [Char] -> Maybe Integer
binStringToDecimal = fmap binToDecimal . stringToBinary
The use of the "." function allow us to define this function as an equality with another function, so we do not need to mention the parameter (point free notation).
The fmap function allow us to run binToDecimal (which expect a [Binary] as argument) on the return of stringToBinary (which is of type "Maybe [Binary]"). Once again, Learn you a Haskell... is a very good reference to learn more about fmap:
http://learnyouahaskell.com/functors-applicative-functors-and-monoids
Now, we can run a second test:
test2 = binStringToDecimal "101010"
> Just 42
And finally, we can test our error handling system with a mistake in the string:
test3 = binStringToDecimal "102010"
> Nothing

One interesting pattern

I'm solving 99 Haskell Probems. I've successfully solved problem No. 21, and when I opened solution page, the following solution was proposed:
Insert an element at a given position into a list.
insertAt :: a -> [a] -> Int -> [a]
insertAt x xs (n+1) = let (ys,zs) = split xs n in ys++x:zs
I found pattern (n + 1) interesting, because it seems to be an elegant way to convert 1-based argument of insertAt into 0-based argument of split (it's function from previous exercises, essentially the same as splitAt). The problem is that GHC did not find this pattern that elegant, in fact it says:
Parse error in pattern: n + 1
I don't think that the guy who wrote the answer was dumb and I would like to know if this kind of patterns is legal in Haskell, and if it is, how to fix the solution.
I believe it has been removed from the language, and so was likely around when the author of 99 Haskell Problems wrote that solution, but it is no longer in Haskell.
The problem with n+k patterns goes back to a design decision in Haskell, to distinguish between constructors and variables in patterns by the first character of their names. If you go back to ML, a common function definition might look like (using Haskell syntax)
map f nil = nil
map f (x:xn) = f x : map f xn
As you can see, syntactically there's no difference between f and nil on the LHS of the first line, but they have different roles; f is a variable that needs to be bound to the first argument to map while nil is a constructor that needs to be matched against the second. Now, ML makes this distinction by looking each variable up in the surrounding scope, and assuming names are variables when the look-up fails. So nil is recognized as a constructor when the lookup fails. But consider what happens when there's a typo in the pattern:
map f niil = nil
(two is in niil). niil isn't a constructor name in scope, so it gets treated as a variable, and the definition is silently interpreted incorrectly.
Haskell's solution to this problem is to require constructor names to begin with uppercase letters, and variable names to begin with lowercase letters. And, for infix operators / constructors, constructor names must begin with : while operator names may not begin with :. This also helps distinguish between deconstructing bindings:
x:xn = ...
is clearly a deconstructing binding, because you can't define a function named :, while
n - m = ...
is clearly a function definition, because - can't be a constructor name.
But allowing n+k patterns, like n+1, means that + is both a valid function name, and something that works like a constructor in patterns. Now
n + 1 = ...
is ambiguous again; it could be part of the definition of a function named (+), or it could be a deconstructing pattern match definition of n. In Haskell 98, this ambiguity was solved by declaring
n + 1 = ...
a function definition, and
(n + 1) = ...
a deconstructing binding. But that obviously was never a satisfactory solution.
Note that you can now use view patterns instead of n+1.
For example:
{-# LANGUAGE ViewPatterns #-}
module Temp where
import Data.List (splitAt)
split :: [a] -> Int -> ([a], [a])
split = flip splitAt
insertAt :: a -> [a] -> Int -> [a]
insertAt x xs (subtract 1 -> n) = let (ys,zs) = split xs n in ys++x:zs

Resources