if-statement with elem results in wrong output - haskell

I have written a program whose purpose is to replace/extend every consonant characters in a given word with the constonant itself, an 'o' and the constonant following. If the character is a vocal, the program should ignore it and move on.
For example, the string "progp" should result in "poprorogogpop".The only vocal in the string "progp" is the 'o' and it should therefore not be repeated.
This basically means that I want to construct a program which replaces a char with a String.
This is what I have so far:
rovarsprak :: String -> String --Definition of our function which recieves an String as input and returns a String--
isVocal :: Char -> String --Function which determines if a letter is a vocal or not--
vocals = ["aeiouy"]; --List with relevant vocals--
rovarsprak [] = []; --Case for empty input--
rovarsprak (x:xs) = isVocal(x) ++ rovarsprak(xs)
isVocal x = if elem [x] vocals
then [x]
else [x] ++ "o" ++ [x]
If I compile and run this with the input parameter "progp" I recieve:
"poprorooogogpop"
Everything in the output is correct up until the 'o' character in the middle of the sentence in "progp" since the vocal 'o' should not be repeated in that way.
My suspicion is that error lies within the elem-part of the if-statement or that it may be related with the recursion.
A word of notice, I am extremly new to haskell programming and have searched for issues related with the elem-statement but to no success.

vocals is intended to be a String, not a [String]:
vocals = "aeiouy" --List ['a','e','i','o','u','y'] with relevant vocals--
isVocal would be a good name if it returned a Bool; you are actually returning a possibly different string, so something like consonantToSyllable would be better.
Notice that it's simpler to just return the correct list of letters rather than building a bunch of short lists and concatenating them.
consonantToSyllable x = if elem x vocals then [x] else [x,'o',x]
or
consonantToSyllable x | elem x vocals = [x]
| otherwise = [x,'o',x]
rovarsprak [] = []
rovarsprak (x:xs) = consonantToSyllable x ++ rovarsprak xs
The above recursion is the pattern captured by concatMap: map a function over a list, then concatenate the resulting lists into one list.
rovarsprak word = concatMap consonantToSyllable word
or just
rovarsprak = concatMap consonantToSyllable

You have 2 errors in your code. First, a String is already a list of characters so vocals = "aeiouy" is enough.
Then you can test elem x vocals.
Besides, your looping function rovarsprak is not tail recursive. foldl will make it so, by handling an accumulator in the signature (and it's also clearer that it's a loop) :
rovarsprakBis :: String -> String
rovarsprakBis = foldl (\acc x -> acc ++ isVocal x) ""

Related

Remove inputted char from String recursively in Haskell

I'm writing a recursive function that takes a char as input, and removes the char from a string on output.
Eg: INPUT: abbacysa | OUTPUT: bbcys
I'm a beginner in Haskell so just trying to wrap my head around recursion still with practice.
I start out by creating the function with an empty list
I then select the elements within the list and begin the condition guard.
I looked into using drop but I think maybe there is a better way of doing this?
removeChar [] = []
removeChar (x:xs)
| x `elem` "a" = removeChar drop "a"
You start with base of recursion:
removeChar [] = []
If string is not empty, you have two possible options:
It starts from a then you want to skip. So result is just rest of string without letter 'a'.
removeChar ('a':xs) = removeChar xs
It doesn't, then you want to keep it. So result is letter plus rest of string without letter 'a'.
removeChar (x:xs) = x: removeChar xs
Here is a solution where you put in a string and a char you want to remove from the string.
Recursively:
removeChar :: String -> Char -> String
removeChar [] _ = []
removeChar (x:xs) n = if (x == n) then removeChar xs n
else x : removeChar xs n
First we check that char "x" in string (x:xs) equals char "n", if it does then we remove the character, else we keep on looping through the string.
Input: removeChar "abba baby" 'b'
Output: "aa ay"

Haskell filter out circular permutations

You have a list with N elements
You only want to print elements that are not circular permuations of other elements of the same list
To check if two strings are the circular permutations of each other I do this, which works fine :
string1 = "abc"
string2 = "cab"
stringconc = string1 ++ string1
if string2 `isInfixOf` stringconc
then -- it's a circular permuation
else -- it's not
Edit : As one comment pointed that out, this test only work for strings of the same size
Back to the real use case :
checkClean :: [String] -> [String] -> IO String
checkClean [] list = return ""
checkClean (x:xs) list = do
let sequence = cleanInfix x list
if sequence /= "abortmath"
then putStr sequence
else return ()
checkClean xs list
cleanInfix :
cleanInfix :: String -> [String] -> String
cleanInfix seq [] = seq
cleanInfix seq (x:xs) = do
let seqconc = x ++ x
if seq `isInfixOf` seqconc && seq /= x
then "abortmath"
else cleanInfix seq xs
However this just outputs... nothing
With some research I found out that sequence in checkClean is always "abortmath"
Also I'm not quite comfortable with this "flag" abortmath, because if by any chance one element of the list is "abortmath", well..
For example :
if I have a list composed of :
NUUNNFFUF
FFUFNUUNN
I should write
NUUNNFFUF
I guess you call your initial code (question) with something like that:
result = ["NUUNNFFUF", "FFUFNUUNN"]
main = do
checkClean result result
It won't print anything because:
the first call of cleanInfix has the arguments following arguments: "NUUNNFFUF" and ["NUUNNFFUF", "FFUFNUUNN"]
in cleanInfix, since seq == x you have a recursive call with the following arguments: "NUUNNFFUF" and ["FFUFNUUNN"]
now, "NUUNNFFUF" is a real permutation of "FFUFNUUNN": cleanInfix returns "abortmath", and checkClean returns ()
then you have a recursive call of checkClean with following arguments: "FFUFNUUNN" and ["NUUNNFFUF", "FFUFNUUNN"]
again, "FFUFNUUNN" is a real permutation of "NUUNNFFUF": cleanInfix returns "abortmath", and checkClean returns ()
this is the end.
Basically, x is a permutation of y and y is a permutation of x, thus x and y are discarded.
Your answer works, but it is horribly complicated.
I won't try to improve either of your codes, but I will make a general comment: you should (you really should) avoid returning a monad when you don't need to: in the question, checkClean just needs to remove duplicates (or "circular duplicates") from a list. That's totally functional: you have all the information you need. Thus, remove those dos, lets and returns!
Now, let's try to focus on this:
You have a list with N elements You only want to print elements that are not circular permuations of other elements of the same list
Why don't you use your initial knowledge on circular permutations?
isCircPermOf x y = x `isInfixOf` (y ++ y)
Now, you need a function that takes a sequence and a list of sequences, and return only the elements of the second that are not circular permutations of the first :
filterCircDuplicates :: String -> [String] -> [String]
filterCircDuplicates seq [] = []
filterCircDuplicates seq (x:xs) =
if seq `isCircPermOf` x
then removeCircDuplicates seq xs
else x:removeCircDuplicates seq xs
This pattern is well know, and you can use filter to simplify it:
filterCircDuplicates seq l = filter (\x -> !seq `isCircPermOf` x) l
Or better:
filterCircDuplicates seq = filter (not.isCircPermOf seq)
Note the signature: not.isCircPermOf seq :: String -> Boolean. It returns true if the current element is not a circular permutation of seq. (You don't have to add the list argument.)
Final step: you need a function that takes a list and return this list without (circular) duplicates.
removeCircDuplicates :: [String] -> [String]
removeCircDuplicates [] = []
removeCircDuplicates (x:xs) = x:filterCircDuplicates x (removeCircDuplicates xs)
When your list has a head and a tail, you clean the tail, then remove the duplicates of the first element of the tail, and keep this first element.
Again, you have a well known pattern, a fold:
removeCircDuplicates = foldr (\x acc -> x:filterCircDuplicates x acc) []
It removes the duplicates from right to left.
And if you want a one-liner:
Prelude Data.List> foldr (\x -> ((:) x).filter(not.(flip isInfixOf (x++x)))) [] ["abcd", "acbd", "cdab", "abdc", "dcab"]
["abcd","acbd","abdc"]
The wonders you can make with a pen and some paper...
So if anyone is interested here is how I solved it, it's probably badly optimised but at least it works (I'm just trying to learn haskell, so it's good enough for now)
-- cleanInfix function
cleanInfix :: String -> [String] -> [String] -> [String]
cleanInfix sequence [] cleanlist = cleanlist
cleanInfix sequence (x:xs) cleanlist = do
-- this is where I check for the circular permuation
let sequenceconc = x ++ x
if sequence `isInfixOf` sequenceconc
then cleanInfix sequence xs (delete x cleanlist)
else cleanInfix sequence xs cleanlist
-- checkClean Function
checkClean :: [String] -> [String] -> [String] -> [String]
checkClean [] listesend cleanlist = cleanlist
checkClean (x:xs) listesend cleanlist = do
-- The first delete is to avoid checking if an element is the circular permuation of... itself, because it obviously is... in some way
let liste2 = cleanInfix x (delete x listesend) cleanlist
checkClean xs (delete x listesend) liste2
-- Clean function, first second and third are the command line argument don't worry about them
clean first second third = do
-- create of the result list by asking user for input
let printlist = checkClean result result result -- yes, it's the same list, three times
print printlist -- print the list

Haskell: Pattern Matching to combine String

I'm trying to write a function which adds single characters from a string to a list of strings, for instance
combine ", !" ["Hello", "", "..."] = ["Hello,", " ", "...!"]
I've tried this:
combine :: String -> [String] -> [String]
combine (y:ys) (x:xs) =
[x:y, combine ys xs]
A simple one would be
combine :: [Char] -> [String] -> [String]
combine [] _ = []
combine _ [] = []
combine (c:cs) (x:xs) = x ++ [c] : combine cs xs
Or even more simply using zipWith
combine :: [Char] -> [String] -> [String]
combine = zipWith (\c x -> x ++ [c])
I had to do a bit extra to get this to work. I'll break it down for you.
First, I specified the type of the function as [Char] -> [String] -> [String]. I could have used String for the first argument, but what you're operating on conceptually is a list of characters and a list of strings, not a string and a list of strings.
Next, I had to specify the edge cases for this function. What happens when either argument is the empty list []? The easy answer is to just end the computation then, so we can write
combine [] _ = []
combine _ [] = []
Here the _ is matching anything, but throwing it away because it isn't used in the return value.
Next, for the actual body of the function We want to take the first character and the first string, then append that character to the end of the string:
combine (c:cs) (x:xs) = x ++ [c]
But this doesn't do anything with cs or xs, the rest of our lists (and won't even compile with the type signature above). We need to keep going, and since we're generating a list, this is normally done with the prepend operator :
combine (c:cs) (x:xs) = x ++ [c] : combine cs xs
However, this is such a common pattern that there is a helper function called zipWith that handles the edge cases for us. It's type signature is
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
It walks down both input lists simultaneously, passing the corresponding elements into the provided function. Since the function we want to apply is \c x -> x ++ [c] (turned into a lambda function), we can drop it in to zipWith as
combine cs xs = zipWith (\c x -> x ++ [c]) cs xs
But Haskell will let us drop arguments when possible, so we can eta reduce this to
combine :: [Char] -> [String] -> [String]
combine = zipWith (\c x -> x ++ [c])
And that's it!
When you want to combine lists element by element, it is usually a zip you are looking at. In this case, you know exactly how you want to combine the elements – that makes it a zipWith.
zipWith takes a "combining function" and then creates a function that combines two lists using said combining function. Let's call your "combining" function append, because it adds a characters to the end of a string. You can define it like this:
append char string = string ++ [char]
Do you see how this works? For example,
append 'e' "nic" = "nice"
or
append '!' "Hello" = "Hello!"
Now that we have that, recall that zipWith takes a "combining function" and then creates a function that combines two lists using that function. So your function is then easily implemented as
combine = zipWith append
and it will do append on each of the elements in order in the lists you supply, like so:
combine ", !" ["Hello", "", "..."] = ["Hello,", " ", "...!"]
You are close. There are a couple issues with what you have.
y has type Char, and x has type String which is an alias for [Char]. This means that you can add y to the top of a list with y : x, but you can't add y to the end of a list using the same : operator. Instead, you make y into a list and join the lists.
x ++ [y]
There must also be a base case, or this recursion will continue until it has no elements in either list and crash. In this case, we likely don't have anything we want to add.
combine [] [] = []
Finally, once we create the element y ++ [x] we want to add it to the top of the rest of the items we have computed. So we use : to cons it to our list.
combine :: String -> [String] -> [String]
combine [] [] = []
combine (x : xs) (y : ys) = (y ++ [x]) : (combine xs ys)
One note about this code, if there is ever a point where the number of characters in your string is different from the number of strings in you list, then this will crash. You can handle that case in a number of ways, bheklilr's answer addresses this.
kqr's answer also works perfectly and is probably the best one to use in practice.

I need convert this string in Char List

I'm learning haskell. I'm reading a string from a text file and need to make this string becomes a list of char.
The input file is this:
Individuo A; TACGATCAAAGCT
Individuo B; AATCGCAT
Individuo C; TAAATCCGATCAAAGAGAGGACTTA
I need convert this string
S1 = "AAACCGGTTAAACCCGGGG" in S1 =
["A","A","A","C","C","G","G","T","T","A","A","A","C","C","C","G","G","G","G"]
or S1 =
['A','A','A','C','C','G','G','T','T','A','A','A','C','C','C','G','G','G','G']
but they are separated by ";"
What should I do?
What can I do?
after getting two lists, I send them to this code:
lcsList :: Eq a => [a] -> [a] -> [a]
lcsList [] _ = []
lcsList _ [] = []
lcsList (x:xs) (y:ys) = if x == y
then x : lcsList xs ys
else
let lcs1 = lcsList (x:xs) ys
lcs2 = lcsList xs (y:ys)
in if (length lcs1) > (length lcs2)
then lcs1
else lcs2
A rough and ready way to split out each of those strings is with something like this - which you can try in ghci
let a = "Individuo A; TACGATCAAAGCT"
tail $ dropWhile (/= ' ') $ dropWhile (/= ';') a
which gives you:
"TACGATCAAAGCT"
And since a String is just a list of Char, this is the same as:
['T', 'A', 'C', 'G', ...
If your file consists of several lines, it is quite simple: you just need to skip everything until you find “;”. If your file consists of just one line, you’ll have to look for sequences’ beginnings and endings separately (hint: sequence ends with space). Write a recursive function to do the task, and use functions takeWhile, dropWhile.
A String is already a list of Char (it is even defined like this: type String = [Char]), so you don’t have to do anything else. If you need a list of Strings, where every String consists of just one char, then use map to wrap every char (once again, every String is a list, so you are allowed to use map on these). To wrap a char, there are three alternatives:
Use lambda function: map (\c -> [c]) s
Use operator section: map (:[]) s
Define a new function: wrap x = [x]
Good luck!

Why does this first Haskell function FAIL to handle infinite lists, while this second snippet SUCCEEDS with infinite lists?

I have two Haskell functions, both of which seem very similar to me. But the first one FAILS against infinite lists, and the second one SUCCEEDS against infinite lists. I have been trying for hours to nail down exactly why that is, but to no avail.
Both snippets are a re-implementation of the "words" function in Prelude. Both work fine against finite lists.
Here's the version that does NOT handle infinite lists:
myWords_FailsOnInfiniteList :: String -> [String]
myWords_FailsOnInfiniteList string = foldr step [] (dropWhile charIsSpace string)
where
step space ([]:xs) | charIsSpace space = []:xs
step space (x:xs) | charIsSpace space = []:x:xs
step space [] | charIsSpace space = []
step char (x:xs) = (char : x) : xs
step char [] = [[char]]
Here's the version that DOES handle infinite lists:
myWords_anotherReader :: String -> [String]
myWords_anotherReader xs = foldr step [""] xs
where
step x result | not . charIsSpace $ x = [x:(head result)]++tail result
| otherwise = []:result
Note: "charIsSpace" is merely a renaming of Char.isSpace.
The following interpreter session illustrates that the first one fails against an infinite list while the second one succeeds.
*Main> take 5 (myWords_FailsOnInfiniteList (cycle "why "))
*** Exception: stack overflow
*Main> take 5 (myWords_anotherReader (cycle "why "))
["why","why","why","why","why"]
EDIT: Thanks to the responses below, I believe I understand now. Here are my conclusions and the revised code:
Conclusions:
The biggest culprit in my first attempt were the 2 equations that started with "step space []" and "step char []". Matching the second parameter of the step function against [] is a no-no, because it forces the whole 2nd arg to be evaluated (but with a caveat to be explained below).
At one point, I had thought (++) might evaluate its right-hand argument later than cons would, somehow. So, I thought I might fix the problem by changing " = (char:x):xs" to "= [char : x] ++ xs". But that was incorrect.
At one point, I thought that pattern matching the second arg against (x:xs) would cause the function to fail against infinite lists. I was almost right about this, but not quite! Evaluating the second arg against (x:xs), as I do in a pattern match above, WILL cause some recursion. It will "turn the crank" until it hits a ":" (aka, "cons"). If that never happened, then my function would not succeed against an infinite list. However, in this particular case, everything is OK because my function will eventually encounter a space, at which point a "cons" will occur. And the evaluation triggered by matching against (x:xs) will stop right there, avoiding the infinite recursion. At that point, the "x" will be matched, but the xs will remain a thunk, so there's no problem. (Thanks to Ganesh for really helping me grasp that).
In general, you can mention the second arg all you want, as long as you don't force evaluation of it. If you've matched against x:xs, then you can mention xs all you want, as long as you don't force evaluation of it.
So, here's the revised code. I usually try to avoid head and tail, merely because they are partial functions, and also because I need practice writing the pattern matching equivalent.
myWords :: String -> [String]
myWords string = foldr step [""] (dropWhile charIsSpace string)
where
step space acc | charIsSpace space = "":acc
step char (x:xs) = (char:x):xs
step _ [] = error "this should be impossible"
This correctly works against infinite lists. Note there's no head, tail or (++) operator in sight.
Now, for an important caveat:
When I first wrote the corrected code, I did not have the 3rd equation, which matches against "step _ []". As a result, I received the warning about non-exhaustive pattern matches. Obviously, it is a good idea to avoid that warning.
But I thought I was going to have a problem. I already mentioned above that it is not OK to pattern match the second arg against []. But I would have to do so in order to get rid of the warning.
However, when I added the "step _ []" equation, everything was fine! There was still no problem with infinite lists!. Why?
Because the 3rd equation in the corrected code IS NEVER REACHED!
In fact, consider the following BROKEN version. It is EXACTLY the SAME as the correct code, except that I have moved the pattern for empty list up above the other patterns:
myWords_brokenAgain :: String -> [String]
myWords_brokenAgain string = foldr step [""] (dropWhile charIsSpace string)
where
step _ [] = error "this should be impossible"
step space acc | charIsSpace space = "":acc
step char (x:xs) = (char:x):xs
We're back to stack overflow, because the first thing that happens when step is called is that the interpreter checks to see if equation number one is a match. To do so, it must see if the second arg is []. To do that, it must evaluate the second arg.
Moving the equation down BELOW the other equations ensures that the 3rd equation is never attempted, because either the first or the second pattern always matches. The 3rd equation is merely there to dispense with the non-exhaustive pattern warning.
This has been a great learning experience. Thanks to everyone for your help.
Others have pointed out the problem, which is that step always evaluates its second argument before producing any output at all, yet its second argument will ultimately depend on the result of another invocation of step when the foldr is applied to an infinite list.
It doesn't have to be written this way, but your second version is kind of ugly because it relies on the initial argument to step having a particular format and it's quite hard to see that the head/tail will never go wrong. (I'm not even 100% certain that they won't!)
What you should do is restructure the first version so it produces output without depending on the input list in at least some situations. In particular we can see that when the character is not a space, there's always at least one element in the output list. So delay the pattern-matching on the second argument until after producing that first element. The case where the character is a space will still be dependent on the list, but that's fine because the only way that case can infinitely recurse is if you pass in an infinite list of spaces, in which case not producing any output and going into a loop is the expected behaviour for words (what else could it do?)
Try expanding the expression by hand:
take 5 (myWords_FailsOnInfiniteList (cycle "why "))
take 5 (foldr step [] (dropWhile charIsSpace (cycle "why ")))
take 5 (foldr step [] (dropWhile charIsSpace ("why " ++ cycle "why ")))
take 5 (foldr step [] ("why " ++ cycle "why "))
take 5 (step 'w' (foldr step [] ("hy " ++ cycle "why ")))
take 5 (step 'w' (step 'h' (foldr step [] ("y " ++ cycle "why "))))
What's the next expansion? You should see that in order to pattern match for step, you need to know whether it's the empty list or not. In order to find that out, you have to evaluate it, at least a little bit. But that second term happens to be a foldr reduction by the very function you're pattern matching for. In other words, the step function cannot look at its arguments without calling itself, and so you have an infinite recursion.
Contrast that with an expansion of your second function:
myWords_anotherReader (cycle "why ")
foldr step [""] (cycle "why ")
foldr step [""] ("why " ++ cycle "why ")
step 'w' (foldr step [""] ("hy " ++ cycle "why ")
let result = foldr step [""] ("hy " ++ cycle "why ") in
['w':(head result)] ++ tail result
let result = step 'h' (foldr step [""] ("y " ++ cycle "why ") in
['w':(head result)] ++ tail result
You can probably see that this expansion will continue until a space is reached. Once a space is reached, "head result" will obtain a value, and you will have produced the first element of the answer.
I suspect that this second function will overflow for infinite strings that don't contain any spaces. Can you see why?
The second version does not actually evaluate result until after it has started producing part of its own answer. The first version evaluates result immediately by pattern matching on it.
The key with these infinite lists is that you have to produce something before you start demanding list elements so that the output can always "stay ahead" of the input.
(I feel like this explanation is not very clear, but it's the best I can do.)
The library function foldr has this implementation (or similar):
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f k (x:xs) = f x (foldr f k xs)
foldr _ k _ = k
The result of myWords_FailsOnInfiniteList depends on the result of foldr which depends on the result of step which depends on the result of the inner foldr which depends on ... and so on an infinite list, myWords_FailsOnInfiniteList will use up an infinite amount of space and time before producing its first word.
The step function in myWords_anotherReader does not require the result of the inner foldr until after it has produced the first letter of the first word. Unfortunately, as Apocalisp says, it uses O(length of first word) space before it produces the next word, because as the first word is being produced, the tail thunk keeps growing tail ([...] ++ tail ([...] ++ tail (...))).
In contrast, compare to
myWords :: String -> [String]
myWords = myWords' . dropWhile isSpace where
myWords' [] = []
myWords' string =
let (part1, part2) = break isSpace string
in part1 : myWords part2
using library functions which may be defined as
break :: (a -> Bool) -> [a] -> ([a], [a])
break p = span $ not . p
span :: (a -> Bool) -> [a] -> ([a], [a])
span p xs = (takeWhile p xs, dropWhile p xs)
takeWhile :: (a -> Bool) -> [a] -> [a]
takeWhile p (x:xs) | p x = x : takeWhile p xs
takeWhile _ _ = []
dropWhile :: (a -> Bool) -> [a] -> [a]
dropWhile p (x:xs) | p x = dropWhile p xs
dropWhile _ xs = xs
Notice that producing the intermediate results is never held up by future computation, and only O(1) space is needed as each element of the result is made available for consumption.
Addendum
So, here's the revised code. I usually try to avoid head and tail, merely because they are partial functions, and also because I need practice writing the pattern matching equivalent.
myWords :: String -> [String]
myWords string = foldr step [""] (dropWhile charIsSpace string)
where
step space acc | charIsSpace space = "":acc
step char (x:xs) = (char:x):xs
step _ [] = error "this should be impossible"
(Aside: You may not care, but the words "" == [] from the library, but your myWords "" = [""]. Similar issue with trailing spaces.)
Looks much-improved over myWords_anotherReader, and is pretty good for a foldr-based solution.
\n -> tail $ myWords $ replicate n 'a' ++ " b"
It's not possible to do better than O(n) time, but both myWords_anotherReader and myWords take O(n) space here. This may be inevitable given the use of foldr.
Worse,
\n -> head $ head $ myWords $ replicate n 'a' ++ " b"
myWords_anotherReader was O(1) but the new myWords is O(n), because pattern matching (x:xs) requires the further result.
You can work around this with
myWords :: String -> [String]
myWords = foldr step [""] . dropWhile isSpace
where
step space acc | isSpace space = "":acc
step char ~(x:xs) = (char:x):xs
The ~ introduces an "irrefutable pattern". Irrefutable patterns never fail and do not force immediate evaluation.

Resources