making a takeUntil function in haskell - haskell

I want make a function which when given a string eg "ab" and "cdabd" when it will be used on these two strings it will output "cd"
I have this up to now
takeUntil :: String -> String -> String
takeUntil [] [] = []
takeUntil xs [] = []
takeUntil [] ys = []
takeUntil xs ys = if contains xs ys then -- ???? I get stuck here.
the contains function is one in which I had defined previously(this whole function should be case insensitive)
contains function:
contains :: String -> String -> Bool
contains _ [] = True
contains [] _ = False
contains xs ys = isPrefixOf (map toLower ys) (map toLower xs) || contains (tail(map toLower xs) (map toLower ys)

There is a lot of ways to do this, but continuing with your path, try the following:
import Data.List
takeUntil :: String -> String -> String
takeUntil [] [] = [] --don't need this
takeUntil xs [] = []
takeUntil [] ys = []
takeUntil xs (y:ys) = if isPrefixOf xs (y:ys)
then []
else y:(takeUntil xs (tail (y:ys)))
Some outputs:
takeUntil "ab" "cdabd"
"cd"
takeUntil "b" "cdabd"
"cda"
takeUntil "d" "cdabd"
"c"
takeUntil "c" "cdabd"
""
takeUntil "xxx" "cdabd"
"cdabd"
EDIT:
The OP wants the function to be case-insensitive.
Well, again you can do that in lot of ways. For example you can write a lowerCase function like (i think you already have it in Data.Text):
import qualified Data.Char as Char
lowerCase :: String -> String
lowerCase [] = []
lowerCase (x:xs) = (Char.toLower x):(lowerCase xs)
And then use it like (maybe ugly and not very practical):
takeUntil (lowerCase "cd") (lowerCase "abcDe")
"ab"
That is the result you expect.
Also, you can use that lowerCase function inside takeUntil:
-- ...
takeUntil xs (y:ys) = if isPrefixOf (lowerCase xs) (lowerCase (y:ys))
-- ...
So, you can just do:
takeUntil "cd" "abcDe"
"ab"
Anyway, i think the best option is that one #bheklilr suggested. Make your own isPrefixOfCaseless function.
I hope this helps.

Within the numerous approaches to define takeUntil, consider using Data.Text functions as follows,
takeUntil :: String -> String -> String
takeUntil sep txt = unpack $ fst $ breakOn (pack sep) (toCaseFold $ pack txt)
Note that pack converts a String to a Text, whereas uncpak does the inverse; toCaseFold for case insensitive operations; breakOn delivers a pair where the first element includes text until the first (possible) match.
Update
This approach covers tests already suggested, yet it does not preseve original String for instance here,
takeUntil "e" "abcDe"
"abcd"
A work around this involves for instance the splitting by index at the break point.

Related

Capitalizing first letter of words while removing spaces (Haskell)

I'm just starting out in Haskell and this is like the third thing I'm writing, so, naturally, I'm finding myself a little stumped.
I'm trying to write a bit of code that will take a string, delete the spaces, and capitalize each letter of that string.
For example, if I input "this is a test", I would like to get back something like: "thisIsATest"
import qualified Data.Char as Char
toCaps :: String -> String
toCaps [] = []
toCaps xs = filter(/=' ') xs
toCaps (_:xs) = map Char.toUpper xs
I think the method I'm using is wrong. With my code in this order, I am able to remove all the spaces using the filter function, but nothing becomes capitalize.
When I move the filter bit to the very end of the code, I am able to use the map Char.toUpper bit. When I map that function Char.toUpper, it just capitalizes everything "HISISATEST", for example.
I was trying to make use of an if function to say something similar to
if ' ' then map Char.toUpper xs else Char.toLower xs, but that didn't work out for me. I haven't utilized if in Haskell yet, and I don't think I'm doing it correctly. I also know using "xs" is wrong, but I'm not sure how to fix it.
Can anyone offer any pointers on this particular problem?
I think it might be better if you split the problem into smaller subproblems. First we can make a function that, for a given word will capitalize the first character. For camel case, we thus can implement this as:
import Data.Char(toUpper)
capWord :: String -> String
capWord "" = ""
capWord (c:cs) = toUpper c : cs
We can then use words to obtain the list of words:
toCaps :: String -> String
toCaps = go . words
where go [] = ""
go (w:ws) = concat (w : map capWord ws)
For example:
Prelude Data.Char> toCaps "this is a test"
"thisIsATest"
For Pascal case, we can make use of concatMap instead:
toCaps :: String -> String
toCaps = concatMap capWord . words
Inspired by this answer from Will Ness, here's a way to do it that avoids unnecessary Booleans and comparisons:
import qualified Data.Char as Char
toCaps :: String -> String
toCaps = flip (foldr go (const [])) id
where go ' ' acc _ = acc Char.toUpper
go x acc f = f x:acc id
Or more understandably, but perhaps slightly less efficient:
import qualified Data.Char as Char
toCaps :: String -> String
toCaps = go id
where go _ [] = []
go _ (' ':xs) = go Char.toUpper xs
go f (x :xs) = f x:go id xs
There are a number of ways of doing it, but if I were trying to keep it as close to how you've set up your example, I might do something like:
import Data.Char (toUpper)
toCaps :: String -> String
toCaps [] = [] -- base case
toCaps (' ':c:cs) = toUpper c : toCaps cs -- throws out the space and capitalizes next letter
toCaps (c:cs) = c : toCaps cs -- anything else is left as is
This is just using basic recursion, dealing with a character (element of the list) at a time, but if you wanted to use higher-order functions such as map or filter that work on the entire list, then you would probably want to compose them (the way that Willem suggested is one way) and in that case you could probably do without using recursion at all.
It should be noted that this solution is brittle in the sense that it assumes the input string does not contain leading, trailing, or multiple consecutive spaces.
Inspired by Joseph Sible 's answer, a coroutines solution:
import Data.Char
toCamelCase :: String -> String
toCamelCase [] = []
toCamelCase (' ': xs) = toPascalCase xs
toCamelCase (x : xs) = x : toCamelCase xs
toPascalCase :: String -> String
toPascalCase [] = []
toPascalCase (' ': xs) = toPascalCase xs
toPascalCase (x : xs) = toUpper x : toCamelCase xs
Be careful to not start the input string with a space, or you'll get the first word capitalized as well.

Haskell Newbie - Occurs check: cannot construct the infinite type: a ~ [a]

I am trying to make a function "tokenize" that takes 3 arguments; main String, String of chars that should be in their own String and a String of chars to remove from the string.
tokenize :: String -> String -> String -> [String]
tokenize [] imp remm = []
tokenize str imp remm = let chr = (head str) in
if elem chr imp then ([chr] : (tokenize (tail str) imp remm))
else if (elem chr remm ) then (tokenize (tail str) imp remm)
else chr: (tokenize (tail str) imp remm)
I get this error message :
Occurs check:
cannot construct the infinite type: a ~ [a]
Expected type: [a]
Actual type: [[a]]
In your expression, you use two subexpressions:
[chr] : (tokenize (tail str) imp remm))
and
chr: (tokenize (tail str) imp remm)
the two can not be in harmony with each other, since that would mean [chr] and chr have the same type hence the error.
Usually in functional programming the parameters are written in a different order. Indeed, it makes more sense to write it as tokenize imp remm str with imp the important characters, remm the characters to remove and str the string to process.
We can implement the fuction by using a helper function go. Here go basically should consider four cases:
we reached the end of the list, and thus return a singleton list with an empty list;
the first character is something to eliminate from the output, we recurse on the tail of the string;
the character is important, we yield an empty list, the character wrapped in a list, and recurse on the tail; and
if all above is not applicable, we prepend the character to the head of the list we retrieve when we recurse.
We filter out empty lists, that can occur when we have for example two consecutive characters that are in imp.
For example:
tokenize :: [Char] -> [Char] -> String -> [String]
tokenize imp remm = filter (not . null) . go
where go [] = [[]]
go (x:xs) | elem x remm = go xs
| elem x imp = [] : [x] : go xs
| otherwise = let (y:ys) = go xs in (x:y) : ys
We then yield:
Prelude> tokenize "abc" "def" "defaabyesays"
["a","a","b","ys","a","ys"]
It might however be better to solve separate problems by separte functions. For example first have a function that removes characters from remm, etc. This makes it more easy to understand and bugfix your function.

How do I extend this mergeWords function to any number of strings?

Following is the mergeWords function.
mergeWords [] [] = []
mergeWords [] (y:ys) = y:'\n':(mergeWords [] ys)
mergeWords (x:xs) [] = x:'\n':(mergeWords xs [])
mergeWords (x:xs) (y:ys) = x:y:'\n':(mergeWords xs ys)
If applied on mergeWords "hello" "world" it gives
"hw\neo\nlr\nll\nod\n"
I can't figure out how to extend this to list of strings. Like applying it to 3 strings should first take first character of each of the strings and then put a '\n' and then the second character and so on.
The puzzle is effectively to merge a list of words, a character at a time, into lines with trailing newline characters.
mergeWords :: [String] -> String
We need to take a list like
[ "hello"
, "jim"
, "nice"
, "day"
]
and rearrange it into the lists of things at a given position
[ "hjnd"
, "eiia"
, "lmcy"
, "le"
, "o"
]
That's what the library function transpose does.
And then we need to make a single string which treats those as lines, separated by newlines. Which is what unlines does.
So
mergeWords = unlines . transpose
and we're done.
Sounds reasonably easy if you do it in steps:
cutWords :: [String] -> [[String]] -- ["ab", "cd", "e"] -> [["a", "c", "e"], ["b", "d"]]
concatWord :: [String] -> String -- ["a", "c", "e"] -> "ace\n"
concatWords :: [String] -> String -- use mergeWord on all of them
The most interesting part is of course the cutWords part. What you want there is a zip-like behaviour, and for that it'll help if we "safe" tail and head:
head' (x:xs) = [x]
head' "" = ""
tail' (x:xs) = xs
tail' "" = ""
Now we can implement our cutWords, making sure we stop in time:
cutWords xs = heads : rest
where
heads = map head' xs
tails = map tail' xs
rest = if any (/= "") tails then cutWords tails
else []
Then the remaining part is trivial:
concatWord word = concat word ++ "\n"
concatWords words = concatMap concatWord word

replace character to number in haskell

I have function change which replace some characters to numbers. Here it is:
change [] = []
change (x:xs) | x == 'A' = '9':'9':change xs
| x == 'B' = '9':'8':change xs
| otherwise = change xs
and the output is:
Main> change "aAB11s"
"9998"
but I need this:
Main> change "aAB11s"
"a999811s"
How can I do this?
Try this:
change [] = []
change (x:xs) | x == 'A' = '9':'9':change xs
| x == 'B' = '9':'8':change xs
| otherwise = x:change xs
The only change is in otherwise.
In addition to #kostya 's answer, you don't need to write the recursive part youself, try this out:
change :: String -> String
change xs = concatMap chToStr xs
where chToStr 'A' = "99"
chToStr 'B' = "98"
chToStr x = [x]
or, more point-freely (actually this is preferred if the point-free refactoring doesn't hurt the readability):
change :: String -> String
change = concatMap chToStr
where chToStr 'A' = "99"
chToStr 'B' = "98"
chToStr x = [x]
And you can test the result:
λ> change "aAB11s"
"a999811s"
Some explanation:
It's tempting to do an elementwise replacement by passing map a function
f :: Char -> Char. But here you can't do that because for A, you want two characters, i.e. 99, so the function you want is of type Char -> String (String and [Char] in Haskell are equivalent) which does not fit the type signature.
So the solution is to also wrap other characters we don't care about into lists, and afterwards, we can perform a string concatenation(this function in Haskell is called concat) to get a string back.
Further, concatMap f xs is just a shorthand for concat (map f xs)
λ> map (\x -> [x,x]) [1..10]
[[1,1],[2,2],[3,3],[4,4],[5,5],[6,6],[7,7],[8,8],[9,9],[10,10]]
λ> concat (map (\x -> [x,x]) [1..10])
[1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10]
λ> concatMap (\x -> [x,x]) [1..10]
[1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10]

Telling if a list contains some information in sequential order

Given a list like:
let list = [1,2,3,4,5,6,7,8,9,10]
I am trying to come up with a way to detect if 7,8,9 exists in the list in sequential order and simply printout 'success' if it does and 'fail' otherwise.
I am trying to accomplish this using zip for the index. Can someone advise if I am on the right track or if there is a better way to accomplish this?
zip [0..] list
And then something like:
[if (snd x)==
7 && let index = (fst x)
&& (snd x)==8 && (fst x)==(index+1)
&& (snd x)==9 && (fst x)==(index+2)
then "success"
else "fail" | x <- list]
When trying to figure out a list algorithm it's usually best to start by
thinking about the special case in the list head. In this case, how would you
test that the list begins with [7,8,9]?
beginsWith789 :: [Int] -> Bool
beginsWith789 (7:8:9:_) = True
beginsWith789 _ = False
I.e. we can just pattern match to the first three elements. Now to generalize
this, if we don't find the subsequence in the list head, we recursively check
the tail of the list
contains789 :: [Int] -> Bool
contains789 (7:8:9:_) = True
contains789 (_:xs) = contains789 xs
contains789 _ = False
Now if we want to further generalize this to find any subsequence, we can use
the isPrefixOf function from Data.List:
import Data.List (isPrefixOf)
contains :: Eq a => [a] -> [a] -> Bool
contains sub lst | sub `isPrefixOf` lst = True
contains (_:xs) = contains sub xs
contains _ = False
We can tidy this up by using any and tails to check if any successively
shorter tail of the list begins with the given subsequence:
import Data.List (isPrefixOf, tails)
contains :: Eq a => [a] -> [a] -> Bool
contains sub = any (sub `isPrefixOf`) . tails
Or, we can simply use the standard library function isInfixOf. ;)
> import Data.List
> [7,8,9] `isInfixOf` [1,2,3,4,5,6,7,8,9]
True
I would recommend the isInfixOf function from Data.List. Finding the sequence is as easy as
let hasSequence = [7,8,9] `isInfixOf` list
How about this?
elemIndex (7,8,9) $ zip3 list (tail list) (tail (tail list))
elemIndex is from Data.List.
If you don't care about what index it's at, you can use this
elem (7,8,9) $ zip3 list (tail list) (tail (tail list))
I'm not sure if you also wanted to count the cases where some elements appear in the given order, but with some other elements between them. Seems like the other answers only count the cases where [7, 8, 9] is a sublist of the original. If you want let's say [7,8,9] be found in [7,0,8,0,9,0], you could use
appears :: (Eq a) => [a] -> [a] -> Bool
appears [] _ = True
appears _ [] = False
appears ns#(n : ns') (h : hs')
| n == h = appears ns' hs'
| otherwise = appears ns hs'
Now
appears [7,8,9] [7,0,8,0,9] --> True
appears [7,8,9] [1..10] --> True
appears [3,8,9] [1..10] --> True
appears [10,8,9] [1..10] --> False

Resources