Capitalizing first letter of words while removing spaces (Haskell) - haskell

I'm just starting out in Haskell and this is like the third thing I'm writing, so, naturally, I'm finding myself a little stumped.
I'm trying to write a bit of code that will take a string, delete the spaces, and capitalize each letter of that string.
For example, if I input "this is a test", I would like to get back something like: "thisIsATest"
import qualified Data.Char as Char
toCaps :: String -> String
toCaps [] = []
toCaps xs = filter(/=' ') xs
toCaps (_:xs) = map Char.toUpper xs
I think the method I'm using is wrong. With my code in this order, I am able to remove all the spaces using the filter function, but nothing becomes capitalize.
When I move the filter bit to the very end of the code, I am able to use the map Char.toUpper bit. When I map that function Char.toUpper, it just capitalizes everything "HISISATEST", for example.
I was trying to make use of an if function to say something similar to
if ' ' then map Char.toUpper xs else Char.toLower xs, but that didn't work out for me. I haven't utilized if in Haskell yet, and I don't think I'm doing it correctly. I also know using "xs" is wrong, but I'm not sure how to fix it.
Can anyone offer any pointers on this particular problem?

I think it might be better if you split the problem into smaller subproblems. First we can make a function that, for a given word will capitalize the first character. For camel case, we thus can implement this as:
import Data.Char(toUpper)
capWord :: String -> String
capWord "" = ""
capWord (c:cs) = toUpper c : cs
We can then use words to obtain the list of words:
toCaps :: String -> String
toCaps = go . words
where go [] = ""
go (w:ws) = concat (w : map capWord ws)
For example:
Prelude Data.Char> toCaps "this is a test"
"thisIsATest"
For Pascal case, we can make use of concatMap instead:
toCaps :: String -> String
toCaps = concatMap capWord . words

Inspired by this answer from Will Ness, here's a way to do it that avoids unnecessary Booleans and comparisons:
import qualified Data.Char as Char
toCaps :: String -> String
toCaps = flip (foldr go (const [])) id
where go ' ' acc _ = acc Char.toUpper
go x acc f = f x:acc id
Or more understandably, but perhaps slightly less efficient:
import qualified Data.Char as Char
toCaps :: String -> String
toCaps = go id
where go _ [] = []
go _ (' ':xs) = go Char.toUpper xs
go f (x :xs) = f x:go id xs

There are a number of ways of doing it, but if I were trying to keep it as close to how you've set up your example, I might do something like:
import Data.Char (toUpper)
toCaps :: String -> String
toCaps [] = [] -- base case
toCaps (' ':c:cs) = toUpper c : toCaps cs -- throws out the space and capitalizes next letter
toCaps (c:cs) = c : toCaps cs -- anything else is left as is
This is just using basic recursion, dealing with a character (element of the list) at a time, but if you wanted to use higher-order functions such as map or filter that work on the entire list, then you would probably want to compose them (the way that Willem suggested is one way) and in that case you could probably do without using recursion at all.
It should be noted that this solution is brittle in the sense that it assumes the input string does not contain leading, trailing, or multiple consecutive spaces.

Inspired by Joseph Sible 's answer, a coroutines solution:
import Data.Char
toCamelCase :: String -> String
toCamelCase [] = []
toCamelCase (' ': xs) = toPascalCase xs
toCamelCase (x : xs) = x : toCamelCase xs
toPascalCase :: String -> String
toPascalCase [] = []
toPascalCase (' ': xs) = toPascalCase xs
toPascalCase (x : xs) = toUpper x : toCamelCase xs
Be careful to not start the input string with a space, or you'll get the first word capitalized as well.

Related

How can split a string with two conditions?

So basically I want to split my string with two conditions , when have a empty space or a diferent letter from the next one.
An example:
if I have this string ,"AAA ADDD DD", I want to split to this, ["AAA","A","DDD","DD"]
So I made this code:
sliceIt :: String -> [String]
sliceIt xs = words xs
But it only splits the inicial string when an empty space exists.
How can I also split when a caracter is next to a diferent one?
Can this problem be solve easier with recursion?
So you want to split by words and then group equal elements in each split. You have the functions for doing so,
import Data.List
sliceIt :: String -> [String]
sliceIt s = concatMap group $ words s
sliceItPointFree = concatMap group . words -- Point free notation. Same but cooler
split :: String -> [String]
split [] = []
split (' ':xs) = split xs
split (x:xs) = (takeWhile (== x) (x:xs)) : (split $ dropWhile (== x) (x:xs))
So this is a recursive definition where there are 2 cases:
If head is a space then ignore it.
Otherwise, take as many of the same characters as you can, then call the function on the remaining part of the string.

I need convert this string in Char List

I'm learning haskell. I'm reading a string from a text file and need to make this string becomes a list of char.
The input file is this:
Individuo A; TACGATCAAAGCT
Individuo B; AATCGCAT
Individuo C; TAAATCCGATCAAAGAGAGGACTTA
I need convert this string
S1 = "AAACCGGTTAAACCCGGGG" in S1 =
["A","A","A","C","C","G","G","T","T","A","A","A","C","C","C","G","G","G","G"]
or S1 =
['A','A','A','C','C','G','G','T','T','A','A','A','C','C','C','G','G','G','G']
but they are separated by ";"
What should I do?
What can I do?
after getting two lists, I send them to this code:
lcsList :: Eq a => [a] -> [a] -> [a]
lcsList [] _ = []
lcsList _ [] = []
lcsList (x:xs) (y:ys) = if x == y
then x : lcsList xs ys
else
let lcs1 = lcsList (x:xs) ys
lcs2 = lcsList xs (y:ys)
in if (length lcs1) > (length lcs2)
then lcs1
else lcs2
A rough and ready way to split out each of those strings is with something like this - which you can try in ghci
let a = "Individuo A; TACGATCAAAGCT"
tail $ dropWhile (/= ' ') $ dropWhile (/= ';') a
which gives you:
"TACGATCAAAGCT"
And since a String is just a list of Char, this is the same as:
['T', 'A', 'C', 'G', ...
If your file consists of several lines, it is quite simple: you just need to skip everything until you find “;”. If your file consists of just one line, you’ll have to look for sequences’ beginnings and endings separately (hint: sequence ends with space). Write a recursive function to do the task, and use functions takeWhile, dropWhile.
A String is already a list of Char (it is even defined like this: type String = [Char]), so you don’t have to do anything else. If you need a list of Strings, where every String consists of just one char, then use map to wrap every char (once again, every String is a list, so you are allowed to use map on these). To wrap a char, there are three alternatives:
Use lambda function: map (\c -> [c]) s
Use operator section: map (:[]) s
Define a new function: wrap x = [x]
Good luck!

Complex pattern matching with strings

I have a list of strings that looks like this:
xs = ["xabbaua", "bbbaacv", "ggfeehhaa", "uyyttaccaa", "ibbatb"]
I would like to find only strings in the list which have and vocel followed by two b's followed by any character followed by a vowel. How are simple matches like this done in Haskell. Is there a better solution that regular expressions? Can anyone help me with an example? Thanks.
You could just use the classic filter function in conjunction with any regexp library. Your pattern is simple enough that this would work with any regexp library :
filter (=~ "bb.[aeiuy]") xs
The confusing part of regexps in Haskell is that there is a very powerful generic API (in regex-base) to use them in the same way for all the specific libraries and the multiple result type you could wish for (Bool, String, Int...). For basic usages it should mostly work as you mean (tm). For your specific need, regex-posix should be sufficient (and come with the haskell platform so no need to install it normally). So don't forget to import it :
import Text.Regex.Posix
This tutorial should show you the basics of the regex API if you have other needs, it is a bit out-dated now but the fundamentals remains the same, only details of regex-base have changed.
One approach would be to build a small pattern-matching language and to embed it in Haskell.
In your example, a pattern is basically a list of character specifications. Let's define a type of abstract characters the values of which will serve as such specifications,
data AbsChar = Exactly Char | Vowel | Any
together with an "interpreter" that tells us whether a character matches a specification:
(=?) :: AbsChar -> Char -> Bool
Exactly c' =? c = c == c'
Vowel =? c = c `elem` "aeiou"
Any =? c = True
For example, Vowel =? 'x' will produce False, while Vowel =? 'a' will produce True.
Then, indeed, a pattern is just a list of abstract characters:
type Pattern = [AbsChar]
Next, we write a function that tests whether the prefix of a string matches a given pattern:
matchesPrefix :: Pattern -> String -> Bool
matchesPrefix [] _ = True
matchesPrefix (a : as) (c : cs) = a =? c && matchesPrefix as cs
matchesPrefix _ _ = False
For example:
> matchesPrefix [Vowel, Exactly 'v'] "eva"
True
> matchesPrefix [Vowel, Exactly 'v'] "era"
False
As we do not want to restrict ourselves to matching prefixes, but rather match anywhere within a word, our next function matches the prefixes of every end segment of a string:
containsMatch :: Pattern -> String -> Bool
containsMatch pat = any (matchesPrefix pat) . tails
It uses the function tails which can be found in the module Data.List, but which we can, to make this explanation self-contained, easily define ourselves as well:
tails :: [a] -> [[a]]
tails [] = [[]]
tails l#(_ : xs) = l : tails xs
For example:
> tails "xabbaua"
["xabbaua","abbaua","bbaua","baua","aua","ua","a",""]
Now, finally, the function you were looking for, that selects all strings from a list that contain a matching segment, is written simply as:
select :: Pattern -> [String] -> [String]
select = filter . containsMatch
Let's test it on your example:
> let pat = [Vowel, Exactly 'b', Exactly 'b', Any, Vowel]
> select pat ["xabbaua", "bbbaacv", "ggfeehhaa", "uyyttaccaa", "ibbatb"]
["xabbaua"]
Well, you can try this function, although this may not be a best method:
elem' :: String -> String -> Bool
elem' p xs = any (p==) $ map (take $ length p) $ tails xs
Usage:
filter (elem' "bb") ["xxbbaua", "bbbaacv", "ggfeehhaa", "uyyttaccaa", "bbbaab"]
or
bbFilter = filter (elem' "bb")
Well if you're absolutely opposed to doing it with Regexs you could do it with just pattern matching and recursion, although it is ugly.
xs = ["xabbaua", "bbbaacv", "ggfeehhaa", "uyyttaccaa", "ibbatb"]
vowel = "aeiou"
filter' strs = filter matches strs
matches [] = False
matches str#(x:'b':'b':_:y:xs)
| x `elem` vowel && y `elem` vowel = True
| otherwise = matches $ tail str
matches (x:xs) = matches xs
Calling filter' xs will return ["xabbaua"] which I believe is the required result.

Haskell Assignment - direction needed to split a String into words

we started a paper on Haskell a few weeks ago and just received our first assignment. I'm aware that SO doesn't like homework questions, so I'm not going to ask how to do it. Instead, it would be very much appreciated if anyone could push me in the right direction with this. Seeing as it might not be a specific question, would it be more appropriate in a discussion / community wiki?
Question: Tokenize a String, that is: "Hello, World!" -> ["Hello", "World"]
Coming from a Java background, I have to forget everything about the usual way to go about this. The problem is that I am still very clueless with Haskell. This is what I've come up with:
module Main where
main :: IO()
main = do putStrLn "Type in a string:\n"
x <- getLine
putStrLn "The string entered was:"
putStrLn x
putStrLn "\n"
print (tokenize x)
tokenize :: String -> [String]
tokenize [] = []
tokenize l = token l ++ tokenize l
token :: String -> String
token [] = []
token l = takeWhile (isAlphaNum) l
What would be the first glaring mistake?
Thank you.
The first glaring mistake is
tokenize l = token l ++ tokenize l
(++) :: [a] -> [a] -> [a] appends two lists of the same type. Since token :: String -> String (and type String = [Char]), the type of tokenize that is inferred from that line is tokenize :: String -> String.
You should use (:) :: a -> [a] -> [a] here.
The next mistake in that line is that in the recursive call, you pass the same input l once again, so you have an infinite recursion, always doing the same without change. You have to remove the first token (and a bit more) from the input for the argument to the recursive call.
Another problem is that your token supposes that the input begins with alphanumeric characters.
You also need a function that ensures that condition for what you pass to token.
This line results in an infinite list (which is OK, since Haskell is lazy, so the list only gets constructed "on demand"), because it is recurring with no change in the arguments:
tokenize l = token l ++ tokenize l
We can visualise what is happening when tokenize is called as:
tokenize l = token l ++ tokenize l
= token l ++ (token l ++ tokenize l)
= token l ++ (token l ++ (token l ++ tokenize l))
= ...
To stop this happening, you need to change what the argument to tokenize so that it recurs sensibly:
tokenize l = token l ++ tokenize <something goes here>
As others already pointed out your mistake, just a little hint: While you found already the very useful takeWhile function, you should have a look at span, as this could be even more helpful here.
This has something in it that feels similar to a parser monad. However, as you're a newcomer to Haskell, it's unlikely that you're in a position to understand how parsing monads work (or use them in your code) quite yet. To give you the basics, consider what you want:
tokenize :: String -> [String]
This takes a String, chomps it up into more pieces, and generates a list of strings corresponding to the words in the input string. How might we represent this? What we want to do is find a function that processes a single string, and at the first sign of whitespace, adds that string on to the sequence of words. But then you have to process what's left over. (I.e., the rest of the string.) For example, let's say you want to tokenize:
The brown fox jumped
You first pull out "The" and then continue processing " brown fox jumped" (note the space at the beginning of the second string). You will do this recursively, so naturally you will need a recursive function.
The natural solution that sticks out is to take something where you accumulate a set of strings you've tokenized so far, keep munching on the current input until you hit whitespace, then also accumulate what you've seen in the current string (this leads to an implementation where you're mostly consing stuff, and then occasionally reversing stuff).
Your exercise seemed a bit challenging to me so I decided to solve it just for self-training. Here's what I came up with:
import Data.List
import Data.Maybe
splitByAnyOf yss xs =
foldr (\ys acc -> concat $ map (splitBy ys) acc) [xs] yss
splitBy ys xs =
case (precedingElements ys xs, succeedingElements ys xs) of
(Just "", Just s) -> splitBy ys s
(Just p, Just "") -> [p]
(Just p, Just s) -> p : splitBy ys s
otherwise -> [xs]
succeedingElements ys xs =
fromMaybe Nothing . find isJust $ map (stripPrefix ys) $ tails xs
precedingElements ys xs =
fromMaybe Nothing . find isJust $ map (stripSuffix ys) $ inits xs
where
stripSuffix ys xs =
if ys `isSuffixOf` xs then Just $ take (length xs - length ys) xs
else Nothing
main = do
print $ splitBy "!" "Hello, World!"
print $ splitBy ", " "Hello, World!"
print $ splitByAnyOf [", ", "!"] "Hello, World!"
outputs:
["Hello, World"]
["Hello","World!"]
["Hello","World"]

haskell splitting a string

I have a string like this:
"Some sentence1. Some sentence2, Some sentence3, Some sentence4....."
and I would like to turn it into a list of string
["Some sentence1", "Some sentence2", "Some sentence3.".......]
so far this is what I have:
foo :: String -> [String]
foor [] = []
| x /= ',' = [[x]] ++ foo xs
| otherwise = [[x] ++ foo xs]
which doesn't compile, much less work.
Help!
There is no x in your pattern match. That is in large part why it neither compiles nor works. But your code also does not handle periods, so it won't work correctly even once it compiles until your fix that.
Your function doesn't define x or xs. That is why it won't compile.
foo [] = []
foo(x:xs) = something
will get you a function that will iterate over the characters of the list. Though you probably want to look at the standard library some more there is a function in there that makes this trivial.(note sorry for being vague this seems like homework so I am trying to not give it away.)
import Data.Char
import Data.List
sentences :: String -> [String]
sentences str =
case break isPunctuation str of
(s1, _:s2) -> s1 : sentences (dropWhile isSpace s2)
(s1, _) -> take 1 s1 >> [s1]

Resources