Haskell how to remove excessive spaces in a String - haskell

Trying to create a function that removes all double/triple etc spaces in a
string (and merges them into one single space)
So far I have been able to get double spaces to remove, but not sure how to go about triple and more.
i.e "a b c d e" -> "a b c d e"
formatSpace :: String -> String
formatSpace [] = []
formatSpace (' ':' ':xs) = ' ': formatSpace xs
formatSpace (x:xs) = x: formatSpace xs
Thought about trying to turn all spaces into say '-' and then turn all those into a single space.
Been able to move and leading and trailing whitespace, but can't do this one

I'd do it like this:
formatSpace :: String -> String
formatSpace = foldr go ""
where
go x acc = x:if x == ' ' then dropWhile (' ' ==) acc else acc
This is maximally lazy and doesn't create any unnecessary intermediate data structures.

Related

Converts a double space into single one anywhere in a String in Haskell

I've been trying to complete a function that converts
double space in a String into a single space in Haskell.
normaliseSpace:: String -> String
normaliseSpace (x:y:xs)= if x==' ' && y==' ' then y:xs
else xs
The problem with my code is that only converts double spaces in the beginning of a String. I suppose it has something to do with pattern matching, but as I am completely new to Haskell I really don't have an idea how to do it. Any help will be appreciated!
The reason this happens is because y:xs and xs will not recurse on te rest of the string. You thus want to perform the function on the rest of the string.
You thus should call normaliseSpace on xs as tail. For example:
normaliseSpace:: String -> String
normaliseSpace "" = ""
normaliseSpace (' ' : ' ' : xs) = ' ' : normaliseSpace xs
normalissSpace (x:xs) = x : normaliseSpace xs
Note that you need to add a pattern for the empty string (list) as well. Since otherwise eventually the recursion will reach the end of the list, and thus raise an error because there is no clause that can "fire".
If you want to reduce a sequence of spaces (two or more to one), then we even need to pass ' ' : xs through the normalizeSpace, like #leftroundabout says:
normaliseSpace:: String -> String
normaliseSpace "" = ""
normaliseSpace (' ' : ' ' : xs) = normaliseSpace (' ':xs)
normalissSpace (x:xs) = x : normaliseSpace xs
We can use an as-pattern here, like #JosephSible suggests:
normaliseSpace:: String -> String
normaliseSpace "" = ""
normaliseSpace (' ' : xs#(' ' : _)) = normaliseSpace xs
normalissSpace (x:xs) = x : normaliseSpace xs
May be you can simply do like
*Main> let normSpaces = unwords . words
*Main> normSpaces "hello world"
"hello world"

Insert space after every punctuation sign in a String Haskell

I have this function that checks if a character is one of these punctuation signs.
checkpunctuation:: Char -> Bool
checkpunctuationc = c `elem` ['.', ',', '?', '!', ':', ';', '(', ')']
I have to write another function that after every punctuation sign it adds a space
format :: String -> String
I know how to add space after a given number of characthers but don't know how to add after specific characters.
Simple recursive option:
format :: String -> String
format [] = []
format (x:xs) | checkpuntuationc x = x : ' ' : format xs
| otherwise = x : format xs
Another option is to use foldr with a helper function:
helper :: Char -> String -> String
helper x xs | checkpunctuation x = x : ' ' : xs
| otherwise = x : xs
The helper checks if the first character is a punctuation. If so it inserts a space, otherwise it does not.
and then define format as:
format :: String -> String
format = foldr helper []
A sample call:
*Main> format "Hello? Goodbye! You say goodbye!! (and I say Hello)"
"Hello? Goodbye! You say goodbye! ! ( and I say Hello) "
This function works also on "infinite strings":
*Main> take 50 $ format $ cycle "Hello?Goodbye!"
"Hello? Goodbye! Hello? Goodbye! Hello? Goodbye! He"
So although we feed it a string that keeps cycle-ing, and thus never ends, we can derive the first 50 characters of the result.
There's probably a more elegant way to do it, but
format :: String -> String
format s = concat [if (checkpunctuation c) then (c:" ") else [c] | c <- s]
will work (thanks, #Shou Ya!).
Edit based on comment
To count the total length of post-formatted punctuation characters, you can use
sumLength :: [String] -> Int
sumLength strings = 2 * (sum $ fmap length (fmap (filter checkpunctuation) strings))
as the it is twice the sum of the number of punctuation characters.

Haskell replace space with Char

I'm trying to come up with a function that will replace all the blank spaces in a string with "%50" or similar and I know I'm messing up something with my types but can't seem to figure it out I have been trying the following (yes I have imported Data.Char)
newLine :: String -> String
newLine xs = if x `elem` " " then "%50"
I also tried the if then else statement but really didn't know what to do with the else so figured just lowercase all the letters with
newLine xs = [if x `elem` ' ' then '%50' else toLower x | x<-xs]
would like the else statement to simply do nothing but have searched and found no way of doing that so i figured if everything was lowercase it wouldn't really matter just trying to get this to work.
Try simple solution
newLine :: String -> String
newline "" = ""
newLine (' ':xs) = '%':'5':'0': newLine xs
newLine (x:xs) = x: newLine xs
or use library function
You're running into type mismatch issues. The approach you're currently using would work if you were replacing a Char with another Char. For example, to replace spaces with asterisks:
newLine xs = [if x == ' ' then '*' else toLower x | x<-xs]
Or if you wanted to replace both spaces and newlines with asterisks, you could use the elem function. But note that the elem function takes an array (or a String, which is the same as [Char]). In your example, you were trying to pass it a single element, ' '. This should work:
newLine xs = [if x `elem` " \n" then '*' else toLower x | x<-xs]
However, you want to replace a Char with a String ([Char]). So you need a different approach. The solution suggested by viorior looks good to me.
Well, the list comprehension is almost correct. Problem is:
"%50" is not a valid character literal, so you can't have '%50'. If you actually mean the three characters %, 5 and 0, it needs to be a String instead.
' ' is a correct character literal, but the character x can't be element of another char. You certainly mean simply x == ' '.
Now that would suggest the solution
[if x == ' ' then "%50" else toLower x | x<-xs]
but this doesn't quite work because you're mixing strings ("%50") and single-characters in the same list. That can easily be fixed though, by "promoting" x to a single-char string:
[if x == ' ' then "%50" else [toLower x] | x<-xs]
The result has then type [String], which can be "flattened" to a single string with the prelude concat function.
concat [if x == ' ' then "%50" else [toLower x] | x<-xs]
An alternative way of writing this is
concatMap (\x -> if x == ' ' then "%50" else [toLower x]) xs
or – exactly the same with more general infix operators
xs >>= \x -> if x == ' ' then "%50" else [toLower x]
To replace characters with possibly longer strings, one can follow this approach:
-- replace single characters
replace :: Char -> String
replace ' ' = "%50"
replace '+' = "Hello"
replace c | isAlpha c = someStringFunctionOf c
replace _ = "DEFAULT"
-- extend to strings
replaceString :: String -> String
replaceString s = concat (map replace s)
The last line can also be written as
replaceString s = concatMap replace s
or even
replaceString s = s >>= replace
or even
replaceString = (>>= replace)
import Data.List
newLine :: String -> String
newLine = intercalate "%50" . words

Having trouble with isUpper function in use

Is it OK to write the otherwise part this way? The function should lower the uppercase letters and put the space in front. It keeps giving an error.
functionl s
| s==[] = error "empty"
| otherwise = [ if isUpper c then (" " ++ toLower c) else c | c <-read s::[Char] ]
First, Note that the return type of (" "++ toLower c) is a String ([Char]) if it was done properly - but it isn't. I'll show you below.
But before that, note that in this specific list comprehension, you have else c which is a single Char.
Your return types must match.
This might be a suitable replacement: concat [ if (isUpper c) then (" "++[c]) else [c] | c <-s ]
Your list comprehension is almost right as #Arnon has shown, but you could definitely implement this function more easily using recursion:
-- A descriptive name and a type signature help
-- tell other programmers what this function does
camelCaseToWords :: String -> String
camelCaseToWords [] = []
camelCaseToWords (c:cs)
| isUpper c = ' ' : toLower c : camelCaseToWords cs
| otherwise = c : camelCaseToWords cs
Now, this pattern can be abstracted to use a fold, which is Haskell's equivalent of a basic for-loop:
camelCaseToWords cs = foldr replacer [] cs
where
replacer c xs
| isUpper c = ' ' : toLower c : xs
| otherwise = c : xs
Here each step of the iteration is performed by replacer, which takes the current character c, an accumulated value xs and returns a new value to be used in the next iteration. The fold is seeded with an initial value of [], and then performed over the entire string.

How do I replace space characters in a string with "%20"?

I wanted to write a Haskell function that takes a string, and replaces any space characters with the special code %20. For example:
sanitize "http://cs.edu/my homepage/I love spaces.html"
-- "http://cs.edu/my%20homepage/I%20love%20spaces.html"
I am thinking to use the concat function, so I can concatenates a list of lists into a plain list.
The higher-order function you are looking for is
concatMap :: (a -> [b]) -> [a] -> [b]
In your case, choosing a ~ Char, b ~ Char (and observing that String is just a type synonym for [Char]), we get
concatMap :: (Char -> String) -> String -> String
So once you write a function
escape :: Char -> String
escape ' ' = "%20"
escape c = [c]
you can lift that to work over strings by just writing
sanitize :: String -> String
sanitize = concatMap escape
Using a comprehension also works, as follows,
changer :: [Char] -> [Char]
changer xs = [ c | v <- xs , c <- if (v == ' ') then "%20" else [v] ]
changer :: [Char] -> [Char] -> [Char]
changer [] res = res
changer (x:xs) res = changer xs (res ++ (if x == ' ' then "%20" else [x]))
sanitize :: [Char] -> [Char]
sanitize xs = changer xs ""
main = print $ sanitize "http://cs.edu/my homepage/I love spaces.html"
-- "http://cs.edu/my%20homepage/I%20love%20spaces.html"
The purpose of sanitize function is to just invoke changer, which does the actual work. Now, changer recursively calls itself, till the current string is exhausted.
changer xs (res ++ (if x == ' ' then "%20" else [x]))
It takes the first character x and checks if it is equal to " ", if so gives %20, otherwise the actual character itself as a string, which we then concatenate with the accumulated string.
Note: This is may not be the optimal solution.
You can use intercalate function from Data.List module. It does an intersperse with given separator and list, then concats the result.
sanitize = intercalate "%20" . words
or using pattern matching :
sanitize [] = []
sanitize (x:xs) = go x xs
where go ' ' [] = "%20"
go y [] = [y]
go ' ' (x:xs) = '%':'2':'0': go x xs
go y (x:xs) = y: go x xs
Another expression of Shanth's pattern-matching approach:
sanitize = foldr go []
where
go ' ' r = '%':'2':'0':r
go c r = c:r

Resources