Split a string while keeping delimiters Haskell - haskell

Basically I'm trying to split a String into [[String]] and then concat the results back but keeping the delimiters in the resultant list (even repeating in a row).
Something like the below kind of works, but the delimiter gets crunched into one space instead of retaining all three spaces
unwords . map (\x -> "|" ++ x ++"|") . words $ "foo bar"
-- "|foo| |bar|"
Ideally I could get something like:
"|foo|| ||bar|" -- or
"|foo| |bar|"
I just can't figure out how to preserve the delimiter, all the split functions I've seen have dropped the delimiters from the resulting lists, I can write one myself but it seems like something that would be in a standardish library and at this point I'm looking to learn more than the basics which includes getting familiar with more colloquial ways of doing things.
I think I'm looking for some function like:
splitWithDelim :: Char -> String -> [String]
splitWithDelim "foo bar" -- ["foo", " ", " ", " ", "bar"]
or maybe it's best to use regexes here?

You can split a list, keeping delimiters using the keepDelimsL and keepDelimsR functions in the Data.List.Split package, like here:
split (keepDelimsL $ oneOf "xyz") "aazbxyzcxd" == ["aa","zb","x","y","zc","xd"]

Related

Writing newlines to files

How can I write a string that contains newlines ("\n") to a file so that each string is on a new line in the file?
I have an accumulator function that iterates over some data and incrementally constructs a string (that contains information) for each element of the data. I don't want to write to the file every step so I'm appending the strings in each step. I do this so I can write the string in one time and limit the amount of IO.
Adding a newline to the string via str ++ "\n" doesn't work, hPrint h str will just print "\n" instead of starting on a new line.
I've tried accumulating a list of strings, instead of one big string, and iterating over the list and printing each string via hPrint. This works for the newlines but it also prints the quotation marks around each string on every line.
Don't use hPrint to write the strings to the file. Just like regular print it outputs the result of show, which produces a debugging-friendly version of the string with control characters and line endings escaped (and the surrounding quotes).
Use hPutStr or hPutStrLn instead. They will write the string to the file as-is (well, the latter adds a newline at the end).
The probably idiomatic solution to what you try to do is to simply aggregate the resulting strings in a list. Then, use the unlines prelude function which has the signature unlines :: [String] -> String and does your \n business for you.
Then, writing the string to disk can be done with help of writeFile which has the signature: writeFile :: FilePath -> String -> IO ().
Haskell is lazy. As such, it sometimes helps to think of Haskell lists as enumerators (C# like IEnumerable). This means here, that trying to compute line wise, then build the string manually and write it line by line is not really necessary. Just as readFile works lazily, so then does e.g. lines. In other words, you gain nothing if you try to "optimize" code which looks in its genuine form similar to this:
main = do
input <- readFile "infile"
writeFile "outfile" ((unlines . process) (lines input))
where
process inputLines = -- whatever you do

Creating an array of possible string variations

I'm trying to figure out how I would create variations of a string, by replacing one character at a time in the string with a different character from another array.
For example:
variations = "abc"
getVariations "xyz" variations
Should return:
["xbc", "ybc", "zbc", "axc", "ayc", "azc", "abx", "aby", "abz"]
I'm not quite sure how to go about this. I tried iterating through the string, and then using list comprehension to add the possible characters but I end up losing characters.
[c ++ xs | c <- splitOn "" variations]
Where xs is the tail of the string.
Would someone be able to point me in the right direction please?
Recursively you can define getVariations replacements input
if input is empty, the result is ...
if input is (a:as), combine the results of:
replacing a with a character from replacements
keeping a the same and performing getVariations on as
This means the definition of getVariations could look ike:
getVariations replacements [] = ...
getVariations replacements (a:as) = ...#1... ++ ...#2...
It might also help to decide what the type of getVariations is:
getVariations :: String -> String -> ???

filtering the words ending with "ed" or "ing" using haskell

hi am new to Haskell and functional programing..
i want to pass in the string and find the words ending with "ed" or "ing".
eg: if the string is "he is playing and he played well"
answer should be : playing, played
does anyone know how to do this using Haskell.
You can build this using standard Haskell functions. Start by importing Data.List:
import Data.List
Use isSuffixOf to determine if one list ends with another. Below endings could be ["ed","ing"] and w would be the word you're testing, such as "played".
hasEnding endings w = any (`isSuffixOf` w) endings
Assuming you have split the string into a list of individual words (ws below), use filter to eliminate the words you don't want:
wordsWithEndings endings ws = filter (hasEnding endings) ws
Use words to get the list of words from the original string. Use intercalculate to join the filtered words back into the final comma-separated string (or leave this off if you want the result as a list of words). Use . to chain these functions together.
wordsEndingEdOrIng ws = intercalate ", " . wordsWithEndings ["ed","ing"] . words $ ws
And you're done.
wordsEndingEdOrIng "he is playing and he played well"
If you're typing into ghci, put let in front of each of the function definitions (all lines but the last one).
contain w end = take (length end) (reverse w) == reverse end
findAll ends txt = filter (\w -> any (contain w) ends) (words txt)
main = getLine >>= print . findAll ["ing","ed"]
findAll :: [String] -> String -> [String]
findAll :: "endings" -> "your text" -> "right words"

How to add characters to the output of a permutation in Haskell?

I want to make in Haskell a application that gives from a couple of characters all possibilities. That works with the permutation function. But now I want to add to the output of every word in the list a prefix and a suffix. Like:
Input:
combinations "prefix" "sufix" "randomletters"
Output (something like this)
["prefixrandomletters", "prefixrandomletters","prefixrandomletters","prefixrandomletters","suffixrandomletters","suffixrandomletters","suffixrandomletters","suffixrandomletters","suffixrandomletters",]
Background of the application:
Like scrabble. First the prefix is like 2 letters that the word can start with. Then 2 letters that the word can end with. Then the letters you have in your hand.
You can map a function that adds the prefix:
combinations pre suf letters = prefixed ++ suffixed
where
perms = permutations letters
prefixed = map (\x -> pre ++ x) $ perms
suffixed = ...
The way to solve this is to break the problem down, as you have started doing:
create a function to give every permutation (permutation)
create functions to add the prefix & suffix (\x -> pre ++ x etc)
apply these functions to every permutation (map) to create two lists of words
combine the two lists of words (++)

Wrapping strings, but not substrings in quotes, using R

This question is related to my question about Roxygen.
I want to write a new function that does word wrapping of strings, similar to strwrap or stringr::str_wrap, but with the following twist: Any elements (substrings) in the string that are enclosed in quotes must not be allowed to wrap.
So, for example, using the following sample data
test <- "function(x=123456789, y=\"This is a long string argument\")"
cat(test)
function(x=123456789, y="This is a long string argument")
strwrap(test, width=40)
[1] "function(x=123456789, y=\"This is a long"
[2] "string argument\")"
I want the desired output of a newWrapFunction(x, width=40, ...) to be:
desired <- c("function(x=123456789, ", "y=\"This is a long string argument\")")
desired
[1] "function(x=123456789, "
[2] "y=\"This is a long string argument\")"
identical(desired, newWrapFunction(tsring, width=40))
[1] TRUE
Can you think of a way to do this?
PS. If you can help me solve this, I will propose this code as a patch to roxygen2. I have identified where this patch should be applied and will acknowledge your contribution.
Here's what I did to get strwrap so it would not break single quoted sections on spaces:
A) Pre-process the "even" sections after splitting by the single-quotes by substituting "~|~" for the spaces:
Define new function strwrapqt
....
zz <- strsplit(x, "\'") # will be only working on even numbered sections
for (i in seq_along(zz) ){
for (evens in seq(2, length(zz[[i]]), by=2)) {
zz[[i]][evens] <- gsub("[ ]", "~|~", zz[[i]][evens])}
}
zz <- unlist(zz)
.... insert just before
z <- lapply(strsplit) ...........
Then at the end replace all the "~|~" with spaces. It might be necessary to doa lot more thinking about the other sorts of whitespace "events" to get a fully regular treatment.
....
y <- gsub("~\\|~", " ", y)
....
Edit: Tested #joran's suggestion. Matching single and double quotes would be a difficult task with the methods I am using but if one were willing to consider any quote as equally valid as a separator target, one could just use zz <- strsplit(x, "\'|\"") as the splitting criterion in the code above.

Resources