haskell splitting a string - string

I have a string like this:
"Some sentence1. Some sentence2, Some sentence3, Some sentence4....."
and I would like to turn it into a list of string
["Some sentence1", "Some sentence2", "Some sentence3.".......]
so far this is what I have:
foo :: String -> [String]
foor [] = []
| x /= ',' = [[x]] ++ foo xs
| otherwise = [[x] ++ foo xs]
which doesn't compile, much less work.
Help!

There is no x in your pattern match. That is in large part why it neither compiles nor works. But your code also does not handle periods, so it won't work correctly even once it compiles until your fix that.

Your function doesn't define x or xs. That is why it won't compile.
foo [] = []
foo(x:xs) = something
will get you a function that will iterate over the characters of the list. Though you probably want to look at the standard library some more there is a function in there that makes this trivial.(note sorry for being vague this seems like homework so I am trying to not give it away.)

import Data.Char
import Data.List
sentences :: String -> [String]
sentences str =
case break isPunctuation str of
(s1, _:s2) -> s1 : sentences (dropWhile isSpace s2)
(s1, _) -> take 1 s1 >> [s1]

Related

Capitalizing first letter of words while removing spaces (Haskell)

I'm just starting out in Haskell and this is like the third thing I'm writing, so, naturally, I'm finding myself a little stumped.
I'm trying to write a bit of code that will take a string, delete the spaces, and capitalize each letter of that string.
For example, if I input "this is a test", I would like to get back something like: "thisIsATest"
import qualified Data.Char as Char
toCaps :: String -> String
toCaps [] = []
toCaps xs = filter(/=' ') xs
toCaps (_:xs) = map Char.toUpper xs
I think the method I'm using is wrong. With my code in this order, I am able to remove all the spaces using the filter function, but nothing becomes capitalize.
When I move the filter bit to the very end of the code, I am able to use the map Char.toUpper bit. When I map that function Char.toUpper, it just capitalizes everything "HISISATEST", for example.
I was trying to make use of an if function to say something similar to
if ' ' then map Char.toUpper xs else Char.toLower xs, but that didn't work out for me. I haven't utilized if in Haskell yet, and I don't think I'm doing it correctly. I also know using "xs" is wrong, but I'm not sure how to fix it.
Can anyone offer any pointers on this particular problem?
I think it might be better if you split the problem into smaller subproblems. First we can make a function that, for a given word will capitalize the first character. For camel case, we thus can implement this as:
import Data.Char(toUpper)
capWord :: String -> String
capWord "" = ""
capWord (c:cs) = toUpper c : cs
We can then use words to obtain the list of words:
toCaps :: String -> String
toCaps = go . words
where go [] = ""
go (w:ws) = concat (w : map capWord ws)
For example:
Prelude Data.Char> toCaps "this is a test"
"thisIsATest"
For Pascal case, we can make use of concatMap instead:
toCaps :: String -> String
toCaps = concatMap capWord . words
Inspired by this answer from Will Ness, here's a way to do it that avoids unnecessary Booleans and comparisons:
import qualified Data.Char as Char
toCaps :: String -> String
toCaps = flip (foldr go (const [])) id
where go ' ' acc _ = acc Char.toUpper
go x acc f = f x:acc id
Or more understandably, but perhaps slightly less efficient:
import qualified Data.Char as Char
toCaps :: String -> String
toCaps = go id
where go _ [] = []
go _ (' ':xs) = go Char.toUpper xs
go f (x :xs) = f x:go id xs
There are a number of ways of doing it, but if I were trying to keep it as close to how you've set up your example, I might do something like:
import Data.Char (toUpper)
toCaps :: String -> String
toCaps [] = [] -- base case
toCaps (' ':c:cs) = toUpper c : toCaps cs -- throws out the space and capitalizes next letter
toCaps (c:cs) = c : toCaps cs -- anything else is left as is
This is just using basic recursion, dealing with a character (element of the list) at a time, but if you wanted to use higher-order functions such as map or filter that work on the entire list, then you would probably want to compose them (the way that Willem suggested is one way) and in that case you could probably do without using recursion at all.
It should be noted that this solution is brittle in the sense that it assumes the input string does not contain leading, trailing, or multiple consecutive spaces.
Inspired by Joseph Sible 's answer, a coroutines solution:
import Data.Char
toCamelCase :: String -> String
toCamelCase [] = []
toCamelCase (' ': xs) = toPascalCase xs
toCamelCase (x : xs) = x : toCamelCase xs
toPascalCase :: String -> String
toPascalCase [] = []
toPascalCase (' ': xs) = toPascalCase xs
toPascalCase (x : xs) = toUpper x : toCamelCase xs
Be careful to not start the input string with a space, or you'll get the first word capitalized as well.

concatenation of string within lists

I have to concat two string given as input into one singe string and put it in a list as output
type Language = [String]
cat :: Language -> Language -> Language
cat l1 l2 =
case l1 of
[""] -> l2
(x:xs) -> case l2 of
[""] -> l1
(y:ys) -> xs ++ ys
and the output should be:
["string1string2"]
any Idea in haskell?
Given your exact problem specification, it is solved by
concatWithinLists :: [String] -> [String] -> [String]
concatWithinLists [x] [y] = [x ++ y]
This is bad in all kinds of ways. All of them stem from your insistence that you will only ever have lists of exactly length 1, completely missing the point of lists.
I strongly recommend reconsidering everything that led you to this issue. The real problem isn't here - it's somewhere higher up in your design. It will continue to be a problem as long as you lie to the type system about the contents of your data. You aren't working with [String], you're working with String and have attached some noise for no benefit.
Why are you passing your strings through in a list? Doing so opens problems, like your code crashing should empty lists be given as an argument (with the exception of cat [""] []). Plus, your pattern matching is off: xs ++ ys becomes [] ++ [] when singleton lists are passed as arguments. This is because [x] = x:[]. A simpler solution would be:
cat :: String -> String -> [String]
cat s1 s2 = [s1 ++ s2]

Haskell List Comprehension - Ineffective Predicate

I'm pretty brand new to Haskell (only written a fizzbuzz program before the current one) and am trying to write a program that takes the unix wordlist ('/usr/share/dict/words') and prints out the list of anagrams for that word, with any direct palindromes starred. I have the meat of this summed up into one function:
findAnagrams :: [String] -> [(String, [String])]
findAnagrams d =
[x | x <- map (\s -> (s, [if reverse s == t then t ++ "*" else t | t <- d, s /= t && null (t \\ s)])) d, not (null (snd x))]
However, when I run the program I get this output:
abase: babes, bases
abased: debase
abasement: basements
abasements: abatements
abases: basses
And so on, so clearly it isn't working properly. My intention is for the list comprehension to read as follows: for all t in d such that t is not equal to s and there is no difference between t and s other than order, if t is the reverse of s include as t*, otherwise include as t. The problem seems to be with the "no difference between t and s other than order" part, which I'm trying to accomplish by using "null (t \ s)". It seems like it should work. Testing in GHCI gives:
Prelude Data.List> null ("abatements" \\ "abasements")
False
And yet it passes the predicate test. My assumption is that I'm missing something simple here, but I've looked at it a while and can't quite come up with it.
In addition, any notes regarding best practice would be greatly appreciated.
If you break it out into multiple functions (remember, source code size is not really that important), you could do something like:
import Data.List
isPalindrome :: String -> Bool
isPalindrome s = s == reverse s
flagPalins :: [String] -> [String]
flagPalins [] = []
flagPalins (x:xs)
| isPalindrome x = x ++ "*"
| otherwise = x
isAnagram :: String -> String -> Bool
isAnagram s t = (isPalindrome s || s /= t) && ??? -- test for anagram
findAnagrams :: String -> [String] -> [String]
findAnagrams s ws = flagPalins $ filter (isAnagram s) ws
findAllAnagrams :: [String] -> [(String, [String])]
findAllAnagrams ws = filter (not . null . snd) ??? -- words paired with their anagrams
I've intentionally left some holes for you to fill in, I'm not going to give you all the answers ;)
There are only two spots for you to do yourself. The one in findAllAnagrams should be pretty easy to figure out, you're already doing something pretty similar with your map (\s -> ...) part. I intentionally structured isAnagram so it'll return True if it's a palindrome or if it's just an anagram, and you only need one more check to determine if t is an anagram of s. Look at the comment I made on your question for a hint about what to do there. If you get stuck, comment and ask for an additional hint, I'll give you the name of the function I think you should use to solve this problem.
If you really want to make a list comprehension, I would recommend solving it this way, then converting back to a comprehension. In general you should write more verbose code, then compress it once you understand it fully.
Think of a \\ b as "items in a that are not in b."
Consider the implications.

I need convert this string in Char List

I'm learning haskell. I'm reading a string from a text file and need to make this string becomes a list of char.
The input file is this:
Individuo A; TACGATCAAAGCT
Individuo B; AATCGCAT
Individuo C; TAAATCCGATCAAAGAGAGGACTTA
I need convert this string
S1 = "AAACCGGTTAAACCCGGGG" in S1 =
["A","A","A","C","C","G","G","T","T","A","A","A","C","C","C","G","G","G","G"]
or S1 =
['A','A','A','C','C','G','G','T','T','A','A','A','C','C','C','G','G','G','G']
but they are separated by ";"
What should I do?
What can I do?
after getting two lists, I send them to this code:
lcsList :: Eq a => [a] -> [a] -> [a]
lcsList [] _ = []
lcsList _ [] = []
lcsList (x:xs) (y:ys) = if x == y
then x : lcsList xs ys
else
let lcs1 = lcsList (x:xs) ys
lcs2 = lcsList xs (y:ys)
in if (length lcs1) > (length lcs2)
then lcs1
else lcs2
A rough and ready way to split out each of those strings is with something like this - which you can try in ghci
let a = "Individuo A; TACGATCAAAGCT"
tail $ dropWhile (/= ' ') $ dropWhile (/= ';') a
which gives you:
"TACGATCAAAGCT"
And since a String is just a list of Char, this is the same as:
['T', 'A', 'C', 'G', ...
If your file consists of several lines, it is quite simple: you just need to skip everything until you find “;”. If your file consists of just one line, you’ll have to look for sequences’ beginnings and endings separately (hint: sequence ends with space). Write a recursive function to do the task, and use functions takeWhile, dropWhile.
A String is already a list of Char (it is even defined like this: type String = [Char]), so you don’t have to do anything else. If you need a list of Strings, where every String consists of just one char, then use map to wrap every char (once again, every String is a list, so you are allowed to use map on these). To wrap a char, there are three alternatives:
Use lambda function: map (\c -> [c]) s
Use operator section: map (:[]) s
Define a new function: wrap x = [x]
Good luck!

Haskell Assignment - direction needed to split a String into words

we started a paper on Haskell a few weeks ago and just received our first assignment. I'm aware that SO doesn't like homework questions, so I'm not going to ask how to do it. Instead, it would be very much appreciated if anyone could push me in the right direction with this. Seeing as it might not be a specific question, would it be more appropriate in a discussion / community wiki?
Question: Tokenize a String, that is: "Hello, World!" -> ["Hello", "World"]
Coming from a Java background, I have to forget everything about the usual way to go about this. The problem is that I am still very clueless with Haskell. This is what I've come up with:
module Main where
main :: IO()
main = do putStrLn "Type in a string:\n"
x <- getLine
putStrLn "The string entered was:"
putStrLn x
putStrLn "\n"
print (tokenize x)
tokenize :: String -> [String]
tokenize [] = []
tokenize l = token l ++ tokenize l
token :: String -> String
token [] = []
token l = takeWhile (isAlphaNum) l
What would be the first glaring mistake?
Thank you.
The first glaring mistake is
tokenize l = token l ++ tokenize l
(++) :: [a] -> [a] -> [a] appends two lists of the same type. Since token :: String -> String (and type String = [Char]), the type of tokenize that is inferred from that line is tokenize :: String -> String.
You should use (:) :: a -> [a] -> [a] here.
The next mistake in that line is that in the recursive call, you pass the same input l once again, so you have an infinite recursion, always doing the same without change. You have to remove the first token (and a bit more) from the input for the argument to the recursive call.
Another problem is that your token supposes that the input begins with alphanumeric characters.
You also need a function that ensures that condition for what you pass to token.
This line results in an infinite list (which is OK, since Haskell is lazy, so the list only gets constructed "on demand"), because it is recurring with no change in the arguments:
tokenize l = token l ++ tokenize l
We can visualise what is happening when tokenize is called as:
tokenize l = token l ++ tokenize l
= token l ++ (token l ++ tokenize l)
= token l ++ (token l ++ (token l ++ tokenize l))
= ...
To stop this happening, you need to change what the argument to tokenize so that it recurs sensibly:
tokenize l = token l ++ tokenize <something goes here>
As others already pointed out your mistake, just a little hint: While you found already the very useful takeWhile function, you should have a look at span, as this could be even more helpful here.
This has something in it that feels similar to a parser monad. However, as you're a newcomer to Haskell, it's unlikely that you're in a position to understand how parsing monads work (or use them in your code) quite yet. To give you the basics, consider what you want:
tokenize :: String -> [String]
This takes a String, chomps it up into more pieces, and generates a list of strings corresponding to the words in the input string. How might we represent this? What we want to do is find a function that processes a single string, and at the first sign of whitespace, adds that string on to the sequence of words. But then you have to process what's left over. (I.e., the rest of the string.) For example, let's say you want to tokenize:
The brown fox jumped
You first pull out "The" and then continue processing " brown fox jumped" (note the space at the beginning of the second string). You will do this recursively, so naturally you will need a recursive function.
The natural solution that sticks out is to take something where you accumulate a set of strings you've tokenized so far, keep munching on the current input until you hit whitespace, then also accumulate what you've seen in the current string (this leads to an implementation where you're mostly consing stuff, and then occasionally reversing stuff).
Your exercise seemed a bit challenging to me so I decided to solve it just for self-training. Here's what I came up with:
import Data.List
import Data.Maybe
splitByAnyOf yss xs =
foldr (\ys acc -> concat $ map (splitBy ys) acc) [xs] yss
splitBy ys xs =
case (precedingElements ys xs, succeedingElements ys xs) of
(Just "", Just s) -> splitBy ys s
(Just p, Just "") -> [p]
(Just p, Just s) -> p : splitBy ys s
otherwise -> [xs]
succeedingElements ys xs =
fromMaybe Nothing . find isJust $ map (stripPrefix ys) $ tails xs
precedingElements ys xs =
fromMaybe Nothing . find isJust $ map (stripSuffix ys) $ inits xs
where
stripSuffix ys xs =
if ys `isSuffixOf` xs then Just $ take (length xs - length ys) xs
else Nothing
main = do
print $ splitBy "!" "Hello, World!"
print $ splitBy ", " "Hello, World!"
print $ splitByAnyOf [", ", "!"] "Hello, World!"
outputs:
["Hello, World"]
["Hello","World!"]
["Hello","World"]

Resources