Haskell String substring function

Haskell String substring function - string

My function takes 2 strings and determines if the first string is a substring of the second input string. For instance:
isSubb "abc" "abcmhk" -- True
isSubb "abc" "uyabcmhk" -- True
isSubb "abc" "okaibcmhk" -- False
isSubb "abc" "amnabkaaabcmhk" -- gives True
So far I have:
isSubb :: [Char] -> [Char] -> Bool
isSubb sub str = auxx sub sub str
auxx :: [Char] -> [Char] -> [Char] -> Bool
auxx safe (s:ub) (st:r)
| s:ub == [] = True
| st:r == [] = False
| s == st = auxx safe ub r
| otherwise = auxx safe safe r
But its giving me a non-exhaustive error on the auxx function.
Any help is greatly appreciated!
Thank you!

In Data.List there is the isInfixOF function.
isInfixOf :: Eq a => [a] -> [a] -> Bool
The isInfixOf function takes two lists and returns True iff the first list is contained, wholly and intact, anywhere within the second.
Prelude Data.List> isInfixOf "abc" "abcmhk"
True
Prelude Data.List> isInfixOf "abc" "uyabcmhk"
True
Prelude Data.List> isInfixOf "abc" "okaibcmhk"
False
Prelude Data.List> isInfixOf "abc" "amnabkaaabcmhk"
True
You could write your function like
import Data.List (isInfixOf)
isSubb :: [Char] -> [Char] -> Bool
isSubb sub str = isInfixOf sub str

Your auxx function needs to take into account the cases where the second or the third parameters are [] (because you are getting there).
The s:ub == [] and st:r == [] will never be True since pattern matching happens before guard evaluation.
A sane equivalent of you function would be
auxx safe sub str
| sub == [] = True
| str == [] = False
| head sub == head str = auxx safe ub r
| otherwise = auxx safe safe r
Though the above is not efficient since it can be improved by pattern matching.
auxx _ [] _ = True
auxx _ _ [] = False
auxx safe (s:ub) (st:r)
| s == st = auxx safe ub r
| otherwise = auxx safe safe r

Your definition is not correct!
| s == st = auxx safe ub r
gives problems. Look, determining whether "abc" is in "afkjskgsbc" is not the same as determining whether "bc" is in "fkjskgsbc" . So you need to consider that the first letter may or may not be a part of the string you're looking for.

Related

Haskell: How to remove a non-number from a String

I have to write a code, that only returns numbers from a String.
I know how to remove a number from a String:
numbersInString = map removeNumbers.words
where
removeNumbers "" = ""
removeNumbers (s:ss)
| isNumber s = removeNumbers ss
| otherwise = s : removeNumbers ss
But I need it when deleting non-numbers.
For example:
removeNonNum :: String -> [String]
....
removeNonNum "234+8" == ["234", "8"]

words only splits on whitespace, so make your own recursive removeNonNum function (this code uses span :: (a -> Bool) -> [a] -> ([a], [a])):
import Data.Char
removeNonNum :: String -> [String]
removeNonNum "" = []
removeNonNum s#(x:_)
-- we have numbers at the start, extract them and add them to the list
| isNumber x =
let (numbers, rest) = span isNumber s
-- recurse onto the remaining parts
in numbers:removeNonNum rest
-- we don't have numbers at the start, skip them and continue recursing
| otherwise = removeNonNum $ dropWhile (not . isNumber) s

Is there any way to shorten these lines

type Parser a = String -> Maybe (a, String)
parseChar :: Char -> Parser Char
parseChar _ "" = Nothing
parseChar ch (x:xs) | x == ch = Just (ch, xs)
| otherwise = Nothing
parseAnyChar :: String -> Parser Char
parseAnyChar "" _ = Nothing
parseAnyChar _ "" = Nothing
parseAnyChar (x:xs) str | isJust res = res
| otherwise = parseAnyChar xs str
where res = parseChar x str
I'm a beginner in haskell and I would like to know how can I use parseChar in parseAnyChar in a more "haskell way" than looping recursively. Using a map for example or anything else but I can't find.

Yes. For one thing, you can just use a standard function instead of manual recursion, to try parseChar on all the specified alternatives:
import Data.List (find)
parseAnyChar cs str = case find isJust ress of
Just res -> res
Nothing -> Nothing
where ress = [parseChar c str | c<-cs]
...or shorter
import Data.Maybe (listToMaybe)
import Control.Monad (join)
parseAnyChar cs str = join . listToMaybe $ (`parseChar`str)<$>cs
A more efficient solution is to not attempt all the options in the first place, but instead use a Set of options:
import qualified Data.Set as Set
parseOneCharOf :: Set.Set Char -> Parser Char
parseOneCharOf _ "" = Nothing
parseOneCharOf cs (x:xs)
| x`Set.member`cs = Just x
| otherwise = Nothing
...or shorter
import Control.Monad (guard)
parseOneCharOf cs (x:xs) = guard (x`Set.member`cs) >> Just x

You can make use of Alternative instance of Maybe:
import Control.Applicative((<|>))
parseAnyChar :: Foldable f => f Char -> Parser Char
parseAnyChar xs str = foldr ((<|>) . (`parseChar` str)) Nothing xs
This works since, (<|>) for Maybe is implemented as [src]:
instance Alternative Maybe where
empty = Nothing
Nothing <|> r = r
l <|> _ = l
So in case the left operand is Nothing, we pick the right one, if the left one is a Just …, we pick the first one.
We thus can rewrite the parseAnyChar to:
parseAnyChar :: String -> Parser Char
parseAnyChar "" _ = Nothing
parseAnyChar _ "" = Nothing
parseAnyChar (x:xs) str = parseChar x str <|> parseAnyChar xs str
We do not need a special case for the second parameter being "" however, since this is already covered by the logic in parseChar. We thus can drop the second clause:
parseAnyChar :: String -> Parser Char
parseAnyChar [] _ = Nothing
parseAnyChar (x:xs) str = parseChar x str <|> parseAnyChar xs str
and now we can rewrite this to a foldr pattern.
This also makes it possible to use an Foldable as source of characters.
If the number of items is however large, you might want to use a more efficient data structure, like #leftroundabout suggests.

Haskell - Pattern Matching form (x:y:zs)

I am having difficulty understanding how to use pattern matching in guards.
I have this sample function, whose purpose is to return the last character in a string.
myFun :: [Char] -> Char
myFun str#(f:s:rst)
| str == "" = error "0 length string"
| length str == 1 = head str
| rst == "" = s
| otherwise = lame (s:rst)
It is failing with "Non-exhaustive patterns in function" when passed a string with a single character.
I assume that Haskell realizes it can't use the form (f:s:rst) to match a single element list, and then fails prior to trying to evaluate the call to length.
How do I make a guard that will tell Haskell what to do when there is only a single element?

You are pattern matching at the function definition level. The way you have described it, you are only covering the case where the string is at least two characters long:
myFun str#(f:s:rst)
You need to handle other cases as well. You can have a catch-all handler like this (needs to go as the last pattern):
myFun _ = ...
Or if you want to handle, for instance, the empty string, like this (prior to the catch-all):
myFun [] = ...
As to the purpose of your function, you are probably better off just using pattern matching and not using guards.
myFun :: [Char] -> Char
myFun [] = error "empty string"
myFun [x] = x
myFun (x:xs) = myFun xs
(Note that it would be more idiomatic to return a Maybe Char instead of crashing your program)

Based on the particularly helpful answer from Chad Gilbert, and some additional tinkering,
I have found a way to have my cake and eat it to.
In case anyone has a similar stumbling block, here is a way to specify uncovered cases prior to declaring your guards:
myFun :: [Char] -> Char
myFun "" = ""
myFun str#(s:rst)
| rst == "" = s
| otherwise = myFun (s:rst)
This also works with multiple args :
strSplit :: [Char] -> [[Char]] -> [[Char]]
strSplit str [] = strSplit str [""]
strSplit "" _ = [""]
strSplit str#(s1:ns) list#(x:xs)
| s1 == '|' = strSplit ns ("":list)
| ns == "" = map reverse $ ((s1 : x) : xs)
| otherwise = strSplit ns ((s1 : x) : xs)
Or with stuff using the original pattern#(first:second:rest) idea:
lastTwo :: [Char]->[Char]
lastTwo "" = ""
lastTwo [x] = [x]
lastTwo str#(f:s:rst)
| rst =="" = [f,s]
| otherwise = lastTwo (s:rst)
This is probably super obvious to folks more familiar with Haskell, but I didn't realize that you were "allowed" to just declare the function multiple times using different syntax to cover different cases.

How to stop recursing and produce the list in memory

(1.) The function "sameString" returns a Boolean value for whether two strings are the same regardless of capitalisation.
-- *Main> sameString "HeLLo" "HElLo"
-- True
-- *Main> sameString "Hello" "Hi there"
-- False
sameString :: String -> String -> Bool
sameString str1 str2
| length str1 == length str2 = and [ a == b | (a,b) <- zip (capitalise str1) (capitalise str2) ]
| otherwise = False
(1) Helper function "capitalise" does the capitalising.
capitalise :: String -> String
capitalise str = [ toUpper x | x <- str ]
(2) Function "prefix" returns a Boolean value which states whether the first string is a prefix of the second, regardless of capitalisation.
-- *Main> prefix "bc" "abCDE"
-- False
-- *Main> prefix "Bc" "bCDE"
-- True
prefix :: String -> String -> Bool
prefix [] [] = True
prefix substr str
| sameString string' substr == True = True
| otherwise = False
where chop_str :: String -> String -> String
chop_str str substr = (take (length substr) str)
string' = chop_str str substr
(3.) Function "dropUntil" returns the contents of the second string after the first occurrence of the first string. If the second string does not contain the first as a substring, it should return the empty string.
*Main> dropUntil "cd" "abcdef"
"ef"
dropUntil :: String -> String -> String
dropUntil substr [] = ""
dropUntil substr (s:tr)
| prefix substr (s:tr) == False = drop 1 s : dropUntil substr tr
| prefix substr (s:tr) == True =
So now the question. I was thinking of doing dropUntil with recursion.
What I think the function above should do is:
1) Given a string and a substring (for which the substring is not the prefix of the string) ...
it should drop the head of the string ...
and cons the empty list "" to
... a recursive call on the remaining tail and the same substring.
The idea behind it is to keep dropping the head of the list until the substring becomes the prefix of the list, where the function should then produce the remainder of the string as a result.
However, I have no idea how to do this. What I essentially want to do is make
| prefix substr (s:tr) == True = "leftover_string"
Where "leftover_string" is what remains after the recursive call drops the elements until the condition is met that the substring is the prefix of the remainder.
Is this possible to do the way I started?

We've had a lot of questions in the past couple of days which involve running down an equatable [x] with some other [x] looking for isPrefixOf. One primitive which has emerged in my thought process since then is really valuable here:
import Data.List
splitAtSublist :: ([x] -> Bool) -> [x] -> Maybe ([x], [x])
splitAtSublist pred list = find (pred . snd) $ zip (inits list) (tails list)
The splittings zip (inits list) (tails list) for a string "abcd" look like [("", "abcd"), ("a", "bcd"), ("ab", "cd"), ("abc", "d"), ("abcd", ""))]. This finds the first element where the "tail end" of the splitting satisfies the predicate pred.
To get dropUntil s from this basis we can just do:
dropUntil p ls =
maybe "" (drop (length p) . snd) $
splitAtSublist (p `isPrefixOf`) ls
where isPrefixOf is from Data.List too. Substituting in, we find out that we don't use the inits list at all and it just becomes:
dropUntil p = maybe "" (drop $ length p) . find (p `isPrefixOf`) . tails
which is as simple as I can get it.

Is this what you want?
| prefix substr (s:tr) == True = drop (length substr) (s:tr)
Some notes:
You have a type error here:
| prefix substr (s:tr) == False = drop 1 s : dropUntil substr tr
^^^^^^^^
I think you just want:
| prefix substr (s:tr) == False = dropUntil substr tr

Haskell - please help me simplify these two functions

I am trying to work through the exercises in Write Yourself a Scheme in 48 Hours. I need help with simplifying couple of functions.
data LispVal = Number Integer
| String String
| Bool Bool
isNumber :: [LispVal] -> LispVal
isNumber [] = Bool False
isNumber [(Number _)] = Bool True
isNumber ((Number _):xs) = isNumber xs
isNumber _ = Bool False
isString :: [LispVal] -> LispVal
isString [] = Bool False
isString [(String _)] = Bool True
isString ((String _):xs) = isString xs
isString _ = Bool False
The isNumber and isString functions have lot common structure. How do I go about factoring out this common structure?

While you can't parameterize the pattern match itself, you can write yourself small helper functions so you at least don't have to repeat the list handling for every function:
isString (String _) = True
isString _ = False
isNumber (Number _) = True
isNumber _ = False
all1 _ [] = False
all1 f xs = all f xs
isListOfStrings = Bool . all1 isString
isListOfNumbers = Bool . all1 isNumber
In my opinion, the special case handling of the empty list isn't consistent here. You should consider just using all instead (so that the empty list can be a list of any kind, similar to how Haskell's lists work).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Haskell String substring function - string

Your definition is not correct! | s == st = auxx safe ub r gives problems. Look, determining whether "abc" is in "afkjskgsbc" is not the same as determining whether "bc" is in "fkjskgsbc" . So you need to consider that the first letter may or may not be a part of the string you're looking for.

Related

Haskell: How to remove a non-number from a String

Is there any way to shorten these lines

Haskell - Pattern Matching form (x:y:zs)

How to stop recursing and produce the list in memory

Haskell - please help me simplify these two functions

Categories

Resources