Split string/list with a char or int - haskell

I'm trying to make a function in Haskell to split a string at a certain char and a list at a certain number.
For doing this the splitAt function is exactly what I need for numbers, but I can't give a char with this function.
E.g.
splitAt 5 [1,2,3,4,5,6,7,8,9,10]
gives
([1,2,3,4,5],[6,7,8,9,10])
that is exactly what I needed with the 5 in the left side of the tuple.
But now I want to do this with a char and a string. But splitAt only takes and int for the second argument. I want
splitAt 'c' "abcde"
resulting in
("abc", "de")
I looking for something in the direction of
splitAt (findIndex 'c' "abcde") "abcde"
but the function findIndex returns something of the type Maybe Int and splitAt needs an Int. Then I tried the following
splitAt (head (findIndices (== 'c') "abcde")) "abcde"
This is a possible solution but it returns the following
("ab","cde")
with the c on the wrong side of the tupple. You can add succ to c but what will the result be if the char is a Z.
Is there an easy way to modify to make
splitAt (findIndex 'c' "abcde") "abcde"
work?

You can use findIndex, just unwrap the Maybe and add one:
import Data.List
splitAfter :: (a-> Bool) -> [a] -> ([a],[a])
splitAfter this xs = case findIndex this xs of
Nothing -> (xs,[])
Just n -> splitAt (n+1) xs
giving, for example
*Main> splitAfter (=='c') "abcde"
("abc","de")
Maybe is a handy datatype for encoding failure in a way that's easy to recover. There's even a function maybe :: b -> (a -> b) -> Maybe a -> b to use a default value and a function to handle the two cases separately:
splitAfter' :: (a-> Bool) -> [a] -> ([a],[a])
splitAfter' this xs = maybe (xs,[])
(\n -> splitAt (n+1) xs)
(findIndex this xs)
which also works. For example
*Main> splitAfter' (==5) [1..10]
([1,2,3,4,5],[6,7,8,9,10])

You can use the fromMaybe function to take the result from Maybe, for example:
splitlist = splitAt (fromMaybe 0 (findIndex 'c' "abcde") "abcde")
fromMaybe :: a -> Maybe a -> a
The fromMaybe function takes a default value and Maybe value. If
the Maybe is Nothing, it returns the default values; otherwise, it
returns the value contained in the Maybe. (source).
With the default value set to 0, if your findIndex return Nothing the result of splitAt will be ("",list), for the same case but with default value set to length list the finally result it will be (list,"").

Given c :: Char and s :: String, you can write some as
splitAt ((1+) $ fromJust $ findIndex (==c) s) s
but
you get a exception if c is not into s
you traverse s two times
A Maybe alternative is
maybe Nothing (\x -> splitAt (1+x) s) (findIndex (==c) s)
you can set "else value" (Nothing in my example).
You can write your own function as
splitAt' :: Char -> String -> (String, String)
splitAt' _ [] = ("", "")
splitAt' c (x:xs) | c == x = ([c], xs)
| True = (x:cs, ys) where (cs, ys) = splitAt' c xs
Then, you get (s, "") if not c in s.

Here's a different way, that doesn't involve messing around with list indices.
break is nearly what you want. Let's reuse it. You want the matching element to be included at the end of the first output list, instead of at the start of the second.
import Control.Arrow ((***))
breakAfter :: (a -> Bool) -> [a] -> ([a], [a])
breakAfter p xs = map fst *** map fst $ break snd (zip xs $ False : map p xs)
How this works:
Transform our input list into a list of pairs (zip). The first element of each pair is taken from the original list. The second element of the pair is a Bool stating whether the previous element of the list is the one we're looking for. This is why we say False : map p xs --- if we just said map p xs, we would reproduce exactly the behaviour of break. Sticking the extra False in at the start is the important bit.
Reuse break. Our condition is encoded in the second element of each pair.
Throw away all those Bools. We don't need them any more.

Related

Remove first element that fulfills predicate (Haskell)

I want to make a function that removes the first element that fulfills the predicate given in the second argument. Something like this:
removeFirst "abab" (< 'b') = "bab"
removeFirst "abab" (== 'b') = "aab"
removeFirst "abab" (> 'b') = "abab"
removeFirst [1,2,3,4] even = [1,3,4]
I wanted to do it by recursively, and came up with this:
removeFirst :: [a] -> (a -> Bool) -> [a]
removeFirst [] _ = []
rremoveFirst (x:xs) p = if p x then x : removeFirst xs p else removeFirst xs p
(Inspired by this question)
But I get a type-error, like this:
Couldn't match type ‘a’ with ‘Bool’
Expected: [Bool]
Actual: [a]
‘a’ is a rigid type variable bound by
the type signature for:
removeFirst :: forall a. [a] -> (a -> Bool) -> [a]
or this:
ghci> removeFirst [1,2,3,4] even
<interactive>:25:1: error:
* Variable not in scope: removeFirst :: [a0] -> (a1 -> Bool) -> t
* Perhaps you meant `rem' (imported from Prelude)
I know this is a relatively simple thing to program, I am just not familiar enough with Haskell yet. How can I do this "Haskell-style" (in one line)?
Before doing it "in style", why not first simply do it, so it works. This is how we learn.
"Variable not in scope: removeFirst ..." simply means you haven't defined the function named removeFirst.
So it seems you first tried to define it (and the error you show does not go with the code you show), then you got errors so it didn't get defined, and then you tried calling it and got the error saying it's not defined yet, naturally.
So, save your program in a source file, then load that file in GHCi. Then if you get any errors please copy-paste the full code from your file into your question (do not re-type it by hand). Also please specify what is it you do when you get the error messages, precisely. And be sure to include the error messages in full by copy-pasting them as well.
Then the logic of your code can be addressed.
Since others have posted working code, here's how I'd code this as a one-liner of sorts:
remFirst :: [a] -> (a -> Bool) -> [a]
remFirst xs p = foldr g z xs xs
where
g x r ~(_:tl) -- "r" for recursive result
| p x -- we've found it, then
= tl -- just return the tail
| otherwise
= x : r tl -- keep x and continue
z _ = [] -- none were found
Shortened, it becomes
remFirst xs p =
foldr (\x r ~(_:tl) -> if p x then tl else x : r tl)
(const []) xs xs
Not one line, but it works.
removeFirst :: [a] -> (a -> Bool) -> [a]
removeFirst (x:xs) pred
| pred x = xs
| otherwise = x : removeFirst xs pred
For a one-liner, I imagine you'd want to use foldl to walk across the list from the left.
EDIT
This solution uses guards, it first checks to see if the first element of the list passed in satisfies the predicate, and if not, it prepends it to the list and recursively checks the tail of the passed in list.
Using manual recursion does not lead to a one-liner solution, so let's try using some pre-built recursion scheme from the library.
Function scanl :: (b -> a -> b) -> b -> [a] -> [b] looks handy. It produces a succession of states, one state per input item.
Testing under the ghci interpreter:
$ ghci
λ>
λ> p = (=='b')
λ>
λ> xs = "ababcdab"
λ> ss = tail $ scanl (\(s,n) x -> if (p x) then (x,n+1) else (x,n)) (undefined,0) xs
λ>
λ> ss
[('a',0),('b',1),('a',1),('b',2),('c',2),('d',2),('a',2),('b',3)]
λ>
At that point, it is easy to spot and get rid of the one unwanted element, thru some simple data massaging:
λ>
λ> filter (\(x,n) -> (n /= 1) || (not $ p x)) ss
[('a',0),('a',1),('b',2),('c',2),('d',2),('a',2),('b',3)]
λ>
λ> map fst $ filter (\(x,n) -> (n /= 1) || (not $ p x)) ss
"aabcdab"
λ>
Let's now write our removeFirst function. I take the liberty to have the predicate as leftmost argument; this is what all library functions do.
removeFirst :: (a -> Bool) -> [a] -> [a]
removeFirst p =
let
stepFn = \(s,n) x -> if (p x) then (x,n+1) else (x,n)
p2 = \(x,n) -> (n /= 1) || (not $ p x)
in
map fst . filter p2 . tail . scanl stepFn (undefined,0)
If required, this version can be changed into a one-liner solution, just by expanding the values of stepFn and p2 into the last line. Left as an exercise for the reader. It makes for a long line, so it is debatable whether that improves readability.
Addendum:
Another approach consists in trying to find a library function, similar to splitAt :: Int -> [a] -> ([a], [a]) but taking a predicate instead of the list position.
So we submit the (a -> Bool) -> [a] -> ([a],[a]) type signature into the Hoogle specialized search engine.
This readily finds the break library function. It is exactly what we require.
λ>
λ> break (=='b') "zqababcdefab"
("zqa","babcdefab")
λ>
So we can write our removeFirst function like this:
removeFirst :: (a -> Bool) -> [a] -> [a]
removeFirst p xs = let (ys,zs) = break p xs in ys ++ (tail zs)
The source code for break simply uses manual recursion.

Is there a straight-forward solution to receiving the element *prior* to hitting the dropWhile predicate?

Given a condition, I want to search through a list of elements and return the first element that reaches the condition, and the previous one.
In C/C++ this is easy :
int i = 0;
for(;;i++) if (arr[i] == 0) break;
After we get the index where the condition is met, getting the previous element is easy, through "arr[i-1]"
In Haskell:
dropWhile (/=0) list gives us the last element I want
takeWhile (/=0) list gives us the first element I want
But I don't see a way of getting both in a simple manner. I could enumerate the list and use indexing, but that seems messy. Is there a proper way of doing this, or a way of working around this?
I would zip the list with its tail so that you have pairs of elements
available. Then you can just use find on the list of pairs:
f :: [Int] -> Maybe (Int, Int)
f xs = find ((>3) . snd) (zip xs (tail xs))
> f [1..10]
Just (3,4)
If the first element matches the predicate this will return
Nothing (or the second match if there is one) so you might need to special-case that if you want something
different.
As Robin Zigmond says break can also work:
g :: [Int] -> (Int, Int)
g xs = case break (>3) xs of (_, []) -> error "not found"
([], _) -> error "first element"
(ys, z:_) -> (last ys, z)
(Or have this return a Maybe as well, depending on what you need.)
But this will, I think, keep the whole prefix ys in memory until it
finds the match, whereas f can start garbage-collecting the elements
it has moved past. For small lists it doesn't matter.
I would use a zipper-like search:
type ZipperList a = ([a], [a])
toZipperList :: [a] -> ZipperList a
toZipperList = (,) []
moveUntil' :: (a -> Bool) -> ZipperList a -> ZipperList a
moveUntil' _ (xs, []) = (xs, [])
moveUntil' f (xs, (y:ys))
| f y = (xs, (y:ys))
| otherwise = moveUntil' f (y:xs, ys)
moveUntil :: (a -> Bool) -> [a] -> ZipperList a
moveUntil f = moveUntil' f . toZipperList
example :: [Int]
example = [2,3,5,7,11,13,17,19]
result :: ZipperList Int
result = moveUntil (>10) example -- ([7,5,3,2], [11,13,17,19])
The good thing about zippers is that they are efficient, you can access as many elements near the index you want, and you can move the focus of the zipper forwards and backwards. Learn more about zippers here:
http://learnyouahaskell.com/zippers
Note that my moveUntil function is like break from the Prelude but the initial part of the list is reversed. Hence you can simply get the head of both lists.
A non-awkward way of implementing this as a fold is making it a paramorphism. For general explanatory notes, see this answer by dfeuer (I took foldrWithTails from it):
-- The extra [a] argument f takes with respect to foldr
-- is the tail of the list at each step of the fold.
foldrWithTails :: (a -> [a] -> b -> b) -> b -> [a] -> b
foldrWithTails f n = go
where
go (a : as) = f a as (go as)
go [] = n
boundary :: (a -> Bool) -> [a] -> Maybe (a, a)
boundary p = foldrWithTails findBoundary Nothing
where
findBoundary x (y : _) bnd
| p y = Just (x, y)
| otherwise = bnd
findBoundary _ [] _ = Nothing
Notes:
If p y is true we don't have to look at bnd to get the result. That makes the solution adequately lazy. You can check that by trying out boundary (> 1000000) [0..] in GHCi.
This solution gives no special treatment to the edge case of the first element of the list matching the condition. For instance:
GHCi> boundary (<1) [0..9]
Nothing
GHCi> boundary even [0..9]
Just (1,2)
There's several alternatives; either way, you'll have to implement this yourself. You could use explicit recursion:
getLastAndFirst :: (a -> Bool) -> [a] -> Maybe (a, a)
getLastAndFirst p (x : xs#(y:ys))
| p y = Just (x, y)
| otherwise = getLastAndFirst p xs
getLastAndFirst _ [] = Nothing
Alternately, you could use a fold, but that would look fairly similar to the above, except less readable.
A third option is to use break, as suggested in the comments:
getLastAndFirst' :: (a -> Bool) -> [a] -> Maybe (a,a)
getLastAndFirst' p l =
case break p l of
(xs#(_:_), (y:_)) -> Just (last xs, y)
_ -> Nothing
(\(xs, ys) -> [last xs, head ys]) $ break (==0) list
Using break as Robin Zigmond suggested ended up short and simple, not using Maybe to catch edge-cases, but I could replace the lambda with a simple function that used Maybe.
I toyed a bit more with the solution and came up with
breakAround :: Int -> Int -> (a -> Bool) -> [a] -> [a]
breakAround m n cond list = (\(xs, ys) -> (reverse (reverse take m (reverse xs))) ++ take n ys) $ break (cond) list
which takes two integers, a predicate, and a list of a, and returns a single list of m elements before the predicate and n elements after.
Example: breakAround 3 2 (==0) [3,2,1,0,10,20,30] would return [3,2,1,0,10]

How can I drop nth element of a list using foldl?

dropnth' :: [a] -> Int -> [a]
dropnth' xs n = foldl (\a b -> if (last a) == xs!!n then a else b ++ []) [head xs] xs
I was trying to solve this "dropping every nth element of a list" question using foldl, but I'm getting an error. How can I do that?
Error:
a are presumably the elements that you have already decided not to drop. You should then decide whether to drop, not the last element of a, but the next element in xs, which is presumably b.
b ++ [] is presumably meant to express that you have decided not to drop the element b, instead adding it to the list a. This is actually written a ++ [b].
This allows me to write this piece of code, which at least compiles:
dropnth' :: Eq a => [a] -> Int -> [a]
dropnth' xs n = foldl (\a b -> if b == xs!!n then a else a ++ [b]) [head xs] xs
xs!!n finds the nth element of xs, and comparing with that will find decide whether something's value is equal to that, not something's position. Note the Eq a, which tells us that we are comparing list values. foldl will have to get the positions of the entries from somewhere, such as from zip [0..].
dropnth' :: [a] -> Int -> [a]
dropnth' xs n = foldl (\a (i, b) -> if mod i n == 0 then a else a ++ [b]) [head xs] (zip [0..] xs)
Adding an element to the end of a list has to rebuild the whole list. Building the list up from its end would be much more efficient. But in this case, we can even use more specialized list operations for our use case.
dropnth' :: [a] -> Int -> [a]
dropnth' xs n = [b | (i, b) <- zip [0..] xs, mod i n > 0]
Note that we now drop the initial element as well. Perhaps that is what you want? Or you could zip with [1..] instead to shift all the crosshairs one to the left.
Usually, type signatures like Int -> [a] -> [a] compose better.

Apply a function to every element in a list to every element in another list - Haskell

My ultimate goal is to find if a list y contains all the elements of list x (I'm checking if x is a subset of y sort of thing)
subset x y =
and [out | z <- x
, out <- filter (==z) y ]
This doesn't work, and I know it's because z is a list still. I'm trying to make sense of this.
I think I may have to use the elem function, but I'm not sure how to split x into chars that I can compare separately through y.
I'm ashamed to say that I've been working on this simple problem for an hour and a half.
Checking whether all elements of xs are elements of ys is very straightforward. Loop through xs, and for each element, check if it is in ys:
subset xs ys = all (\x -> elem x ys) xs
You could also use the list difference function (\\). If you have list y and list x, and you want to check that all elements of x are in y, then x \\ y will return a new list with the elements of x that are not in y. If all the elements of x are in y, the returned list will be empty.
For example, if your list y is [1,2,3,4,5] and your list x is [2,4], you can do:
Prelude> [2,4] \\ [1,2,3,4,5]
[]
If list y is [1,2,3,4,5] and list x is [2,4,6], then:
Prelude> [2,4,6] \\ [1,2,3,4,5]
[6]
Easy way to reason about subsets is to use sets as the data type.
import qualified Data.Set as S
subset :: Ord a => [a] -> [a] -> Bool
subset xs ys = S.isSubsetOf (S.fromList xs) (S.fromList ys)
Then it's as simple as:
*Main> subset [1..5] [1..10]
True
*Main> subset [0..5] [1..10]
False
Let's break this down into two subproblems:
Find if a value is a member of a list;
Use the solution to #1 to test whether every value in a list is in the second one.
For the first subproblem there is a library function already:
elem :: (Eq a, Foldable t) => a -> t a -> Bool
Lists are a Foldable type, so you can use this function with lists for t and it would have the following type:
elem :: (Eq a) => a -> [a] -> Bool
EXERCISE: Write your own version of elem, specialized to work with lists (don't worry about the Foldable stuff now).
So now, to tackle #2, one first step would be this:
-- For each element of `xs`, test whether it's an element of `ys`.
-- Return a list of the results.
notYetSubset :: Eq a => [a] -> [a] -> [Bool]
notYetSubset xs ys = map (\x -> elem x ys) xs
After that, we need to go from the list of individual boolean results to just one boolean. There's a standard library function that does that as well:
-- Return true if and only if every element of the argument collection is
-- is true.
and :: Foldable t => t Bool -> Bool
EXERCISE: write your own version of and, specialized to lists:
myAnd :: [Bool] -> Bool
myAnd [] = _fillMeIn
myAnd (x:xs) = _fillMeIn
With these tools, now we can write subset:
subset :: Eq a => [a] -> [a] -> [Bool]
subset xs ys = and (map (\x -> elem x ys) xs)
Although a more experienced Haskeller would probably write it like this:
subset :: Eq a => [a] -> [a] -> [Bool]
subset xs ys = every (`elem` ys) xs
{- This:
(`elem` ys)
...is a syntactic shortcut for this:
\x -> x elem ys
-}
...where every is another standard library function that is just a shortcut for the combination of map and and:
-- Apply a boolean test to every element of the list, and
-- return `True` if and only if the test succeeds for all elements.
every :: (a -> Bool) -> [a] -> Bool
every p = and . map p

Lack of understanding infinite lists and seq operator

The code below retains, for a given integer n, the first n items from a list, drops the following n items, keeps the following n and so on. It works correctly for any finite list.
In order to make it usable with infinite lists, I used the 'seq' operator to force the accumulator evaluation before the recursive step as in foldl' as example.
I tested by tracing the accumulator's value and it seems that it is effectively computed as desired with finite lists.
Nevertheless, it doesn't work when applied to an infinite list. The "take" in the main function is only executed once the inner calculation is terminated, what, of course, never happens with an infinite list.
Please, can someone tell me where is my mistake?
main :: IO ()
main = print (take 2 (foo 2 [1..100]))
foo :: Show a => Int -> [a] -> [a]
foo l lst = inFoo keepOrNot 1 l lst []
inFoo :: Show a => (Bool -> Int -> [a] -> [a] -> [a]) -> Int -> Int -> [a] -> [a] -> [a]
inFoo keepOrNot i l [] lstOut = lstOut
inFoo keepOrNot i l lstIn lstOut = let lstOut2 = (keepOrNot (odd i) l lstIn lstOut) in
stOut2 `seq` (inFoo keepOrNot (i+1) l (drop l lstIn) lstOut2)
keepOrNot :: Bool -> Int -> [a] -> [a] -> [a]
keepOrNot b n lst1 lst2 = case b of
True -> lst2 ++ (take n lst1)
False -> lst2
Here's how list concatenation is implemented:
(++) :: [a] -> [a] -> [a]
(++) [] ys = ys
(++) (x:xs) ys = x : xs ++ ys
Note that
the right hand list structure is reused as is (even if it's not been evaluated yet, so lazily)
the left hand list structure is rewritten (copied)
This means that if you're using ++ to build up a list, you want the accumulator to be on the right hand side. (For finite lists, merely for efficiency reasons --- if the accumulator is on the left hand side, it will be repeatedly copied and this is inefficient. For infinite lists, the caller can't look at the first element of the result until it's been copied for the last time, and there won't be a last time because there's always something else to concatenate onto the right of the accumulator.)
The True case of keepOrNot has the accumulator on the left of the ++. You need to use a different data structure.
The usual idiom in this case is to use difference lists. Instead of using type [a] for your accumulator, use [a] -> [a]. Your accumulator is now a function that prepends a list to the list it's given as input. This avoids repeated copying, and the list can be built lazily.
keepOrNot :: Bool -> Int -> [a] -> ([a] -> [a]) -> ([a] -> [a])
keepOrNot b n lst1 acc = case b of
True -> acc . (take n lst1 ++)
False -> acc
The initial value of the accumulator should be id. When you want to convert it to a conventional list, call it with [] (i.e., acc []).
seq is a red herring here. seq does not force the entire list. seq only determines whether it is of the form [] or x : xs.
You're learning Haskell, yes? So it would be a good idea as an exercise to modify your code to use a difference list accumulator. Possibly the use of infinite lists will burn you in a different part of your code; I don't know.
But there is a better approach to writing foo.
foo c xs = map snd . filter fst . zipWith f [0..] $ xs
where f i x = (even (i `div` c), x)
So you want to group a list into groups of n elements, and drop every other group. We can write this down directly:
import Data.List (unfoldr)
groups n xs = takeWhile (not.null) $ unfoldr (Just . splitAt n) xs
foo c xs = concatMap head . groups 2 . groups c $ xs
dave4420 already explained what is wrong with your code, but I'd like to comment on how you got there, IMO. Your keepOrNot :: Bool -> Int -> [a] -> [a] -> [a] function is too general. It works according to the received Bool, any Bool; but you know that you will feed it a succession of alternating True and False values. Programming with functions is like plugging a pipe into a funnel - output of one function serves as input to the next - and the funnel is too wide here, so the contact is loose.
A minimal re-write of your code along these lines could be
foo n lst = go lst
where
go lst = let (a,b) = splitAt n lst
(c,d) = splitAt n b
in
a ++ go d
The contact is "tight", there's no "information leakage" here. We just do the work twice (*) ourselves, and "connect the pipes" explicitly, in this code, grabbing one result (a) and dropping the other (c).
--
(*) twice, reflecting the two Boolean values, True and False, alternating in a simple fashion one after another. Thus this is captured frozen in the code's structure, not hanging loose as a parameter able to accommodate an arbitrary Boolean value.
Like dava4420 said, you shouldn't be using (++) to accumulate from the left. But perhaps you shouldn't be accumulating at all! In Haskell, lazyness makes straighforward "head-construction" often more efficient than the tail recursions you'd need to use in e.g. Lisp. For example:
foo :: Int -> [a] -> [a] -- why would you give this a Show constraint?
foo ℓ = foo' True
where foo' _ [] = []
foo' keep lst
| keep = firstℓ ++ foo' False other
| otherwise = foo' True other
where (firstℓ, other) = splitAt ℓ lst

Resources