How to implement a list splitter based on predicate

How to implement a list splitter based on predicate - haskell

I am trying to implement a list-splitter based on a given predicate.The functionality is a similar to a recursive break.I input in a list ,and i append elements to a small list as long as the predicate for the given element is true.When the predicate is false i append the formed list to a bigger list and i continue from where i remained ignoring the element corresponding to the false predicate.
P.S: I am trying to not use any Data.List functions only self-implemented.(I'm in learning phase).
Example:
input
predicate: (\x->x<3)
list : [1,2,3,2,3,1,1,1,3]
output: [[1,2],[2],[1,1,1]]
I have tried so far with the following approach:
I use a list that given a list it breaks it in a tuple of 2.The first element is the list until the false predicate.The second is the rest.I feed this tuple to a function that performs this again and again.
splt::(a->Bool)->[a]->[[a]]
splt p []=[]
splt p (x:xs)=go p [] (x:xs) where
go p rez []=rez
go p rez ls=rez:process . brk $ ls [] where
process (x,[])=rez:x
process ([],x)=go p rez x
brk::(a->Bool)->[a]-> ([a],[a])
brk p []= ([],[])
brk p (x:xs)=go p ([],x:xs) where
go p (accu,[])= (accu,[])
go p (accu,x:xs)|not (p x) =(accu,xs)
|otherwise = go p (x:accu,xs)
I get the following error : cannot produce infinite type.
I have also tried a simpler solution:
format::(a->Bool)->[a]->[[a]]
format p []=[]
format p ls=go p [] [] ls where
go p small big []=small:big
go p small big (x:xs)=case p x of
True ->go p x:small big xs
False ->go p [] small:big xs
But it does not work and i also have to reverse the results in both cases.The error i get is:
* Couldn't match type `a' with `[a0]'
`a' is a rigid type variable bound by
the type signature for:
wrds :: forall a. (a -> Bool) -> [a] -> [[a]]
at Ex84.hs:12:5-31
Expected type: [a0] -> Bool
Actual type: a -> Bool
* In the first argument of `go', namely `p'
In the expression: go p [] []
ls

The second one is just a bracketing issue and can be remedied by correctly parenthesising the arguments as follows:
case p x of
True ->go p (x:small) big xs
False ->go p [] (small:big) xs
By not doing so, the lists are being treated as functions. You will have to map reverse on the resulting list as well in this case.
You can also try using the span function for this:
splt :: (a -> Bool) -> [a] -> [[a]]
splt _ [] = []
splt p xs = let (xs', xs'') = span p xs
remaining = dropWhile (not . p) xs''
in case xs' of
[] -> splt p remaining
xs' -> xs' : splt p xs''

This is a way of implementing what you need.
What I did is to use the function span, that breaks the list in two, returning a tuple. The first element is the list of initial elements satisfying p. The second element of the tuple is the remainder of the list.
I used this recursively, while discarding the elements that do not satisfy p.
splt :: (a -> Bool) -> [a] -> [[a]]
splt p [] = []
splt p ls = let (a,rs) = span p ls
(b,rs') = span (not . p) rs
in if null a then splt p rs' else a : splt p rs'
You can also try to define span yourself

Related

Understanding non-strictness in Haskell with a recursive example

What is the difference between this two, in terms of evaluation?
Why this "obeys" (how to say?) non-strictness
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter _ [] = []
recFilter p (h:tl) = if (p h)
then h : recFilter p tl
else recFilter p tl
while this doesn't?
recFilter :: (a -> Bool) -> [a] -> Int -> [a]
recFilter _ xs 0 = xs
recFilter p (h:tl) len
| p(h) = recFilter p (tl ++ [h]) (len-1)
| otherwise = recFilter p tl (len-1)
Is it possible to write a tail-recursive function non-strictly?
To be honest I also don't understand the call stack of the first example, because I can't see where that h: goes. Is there a way to see this in ghci?

The non-tail recursive function roughly consumes a portion of the input (the first element) to produce a portion of the output (well, if it's not filtered out at least). Then recursion handles the next portion of the input, and so on.
Your tail recursive function will recurse until len becomes zero, and only at that point it will output the whole result.
Consider this pseudocode:
def rec1(p,xs):
case xs:
[] -> []
(y:ys) -> if p(y): print y
rec1(p,ys)
and compare it with this accumulator-based variant. I'm not using len since I use a separate accumulator argument, which I assume to be initially empty.
def rec2(p,xs,acc):
case xs:
[] -> print acc
(y:ys) -> if p(y):
rec2(p,ys,acc++[y])
else:
rec2(p,ys,acc)
rec1 prints before recursing: it does not need to inspect the whole input list to start printing its output. It works in a "steraming" fashion, in a sense. Instead, rec2 will only start to print at the very end, after the input list was completely scanned.
In your Haskell code there are no prints, of course, but you can thing of returning x : function call as "printing x", since x is made available to the caller of our function before function call is actually made. (Well, to be pedantic this depends on how the caller will consume the output list, but I'll neglect this.)
Hence the non-tail recursive code can also work on infinite lists. Even on finite inputs, performance is improved: if we call head (rec1 p xs), we only evaluate xs until the first non-discarded element. By contrast head (rec2 p xs) would fully filter the whole list xs, even we don't need that.

The second implementation does not make much sense: a variable named len will not contain the length of the list. You thus need to pass this, for infinite lists, this would not work, since there is no length at all.
You likely want to produce something like:
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter p = go []
where go ys [] = ys -- (1)
go ys (x:xs) | p x = go (ys ++ [x]) xs
| otherwise = go ys xs
where we thus have an accumulator to which we append the items in the list, and then eventually return the accumulator.
The problem with the second approach is that as long as the accumulator is not returned, Haskell will need to keep recursing until at least we reach weak head normal form (WHNF). This means that if we pattern match the result with [] or (_:_), we will need at least have to recurse until case one, since the other cases only produce a new expression, and it will thus not yield a data constructor on which we can pattern match.
This in contrast to the first filter where if we pattern match on [] or (_:_) it is sufficient to stop at the first case (1), or the third case 93) where the expression produces an object with a list data constructor. Only if we require extra elements to pattern match, for example (_:_:_), it will require to evaluate the recFilter p tl in case (2) of the first implementation:
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter _ [] = [] -- (1)
recFilter p (h:tl) = if (p h)
then h : recFilter p tl -- (2)
else recFilter p tl
For more information, see the Laziness section of the Wikibook on Haskell that describes how laziness works with thunks.

Haskell don't really know what to name this

I'm trying to make it so that on a tuple input (n,m) and a list of tuples xs , if the first item in the tuple in xs is in (n,m) then keep it that way in the new list otherwise add the a tuple consisting of some value k from n to m as a first element and as second element it should be 0.My question is:how can i say "repeat 0" using guards ? Since clearly my code won't run since my code says "repeat = 0"
expand :: (Int,Int) -> Profile ->Profile
expand (n,m) [] = zip [n..m] (repeat 0)
expand (n,m) (x:xs) = zip [n..m] (repeat (|(fst (x) `elem` [n..m]) == False = 0
|otherwise = snd (x))

You can use a helper function here that converts a number in the [ n .. m ] range to a 2-tuple. Here we thus try to find an element in the list xs that matches with the first item of that tuple, if we do not find such element, we use 0:
import Data.List(find)
expand :: (Int,Int) -> Profile -> Profile
expand (n,m) xs = map go [n .. m]
where go i | Just l <- find (\(f, _) -> f == i) xs = l
| otherwise = (i, 0)
For a list, find was implemented as [src]:
find :: (a -> Bool) -> [a] -> Maybe a
find p = listToMaybe . filter p
filter thus will make a list that contains the elements that satisfy the predicate p, and listToMaybe :: [a] -> Maybe a will convert an empty list [] to Nothing, and for a non-empty list (x:_) it will wrap the first element x in a Just data constructor. Due to Haskell's laziness, it will thus look for the first element that satisfies the predicate.
this thus gives us:
Prelude Data.List> expand (2,7) [(4, 2.3), (6, 3)]
[(2,0.0),(3,0.0),(4,2.3),(5,0.0),(6,3.0),(7,0.0)]

Remove the First Value in a List that Meets a Criterion

I'm trying to solve this problem. This function takes two parameters. The first is a function that returns a boolean value, and the second is a list of numbers. The function is supposed to remove the first value in the second parameter that returns true when run with the first parameter.
There's a second function, which does the same thing, except it removes the last value that satisfies it, instead of the first.
I'm fairly certain I have the logic down, as I tested it in another language and it worked, my only problem is translating it into Haskell syntax. Here's what I have:
removeFirst :: (t -> Bool) -> [t] -> [t]
removeFirst p xs = []
removeFirst p xs
| p y = ys
| otherwise = y:removeFirst p ys
where
y:ys = xs
removeLast :: (t -> Bool) -> [t] -> [t]
removeLast p xs = []
removeLast p xs = reverse ( removeFirst p ( reverse xs ) )
I ran:
removeFirst even [1..10]
But instead of getting [1,3,4,5,6,7,8,9,10] as expected, I get [].
What am I doing wrong?

removeFirst p xs = []
This always returns the empty list and it matches all arguments. I think you mean this.
removeFirst _ [] = []

Your first equation,
removeFirst p xs = []
says „Whatever my arguments are, just return []“, and the rest of the code is ignored.
You probably mean
removeFirst p [] = []
saying „When the list is already empty, return the empty list.“

How do I split a list into sublists at certain points?

How do I manually split [1,2,4,5,6,7] into [[1],[2],[3],[4],[5],[6],[7]]? Manually means without using break.
Then, how do I split a list into sublists according to a predicate? Like so
f even [[1],[2],[3],[4],[5],[6],[7]] == [[1],[2,3],[4,5],[6,7]]
PS: this is not homework, and I've tried for hours to figure it out on my own.

To answer your first question, this is rather an element-wise transformation than a split. The appropriate function to do this is
map :: (a -> b) -> [a] -> [b]
Now, you need a function (a -> b) where b is [a], as you want to transform an element into a singleton list containing the same type. Here it is:
mkList :: a -> [a]
mkList a = [a]
so
map mkList [1,2,3,4,5,6,7] == [[1],[2],...]
As for your second question: If you are not allowed (homework?) to use break, are you then allowed to use takeWhile and dropWhile which form both halves of the result of break.
Anyway, for a solution without them ("manually"), just use simple recursion with an accumulator:
f p [] = []
f p (x:xs) = go [x] xs
where go acc [] = [acc]
go acc (y:ys) | p y = acc : go [y] ys
| otherwise = go (acc++[y]) ys
This will traverse your entire list tail recursively, always remembering what the current sublist is, and when you reach an element where p applies, outputting the current sublist and starting a new one.
Note that go first receives [x] instead of [] to provide for the case where the first element already satisfies p x and we don't want an empty first sublist to be output.
Also, this operates on the original list ([1..7]) instead of [[1],[2]...]. But you can use it on the transformed one as well:
> map concat $ f (odd . head) [[1],[2],[3],[4],[5],[6],[7]]
[[1,2],[3,4],[5,6],[7]]

For the first, you can use a list comprehension:
>>> [[x] | x <- [1,2,3,4,5,6]]
[[1], [2], [3], [4], [5], [6]]
For the second problem, you can use the Data.List.Split module provided by the split package:
import Data.List.Split
f :: (a -> Bool) -> [[a]] -> [[a]]
f predicate = split (keepDelimsL $ whenElt predicate) . concat
This first concats the list, because the functions from split work on lists and not list of lists. The resulting single list is the split again using functions from the split package.

First:
map (: [])
Second:
f p xs =
let rs = foldr (\[x] ~(a:r) -> if (p x) then ([]:(x:a):r) else ((x:a):r))
[[]] xs
in case rs of ([]:r) -> r ; _ -> rs
foldr's operation is easy enough to visualize:
foldr g z [a,b,c, ...,x] = g a (g b (g c (.... (g x z) ....)))
So when writing the combining function, it is expecting two arguments, 1st of which is "current element" of a list, and 2nd is "result of processing the rest". Here,
g [x] ~(a:r) | p x = ([]:(x:a):r)
| otherwise = ((x:a):r)
So visualizing it working from the right, it just adds into the most recent sublist, and opens up a new sublist if it must. But since lists are actually accessed from the left, we keep it lazy with the lazy pattern, ~(a:r). Now it works even on infinite lists:
Prelude> take 9 $ f odd $ map (:[]) [1..]
[[1,2],[3,4],[5,6],[7,8],[9,10],[11,12],[13,14],[15,16],[17,18]]
The pattern for the 1st argument reflects the peculiar structure of your expected input lists.

Multiple Statements In Haskell

How do you have multiple statements in haskell?
Here's what I'm trying to do: given a list such as [a,b,c,d], return every other element, so you get [a,c]. I can see the solution, and here's what I have so far:
fact (xs) | length( xs ) `mod` 2 == 1 = head( xs )
| otherwise = fact(tail( xs ))
This works fine the first time around, but then it quits. What I want to be able to say is return the head, and then call fact(tail(xs)) How do I do that?

The function you specified returns only a single element. You'd need to change it to something like:
fact [] = [] -- can't call tail on a list of length 0!
fact (xs) | length( xs ) `mod` 2 == 1 = head( xs ) : fact(tail(xs))
| otherwise = fact(tail( xs ))
You may find it helpful to write out type signatures to help figure out thinkos like this:
fact :: [a] -> [a] -- convert a list of anything to another (shorter) list
However note that this is very slow - O(n^2) in fact, since it's taking length at each step. A much more haskelly solution would use pattern matching to process two elements at a time:
fact :: [a] -> [a]
-- Take the first element of each two-element pair...
fact (x:_:xs) = x:fact xs
-- If we have only one element left, we had an odd-length list.
-- So grab the last element too.
fact [x] = [x]
-- Return nothing if we have an empty list
fact _ = []

There are no statements in Haskell.
You should not abuse parentheses in Haskell. Rather, you should accustom yourself to the language. So your original code should look like
fact xs | length xs `mod` 2 == 1 = head xs
| otherwise = fact (tail xs)
As bdonlan notes, the function you are looking for is really
fact [] = []
fact [x] = [x]
fact (x:_:xs) = x : fact xs
Suppose we have the list [a, b, c, d]. Let us apply the function and fully evaluate the result.
fact [a, b, c, d] = a : fact [c, d]
= a : c : fact []
= a : c : []
= [a, c]
Note that [a, b, c, d] is exactly the same as a : b : c : d : [] because the two ways of representing lists are interpreted interchangeably by the compiler.

Swapping a semaphore
In fact, we can do it following two possible patterns:
[1,2,3,4,..] becomes [1,3,5,7...]
[1,2,3,4,..] becomes [2,4,6,8...]
Both do the same, but they "begin the counting" the opposite way. Let us implement both of them with the same function! Of course, this function must be parametrized according to the "pattern". Two possible patterns exist, thus, we need a boolean for type for parametrization. Implementation: let us use a boolean parameter as a "flag", "semaphore":
module Alternation where
every_second :: [a] -> [a]
every_second = every_second_at False
every_second_at :: Bool -> [a] -> [a]
every_second_at _ [] = []
every_second_at True (x : xs) = x : every_second_at False xs
every_second_at False (x : xs) = every_second_at True xs
We have used an auxiliary function, bookkeeping the "flag/semaphore": it is swapping it accordingly. In fact, this auxiliary function can be regarded as a generalization of the original task. I think, that is what is called a "worker wrapper" function.
Countdown with an index
The task can be generalized even further. Why not to write a much more general function, which can be parametrized by a "modulus" m, and it "harvests" all mth elems of a list?
every_mth 1 [1,2,3,4,...] yields [1,2,3,4...]
every_mth 2 [1,2,3,4,...] yields [1,3,5...]
every_mth 3 [1,2,3,4,...] yields [1,4,7...]
We can use the same ideas as before, just we have to use more complicated a "semaphore": a natural number instead of a boolean. This is a "countdown" parameter, an index i bookkeeping when it is our turn:
module Cycle where
import Nat (Nat)
every_mth :: Nat -> [a] -> [a]
every_mth 0 = undefined
every_mth m # (i + 1) = every_mth_at m i
We use an auxiliary function (worker wrapper), bookkeeping the countdown index i:
every_mth_at :: Nat -> Nat -> [a] -> [a]
every_mth_at _ _ [] = []
every_mth_at m 0 (x : xs) = x : every_mth m xs
every_nth_at m (i + 1) (x : xs) = every_mth_at m i xs
For simplicity's sake, natural number type is "implemented" here as a mere alias:
module Nat (Nat) where
type Nat = Integer
Maybe, in a number theoretic sense, there are also cleaner alternative approaches, not exactly equivalent to the task You specified, but adjusting seems to be straightforward:
let every_mth 1 [0,1,2,3,4,...] yield [0,1,2,3,4,...]
let every_mth 2 [0,1,2,3,4,...] yield [0,2,4,6,8...]
let every_mth 3 [0,1,2,3,4,...] yield [0,3,6,9...]
thus, it is specified here so that it should provide "incidentally" the list of multiples of the parameter, when applied to the lazy list of all natural numbers.
In its implementation, it is worth using numbers as a "zero-based" index. Instead of "every mth", we say: "use i as an index ranging 0, 1, ..., u = m-1, where u denotes the upper limit of the possible indices. This upper index can be a useful parameter in the auxiliary function, which counts down the index.
module Multiple where
import Nat (Nat)
every_mth :: Nat -> [a] -> [a]
every_mth 0 = undefined
every_mth (u + 1) = countdown u
countdown :: Nat -> [a] -> [a]
countdown = countdown_at 0
countdown_at :: Nat -> Nat -> [a] -> [a]
countdown_at _ _ [] = []
countdown_at 0 u (x : xs) = x : countdown_at u u xs
countdown_at (i + 1) u (x : xs) = countdown_at i u xs

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to implement a list splitter based on predicate - haskell

Related

Understanding non-strictness in Haskell with a recursive example

Haskell don't really know what to name this

Remove the First Value in a List that Meets a Criterion

How do I split a list into sublists at certain points?

Multiple Statements In Haskell

Categories

Resources