Boolean selection of list - haskell

Suppose we want those elements of list x for which the corresponding element of list y is strictly positive. Any of the three solutions below work:
let x = [1..4]
let y = [1, -1, 2, -2]
[ snd both | both <- zip (map (> 0) y) x, fst both ]
map snd $ filter fst $ zip (map (>0) y) x
sel :: [Bool] -> [a] -> [a]
sel [] _ = []
sel (True : xs) (y : ys) = y : sel xs ys
sel (False : xs) (y : ys) = sel xs ys
sel (map (> 0) y) x
however, what prompted this was that in the R language this can be written compactly like this:
x[y > 0]
and given how much shorter that is I was wondering if there is a shorter/better way to do this in Haskell?

I'm not a haskell specialist, but why not use list comprehension?
[i | (i,j) <- zip x y, j > 0 ]

If you are willing to use a language extension, I can offer the alternative
{-# LANGUAGE ParallelListComp #-}
bfilter :: (b -> Bool) -> [a] -> [b] -> [a]
bfilter cond xs ys = [x | x <- xs | y <- ys, cond y]
Nothing in Haskell will be nearly as short as the R version, because in R, it's a language built-in, but in Haskell it isn't. Apparently whoever designed R found there to be good reasons to include such a primitive, but none of the Haskell designers found there to be convincing reasons to include such a construct in the language (and it wouldn't fit in nicely, so I fully endorse that decision - it may fit in well in R, I don't know that language).

zip x y >>= \(a,b) -> filter(const(b>0)) [a]
Or pointlessly using Applicative...
import Control.Applicative
zip x y >>= filter <$> const.(>0).snd <*> (:[]).fst

As Daniel Fischer says, there isn't any special syntax for this.
If you're going to be doing this operation often, it's best to define your own single reusable function, instead of having to assemble the list comprehension or map/filter chain manually every time. (Your sel doesn't pass this test because the caller has to apply the map separately.)
selectWhere :: [a] -> (a -> Bool) -> [b] -> [b]
selectWhere ys pred = map snd . filter (pred . fst) . zip ys
-- call it like this: selectWhere y (> 0) x
or whichever clearer definition you prefer. The important thing is that you wrap it up inside a function.


How to filter a list by another list in Haskell?

Suppose I have two lists A and B of the same length. I want to keep elements in A which are greater than corresponding elements in B. Let A=[1,5,8], B=[2,4,9], the result should be [5] because 1<2, 5>4, 8<9.
I come up with a solution. Let C=zip A B, then filter C, finally get result by taking fst of each element in C. It's not so elegant. Is there a simpler way?
map fst (filter (\ x-> (fst x) > (snd x)) (zip a b))
Your described solution looks fine to me.
An alternative which is not necessarily better:
import Data.Maybe
import Control.Monad
catMaybes $ zipWith (\a b -> guard (a>b) >> return a) list1 list2
According to the desugaring of monad comprehensions this should also work
{-# LANGUAGE MonadComprehensions #-}
[ a | ( a <- list1 | b <- list2 ), a > b ]
... but in practice it does not. It is a pity because I find it quite elegant.
I wonder whether I got it wrong or it is a GHC bug.
I was working on something similar and as a newbie this is the best I came up with:
filterGreaterThan xs ys = do (x,y) <- zip xs ys
guard (x > y)
return x
This solution is easier to reason about than the others. The do notation really shines here.
I'm not sure how your code looks but the following function look quite elegant to me:
greater :: Ord a => [a] -> [a] -> [a]
greater xs = map fst . filter ((>) <$> fst <*> snd) . zip xs
example :: [Int]
example = greater [1,5,8] [2,4,9] -- result is [5]
This pattern is well known in the Lisp community as the decorate-process-undecorate pattern.
A recursive approach, not so elegant as (any) of the other approaches, this relies on no explicit zipping and we get the result in one pass,
greater :: Ord a => [a] -> [a] -> [a]
greater [] [] = []
greater (x:xs) (y:ys)
| x > y = x : greater xs ys
| otherwise = greater xs ys
If you want to generalize this idea nicely, I would recommend looking to mapMaybe:
:: (a -> Maybe b)
-> [a] -> [b]
Applying that idea to zipWith yields
:: (a -> b -> Maybe c)
-> [a] -> [b] -> [c]
zipWithMaybe f xs ys =
[c | Just c <- zipWith f xs ys]
Now you can write your function
keepGreater :: Ord a => [a] -> [a] -> [a]
keepGreater = zipWithMaybe $
\x y -> x <$ guard (x > y)
Is it really worth the trouble? For lists, probably not. But something like this turns out to be useful in the context of merges for Data.Map.
Pretty similar to #chi's solution with Lists concant:
concat $ zipWith (\a b -> last $ []:[[a] | a > b]) as bs

How to have multiple infinite ranges in list comprehensions?

In haskell I have a list comprehension like this:
sq = [(x,y,z) | x <- v, y <- v, z <- v, x*x + y*y == z*z, x < y, y < z]
where v = [1..]
However when I try take 10 sq, it just freezes...
Is there a way to handle multiple infinite ranges?
In addition to the other answers explaining the problem, here is an alternative solution, generalized to work with level-monad and stream-monad that lend themselves for searches over infinite search spaces (It is also compatible with the list monad and logict, but those won't play nicely with infinite search spaces, as you already found out):
{-# LANGUAGE MonadComprehensions #-}
module Triples where
import Control.Monad
sq :: MonadPlus m => m (Int, Int, Int)
sq = [(x, y, z) | x <- v, y <- v, z <- v, x*x + y*y == z*z, x < y, y < z]
where v = return 0 `mplus` v >>= (return . (1+))
Now, for a fast breadth first search:
*Triples> :m +Control.Monad.Stream
*Triples Control.Monad.Stream> take 10 $ runStream sq
*Triples> :m +Control.Monad.Levels
*Triples Control.Monad.Levels> take 5 $ bfs sq -- larger memory requirements
*Triples Control.Monad.Levels> take 5 $ idfs sq -- constant space, slower, lazy
List comprehensions are translated into nested applications of the concatMap function:
concatMap :: (a -> [b]) -> [a] -> [b]
concatMap f xs = concat (map f xs)
concat :: [[a]] -> [a]
concat [] = []
concat (xs:xss) = xs ++ concat xss
-- Shorter definition:
-- > concat = foldr (++) []
Your example is equivalent to this:
sq = concatMap (\x -> concatMap (\y -> concatMap (\z -> test x y z) v) v) v
where v = [1..]
test x y z =
if x*x + y*y == z*z
then if x < y
then if y < z
then [(x, y, z)]
else []
else []
else []
This is basically a "nested loops" approach; it'll first try x = 1, y = 1, z = 1, then move on to x = 1, y = 1, z = 2 and so on, until it tries all of the list's elements as values for z; only then can it move on to try combinations with y = 2.
But of course you can see the problem—since the list is infinite, we never run out of values to try for z. So the combination (3, 4, 5) can only occur after infinitely many other combinations, which is why your code loops forever.
To solve this, we need to generate the triples in a smarter way, such that for any possible combination, the generator reaches it after some finite number of steps. Study this code (which handles only pairs, not triples):
-- | Take the Cartesian product of two lists, but in an order that guarantees
-- that all combinations will be tried even if one or both of the lists is
-- infinite:
cartesian :: [a] -> [b] -> [(a, b)]
cartesian [] _ = []
cartesian _ [] = []
cartesian (x:xs) (y:ys) =
[(x, y)] ++ interleave3 vertical horizontal diagonal
-- The trick is to split the problem into these four pieces:
-- |(x0,y0)| (x0,y1) ... horiz
-- +-------+------------
-- |(x1,y0)| .
-- | . | .
-- | . | .
-- | . | .
-- vert diag
vertical = map (\x -> (x,y)) xs
horizontal = map (\y -> (x,y)) ys
diagonal = cartesian xs ys
interleave3 :: [a] -> [a] -> [a] -> [a]
interleave3 xs ys zs = interleave xs (interleave ys zs)
interleave :: [a] -> [a] -> [a]
interleave xs [] = xs
interleave [] ys = ys
interleave (x:xs) (y:ys) = x : y : interleave xs ys
To understand this code (and fix it if I messed up!) look at this blog entry on how to count infinite sets, and at the fourth diagram in particular—the function is an algorithm based on that "zigzag"!
I just tried a simple version of your sq using this; it finds (3,4,5) almost instantly, but then takes very long to get to any other combination (in GHCI at least). But I think the key lessons to take away from this are:
List comprehensions just don't work well for nested infinite lists.
Don't spend too much time playing around with list comprehensions. Everything that they can do, functions like map, filter and concatMap can do—plus there are many other useful functions in the list library, so concentrate your effort on that.
Your code freeze because yours predicate will never been satisfied.
Why ?
Let's take an example without any predicate to understand.
>>> let v = [1..] in take 10 $ [ (x, y, z) | x <- v, y <- v, z <- v ]
As you see x and y will always be evaluated to 1 as z will never stop to rise.
Then your predicate can't be.
Any workaround ?
Try "Nested list" comprehension.
>>> [[ fun x y | x <- rangeX, predXY] | y <- rangeY, predY ]
Or parallel list comprehension which can be activated using,
>>> :set -XParallelListComp
lookup on the doc
This is possible, but you'll have to come up with an order in which to generate the numbers. The following generates the numbers you want; note that the x < y test can be replaced by generating only y that are >x and similarly for z (which is determined once x and y are bound):
[(x, y, z) | total <- [1..]
, x <- []
, y <- []
, z <- [total - x - y]
, x*x + y*y == z*z]

Combining foldl and foldr

I've figured out myself that foldl (or foldl') is the best approach when you want to produce summarise a list into one result (i.e. sum), and foldr is the best approach when you want to produce another (perhaps even infinite) list (i.e. filter).
So I was considering was processing that combines these two. So I made the function sum_f. sum_f is fairly simple, all it does is add up the elements of a list, but if it finds an element such that f x is true, it gives the current result as output as the element of a list and starts summing from that point all over.
The code is here:
sum_f :: (Num a) => (a -> Bool) -> [a] -> [a]
sum_f f =
sum_f_worker s (x:xs) =
rec_call z = sum_f_worker z xs
next_sum = s + x
next_sum `seq` if (f x) then next_sum : (rec_call 0) else rec_call next_sum
sum_f_worker _ [] = []
sum_f_worker 0
Now for example, lets sum all the positive integers grouped by any powers of two. This should output the following:
[1, 2, 3+4, 5+6+7+8, 9+10+11+12+13+14+15+16, ...]
[1, 2, 7, 26, 100, ...]
We can do this like the following:
import Data.Bits
main =
power_of_two x = (x .&. (x - 1)) == 0 -- .&. is bitwise and
print $ take 25 $ sum_f power_of_two [(1::Integer)..]
Now this above function (I believe) runs in constant space (like foldl'), even though the groups grow exponentially. Also, it works on infinite lists (like foldr).
I was wondering whether I could write the above using prelude functions without explicit recursion (i.e. only the recursion inside prelude functions). Or does combining the ideas of foldl and foldr here mean that the recursion here can't be done with standard prelude functions and needs to be explicit?
What you want can be expressed using only a right fold as follows:
{-# LANGUAGE BangPatterns #-}
sum_f :: (Num a) => (a -> Bool) -> [a] -> [a]
sum_f p xs = foldr g (const []) xs 0
g x f !a = if p x then x+a:f 0 else f (x+a)
Prelude Data.Bits> sum_f (\x -> x .&. pred x == 0) [1..10]
And it works on infinite lists:
Prelude Data.Bits> take 10 . sum_f (\x -> x .&. pred x == 0) $ [1..]

Generating a lists of a specific length with Haskell's QuickCheck

-- 3 (find k"th element of a list)
element_at xs x = xs !! x
prop_3a xs x = (x < length xs && x >= 0) ==> element_at xs (x::Int) == (xs !! x::Int)
When prop_3a is ran through QuickCheck, it gives up, because it won't generate long enough lists.
How can I write a generator that will generate lists with length longer than the random integer?
hammar's answer is perfectly adequate for the problem. But for the sake of answering the precise question asked, I couldn't help but investigate a bit. Let's use forAll.
prop_bang x = x >= 0 ==> forAll (listLongerThan x) $ \xs ->
element_at xs x == xs !! x
So now we need a function, listLongerThan :: Int -> Gen [Int]. It takes a length, x, and produces a generator which will produce lists of length greater than x.
listLongerThan :: Int -> Gen [Int]
listLongerThan x = replicateM (x+1) arbitrary
It's rather straightforward: we simply take advantage of the Monad instance of Gen. If you run quickCheck prop_bang, you'll notice it starts taking quite a long time, because it begins testing absurdly long lists. Let's limit the length of the list, to make it go a bit faster. Also, right now listLongerThan only generates a list that is exactly x+1 long; let's mix that up a bit, again utilizing the Monad instance of Gen.
prop_bang =
forAll smallNumber $ \x ->
forAll (listLongerThan x) $ \xs ->
element_at xs x == xs !! x
smallNumber :: Gen Int
smallNumber = fmap ((`mod` 100) . abs) arbitrary
listLongerThan :: Int -> Gen [Int]
listLongerThan x = do
y <- fmap (+1) smallNumber -- y > 0
replicateM (x+y) arbitrary
You can use sample smallNumber or sample (listLongerThan 3) in ghci to make sure it is generating the correct stuff.
How about going the other way? First we let QuickCheck pick a list and then we constrain what indices we allow. This works, and does not throw away any test cases.
prop_3a (NonEmpty xs) = forAll (choose (0, length xs - 1)) $ \i ->
element_at xs i == (xs !! i :: Int)
Here, I use forAll to use a specific generator for the indices, in this case using choose which picks an element from a specified range, and I also use the NonEmptyList type to ensure that we don't try to index into an empty list.
This works:
import Test.QuickCheck
element_at :: [a] -> Int -> a
element_at xs i = xs !! i
prop_3a :: [Int] -> Int -> Property
prop_3a xs i = (i >= 0) ==> (length xs > i) ==> element_at xs i == xs !! i
However, the problem with this is that a lot of sample values are discarded. You could use things like Positive to help with ensuring that the index is valid.
If you want to be more complex, you can use more newtype wrappers to try and generate values of sufficient length (possibly using sized, or generate the list and the index together: generate the list, and then generate the index based upon the length of the list).

Recursively sort non-contiguous list to list of contiguous lists

I've been trying to learn a bit of functional programming (with Haskell & Erlang) lately and I'm always amazed at the succinct solutions people can come up with when they can think recursively and know the tools.
I want a function to convert a list of sorted, unique, non-contiguous integers into a list of contiguous lists, i.e:
[[1,2,3], [6,7,8], [10,11]
This was the best I could come up with in Haskell (two functions)::
make_ranges :: [[Int]] -> [Int] -> [[Int]]
make_ranges ranges [] = ranges
make_ranges [] (x:xs)
| null xs = [[x]]
| otherwise = make_ranges [[x]] xs
make_ranges ranges (x:xs)
| (last (last ranges)) + 1 == x =
make_ranges ((init ranges) ++ [(last ranges ++ [x])]) xs
| otherwise = make_ranges (ranges ++ [[x]]) xs
rangify :: [Int] -> [[Int]]
rangify lst = make_ranges [] lst
It might be a bit subjective but I'd be interested to see a better, more elegant, solution to this in either Erlang or Haskell (other functional languages too but I might not understand it.) Otherwise, points for just fixing my crappy beginner's Haskell style!
Most straightforward way in my mind is a foldr:
ranges = foldr step []
where step x [] = [[x]]
step x acc#((y:ys):zs) | y == x + 1 = (x:y:ys):zs
| otherwise = [x]:acc
Or, more concisely:
ranges = foldr step []
where step x ((y:ys):zs) | y == x + 1 = (x:y:ys):zs
step x acc = [x]:acc
But wait, there's more!
abstractRanges f = foldr step []
where step x ((y:ys):zs) | f x y = (x:y:ys):zs
step x acc = [x]:acc
ranges = abstractRanges (\x y -> y == x + 1)
powerRanges = abstractRanges (\x y -> y == x*x) -- mighty morphin
By turning the guard function into a parameter, you can group more interesting things than just +1 sequences.
*Main> powerRanges [1,1,1,2,4,16,3,9,81,5,25]
The utility of this particular function is questionable...but fun!
I can't believe I got the shortest solution. I know this is no code golf, but I think it is still quite readable:
import GHC.Exts
range xs = map (map fst) $ groupWith snd $ zipWith (\a b -> (a, a-b)) xs [0..]
or pointfree
range = map (map snd) . groupWith fst . zipWith (\a b -> (b-a, b)) [0..]
BTW, groupWith snd can be replaced with groupBy (\a b -> snd a == snd b) if you prefer Data.List over GHC.Exts
BTW: Is there a nicer way to get rid of the lambda (\a b -> (b-a, b)) than (curry $ (,) <$> ((-) <$> snd <*> fst) <*> snd) ?
[Edit 2]
Yeah, I forgot (,) is a functor. So here is the obfuscated version:
range = map (map fst) . groupWith snd . (flip $ zipWith $ curry $ fmap <$> (-).fst <*> id) [0..]
Suggestions are welcome...
import Data.List (groupBy)
ranges xs = ( snd
. groupBy (const fst)
. zip (True : zipWith ((==) . succ) xs (tail xs))
$ xs
As to how to come up with such a thing: I started with the zipWith f xs (tail xs), which is a common idiom when you want to do something on consecutive elements of a list. Likewise is zipping up a list with information about the list, and then acting (groupBy) upon it. The rest is plumbing.
Then, of course, you can feed it through #pl and get:
import Data.List (groupBy)
import Control.Monad (ap)
import Control.Monad.Instances()
ranges = ((( snd)
. groupBy (const fst))
.) =<< zip
. (True:)
. ((zipWith ((==) . succ)) `ap` tail)
, which, by my authoritative definition, is evil due to Mondad ((->) a). Twice, even. The data flow is meandering too much to lay it out in any sensible way. zipaptail is an Aztec god, and Aztec gods aren't to be messed with.
Another version in Erlang:
part(List) -> part(List,[]).
part([H1,H2|T],Acc) when H1 =:= H2 - 1 ->
part([H1|T],Acc) ->
[lists:reverse([H1|Acc]) | part(T,[])];
part([],Acc) -> Acc.
k z = map (fst <$>) . groupBy (const snd) .
zip z . (False:) . (zipWith ((==) . succ) <*> tail) $ z
Try reusing standard functions.
import Data.List (groupBy)
rangeify :: (Num a) => [a] -> [[a]]
rangeify l = map (map fst) $ groupBy (const snd) $ zip l contigPoints
where contigPoints = False : zipWith (==) (map (+1) l) (drop 1 l)
Or, following (mixed) advice to use unfoldr, stop abusing groupBy, and be happy using partial functions when it doesn't matter:
import Control.Arrow ((***))
import Data.List (unfoldr)
spanContig :: (Num a) => [a] -> [[a]]
spanContig l =
map fst *** map fst $ span (\(a, b) -> a == b + 1) $ zip l (head l - 1 : l)
rangeify :: (Num a) => [a] -> [[a]]
rangeify = unfoldr $ \l -> if null l then Nothing else Just $ spanContig l
Erlang using foldr:
ranges(List) ->
lists:foldr(fun (X, [[Y | Ys], Acc]) when Y == X + 1 ->
[[X, Y | Ys], Acc];
(X, Acc) ->
[[X] | Acc]
end, [], List).
This is my v0.1 and I can probably make it better:
makeCont :: [Int] -> [[Int]]
makeCont [] = []
makeCont [a] = [[a]]
makeCont (a:b:xs) = if b - a == 1
then (a : head next) : tail next
else [a] : next
next :: [[Int]]
next = makeCont (b:xs)
And I will try and make it better. Edits coming I think.
As a comparison, here's an implementation in Erlang:
partition(L) -> [lists:reverse(T) || T <- lists:reverse(partition(L, {[], []}))].
partition([E|L], {R, [EL|_] = T}) when E == EL + 1 -> partition(L, {R, [E|T]});
partition([E|L], {R, []}) -> partition(L, {R, [E]});
partition([E|L], {R, T}) -> partition(L, {[T|R], [E]});
partition([], {R, []}) -> R;
partition([], {R, T}) -> [T|R].
The standard paramorphism recursion scheme isn't in Haskell's Data.List module, though I think it should be. Here's a solution using a paramorphism, because you are building a list-of-lists from a list, the cons-ing is a little tricksy:
contig :: (Eq a, Num a) => [a] -> [[a]]
contig = para phi [] where
phi x ((y:_),(a:acc)) | x + 1 == y = (x:a):acc
phi x (_, acc) = [x]:acc
Paramorphism is general recursion or a fold with lookahead:
para :: (a -> ([a], b) -> b) -> b -> [a] -> b
para phi b [] = b
para phi b (x:xs) = phi x (xs, para phi b xs)
It can be pretty clear and simple in the Erlang:
partition([]) -> [];
partition([A|T]) -> partition(T, [A]).
partition([A|T], [B|_]=R) when A =:= B+1 -> partition(T, [A|R]);
partition(L, P) -> [lists:reverse(P)|partition(L)].
Edit: Just for curiosity I have compared mine and Lukas's version and mine seems about 10% faster either in native either in bytecode version on testing set what I generated by lists:usort([random:uniform(1000000)||_<-lists:seq(1,1000000)]) on R14B01 64b version at mine notebook. (Testing set is 669462 long and has been partitioned to 232451 sublists.)
Edit2: Another test data lists:usort([random:uniform(1000000)||_<-lists:seq(1,10000000)]), length 999963 and 38 partitions makes bigger diference in native code. Mine version finish in less than half of time. Bytecode version is only about 20% faster.
Edit3: Some microoptimizations which provides additional performance but leads to more ugly and less maintainable code:
part4([]) -> [];
part4([A|T]) -> part4(T, A, []).
part4([A|T], B, R) when A =:= B+1 -> part4(T, A, [B|R]);
part4([A|T], B, []) -> [[B]|part4(T, A, [])];
part4([A|T], B, R) -> [lists:reverse(R, [B])|part4(T, A, [])];
part4([], B, R) -> [lists:reverse(R,[B])].
Here's an attempt from a haskell noob
ranges ls = let (a, r) = foldl (\(r, a#(h:t)) e -> if h + 1 == e then (r, e:a) else (a:r, [e])) ([], [head ls]) (tail ls)
in reverse . map reverse $ r : a
