I have generalized the existing Data.List.partition implementation
partition :: (a -> Bool) -> [a] -> ([a],[a])
partition p xs = foldr (select p) ([],[]) xs
where
-- select :: (a -> Bool) -> a -> ([a], [a]) -> ([a], [a])
select p x ~(ts,fs) | p x = (x:ts,fs)
| otherwise = (ts, x:fs)
to a "tri-partition" function
ordPartition :: (a -> Ordering) -> [a] -> ([a],[a],[a])
ordPartition cmp xs = foldr select ([],[],[]) xs
where
-- select :: a -> ([a], [a], [a]) -> ([a], [a], [a])
select x ~(lts,eqs,gts) = case cmp x of
LT -> (x:lts,eqs,gts)
EQ -> (lts,x:eqs,gts)
GT -> (lts,eqs,x:gts)
But now I'm facing a confusing behaviour when compiling with ghc -O1, the 'foo' and 'bar' functions work in constant-space, but the doo function leads to a space-leak.
foo xs = xs1
where
(xs1,_,_) = ordPartition (flip compare 0) xs
bar xs = xs2
where
(_,xs2,_) = ordPartition (flip compare 0) xs
-- pass-thru "least" non-empty partition
doo xs | null xs1 = if null xs2 then xs3 else xs2
| otherwise = xs1
where
(xs1,xs2,xs3) = ordPartition (flip compare 0) xs
main :: IO ()
main = do
print $ foo [0..100000000::Integer] -- results in []
print $ bar [0..100000000::Integer] -- results in [0]
print $ doo [0..100000000::Integer] -- results in [0] with space-leak
So my question now is,
What is the reason for the space-leak in doo, which seems suprising to me, since foo and bar don't exhibit such a space leak? and
Is there a way to implement ordPartition in such a way, that when used in the context of functions such as doo it performs with constant space complexity?
It's not a space leak. To find out whether a component list is empty, the entire input list has to be traversed and the other component lists constructed (as thunks) if it is. In the doo case, xs1 is empty, so the entire thing has to be built before deciding what to output.
That is a fundamental property of all partitioning algorithms, if one of the results is empty, and you check for its emptiness as a condition, that check cannot be completed before the entire list has been traversed.
Related
I would like to define a greaters function, which selects from a list items that are larger than the one before it.
For instance:
greaters [1,3,2,4,3,4,5] == [3,4,4,5]
greaters [5,10,6,11,7,12] == [10,11,12]
The definition I came up with is this :
greaters :: Ord a => [a] -> [a]
Things I tried so far:
greaters (x:xs) = group [ d | d <- xs, x < xs ]
Any tips?
We can derive a foldr-based solution by a series of re-writes starting from the hand-rolled recursive solution in the accepted answer:
greaters :: Ord a => [a] -> [a]
greaters [] = []
greaters (x:xs) = go x xs -- let's re-write this clause
where
go _ [] = []
go last (act:xs)
| last < act = act : go act xs
| otherwise = go act xs
greaters (x:xs) = go xs x -- swap the arguments
where
go [] _ = []
go (act:xs) last
| last < act = act : go xs act
| otherwise = go xs act
greaters (x:xs) = foldr g z xs x -- go ==> foldr g z
where
foldr g z [] _ = []
foldr g z (act:xs) last
| last < act = act : foldr g z xs act
| otherwise = foldr g z xs act
greaters (x:xs) = foldr g z xs x
where -- simplify according to
z _ = [] -- foldr's definition
g act (foldr g z xs) last
| last < act = act : foldr g z xs act
| otherwise = foldr g z xs act
Thus, with one last re-write of foldr g z xs ==> r,
greaters (x:xs) = foldr g z xs x
where
z = const []
g act r last
| last < act = act : r act
| otherwise = r act
The extra parameter serves as a state being passed forward as we go along the input list, the state being the previous element; thus avoiding the construction by zip of the shifted-pairs list serving the same purpose.
I would start from here:
greaters :: Ord a => [a] -> [a]
greaters [] = []
greaters (x:xs) = greatersImpl x xs
where
greatersImpl last [] = <fill this out>
greatersImpl last (x:xs) = <fill this out>
The following functions are everything you’d need for one possible solution :)
zip :: [a] -> [b] -> [(a, b)]
drop 1 :: [a] -> [a]
filter :: (a -> Bool) -> [a] -> [a]
(<) :: Ord a => a -> a -> Bool
uncurry :: (a -> b -> c) -> (a, b) -> c
map :: (a -> b) -> [a] -> [b]
snd :: (a, b) -> b
Note: drop 1 can be used when you’d prefer a “safe” version of tail.
If you like over-generalization like me, you can use the witherable package.
{-# language ScopedTypeVariables #-}
import Control.Monad.State.Lazy
import Data.Witherable
{-
class (Traversable t, Filterable t) => Witherable t where
-- `wither` is an effectful version of mapMaybe.
wither :: Applicative f => (a -> f (Maybe b)) -> t a -> f (t b)
-}
greaters
:: forall t a. (Ord a, Witherable t)
=> t a -> t a
greaters xs = evalState (wither go xs) Nothing
where
go :: a -> State (Maybe a) (Maybe a)
go curr = do
st <- get
put (Just curr)
pure $ case st of
Nothing -> Nothing
Just prev ->
if curr > prev
then Just curr
else Nothing
The state is the previous element, if there is one. Everything is about as lazy as it can be. In particular:
If the container is a Haskell list, then it can be an infinite one and everything will still work. The beginning of the list can be produced without withering the rest.
If the container extends infinitely to the left (e.g., an infinite snoc list), then everything will still work. How can that be? We only need to know what was in the previous element to work out the state for the current element.
"Roll your own recursive function" is certainly an option here, but it can also be accomplished with a fold. filter can't do it because we need some sort of state being passed, but fold can nicely accumulate the result while keeping that state at the same time.
Of course the key idea is that we keep track of last element add the next one to the result set if it's greater than the last one.
greaters :: [Int] -> [Int]
greaters [] = []
greaters (h:t) = reverse . snd $ foldl (\(a, r) x -> (x, if x > a then x:r else r)) (h, []) t
I'd really love to eta-reduce it but since we're dropping the first element and seeding the accumulator with it it kinda becomes awkward with the empty list; still, this is effectively an one-liner.
So i have come up with a foldr solution. It should be similar to what #Will Ness has demonstrated but not quite i suppose as we don't need a separate empty list check in this one.
The thing is, while folding we need to encapsulate the previous element and also the state (the result) in a function type. So in the go helper function f is the state (the result) c is the current element of interest and p is the previous one (next since we are folding right). While folding from right to left we are nesting up these functions only to run it by applyying the head of the input list to it.
go :: Ord a => a -> (a -> [a]) -> (a -> [a])
go c f = \p -> let r = f c
in if c > p then c:r else r
greaters :: Ord a => [a] -> [a]
greaters = foldr go (const []) <*> head
*Main> greaters [1,3,2,4,3,4,5]
[3,4,4,5]
*Main> greaters [5,10,6,11,7,12]
[10,11,12]
*Main> greaters [651,151,1651,21,651,1231,4,1,16,135,87]
[1651,651,1231,16,135]
*Main> greaters [1]
[]
*Main> greaters []
[]
As per rightful comments of #Will Ness here is a modified slightly more general code which hopefully doesn't break suddenly when the comparison changes. Note that const [] :: b -> [a] is the initial function and [] is the terminator applied to the result of foldr. We don't need Maybe since [] can easily do the job of Nothing here.
gs :: Ord a => [a] -> [a]
gs xs = foldr go (const []) xs $ []
where
go :: Ord a => a -> ([a] -> [a]) -> ([a] -> [a])
go c f = \ps -> let r = f [c]
in case ps of
[] -> r
[p] -> if c > p then c:r else r
I'm imagining a function like
takeChunkUntil :: [a] -> ([a] -> Bool) -> ([a], [a])
Hopefully lazy.
It takes elements from the first list until the group of them satisfies the predicate, then returns that sublist as well as the remaining elements.
TO ANSWER SOME QUESTIONS:
The ultimate goal is to make something that reads Huffman codes lazily. So if you have a string of bits, here represented as Bool, bs, you can write take n $ decode huffmanTree bs to take the first n coded values while consuming only as much of bs as necessary. If you would like I'll post more details and my attempted solutions. This could get long :) (Note I'm a tutor who was given this problem by a student, but I didn't try to help him as it was beyond me, however I'm very curious now.)
CONTINUED: Here goes the whole thing:
Definition of Huffman tree:
data BTree a = Leaf a | Fork (BTree a) (BTree a) deriving (Show, Eq)
Goal: write a lazy decode function that returns a pair of the decoded values and a Boolean indicating if there were any values left over that were not fully long enough to be decoded into a value. Note: we are using Bool to represent a bit: True =1, False = 0.
decode :: BTree a -> [Bool] -> ([a], Bool)
Here's the essence: The first function I wrote was a function that decodes one value. Returns Nothing if the input list was empty, otherwise returns the decoded value and the remaining "bit".
decode1 :: BTree a -> [Bool] -> Maybe (a, [Bool])
decode1 (Leaf v) bs = Just (v, bs)
decode1 _ [] = Nothing
decode1 (Fork left right) (b:bs)
| b = decode1 right bs
| otherwise = decode1 left bs
First, I figured that I needed some kind of tail recursion to make this lazy. Here's what doesn't work. I think it doesn't, anyway. Notice how it's recursive, but I'm passing a list of "symbols decoded so far" and appending the new one. Inefficient and maybe (if my understanding is right) won't lead to tail recursion.
decodeHelp :: BTree a -> [a] -> [Bool] -> ([a],Bool)
decodeHelp t symSoFar bs = case decode1 t bs of
Nothing -> (symSoFar,False)
Just (s,remain) -> decodeHelp t (symSoFar ++ [s]) remain
So I thought, how can I write a better kind of recursion in which I decode a symbol and append it to the next call? The key is to return a list of [Maybe a], in which Just a is a successfully decoded symbol and Nothing means no symbol could be decoded (i.e. remaining booleans were not sufficient)
decodeHelp2 :: BTree a -> [Bool] -> [Maybe a]
decodeHelp2 t bs = case decode1 t bs of
Nothing -> [Nothing]
Just (s, remain) -> case remain of
[] -> []
-- in the following line I can just cons Just s onto the
-- recursive call. My understand is that's what make tail
-- recursion work and lazy.
_ -> Just s : decodeHelp2 t remain
But obviously this is not what the problem set wants out of the result. How can I turn all these [Maybe a] into a ([a], Bool)? My first thought was to apply scanl
Here's the scanning function. It accumulates Maybe a into ([a], Bool)
sFunc :: ([a], Bool) -> Maybe a -> ([a], Bool)
sFunc (xs, _) Nothing = (xs, False)
sFunc (xs, _) (Just x) = (xs ++ [x], True)
Then you can write
decodeSortOf :: BTree a -> [Bool] -> [([a], Bool)]
decodeSortOf t bs = scanl sFunc ([],True) (decodeHelp2 t bs)
I verified this works and is lazy:
take 3 $ decodeSortOf xyz_code [True,False,True,True,False,False,False,error "foo"] gives [("",True),("y",True),("yz",True)]
But this is not the desired result. Help, I'm stuck!
Here's a hint. I've swapped the argument order to get something more idiomatic, and I've changed the result type to reflect the fact that you may not find an acceptable chunk.
import Data.List (inits, tails)
takeChunkUntil :: ([a] -> Bool) -> [a] -> Maybe ([a], [a])
takeChunkUntil p as = _ $ zip (inits as) (tails as)
We can use explicit recursion here, where if the predicate is satisfied, we prepend to the first item of the tuple. If not, we make a 2-tuple where we put the (remaining) list in the second item of the 2-tuple. For example:
import Control.Arrow(first)
takeChunkUntil :: ([a] -> Bool) -> [a] -> ([a], [a])
takeChunkUntil p = go []
where go _ [] = ([], [])
go gs xa#(x:xs) | not (p (x:gs)) = first (x:) (go (x:gs) xs)
| otherwise = ([], xa)
We here make the assumption that the order of the elements in the group is not relevant to the predicate (since we each time pass the list in reverse order). If that is relevant, we can use a difference list for example. I leave that as an exercise.
This works on an infinite list as well, for example:
Prelude Control.Arrow> take 10 (fst (takeChunkUntil (const False) (repeat 1)))
[1,1,1,1,1,1,1,1,1,1]
Given a condition, I want to search through a list of elements and return the first element that reaches the condition, and the previous one.
In C/C++ this is easy :
int i = 0;
for(;;i++) if (arr[i] == 0) break;
After we get the index where the condition is met, getting the previous element is easy, through "arr[i-1]"
In Haskell:
dropWhile (/=0) list gives us the last element I want
takeWhile (/=0) list gives us the first element I want
But I don't see a way of getting both in a simple manner. I could enumerate the list and use indexing, but that seems messy. Is there a proper way of doing this, or a way of working around this?
I would zip the list with its tail so that you have pairs of elements
available. Then you can just use find on the list of pairs:
f :: [Int] -> Maybe (Int, Int)
f xs = find ((>3) . snd) (zip xs (tail xs))
> f [1..10]
Just (3,4)
If the first element matches the predicate this will return
Nothing (or the second match if there is one) so you might need to special-case that if you want something
different.
As Robin Zigmond says break can also work:
g :: [Int] -> (Int, Int)
g xs = case break (>3) xs of (_, []) -> error "not found"
([], _) -> error "first element"
(ys, z:_) -> (last ys, z)
(Or have this return a Maybe as well, depending on what you need.)
But this will, I think, keep the whole prefix ys in memory until it
finds the match, whereas f can start garbage-collecting the elements
it has moved past. For small lists it doesn't matter.
I would use a zipper-like search:
type ZipperList a = ([a], [a])
toZipperList :: [a] -> ZipperList a
toZipperList = (,) []
moveUntil' :: (a -> Bool) -> ZipperList a -> ZipperList a
moveUntil' _ (xs, []) = (xs, [])
moveUntil' f (xs, (y:ys))
| f y = (xs, (y:ys))
| otherwise = moveUntil' f (y:xs, ys)
moveUntil :: (a -> Bool) -> [a] -> ZipperList a
moveUntil f = moveUntil' f . toZipperList
example :: [Int]
example = [2,3,5,7,11,13,17,19]
result :: ZipperList Int
result = moveUntil (>10) example -- ([7,5,3,2], [11,13,17,19])
The good thing about zippers is that they are efficient, you can access as many elements near the index you want, and you can move the focus of the zipper forwards and backwards. Learn more about zippers here:
http://learnyouahaskell.com/zippers
Note that my moveUntil function is like break from the Prelude but the initial part of the list is reversed. Hence you can simply get the head of both lists.
A non-awkward way of implementing this as a fold is making it a paramorphism. For general explanatory notes, see this answer by dfeuer (I took foldrWithTails from it):
-- The extra [a] argument f takes with respect to foldr
-- is the tail of the list at each step of the fold.
foldrWithTails :: (a -> [a] -> b -> b) -> b -> [a] -> b
foldrWithTails f n = go
where
go (a : as) = f a as (go as)
go [] = n
boundary :: (a -> Bool) -> [a] -> Maybe (a, a)
boundary p = foldrWithTails findBoundary Nothing
where
findBoundary x (y : _) bnd
| p y = Just (x, y)
| otherwise = bnd
findBoundary _ [] _ = Nothing
Notes:
If p y is true we don't have to look at bnd to get the result. That makes the solution adequately lazy. You can check that by trying out boundary (> 1000000) [0..] in GHCi.
This solution gives no special treatment to the edge case of the first element of the list matching the condition. For instance:
GHCi> boundary (<1) [0..9]
Nothing
GHCi> boundary even [0..9]
Just (1,2)
There's several alternatives; either way, you'll have to implement this yourself. You could use explicit recursion:
getLastAndFirst :: (a -> Bool) -> [a] -> Maybe (a, a)
getLastAndFirst p (x : xs#(y:ys))
| p y = Just (x, y)
| otherwise = getLastAndFirst p xs
getLastAndFirst _ [] = Nothing
Alternately, you could use a fold, but that would look fairly similar to the above, except less readable.
A third option is to use break, as suggested in the comments:
getLastAndFirst' :: (a -> Bool) -> [a] -> Maybe (a,a)
getLastAndFirst' p l =
case break p l of
(xs#(_:_), (y:_)) -> Just (last xs, y)
_ -> Nothing
(\(xs, ys) -> [last xs, head ys]) $ break (==0) list
Using break as Robin Zigmond suggested ended up short and simple, not using Maybe to catch edge-cases, but I could replace the lambda with a simple function that used Maybe.
I toyed a bit more with the solution and came up with
breakAround :: Int -> Int -> (a -> Bool) -> [a] -> [a]
breakAround m n cond list = (\(xs, ys) -> (reverse (reverse take m (reverse xs))) ++ take n ys) $ break (cond) list
which takes two integers, a predicate, and a list of a, and returns a single list of m elements before the predicate and n elements after.
Example: breakAround 3 2 (==0) [3,2,1,0,10,20,30] would return [3,2,1,0,10]
I'm practicing some Haskell to understand the \, case.. of and Maybe better.
I've got this little function here which should return Nothing if the array is empty, Just y if y is equal to the head of the array xs and Just (tail xs) if y is not equal to the head of the array xs.
I set the return type of the function to Maybe a because in one case it should return an Int and in the other an [Int].
funct :: Int -> [Int] -> Maybe a
funct = \y xs -> case xs of
[] -> Nothing
xs -> if ((head xs) == y)
then Just y
else Just (tail xs)
What am I missing? I am getting the error that it couldn't match type a with [Int]. Isn't the a in Maybe a generic or is it influenced by the fact that I "used" the a as an Int in the Just y part?
EDIT: Ok my suggestion was bs, I tested it with Just (tail xs) in the then and else part and I'm still getting the same error.
set the return type of the function to Maybe a because in one case it should return an Int and in the other an [Int].
Haskell is statically typed. Meaning it can not - at runtime - have a different return type. It can only have one return type. a is not an ad hoc type (in the sense that it can be any type at runtime). It means that a will be determined - at compile time - based on the types of other parameters.
For instance you can write: foo :: a -> a -> a to specify that if foo takes two Ints (again known at compile time), the result will be an Int.
You can however use Either a b to say that you will either return a Left a, or a Right b. So you can rewrite it to:
funct :: Int -> [Int] -> Maybe (Either Int [Int])
funct = \y xs -> case xs of
[] -> Nothing
xs -> if ((head xs) == y)
then Just (Left y)
else Just (Right (tail xs))
Your function however is quite verbose, you can make it more clear and compact as follows:
funct :: Int -> [Int] -> Maybe (Either Int [Int])
funct _ [] = Nothing
funct y (h:t) | h == y = Just (Left y)
| otherwise = Just (Right t)
Furthermore we can generalize it to:
funct :: Eq a => a -> [a] -> Maybe (Either a [a])
funct _ [] = Nothing
funct y (h:t) | h == y = Just (Left y)
| otherwise = Just (Right t)
Here Eq is a typeclass that specifies that there exists a function (==) :: a -> a -> Bool that we can use. Otherwise using == in the body of the function would not be possible.
Furthermore we use patterns in the head of every clause. [] is a pattern that describes the empty list. (h:t) on the other hand is a pattern describing a list containing at least one element: the head h, followed by a (possibly empty tail t).
The code below retains, for a given integer n, the first n items from a list, drops the following n items, keeps the following n and so on. It works correctly for any finite list.
In order to make it usable with infinite lists, I used the 'seq' operator to force the accumulator evaluation before the recursive step as in foldl' as example.
I tested by tracing the accumulator's value and it seems that it is effectively computed as desired with finite lists.
Nevertheless, it doesn't work when applied to an infinite list. The "take" in the main function is only executed once the inner calculation is terminated, what, of course, never happens with an infinite list.
Please, can someone tell me where is my mistake?
main :: IO ()
main = print (take 2 (foo 2 [1..100]))
foo :: Show a => Int -> [a] -> [a]
foo l lst = inFoo keepOrNot 1 l lst []
inFoo :: Show a => (Bool -> Int -> [a] -> [a] -> [a]) -> Int -> Int -> [a] -> [a] -> [a]
inFoo keepOrNot i l [] lstOut = lstOut
inFoo keepOrNot i l lstIn lstOut = let lstOut2 = (keepOrNot (odd i) l lstIn lstOut) in
stOut2 `seq` (inFoo keepOrNot (i+1) l (drop l lstIn) lstOut2)
keepOrNot :: Bool -> Int -> [a] -> [a] -> [a]
keepOrNot b n lst1 lst2 = case b of
True -> lst2 ++ (take n lst1)
False -> lst2
Here's how list concatenation is implemented:
(++) :: [a] -> [a] -> [a]
(++) [] ys = ys
(++) (x:xs) ys = x : xs ++ ys
Note that
the right hand list structure is reused as is (even if it's not been evaluated yet, so lazily)
the left hand list structure is rewritten (copied)
This means that if you're using ++ to build up a list, you want the accumulator to be on the right hand side. (For finite lists, merely for efficiency reasons --- if the accumulator is on the left hand side, it will be repeatedly copied and this is inefficient. For infinite lists, the caller can't look at the first element of the result until it's been copied for the last time, and there won't be a last time because there's always something else to concatenate onto the right of the accumulator.)
The True case of keepOrNot has the accumulator on the left of the ++. You need to use a different data structure.
The usual idiom in this case is to use difference lists. Instead of using type [a] for your accumulator, use [a] -> [a]. Your accumulator is now a function that prepends a list to the list it's given as input. This avoids repeated copying, and the list can be built lazily.
keepOrNot :: Bool -> Int -> [a] -> ([a] -> [a]) -> ([a] -> [a])
keepOrNot b n lst1 acc = case b of
True -> acc . (take n lst1 ++)
False -> acc
The initial value of the accumulator should be id. When you want to convert it to a conventional list, call it with [] (i.e., acc []).
seq is a red herring here. seq does not force the entire list. seq only determines whether it is of the form [] or x : xs.
You're learning Haskell, yes? So it would be a good idea as an exercise to modify your code to use a difference list accumulator. Possibly the use of infinite lists will burn you in a different part of your code; I don't know.
But there is a better approach to writing foo.
foo c xs = map snd . filter fst . zipWith f [0..] $ xs
where f i x = (even (i `div` c), x)
So you want to group a list into groups of n elements, and drop every other group. We can write this down directly:
import Data.List (unfoldr)
groups n xs = takeWhile (not.null) $ unfoldr (Just . splitAt n) xs
foo c xs = concatMap head . groups 2 . groups c $ xs
dave4420 already explained what is wrong with your code, but I'd like to comment on how you got there, IMO. Your keepOrNot :: Bool -> Int -> [a] -> [a] -> [a] function is too general. It works according to the received Bool, any Bool; but you know that you will feed it a succession of alternating True and False values. Programming with functions is like plugging a pipe into a funnel - output of one function serves as input to the next - and the funnel is too wide here, so the contact is loose.
A minimal re-write of your code along these lines could be
foo n lst = go lst
where
go lst = let (a,b) = splitAt n lst
(c,d) = splitAt n b
in
a ++ go d
The contact is "tight", there's no "information leakage" here. We just do the work twice (*) ourselves, and "connect the pipes" explicitly, in this code, grabbing one result (a) and dropping the other (c).
--
(*) twice, reflecting the two Boolean values, True and False, alternating in a simple fashion one after another. Thus this is captured frozen in the code's structure, not hanging loose as a parameter able to accommodate an arbitrary Boolean value.
Like dava4420 said, you shouldn't be using (++) to accumulate from the left. But perhaps you shouldn't be accumulating at all! In Haskell, lazyness makes straighforward "head-construction" often more efficient than the tail recursions you'd need to use in e.g. Lisp. For example:
foo :: Int -> [a] -> [a] -- why would you give this a Show constraint?
foo ℓ = foo' True
where foo' _ [] = []
foo' keep lst
| keep = firstℓ ++ foo' False other
| otherwise = foo' True other
where (firstℓ, other) = splitAt ℓ lst