How do I make this function consume its input bit stream lazily? - haskell

I'm imagining a function like
takeChunkUntil :: [a] -> ([a] -> Bool) -> ([a], [a])
Hopefully lazy.
It takes elements from the first list until the group of them satisfies the predicate, then returns that sublist as well as the remaining elements.
TO ANSWER SOME QUESTIONS:
The ultimate goal is to make something that reads Huffman codes lazily. So if you have a string of bits, here represented as Bool, bs, you can write take n $ decode huffmanTree bs to take the first n coded values while consuming only as much of bs as necessary. If you would like I'll post more details and my attempted solutions. This could get long :) (Note I'm a tutor who was given this problem by a student, but I didn't try to help him as it was beyond me, however I'm very curious now.)
CONTINUED: Here goes the whole thing:
Definition of Huffman tree:
data BTree a = Leaf a | Fork (BTree a) (BTree a) deriving (Show, Eq)
Goal: write a lazy decode function that returns a pair of the decoded values and a Boolean indicating if there were any values left over that were not fully long enough to be decoded into a value. Note: we are using Bool to represent a bit: True =1, False = 0.
decode :: BTree a -> [Bool] -> ([a], Bool)
Here's the essence: The first function I wrote was a function that decodes one value. Returns Nothing if the input list was empty, otherwise returns the decoded value and the remaining "bit".
decode1 :: BTree a -> [Bool] -> Maybe (a, [Bool])
decode1 (Leaf v) bs = Just (v, bs)
decode1 _ [] = Nothing
decode1 (Fork left right) (b:bs)
| b = decode1 right bs
| otherwise = decode1 left bs
First, I figured that I needed some kind of tail recursion to make this lazy. Here's what doesn't work. I think it doesn't, anyway. Notice how it's recursive, but I'm passing a list of "symbols decoded so far" and appending the new one. Inefficient and maybe (if my understanding is right) won't lead to tail recursion.
decodeHelp :: BTree a -> [a] -> [Bool] -> ([a],Bool)
decodeHelp t symSoFar bs = case decode1 t bs of
Nothing -> (symSoFar,False)
Just (s,remain) -> decodeHelp t (symSoFar ++ [s]) remain
So I thought, how can I write a better kind of recursion in which I decode a symbol and append it to the next call? The key is to return a list of [Maybe a], in which Just a is a successfully decoded symbol and Nothing means no symbol could be decoded (i.e. remaining booleans were not sufficient)
decodeHelp2 :: BTree a -> [Bool] -> [Maybe a]
decodeHelp2 t bs = case decode1 t bs of
Nothing -> [Nothing]
Just (s, remain) -> case remain of
[] -> []
-- in the following line I can just cons Just s onto the
-- recursive call. My understand is that's what make tail
-- recursion work and lazy.
_ -> Just s : decodeHelp2 t remain
But obviously this is not what the problem set wants out of the result. How can I turn all these [Maybe a] into a ([a], Bool)? My first thought was to apply scanl
Here's the scanning function. It accumulates Maybe a into ([a], Bool)
sFunc :: ([a], Bool) -> Maybe a -> ([a], Bool)
sFunc (xs, _) Nothing = (xs, False)
sFunc (xs, _) (Just x) = (xs ++ [x], True)
Then you can write
decodeSortOf :: BTree a -> [Bool] -> [([a], Bool)]
decodeSortOf t bs = scanl sFunc ([],True) (decodeHelp2 t bs)
I verified this works and is lazy:
take 3 $ decodeSortOf xyz_code [True,False,True,True,False,False,False,error "foo"] gives [("",True),("y",True),("yz",True)]
But this is not the desired result. Help, I'm stuck!

Here's a hint. I've swapped the argument order to get something more idiomatic, and I've changed the result type to reflect the fact that you may not find an acceptable chunk.
import Data.List (inits, tails)
takeChunkUntil :: ([a] -> Bool) -> [a] -> Maybe ([a], [a])
takeChunkUntil p as = _ $ zip (inits as) (tails as)

We can use explicit recursion here, where if the predicate is satisfied, we prepend to the first item of the tuple. If not, we make a 2-tuple where we put the (remaining) list in the second item of the 2-tuple. For example:
import Control.Arrow(first)
takeChunkUntil :: ([a] -> Bool) -> [a] -> ([a], [a])
takeChunkUntil p = go []
where go _ [] = ([], [])
go gs xa#(x:xs) | not (p (x:gs)) = first (x:) (go (x:gs) xs)
| otherwise = ([], xa)
We here make the assumption that the order of the elements in the group is not relevant to the predicate (since we each time pass the list in reverse order). If that is relevant, we can use a difference list for example. I leave that as an exercise.
This works on an infinite list as well, for example:
Prelude Control.Arrow> take 10 (fst (takeChunkUntil (const False) (repeat 1)))
[1,1,1,1,1,1,1,1,1,1]

Related

Is there a straight-forward solution to receiving the element *prior* to hitting the dropWhile predicate?

Given a condition, I want to search through a list of elements and return the first element that reaches the condition, and the previous one.
In C/C++ this is easy :
int i = 0;
for(;;i++) if (arr[i] == 0) break;
After we get the index where the condition is met, getting the previous element is easy, through "arr[i-1]"
In Haskell:
dropWhile (/=0) list gives us the last element I want
takeWhile (/=0) list gives us the first element I want
But I don't see a way of getting both in a simple manner. I could enumerate the list and use indexing, but that seems messy. Is there a proper way of doing this, or a way of working around this?
I would zip the list with its tail so that you have pairs of elements
available. Then you can just use find on the list of pairs:
f :: [Int] -> Maybe (Int, Int)
f xs = find ((>3) . snd) (zip xs (tail xs))
> f [1..10]
Just (3,4)
If the first element matches the predicate this will return
Nothing (or the second match if there is one) so you might need to special-case that if you want something
different.
As Robin Zigmond says break can also work:
g :: [Int] -> (Int, Int)
g xs = case break (>3) xs of (_, []) -> error "not found"
([], _) -> error "first element"
(ys, z:_) -> (last ys, z)
(Or have this return a Maybe as well, depending on what you need.)
But this will, I think, keep the whole prefix ys in memory until it
finds the match, whereas f can start garbage-collecting the elements
it has moved past. For small lists it doesn't matter.
I would use a zipper-like search:
type ZipperList a = ([a], [a])
toZipperList :: [a] -> ZipperList a
toZipperList = (,) []
moveUntil' :: (a -> Bool) -> ZipperList a -> ZipperList a
moveUntil' _ (xs, []) = (xs, [])
moveUntil' f (xs, (y:ys))
| f y = (xs, (y:ys))
| otherwise = moveUntil' f (y:xs, ys)
moveUntil :: (a -> Bool) -> [a] -> ZipperList a
moveUntil f = moveUntil' f . toZipperList
example :: [Int]
example = [2,3,5,7,11,13,17,19]
result :: ZipperList Int
result = moveUntil (>10) example -- ([7,5,3,2], [11,13,17,19])
The good thing about zippers is that they are efficient, you can access as many elements near the index you want, and you can move the focus of the zipper forwards and backwards. Learn more about zippers here:
http://learnyouahaskell.com/zippers
Note that my moveUntil function is like break from the Prelude but the initial part of the list is reversed. Hence you can simply get the head of both lists.
A non-awkward way of implementing this as a fold is making it a paramorphism. For general explanatory notes, see this answer by dfeuer (I took foldrWithTails from it):
-- The extra [a] argument f takes with respect to foldr
-- is the tail of the list at each step of the fold.
foldrWithTails :: (a -> [a] -> b -> b) -> b -> [a] -> b
foldrWithTails f n = go
where
go (a : as) = f a as (go as)
go [] = n
boundary :: (a -> Bool) -> [a] -> Maybe (a, a)
boundary p = foldrWithTails findBoundary Nothing
where
findBoundary x (y : _) bnd
| p y = Just (x, y)
| otherwise = bnd
findBoundary _ [] _ = Nothing
Notes:
If p y is true we don't have to look at bnd to get the result. That makes the solution adequately lazy. You can check that by trying out boundary (> 1000000) [0..] in GHCi.
This solution gives no special treatment to the edge case of the first element of the list matching the condition. For instance:
GHCi> boundary (<1) [0..9]
Nothing
GHCi> boundary even [0..9]
Just (1,2)
There's several alternatives; either way, you'll have to implement this yourself. You could use explicit recursion:
getLastAndFirst :: (a -> Bool) -> [a] -> Maybe (a, a)
getLastAndFirst p (x : xs#(y:ys))
| p y = Just (x, y)
| otherwise = getLastAndFirst p xs
getLastAndFirst _ [] = Nothing
Alternately, you could use a fold, but that would look fairly similar to the above, except less readable.
A third option is to use break, as suggested in the comments:
getLastAndFirst' :: (a -> Bool) -> [a] -> Maybe (a,a)
getLastAndFirst' p l =
case break p l of
(xs#(_:_), (y:_)) -> Just (last xs, y)
_ -> Nothing
(\(xs, ys) -> [last xs, head ys]) $ break (==0) list
Using break as Robin Zigmond suggested ended up short and simple, not using Maybe to catch edge-cases, but I could replace the lambda with a simple function that used Maybe.
I toyed a bit more with the solution and came up with
breakAround :: Int -> Int -> (a -> Bool) -> [a] -> [a]
breakAround m n cond list = (\(xs, ys) -> (reverse (reverse take m (reverse xs))) ++ take n ys) $ break (cond) list
which takes two integers, a predicate, and a list of a, and returns a single list of m elements before the predicate and n elements after.
Example: breakAround 3 2 (==0) [3,2,1,0,10,20,30] would return [3,2,1,0,10]

Changing recursive guards into higher order functions

I'm trying to convert basic functions into higher order functions (specifically map, filter, or foldr). I was wondering if there are any simple concepts to apply where I could see old functions I've written using guards and turn them into higher order.
I'm working on changing a function called filterFirst that removes the first element from the list (second argument) that does not satisfy a given predicate function (first argument).
filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst _ [] = []
filterFirst x (y:ys)
| x y = y : filterFirst x ys
| otherwise = ys
For an example:
greaterOne :: Num a=>Ord a=>a->Bool
greaterOne x = x > 1
filterFirst greaterOne [5,-6,-7,9,10]
[5,-7,9,10]
Based on the basic recursion, I was wondering if there might be a way to translate this (and similar functions) to higher order map, filter, or foldr. I'm not very advanced and these functions are new to me.
There is a higher-order function that's appropriate here, but it's not in the base library. What's the trouble with foldr? If you just fold over the list, you'll end up rebuilding the whole thing, including the part after the deletion.
A more appropriate function for the job is para from the recursion-schemes package (I've renamed one of the type variables):
para :: Recursive t => (Base t (t, r) -> r) -> t -> r
In the case of lists, this specializes to
para :: (ListF a ([a], r) -> r) -> [a] -> r
where
data ListF a b = Nil | Cons a b
deriving (Functor, ....)
This is pretty similar to foldr. The recursion-schemes equivalent of foldr is
cata :: Recursive t => (Base t r -> r) -> t -> r
Which specializes to
cata :: (ListF a r -> r) -> [a] -> r
Take a break here and figure out why the type of cata is basically equivalent to that of foldr.
The difference between cata and para is that para passes the folding function not only the result of folding over the tail of the list, but also the tail of the list itself. That gives us an easy and efficient way to produce the rest of the list after we've found the first non-matching element:
filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst f = para go
where
--go :: ListF a ([a], [a]) -> [a]
go (Cons a (tl, r))
| f a = a : r
| otherwise = tl
go Nil = []
para is a bit awkward for lists, since it's designed to fit into a more general context. But just as cata and foldr are basically equivalent, we could write a slightly less awkward function specifically for lists.
foldrWithTails
:: (a -> [a] -> b -> b)
-> b -> [a] -> b
foldrWithTails f n = go
where
go (a : as) = f a as (go as)
go [] = n
Then
filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst f = foldrWithTails go []
where
go a tl r
| f a = a : r
| otherwise = tl
First, let's flip the argument order of your function. This will make a few steps easier, and we can flip it back when we're done. (I'll call the flipped version filterFirst'.)
filterFirst' :: [a] -> (a -> Bool) -> [a]
filterFirst' [] _ = []
filterFirst' (y:ys) x
| x y = y : filterFirst' ys x
| otherwise = ys
Note that filterFirst' ys (const True) = ys for all ys. Let's substitute that in place:
filterFirst' :: [a] -> (a -> Bool) -> [a]
filterFirst' [] _ = []
filterFirst' (y:ys) x
| x y = y : filterFirst' ys x
| otherwise = filterFirst' ys (const True)
Use if-else instead of a guard:
filterFirst' :: [a] -> (a -> Bool) -> [a]
filterFirst' [] _ = []
filterFirst' (y:ys) x = if x y then y : filterFirst' ys x else filterFirst' ys (const True)
Move the second argument to a lambda:
filterFirst' :: [a] -> (a -> Bool) -> [a]
filterFirst' [] = \_ -> []
filterFirst' (y:ys) = \x -> if x y then y : filterFirst' ys x else filterFirst' ys (const True)
And now this is something we can turn into a foldr. The pattern we were going for is that filterFirst' (y:ys) can be expressed in terms of filterFirst' ys, without using ys otherwise, and we're now there.
filterFirst' :: Foldable t => t a -> (a -> Bool) -> [a]
filterFirst' = foldr (\y f -> \x -> if x y then y : f x else f (const True)) (\_ -> [])
Now we just need to neaten it up a bit:
filterFirst' :: Foldable t => t a -> (a -> Bool) -> [a]
filterFirst' = foldr go (const [])
where go y f x
| x y = y : f x
| otherwise = f (const True)
And flip the arguments back:
filterFirst :: Foldable t => (a -> Bool) -> t a -> [a]
filterFirst = flip $ foldr go (const [])
where go y f x
| x y = y : f x
| otherwise = f (const True)
And we're done. filterFirst implemented in terms of foldr.
Addendum: Although filter isn't strong enough to build this, filterM is when used with the State monad:
{-# LANGUAGE FlexibleContexts #-}
import Control.Monad.State
filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst x ys = evalState (filterM go ys) False
where go y = do
alreadyDropped <- get
if alreadyDropped || x y then
return True
else do
put True
return False
If we really want, we can write filterFirst using foldr, since foldr is kind of "universal" -- it allows any list transformation we can perform using recursion. The main downside is that the resulting code is rather counter-intuitive. In my opinion, explicit recursion is far better in this case.
Anyway here's how it is done. This relies on what I consider to be an antipattern, namely "passing four arguments to foldr". I call this an antipattern since foldr is usually called with three arguments only, and the result is not a function taking a fourth argument.
filterFirst :: (a->Bool)->[a]->[a]
filterFirst p xs = foldr go (\_ -> []) xs True
where
go y ys True
| p y = y : ys True
| otherwise = ys False
go y ys False = y : ys False
Clear? Not very much. The trick here is to exploit foldr to build a function Bool -> [a] which returns the original list if called with False, and the filtered-first list if called with True. If we craft that function using
foldr go baseCase xs
the result is then obviously
foldr go baseCase xs True
Now, the base case must handle the empty list, and in such case we must return a function returning the empty list, whatever the boolean argument is. Hence, we arrive at
foldr go (\_ -> []) xs True
Now, we need to define go. This takes as arguments:
a list element y
the result of the "recursion" ys (a function Bool->[a] for the rest of the list)
and must return a function Bool->[a] for the larger list. So let's also consider
a boolean argument
and finally make go return a list. Well, if the boolean is False we must return the list unchanged, so
go y ys False = y : ys False
Note that ys False means "the tail unchanged", so we are really rebuilding the whole list unchanged.
If instead the boolean is true, we query the predicate as in p y. If that is false, we discard y, and return the list tail unchanged
go y ys True
| p y = -- TODO
| otherwise = ys False
If p y is true, we keep y and we return the list tail filtered.
go y ys True
| p y = y : ys True
| otherwise = ys False
As a final note, we cold have used a pair ([a], [a]) instead of a function Bool -> [a], but that approach does not generalize as well to more complex cases.
So, that's all. This technique is something nice to know, but I do not recommend it in real code which is meant to be understood by others.
Joseph and chi's answers already show how to derive a foldr implementation, so I'll try to aid intuition.
map is length-preserving, filterFirst is not, so trivially map must be unsuited for implementing filterFirst.
filter (and indeed map) are memoryless - the same predicate/function is applied to each element of the list, regardless of the result on other elements. In filterFirst, behaviour changes once we see the first non-satisfactory element and remove it, so filter (and map) are unsuited.
foldr is used to reduce a structure to a summary value. It's very general, and it might not be immediately obvious without experience what sorts of things this may cover. filterFirst is in fact such an operation, though. The intuition is something like, "can we build it in a single pass through the structure, building it up as we go(, with additional state stored as required)?". I fear Joseph's answer obfuscates a little, as foldr with 4 parameters, it may not be immediately obvious what's going on, so let's try it a little differently.
filterFirst p xs = snd $ foldr (\a (deleted,acc) -> if not deleted && not (p a) then (True,acc) else (deleted,a:acc) ) (False,[]) xs
Here's a first attempt. The "extra state" here is obviously the bool indicating whether or not we've deleted an element yet, and the list accumulates in the second element of the tuple. At the end we call snd to obtain just the list. This implementation has the problem, however, that we delete the rightmost element not satisfying the predicate, because foldr first combines the rightmost element with the neutral element, then the second-rightmost, and so on.
filterFirst p xs = snd $ foldl (\(deleted,acc) a -> if not deleted && not (p a) then (True,acc) else (deleted,a:acc) ) (False,[]) xs
Here, we try using foldl. This does delete the leftmost non-satisfactory element, but has the side-effect of reversing the list. We can stick a reverse at the front, and this would solve the problem, but is somewhat unsatisfactory due to the double-traversal.
Then, if you go back to foldr, having realized that (basically) if you want transform a list whilst preserving order that foldr is the correct variant, you play with it for a while and end up writing what Joseph suggested. I do however agree with chi that straightforward recursion is the best solution here.
Your function can also be expressed as an unfold, or, more specifically, as an apomorphism. Allow me to begin with a brief explanatory note, before the solution itself.
The apomorphism is the recursion scheme dual to the paramorphism (see dfeuer's answer for more about the latter). Apomorphisms are examples of unfolds, which generate a structure from a seed. For instance, Data.List offers unfoldr, a list unfold.
unfoldr :: (b -> Maybe (a, b)) -> b -> [a]
The function given to unfoldr takes a seed and either produces a list element and a new seed (if the maybe-value is a Just) or terminates the list generation (if it is Nothing). Unfolds are more generally expressed by the ana function from recursion-schemes ("ana" is short for "anamorphism").
ana :: Corecursive t => (a -> Base t a) -> a -> t
Specialised to lists, this becomes...
ana #[_] :: (b -> ListF a b) -> b -> [a]
... which is unfoldr in different clothing.
An apomorphism is an unfold in which the generation of the structure can be short-circuited at any point of the process, by producing, instead of a new seed, the rest of the structure in a fell swoop. In the case of lists, we have:
apo #[_] :: (b -> ListF a (Either [a] b)) -> b -> [a]
Either is used to trigger the short-circuit: with a Left result, the unfold short-circuits, while with a Right it proceeds normally.
The solution in terms of apo is fairly direct:
{-# LANGUAGE LambdaCase #-}
import Data.Functor.Foldable
filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst p = apo go
where
go = \case
[] -> Nil
a : as
| p a -> Cons a (Right as)
| otherwise -> case as of
[] -> Nil
b : bs -> Cons b (Left bs)
It is somewhat more awkward than dfeuer's para-based solution, because if we want to short-circuit without an empty list for a tail we are compelled to emit one extra element (the b in the short-circuiting case), and so we have to look one position ahead. This awkwardness would grow by orders of magnitude if, rather than filterFirst, we were to impĺement plain old filter with an unfold, as beautifully explained in List filter using an anamorphism.
This answer is inspired by a comment from luqui on a now-deleted question.
filterFirst can be implemented in a fairly direct way in terms of span:
filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst p = (\(yeas, rest) -> yeas ++ drop 1 rest) . span p
span :: (a -> Bool) -> [a] -> ([a], [a]) splits the list in two at the first element for which the condition doesn't hold. After span, we drop the first element of the second part of the list (with drop 1 rather than tail so that we don't have to add a special case for []), and reassemble the list with (++).
As an aside, there is a near-pointfree spelling of this implementation which I find too pretty not to mention:
filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst p = uncurry (++) . second (drop 1) . span p
While span is a higher order function, it would be perfectly understandable if you found this implementation disappointing in the context of your question. After all, span is not much more fundamental than filterFirst itself. Shouldn't we try going a little deeper, to see if we can capture the spirit of this solution while expressing it as a fold, or as some other recursion scheme?
I believe functions like filterFirst can be fine demonstrations of hylomorphisms. A hylomorphism is an unfold (see my other answer for more on that) that generates an intermediate data structure followed by a fold which turns this data structure into something else. Though it might look like that would require two passes to get a result (one through the input structure, and another through the intermediate one), if the hylomorphism implemented properly (as done in the hylo function of recursion-schemes) it can be done in a single pass, with the fold consuming pieces of the intermediate structure as they are generated by the unfold (so that we don't have to actually build it all only to tear it down).
Before we start, here is the boilerplate needed to run what follows:
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TemplateHaskell #-}
import Data.Functor.Foldable
import Data.Functor.Foldable.TH
The strategy here is picking an intermediate data structure for the hylomorphism that expresses the essence of what we want to achieve. In this case, we will use this cute thing:
data BrokenList a = Broken [a] | Unbroken a (BrokenList a)
-- I won't actually use those instances here,
-- but they are nice to have if you want to play with the type.
deriving (Eq, Show, Functor, Foldable, Traversable)
makeBaseFunctor ''BrokenList
BrokenList is very much like a list (Broken and Unbroken mirror [] and (:), while the makeBaseFunctor incantation generates a BrokenListF base functor analogous to ListF, with BrokenF and UnbrokenF constructors), except that it has another list attached at its end (the Broken constructor). It expresses, in a quite literal way, the idea of a list being divided in two parts.
With BrokenList at hand, we can write the hylomorphism. coalgSpan is the operation used for the unfold, and algWeld, the one used for the fold.
filterFirst p = hylo algWeld coalgSpan
where
coalgSpan = \case
[] -> BrokenF []
x : xs
| p x -> UnbrokenF x xs
| otherwise -> BrokenF xs
algWeld = \case
UnbrokenF x yeas -> x : yeas
BrokenF rest -> rest
coalgSpan breaks the list upon hitting a x element such that p x doesn't hold. Not adding that element to the second part of the list (BrokenF xs rather than BrokenF (x : xs)) takes care of the filtering. As for algWeld, it is used to concatenate the two parts (it is very much like what we would use to implement (++) using cata).
(For a similar example of BrokenList in action, see the breakOn implementation in Note 5 of this older answer of mine. It suggests what it would take to implement span using this strategy.)
There are at least two good things about this hylo-based implementation. Firstly, it has good performance (casual testing suggests that, if compiled with optimisations, it is at least as good as, and possibly slightly faster than, the most efficient implementations in other answers here). Secondly, it reflects very closely your original, explicitly recursive implementation of filterFirst (or, at any rate, more closely than the fold-only and unfold-only implementations).

How to implement the head function using fst function

I admit this is my homework. But I really couldn't find a good solution after working hard on it.
There might be some stupid ways to accomplish this, like:
myHead (x:[]) = x
myHead (x:y:xs) = fst (x, y)
But I don't think that's what the teacher wants.
BTW, error-handling is not required.
Thanks in advance!
There's a very natural function that's not in the prelude called "uncons" which is the inverse of uncurried cons.
cons :: a -> [a] -> [a]
uncurry cons :: (a, [a]) -> [a]
uncons :: [a] -> (a, [a])
uncons (x:xs) = (x, xs)
You can use it to implement head as
head = fst . uncons
Why is uncons natural?
You can think of a list as the datatype that's defined through the use of two constructor functions
nil :: [a]
nil = []
cons :: (a, [a]) -> [a]
cons (a,as) = a:as
You can also think of it as the data type which is deconstructed by a function
destruct :: [a] -> Maybe (a, [a])
destruct [] = Nothing
destruct (a:as) = Just (a, as)
It's well beyond this answer to explain why those are so definitively tied to the list type, but one way to look at it is to try to define
nil :: f a
cons :: (a, f a) -> f a
or
destruct :: f a -> Maybe (a, f a)
for any other container type f. You'll find that they all have very close relationships with lists.
You can almost already see uncons in the second case of the definition of destruct, but there's a Just in the way. This is uncons is better paired with head and tail which are not defined on empty lists
head [] = error "Prelude.head"
so we can adjust the previous answer to work for infinite streams. Here we can think of infinite streams as being constructed by one function
data Stream a = Next a (Stream a)
cons :: (a, Stream a) -> Stream a
cons (a, as) = Next a as
and destructed by one function
uncons :: Stream a -> (a, Stream a)
uncons (Next a as) = (a, as)
-- a. k. a.
uncons stream = (head stream, tail stream)
the two being inverses of one another.
Now we can get head for Streams by getting the first element of the return tuple from uncons
head = fst . uncons
And that's what head models in the Prelude, so we can pretend like lists are infinite streams and define head in that way
uncons :: [a] -> (a, [a])
uncons (a:as) = (a, as)
-- a. k. a.
uncons list = (head list, tail list)
head = fst . uncons
Perhaps you're expected write to your own cons List type, then it might make more sense. Although type synonyms can't be recursive, so you end up using a non-tuple data constructor, making the tuple superfluous.. it would look like:
data List a = Nil | List (a, List a)
deriving( Show )
head :: List a -> a
head (List c) = fst c
Like already said in the comments, this is just a silly task and you won't get something you could call a good implementation of head.
Your solution, for those requirements, is just fine – as the only change I would replace (x:y:xs) with (x:y:_) since xs isn't used at all (which would actually cause a compiler warning in some settings). In fact, you could do that with y as well:
myHead (x:_:_) = fst (x, undefined)
There would be alternatives that look perhaps not quite so useless use of fst, i.e. don't just build a tuple by hand and immediately deconstruct it again:
myHead' [x] = x
myHead' xs = myHead' . fst $ splitAt 1 xs
myHead'' = foldr1 $ curry fst
myHead''' = fromJust . find ((==0) . fst) . zip [0..]
but you could rightfully say that these are just ridiculous.

Lack of understanding infinite lists and seq operator

The code below retains, for a given integer n, the first n items from a list, drops the following n items, keeps the following n and so on. It works correctly for any finite list.
In order to make it usable with infinite lists, I used the 'seq' operator to force the accumulator evaluation before the recursive step as in foldl' as example.
I tested by tracing the accumulator's value and it seems that it is effectively computed as desired with finite lists.
Nevertheless, it doesn't work when applied to an infinite list. The "take" in the main function is only executed once the inner calculation is terminated, what, of course, never happens with an infinite list.
Please, can someone tell me where is my mistake?
main :: IO ()
main = print (take 2 (foo 2 [1..100]))
foo :: Show a => Int -> [a] -> [a]
foo l lst = inFoo keepOrNot 1 l lst []
inFoo :: Show a => (Bool -> Int -> [a] -> [a] -> [a]) -> Int -> Int -> [a] -> [a] -> [a]
inFoo keepOrNot i l [] lstOut = lstOut
inFoo keepOrNot i l lstIn lstOut = let lstOut2 = (keepOrNot (odd i) l lstIn lstOut) in
stOut2 `seq` (inFoo keepOrNot (i+1) l (drop l lstIn) lstOut2)
keepOrNot :: Bool -> Int -> [a] -> [a] -> [a]
keepOrNot b n lst1 lst2 = case b of
True -> lst2 ++ (take n lst1)
False -> lst2
Here's how list concatenation is implemented:
(++) :: [a] -> [a] -> [a]
(++) [] ys = ys
(++) (x:xs) ys = x : xs ++ ys
Note that
the right hand list structure is reused as is (even if it's not been evaluated yet, so lazily)
the left hand list structure is rewritten (copied)
This means that if you're using ++ to build up a list, you want the accumulator to be on the right hand side. (For finite lists, merely for efficiency reasons --- if the accumulator is on the left hand side, it will be repeatedly copied and this is inefficient. For infinite lists, the caller can't look at the first element of the result until it's been copied for the last time, and there won't be a last time because there's always something else to concatenate onto the right of the accumulator.)
The True case of keepOrNot has the accumulator on the left of the ++. You need to use a different data structure.
The usual idiom in this case is to use difference lists. Instead of using type [a] for your accumulator, use [a] -> [a]. Your accumulator is now a function that prepends a list to the list it's given as input. This avoids repeated copying, and the list can be built lazily.
keepOrNot :: Bool -> Int -> [a] -> ([a] -> [a]) -> ([a] -> [a])
keepOrNot b n lst1 acc = case b of
True -> acc . (take n lst1 ++)
False -> acc
The initial value of the accumulator should be id. When you want to convert it to a conventional list, call it with [] (i.e., acc []).
seq is a red herring here. seq does not force the entire list. seq only determines whether it is of the form [] or x : xs.
You're learning Haskell, yes? So it would be a good idea as an exercise to modify your code to use a difference list accumulator. Possibly the use of infinite lists will burn you in a different part of your code; I don't know.
But there is a better approach to writing foo.
foo c xs = map snd . filter fst . zipWith f [0..] $ xs
where f i x = (even (i `div` c), x)
So you want to group a list into groups of n elements, and drop every other group. We can write this down directly:
import Data.List (unfoldr)
groups n xs = takeWhile (not.null) $ unfoldr (Just . splitAt n) xs
foo c xs = concatMap head . groups 2 . groups c $ xs
dave4420 already explained what is wrong with your code, but I'd like to comment on how you got there, IMO. Your keepOrNot :: Bool -> Int -> [a] -> [a] -> [a] function is too general. It works according to the received Bool, any Bool; but you know that you will feed it a succession of alternating True and False values. Programming with functions is like plugging a pipe into a funnel - output of one function serves as input to the next - and the funnel is too wide here, so the contact is loose.
A minimal re-write of your code along these lines could be
foo n lst = go lst
where
go lst = let (a,b) = splitAt n lst
(c,d) = splitAt n b
in
a ++ go d
The contact is "tight", there's no "information leakage" here. We just do the work twice (*) ourselves, and "connect the pipes" explicitly, in this code, grabbing one result (a) and dropping the other (c).
--
(*) twice, reflecting the two Boolean values, True and False, alternating in a simple fashion one after another. Thus this is captured frozen in the code's structure, not hanging loose as a parameter able to accommodate an arbitrary Boolean value.
Like dava4420 said, you shouldn't be using (++) to accumulate from the left. But perhaps you shouldn't be accumulating at all! In Haskell, lazyness makes straighforward "head-construction" often more efficient than the tail recursions you'd need to use in e.g. Lisp. For example:
foo :: Int -> [a] -> [a] -- why would you give this a Show constraint?
foo ℓ = foo' True
where foo' _ [] = []
foo' keep lst
| keep = firstℓ ++ foo' False other
| otherwise = foo' True other
where (firstℓ, other) = splitAt ℓ lst

How to implement a lazy constant-space tri-partition function?

I have generalized the existing Data.List.partition implementation
partition :: (a -> Bool) -> [a] -> ([a],[a])
partition p xs = foldr (select p) ([],[]) xs
where
-- select :: (a -> Bool) -> a -> ([a], [a]) -> ([a], [a])
select p x ~(ts,fs) | p x = (x:ts,fs)
| otherwise = (ts, x:fs)
to a "tri-partition" function
ordPartition :: (a -> Ordering) -> [a] -> ([a],[a],[a])
ordPartition cmp xs = foldr select ([],[],[]) xs
where
-- select :: a -> ([a], [a], [a]) -> ([a], [a], [a])
select x ~(lts,eqs,gts) = case cmp x of
LT -> (x:lts,eqs,gts)
EQ -> (lts,x:eqs,gts)
GT -> (lts,eqs,x:gts)
But now I'm facing a confusing behaviour when compiling with ghc -O1, the 'foo' and 'bar' functions work in constant-space, but the doo function leads to a space-leak.
foo xs = xs1
where
(xs1,_,_) = ordPartition (flip compare 0) xs
bar xs = xs2
where
(_,xs2,_) = ordPartition (flip compare 0) xs
-- pass-thru "least" non-empty partition
doo xs | null xs1 = if null xs2 then xs3 else xs2
| otherwise = xs1
where
(xs1,xs2,xs3) = ordPartition (flip compare 0) xs
main :: IO ()
main = do
print $ foo [0..100000000::Integer] -- results in []
print $ bar [0..100000000::Integer] -- results in [0]
print $ doo [0..100000000::Integer] -- results in [0] with space-leak
So my question now is,
What is the reason for the space-leak in doo, which seems suprising to me, since foo and bar don't exhibit such a space leak? and
Is there a way to implement ordPartition in such a way, that when used in the context of functions such as doo it performs with constant space complexity?
It's not a space leak. To find out whether a component list is empty, the entire input list has to be traversed and the other component lists constructed (as thunks) if it is. In the doo case, xs1 is empty, so the entire thing has to be built before deciding what to output.
That is a fundamental property of all partitioning algorithms, if one of the results is empty, and you check for its emptiness as a condition, that check cannot be completed before the entire list has been traversed.

Resources