Halving a list into sublists using pattern matching - haskell

I am trying to get the output, given an input
> halve [1,2,3,4,5,6]
([1,2,3],[4,5,6])
I have solved this problem using this approach:
halve xs = ((take s xs), (drop s xs))
where
s = (length xs) `div` 2
I am a beginner in Haskell and I want to learn how to solve this question using pattern matching? Thanks

You can make use of a variant of the hare and tortoise algorithm. This algorithm basically runs over the list with two iterators: the hare taking two hops at a time, and the tortoise performing one hop at that time.
When the hare reaches the end of the list, then we know that the tortoise is halfway, and thus can split the list in half: the list seen thus far is the first half, and the list still to enumerate over, is the second half.
An algorithm thus looks like:
half :: [a] -> ([a], [a])
half h = go h h
where go (_:(_:hs)) (t:ts) = (..., ...)
where (a, b) = go ...
go _ (t:ts) = (..., ...)
go _ [] = (..., ...)
with the ... parts still to fill in.

Related

Generating subsets of set. Laziness?

I have written a function generating subsets of subset. It caused stack overflow when I use in the following way subsets [1..]. And it is "normal" behaviour when it comes to "normal" (no-lazy) languages. And now, I would like to improve my function to be lazy.
P.S. I don't understand laziness ( And I try to understand it) so perhaps my problem is strange for you- please explain. :)
P.S. 2 Feel free to say me something about my disability in Haskell ;)
subsets :: [a] -> [[a]]
subsets (x:xs) = (map (\ e -> x:e) (subsets xs)) ++ (subsets xs)
subsets [] = [[]]
There's two problems with that function. First, it recurses twice, which makes it exponentially more ineffiecient than necessary (if we disregard the exponential number of results...), because each subtree is recalculated every time for all overlapping subsets; this can be fixed by leting the recursive call be the same value:
subsets' :: [a] -> [[a]]
subsets' [] = [[]]
subsets' (x:xs) = let s = subsets' xs
in map (x:) s ++ s
This will already allow you to calculate length $ subsets' [1..25] in a few seconds, while length $ subsets [1..25] takes... well, I didn't wait ;)
The other issue is that with your version, when you give it an infinite list, it will recurse on the infinite tail of that list first. To generate all finite subsets in a meaningful way, we need to ensure two things: first, we must build up each set from smaller sets (to ensure termination), and second, we should ensure a fair order (ie., not generate the list [[1], [2], ...] first and never get to the rest). For this, we start from [[]] and recursively add the current element to everything we have already generated, and then remember the new list for the next step:
subsets'' :: [a] -> [[a]]
subsets'' l = [[]] ++ subs [[]] l
where subs previous (x:xs) = let next = map (x:) previous
in next ++ subs (previous ++ next) xs
subs _ [] = []
Which results in this order:
*Main> take 100 $ subsets'' [1..]
[[],[1],[2],[2,1],[3],[3,1],[3,2],[3,2,1],[4],[4,1],[4,2],[4,2,1],[4,3],[4,3,1],[4,3,2],[4,3,2,1],[5],[5,1],[5,2],[5,2,1],[5,3],[5,3,1],[5,3,2],[5,3,2,1],[5,4],[5,4,1],[5,4,2],[5,4,2,1],[5,4,3],[5,4,3,1],[5,4,3,2],[5,4,3,2,1],[6],[6,1],[6,2],[6,2,1],[6,3],[6,3,1],[6,3,2],[6,3,2,1],[6,4],[6,4,1],[6,4,2],[6,4,2,1],[6,4,3],[6,4,3,1],[6,4,3,2],[6,4,3,2,1],[6,5],[6,5,1],[6,5,2],[6,5,2,1],[6,5,3],[6,5,3,1],[6,5,3,2],[6,5,3,2,1],[6,5,4],[6,5,4,1],[6,5,4,2],[6,5,4,2,1],[6,5,4,3],[6,5,4,3,1],[6,5,4,3,2],[6,5,4,3,2,1],[7],[7,1],[7,2],[7,2,1],[7,3],[7,3,1],[7,3,2],[7,3,2,1],[7,4],[7,4,1],[7,4,2],[7,4,2,1],[7,4,3],[7,4,3,1],[7,4,3,2],[7,4,3,2,1],[7,5],[7,5,1],[7,5,2],[7,5,2,1],[7,5,3],[7,5,3,1],[7,5,3,2],[7,5,3,2,1],[7,5,4],[7,5,4,1],[7,5,4,2],[7,5,4,2,1],[7,5,4,3],[7,5,4,3,1],[7,5,4,3,2],[7,5,4,3,2,1],[7,6],[7,6,1],[7,6,2],[7,6,2,1]]
You can't generate all the subsets of an infinite set: they form an uncountable set. Cardinality makes it impossible.
At most, you can try to generate all the finite subsets. For that, you can't proceed by induction, from [] onwards, since you'll never reach []. You need to proceed inductively from the beginning of the list, instead of the end.
A right fold solution would be:
powerset :: Foldable t => t a -> [[a]]
powerset xs = []: foldr go (const []) xs [[]]
where go x f a = let b = (x:) <$> a in b ++ f (a ++ b)
then:
\> take 8 $ powerset [1..]
[[],[1],[2],[2,1],[3],[3,1],[3,2],[3,2,1]]

Long working of program that count Ints

I want to write program that takes array of Ints and length and returns array that consist in position i all elements, that equals i, for example
[0,0,0,1,3,5,3,2,2,4,4,4] 6 -> [[0,0,0],[1],[2,2],[3,3],[4,4,4],[5]]
[0,0,4] 7 -> [[0,0],[],[],[],[4],[],[]]
[] 3 -> [[],[],[]]
[2,2] 3 -> [[],[],[2,2]]
So, that's my solution
import Data.List
import Data.Function
f :: [Int] -> Int -> [[Int]]
f ls len = g 0 ls' [] where
ls' = group . sort $ ls
g :: Int -> [[Int]] -> [[Int]] -> [[Int]]
g val [] accum
| len == val = accum
| otherwise = g (val+1) [] (accum ++ [[]])
g val (x:xs) accum
| len == val = accum
| val == head x = g (val+1) xs (accum ++ [x])
| otherwise = g (val+1) (x:xs) (accum ++ [[]])
But query f [] 1000000 works really long, why?
I see we're accumulating over some data structure. I think foldMap. I ask "Which Monoid"? It's some kind of lists of accumulations. Like this
newtype Bunch x = Bunch {bunch :: [x]}
instance Semigroup x => Monoid (Bunch x) where
mempty = Bunch []
mappend (Bunch xss) (Bunch yss) = Bunch (glom xss yss) where
glom [] yss = yss
glom xss [] = xss
glom (xs : xss) (ys : yss) = (xs <> ys) : glom xss yss
Our underlying elements have some associative operator <>, and we can thus apply that operator pointwise to a pair of lists, just like zipWith does, except that when we run out of one of the lists, we don't truncate, rather we just take the other. Note that Bunch is a name I'm introducing for purposes of this answer, but it's not that unusual a thing to want. I'm sure I've used it before and will again.
If we can translate
0 -> Bunch [[0]] -- single 0 in place 0
1 -> Bunch [[],[1]] -- single 1 in place 1
2 -> Bunch [[],[],[2]] -- single 2 in place 2
3 -> Bunch [[],[],[],[3]] -- single 3 in place 3
...
and foldMap across the input, then we'll get the right number of each in each place. There should be no need for an upper bound on the numbers in the input to get a sensible output, as long as you are willing to interpret [] as "the rest is silence". Otherwise, like Procrustes, you can pad or chop to the length you need.
Note, by the way, that when mappend's first argument comes from our translation, we do a bunch of ([]++) operations, a.k.a. ids, then a single ([i]++), a.k.a. (i:), so if foldMap is right-nested (which it is for lists), then we will always be doing cheap operations at the left end of our lists.
Now, as the question works with lists, we might want to introduce the Bunch structure only when it's useful. That's what Control.Newtype is for. We just need to tell it about Bunch.
instance Newtype (Bunch x) [x] where
pack = Bunch
unpack = bunch
And then it's
groupInts :: [Int] -> [[Int]]
groupInts = ala' Bunch foldMap (basis !!) where
basis = ala' Bunch foldMap id [iterate ([]:) [], [[[i]] | i <- [0..]]]
What? Well, without going to town on what ala' is in general, its impact here is as follows:
ala' Bunch foldMap f = bunch . foldMap (Bunch . f)
meaning that, although f is a function to lists, we accumulate as if f were a function to Bunches: the role of ala' is to insert the correct pack and unpack operations to make that just happen.
We need (basis !!) :: Int -> [[Int]] to be our translation. Hence basis :: [[[Int]]] is the list of images of our translation, computed on demand at most once each (i.e., the translation, memoized).
For this basis, observe that we need these two infinite lists
[ [] [ [[0]]
, [[]] , [[1]]
, [[],[]] , [[2]]
, [[],[],[]] , [[3]]
... ...
combined Bunchwise. As both lists have the same length (infinity), I could also have written
basis = zipWith (++) (iterate ([]:) []) [[[i]] | i <- [0..]]
but I thought it was worth observing that this also is an example of Bunch structure.
Of course, it's very nice when something like accumArray hands you exactly the sort of accumulation you need, neatly packaging a bunch of grungy behind-the-scenes mutation. But the general recipe for an accumulation is to think "What's the Monoid?" and "What do I do with each element?". That's what foldMap asks you.
The (++) operator copies the left-hand list. For this reason, adding to the beginning of a list is quite fast, but adding to the end of a list is very slow.
In summary, avoid adding things to the end of a list. Try to always add to the beginning instead. One simple way to do that is to build the list backwards, and then reverse it at the end. A more devious trick is to use "difference lists" (Google it). Another possibility is to use Data.Sequence rather than a list.
The first thing that should be noted is the most obvious way to implement this is use a data structure that allows random access, an array is an obviously choice. Note that you need to add the elements to the array multiple times and somehow "join them".
accumArray is perfect for this.
So we get:
f l i = elems $ accumArray (\l e -> e:l) [] (0,i-1) (map (\e -> (e,e)) l)
And we're good to go (see full code here).
This approach does involve converting the final array back into a list, but that step is very likely faster than say sorting the list, which often involves scanning the list at least a few times for a list of decent size.
Whenever you use ++ you have to recreate the entire list, since lists are immutable.
A simple solution would be to use :, but that builds a reversed list. However that can be fixed using reverse, which results in only building two lists (instead of 1 million in your case).
Your concept of glomming things onto an accumulator is a very useful one, and both MathematicalOrchid and Guvante show how you can use that concept reasonably efficiently. But in this case, there is a simpler approach that is likely also faster. You started with
group . sort $ ls
and this was a very good place to start! You get a list that's almost the one you want, except that you need to fill in some blanks. How can we figure those out? The simplest way, though probably not quite the most efficient, is to work with a list of all the numbers you want to count up to: [0 .. len-1].
So we start with
f ls len = g [0 .. len-1] (group . sort $ ls)
where
?
How do we define g? By pattern matching!
f ls len = g [0 .. len-1] (group . sort $ ls)
where
-- We may or may not have some lists left,
-- but we counted as high as we decided we
-- would
g [] _ = []
-- We have no lists left, so the rest of the
-- numbers are not represented
g ns [] = map (const []) ns
-- This shouldn't be possible, because group
-- doesn't make empty lists.
g _ ([]:_) = error "group isn't working!"
-- Finally, we have some work to do!
g (n:ns) xls#(xl#(x:_):xls')
| n == x = xl : g ns xls'
| otherwise = [] : g ns xls
That was nice, but making the list of numbers isn't free, so you might be wondering how you can optimize it. One method I invite you to try is using your original technique of keeping a separate counter, but following this same sort of structure.

List to tuple in Haskell

let's say i have a list like this:
["Questions", "that", "may", "already", "have", "your", "correct", "answer"]
and want to have this:
[("Questions","that"),("may","already"),("have","your"),("correct","answer")]
can this be done ? or is it a bad Haskell practice ?
For a simple method (that fails for a odd number of elements) you can use
combine :: [a] -> [(a, a)]
combine (x1:x2:xs) = (x1,x2):combine xs
combine (_:_) = error "Odd number of elements"
combine [] = []
Live demo
Or you could use some complex method like in an other answer that I don't really want to understand.
More generic:
map2 :: (a -> a -> b) -> [a] -> [b]
map2 f (x1:x2:xs) = (f x1 x2) : map2 f xs
map2 _ (_:_) = error "Odd number of elements"
map2 _ [] = []
Here is one way to do it, with the help of a helper function that lets you drop every second element from your target list, and then just use zip. This may not have your desired behavior when the list is of odd length since that's not yet defined in the question.
-- This is just from ghci
let my_list = ["Questions", "that", "may", "already", "have", "your", "correct", "answer"]
let dropEvery [] _ = []
let dropEvery list count = (take (count-1) list) ++ dropEvery (drop count list) count
zip (dropEvery my_list 2) $ dropEvery (tail my_list) 2
[("Questions","that"),("may","already"),("have","your"),("correct","answer")
The helper function is taken from question #6 from 99 Questions., where there are many other implementations of the same idea, probably many with better recursion optimization properties.
To understand dropEvery, it's good to remember what take and drop each do. take k some_list takes the first k entries of some_list. Meanwhile drop k some_list drops the first k entries.
If we want to drop every Nth element, it means we want to keep each run of (N-1) elements, then drop one, then do the same thing again until we are done.
The first part of dropEvery does this: it takes the first count-1 entries, which it will then concatenate to whatever it gets from the rest of the list.
After that, it says drop count (forget about the N-1 you kept, and also the 1 (in the Nth spot) that you had wanted to drop all along) -- and after these are dropped, you can just recursively apply the same logic to whatever is leftover.
Using ++ in this manner can be quite expensive in Haskell, so from a performance point of view this is not so great, but it was one of the shorter implementations available at that 99 questions page.
Here's a function to do it all in one shot, which is maybe a bit more readable:
byTwos :: [a] -> [(a,a)]
byTwos [] = []
byTwos xs = zip firsts seconds
where enumerated = zip xs [1..]
firsts = [fst x | x <- enumerated, odd $ snd x]
seconds = [fst x | x <- enumerated, even $ snd x]
In this case, I started out by saying this problem will be easy to solve with zip if I just already had the list of odd-indexed elements and the list of even-indexed elements. So let me just write that down, and then worry about getting them in some where clause.
In the where clause, I say first zip xs [1..] which will make [("Questions", 1), ("that", 2), ...] and so on.
Side note: recall that fst takes the first element of a tuple, and snd takes the second element.
Then firsts says take the first element of all these values if the second element is odd -- these will serve as "firsts" in the final output tuples from zip.
seconds says do the same thing, but only if the second element is even -- these will serve as "seconds" in the final output tuples from zip.
In case the list has odd length, firsts will be one element longer than seconds and so the final zip means that the final element of the list will simply be dropped, and the result will be the same as though you called the function on the front of the list (all but final element).
A simple pattern matching could do the trick :
f [] = []
f (x:y:xs) = (x,y):f(xs)
It means that an empty list gives an empty list, and that a list of a least two elements returns you a list with a couple of these two elements and then application of the same reasoning with what follows...
Using chunk from Data.List.Split you can get the desired result of pairing every two consecutive items in a list, namely for the given list named by xs,
import Data.List.Split
map (\ys -> (ys!!0, ys!!1)) $ chunk 2 xs
This solution assumes the given list has an even number of items.

H-99 Problems: #26 Can't Understand The Solution

I am currently working through H-99 Questions after reading Learn You a Haskell. So far I felt like I had a pretty good grasp of the concepts, and I didn't have too much trouble solving or understanding the previous problems. However, this one stumped me and I don't understand the solution.
The problem is:
Generate the combinations of K distinct objects chosen from the N elements of a list
In how many ways can a committee of 3 be chosen from a group of 12 people? We all know that there are C(12,3) = 220 possibilities (C(N,K) denotes the well-known binomial coefficients). For pure mathematicians, this result may be great. But we want to really generate all the possibilities in a list.
The solution provided:
import Data.List
combinations :: Int -> [a] -> [[a]]
combinations 0 _ = [ [] ]
combinations n xs = [ y:ys | y:xs' <- tails xs, ys <- combinations (n-1) xs']
The main point of confusion for me is the y variable. according to how tails works it should be getting assigned the entire list at the beginning and then that list will be preppend to ys after it is generate. However, when the function run it return a list of lists no longer than the n value passed in. Could someone please help me understand exactly how this works?
Variable y is not bound to the whole xs list. For instance, assume xs=[1,2,3]. Then:
y:xs' is matched against [1,2,3] ==> y=1 , xs'=[2,3]
y:xs' is matched against [2,3] ==> y=2 , xs'=[3]
y:xs' is matched against [3] ==> y=3 , xs'=[]
y:xs' is matched against [] ==> pattern match failure
Note that y is an integer above, while xs' is a list of integers.
The Haskell code can be read a a non-deterministic algorithm, as follows. To generate a combination of n elements from xs, get any tail of xs (i.e., drop any number of elements from the beginning). If the tail is empty, ignore it. Otherwise, let the tail be y:xs', where y is the first element of the tail and xs' the remaining (possibly empty) part. Take y and add it to the combination we are generating (as the first element). Then recursively choose other n-1 arguments from the xs' remaining part, and add those to the combination as well. When n drops to zero, we know there is only one combination, namely the empty combination [], so take that.
y is not appended to ys. That would involve the (++) :: [a] -> [a] -> [a] operator.
For that matter the types would not match if you tried to append y and ys. y has type a, while ys has type [a].
Rather, y is consed to ys using (:) :: a -> [a] -> [a] (the cons operator).
The length of the returned list is equal to n because combinations recurses from n to 0 so it will produce exactly n inner lists.

applying my functions?

I made these two haskell functions,
cut :: Int -> [a] -> (Error ([a],[a]))
cut _ [] = Ok([],[])
cut n xs | n>0 && n < length xs = Ok(take n xs, drop n xs)
| n > length xs = error("Fail")
mix :: [a] -> [a] -> [a]
mix xs [] = xs
mix [] ys = ys
mix (x:xs) (y:ys) = x:y:mix xs ys
An now wish to make anouther function in which i can use both of these,
this is what i have;
doboth :: [Int] -> [a] -> Error [a]
doboth (x:xs) list = mix((cut x list)) then send xs back to doboth recursivly for the next x elemet of the list.
The idea of this function is to cut a list and then mix the two lists, it gets the cut points from the do both list of ints...
ANy ideas?
Since cut returns not a list, you need to pattern match a bit:
case cut x list of
Ok (as, bs) -> mix ... ... -- and so forth
Shouldn't doboth return Error [[a]]?
Maybe you should use a standard type like Maybe or Either instead Error.
If I understand what you want correctly, then doboth would be something like
doboth xs list = mapM (\ x -> mix (cut x list)) xs
Assuming you've made Error into a monad (which Maybe and Either already are).
There are several questions you have to ask yourself:
Does cut have to return an Error value? Or can you come up with a reasonable definition that works for all inputs. For example, tail doesn't work for all inputs, whereas take does. Investigate take and see how it handles exceptional inputs. This might lead you to a better definition of cut. The reason this is important is that functions that return uniform results for all values are generally much easier to work with.
What do you expect doboth to do if the result of the cut is an error?
Does doboth operate on elements of a list independently? Or are the result dependent on earlier computations? The first is map like, the second fold like. You want to perform a cut and mix for each value in the [Int] input, but should the input to cut be the original list, or the list from the previous step?
Given that you have computed one step of doboth, what should the next step of doboth look like? Try writing out the code that would do two, or even three, steps of doboth at once.
What is the value of doboth if the [Int] argument is empty?

Resources