I want to create a function that returns every third int from a list of ints without using any predefined functions. For example, everyThird [1,2,3,4,5] --> [1,4]
everyThird:: [a] -> [a]
Could I just continue to iterate over the list using tail and appending to a new list every third call? I am new to Haskell and very confused with all of this
One other way of doing this is to handle three different base cases, in all of which we're at the end of the list and the list is less than three elements long, and one recursive case, where the list is at least three elements long:
everyThird :: [a] -> [a]
everyThird [] = []
everyThird [x] = [x]
everyThird [x, _] = [x]
everyThird (x:_:_:xs) = x:everyThird xs
You want to do exactly what you said: iterate over the list and include the element only on each third call. However, there's a problem. Haskell is a funny language where the idea of "changing" a variable doesn't make sense, so the usual approach of "have a counter variable i which tells us whether we're on the third element or not" won't work in the usual way. Instead, we'll create a recursive helper function to maintain the count for us.
everyThird :: [Int] -> [Int]
everyThird xs = helper 0 xs
where helper _ [] = []
helper 0 (x : xs) = x : helper 2 xs
helper n (_ : xs) = helper (n - 1) xs
We have three cases in the helper.
If the list is empty, stop and return the empty list.
If the counter is at 0 (that is, if we're on the third element), make a list starting with the current element and ending with the rest of the computation.
If the counter is not at zero, count down and continue iteration.
Because of the way pattern matching works, it will try these three statements in order.
Notice how we use an additional argument to be the counter variable since we can't mutate the variable like we would in an imperative language. Also, notice how we construct the list recursively; we never "append" to an existing list because that would imply that we're mutating the list. We simply build the list up from scratch and end up with the correct result on the first go round.
Haskell doesn't have classical iteration (i.e. no loops), at least not without monads, but you can use similar logic as you would in a for loop by zipping your list with indexes [0..] and applying appropriate functions from Data.List.
E.g. What you need to do is filter every third element:
everyThirdWithIndexes list = filter (\x -> snd x `mod` 3 == 0) $ zip list [0..]
Of course you have to get rid of the indexes, there are two elegant ways you can do this:
everyThird list = map (fst) . everyThirdWithIndexes list
-- or:
everyThird list = fst . unzip . everyThirdWithIndexes list
If you're not familiar with filter and map, you can define a simple recursion that builds a list from every first element of a list, drops the next two and then adds another from a new function call:
everyThird [] = [] -- both in case if the list is empty and the end case
everyThird (x:xs) = x : everyThird (drop 2 xs)
EDIT: If you have any questions about these solutions (e.g. some syntax that you are not familiar with), feel free to ask in the comments. :)
One classic approach:
everyThird xs = [x | (1,x) <- zip (cycle [1..3]) xs]
You can also use chunksOf from Data.List.Split to seperate the lists into chunks of 3, then just map the first element of each:
import Data.List.Split
everyThird :: [a] -> [a]
everyThird xs = map head $ chunksOf 3 xs
Which works as follows:
*Main> everyThird [1,2,3,4,5]
[1,4]
Note: You may need to run cabal install split to use chunksOf.
I have written a function generating subsets of subset. It caused stack overflow when I use in the following way subsets [1..]. And it is "normal" behaviour when it comes to "normal" (no-lazy) languages. And now, I would like to improve my function to be lazy.
P.S. I don't understand laziness ( And I try to understand it) so perhaps my problem is strange for you- please explain. :)
P.S. 2 Feel free to say me something about my disability in Haskell ;)
subsets :: [a] -> [[a]]
subsets (x:xs) = (map (\ e -> x:e) (subsets xs)) ++ (subsets xs)
subsets [] = [[]]
There's two problems with that function. First, it recurses twice, which makes it exponentially more ineffiecient than necessary (if we disregard the exponential number of results...), because each subtree is recalculated every time for all overlapping subsets; this can be fixed by leting the recursive call be the same value:
subsets' :: [a] -> [[a]]
subsets' [] = [[]]
subsets' (x:xs) = let s = subsets' xs
in map (x:) s ++ s
This will already allow you to calculate length $ subsets' [1..25] in a few seconds, while length $ subsets [1..25] takes... well, I didn't wait ;)
The other issue is that with your version, when you give it an infinite list, it will recurse on the infinite tail of that list first. To generate all finite subsets in a meaningful way, we need to ensure two things: first, we must build up each set from smaller sets (to ensure termination), and second, we should ensure a fair order (ie., not generate the list [[1], [2], ...] first and never get to the rest). For this, we start from [[]] and recursively add the current element to everything we have already generated, and then remember the new list for the next step:
subsets'' :: [a] -> [[a]]
subsets'' l = [[]] ++ subs [[]] l
where subs previous (x:xs) = let next = map (x:) previous
in next ++ subs (previous ++ next) xs
subs _ [] = []
Which results in this order:
*Main> take 100 $ subsets'' [1..]
[[],[1],[2],[2,1],[3],[3,1],[3,2],[3,2,1],[4],[4,1],[4,2],[4,2,1],[4,3],[4,3,1],[4,3,2],[4,3,2,1],[5],[5,1],[5,2],[5,2,1],[5,3],[5,3,1],[5,3,2],[5,3,2,1],[5,4],[5,4,1],[5,4,2],[5,4,2,1],[5,4,3],[5,4,3,1],[5,4,3,2],[5,4,3,2,1],[6],[6,1],[6,2],[6,2,1],[6,3],[6,3,1],[6,3,2],[6,3,2,1],[6,4],[6,4,1],[6,4,2],[6,4,2,1],[6,4,3],[6,4,3,1],[6,4,3,2],[6,4,3,2,1],[6,5],[6,5,1],[6,5,2],[6,5,2,1],[6,5,3],[6,5,3,1],[6,5,3,2],[6,5,3,2,1],[6,5,4],[6,5,4,1],[6,5,4,2],[6,5,4,2,1],[6,5,4,3],[6,5,4,3,1],[6,5,4,3,2],[6,5,4,3,2,1],[7],[7,1],[7,2],[7,2,1],[7,3],[7,3,1],[7,3,2],[7,3,2,1],[7,4],[7,4,1],[7,4,2],[7,4,2,1],[7,4,3],[7,4,3,1],[7,4,3,2],[7,4,3,2,1],[7,5],[7,5,1],[7,5,2],[7,5,2,1],[7,5,3],[7,5,3,1],[7,5,3,2],[7,5,3,2,1],[7,5,4],[7,5,4,1],[7,5,4,2],[7,5,4,2,1],[7,5,4,3],[7,5,4,3,1],[7,5,4,3,2],[7,5,4,3,2,1],[7,6],[7,6,1],[7,6,2],[7,6,2,1]]
You can't generate all the subsets of an infinite set: they form an uncountable set. Cardinality makes it impossible.
At most, you can try to generate all the finite subsets. For that, you can't proceed by induction, from [] onwards, since you'll never reach []. You need to proceed inductively from the beginning of the list, instead of the end.
A right fold solution would be:
powerset :: Foldable t => t a -> [[a]]
powerset xs = []: foldr go (const []) xs [[]]
where go x f a = let b = (x:) <$> a in b ++ f (a ++ b)
then:
\> take 8 $ powerset [1..]
[[],[1],[2],[2,1],[3],[3,1],[3,2],[3,2,1]]
I'm facing the following problem :
From the initial set [1,2,3,4] compute all possible subsets i.e [[1],[2],[3],[4],[1,2],[1,3],[1,4],[2,3],[2,4],[3,4],[1,2,3],[1,2,4],[1,3,4],[2,3,4],[1,2,3,4]]
I've wrote the following Haskell program generate.hs which is correct.
generateSets :: Eq a => [a] -> [[a]] -> [[a]] -> [[a]]
generateSets [] _ _ = []
generateSets src [] _ = let isets = growthup [] src in generateSets src iset iset
generateSets src sets rsets = if null sets' then rsets else generateSets src sets' (rsets++sets')
where sets' = concatMap (flip growthup src) sets
growthup :: (Eq a) => [a] -> [a] -> [[a]]
growthup ps ss = map (\suf -> ps++[suf]) ss'
where ss' = nextoccurence ps ss
nextoccurence :: (Eq a) => [a] -> [a] -> [a]
nextoccurence [] ys = ys
nextoccurence xs ys = tail ys'
where ys' = dropWhile (/= last xs) ys
While executing it in the GHC interpreter ghci ...
ghci> generate [1,2,3,4] [] []
ghci> [[1],[2],[3],[4],[1,2],[1,3],[1,4],[2,3],[2,4],[3,4],[1,2,3],[1,2,4],[1,3,4],[2,3,4],[1,2,3,4]]
every thing goes fine but the program take too long for just small sets of size 30 for example.
My question is : It is possible to improve my code in order to gain more from haskell laziness, or garbagge collector or something else ?
Is my code a good candidate for parallelism ?
Thanks for any reply !
Sets have a lot of subsets. In fact, a set of n elements has 2n subsets, so a set of 30 elements has over one billion subsets. Whichever method you use to generate them, even iterating over the results is going to take a long time. For larger sets you can pretty much forget about going through them all before the heat death of the universe.
So there's only so much you can do performance-wise, as even doubling the speed of your algorithm will only let you work with lists of one more element in the same time. For most applications, the real solution is to avoid having to enumerate all the subsets in the first place.
That said, there is a simple inductive way of thinking about subsets which makes defining a proper subset function easy without having to do any equality comparisons, which solves some of the problems with your implementation.
For the base case, the empty set has one subset: the empty set.
subsets [] = [[]]
For a set with at least one element (x:xs), we have the subsets which contain that element, and the ones that don't. We can get the subsets that don't contain x by recursively calling subsets xs, and we can get the rest by prepending x to those.
subsets (x:xs) = subsets xs ++ map (x:) (subsets xs)
The definition of subsequences in Data.List works on the same principle, but in a slightly more optimized way, which also returns the subsets in a different order and makes better use of sharing. However, as I said, enumerating the subsets of a list of length 30 is going to be slow no matter what, and your best bet is to try to avoid having to do it in the first place.