test :: [String] -> [String]
test = foldr step []
where step x ys
| elem x ys = x : ys
| otherwise = ys
I am trying to build a new list consisting of all the distinct strings being input. My test data is:
test ["one", "one", "two", "two", "three"]
expected result:
["one", "two", "three"]
I am new to Haskell, and I am sure that I am missing something very fundamental and obvious, but have run out of ways to explore this. Could you provide pointers to where my thinking is deficient?
The actual response is []. It seems that the first guard condition is never met (if I replace it with True, the original list is replicated), so the output list is never built.
My understanding was that the fold would accumulate the result of step on each item of the list, adding it to the empty list. I anticipated that step would test each item for its inclusion in the output list (the first element tested not being there) and would add anything that was not already included to the output list. Obviously not :-)
Your reasoning is correct: you just need to switch = x : ys and = ys so that you add the x when it's not an element of ys. Also, Data.List.nub does this exact thing.
Think about it: your code is saying "when x is in the remainder, prepend x to the result", i.e. creating a duplicate. You just need to change it to "when x is not in the remainder, prepend x to the result" and you get the correct function.
This function differs from Data.List.nub in an important way: this function is more strict. Thus:
test [1..] = _|_ -- infinite loop (try it)
nub [1..] = [1..]
nub gives the answer correctly for infinite lists -- this means that it doesn't need the whole list to start computing results, and thus it is a nice player in the stream processing game.
The reason it is strict is because elem is strict: it searches the whole list (presuming it doesn't find a match) before it returns a result. You could write that like this:
nub :: (Eq a) => [a] -> [a]
nub = go []
where
go seen [] = []
go seen (x:xs) | x `elem` seen = go seen xs
| otherwise = x : go (x:seen) xs
Notice how seen grows like the output so far, whereas yours grows like the remainder of the output. The former is always finite (starting at [] and adding one at a time), whereas the latter may be infinite (eg. [1..]). So this variant can yield elements more lazily.
This would be faster (O(n log n) instead of O(n^2)) if you used a Data.Set instead of a list for seen. But it adds an Ord constraint.
Related
I'm trying to solve a problem for a functional programming exercise in Haskell. I have to implement a function such that, given a string with an even number of characters, the function returns the same string with character pairs swapped.
Like this:
"helloworld" -> "ehllworodl"
This is my current implementation:
swap :: String -> String
swap s = swapRec s ""
where
swapRec :: String -> String -> String
swapRec [] result = result
swapRec (x:y:xs) result = swapRec xs (result++[y]++[x])
My function returns the correct results, however the programming exercise is timed, and It seems like my code is running too slowly.
Is there something I could do to make my code run faster, or I am following the wrong approach to the problem ?
Yes. If you use (++) :: [a] -> [a] -> [a], then this takes linear time in the number of elements of the first list you want to concatenate. Since result can be large, this will result in a ineffeciency: the algorithm is then O(n2).
You however do not need to construct the result with an accumulator. You can return a list, and do the processing of the remaining elements with a recursive call, like:
swap :: [a] -> [a]
swap [] = []
swap [x] = [x]
swap (x:y:xs) = y : x : swap xs
The above also uncovered a problem with the implementation: if the list had an odd length, then the function would have crashed. Here in the second case, we handle a list with one element by returning that list (perhaps you need to modify this according to the specifications).
Furthermore here we can benefit of Haskell's laziness: if we have a large list, want to pass it through the swap function, but are only interested in the first five elements, then we will not calculate the entire list.
We can also process all kinds of list with the above function: a list of numbers, of strings, etc.
Note that (++) itself is not inherently bad: if you need to concatenate, it is of course the most efficient way to do this. The problem is that you here in every recursive step will concatenate again, and the left list is growing each time.
Affixing something at the end of the accumulator passed into a recursive call
swapRec (x:y:xs) resultSoFar = swapRec xs
(resultSoFar ++ [y] ++ [x])
is the same as prepending it at the start of the result returned from the recursive call:
swapRec (x:y:xs) = [y] ++ [x] ++ swapRec xs
You will have to amend your function accordingly throughout.
This is known as guarded recursion. What you were using is known as tail recursion (a left fold).
The added benefit is that it will now be on-line (i.e., taking O(1) time per each processed element). You were creating the (++) nesting on the left which leads to quadratic behaviour, as discussed e.g. here.
I want to create a function that returns every third int from a list of ints without using any predefined functions. For example, everyThird [1,2,3,4,5] --> [1,4]
everyThird:: [a] -> [a]
Could I just continue to iterate over the list using tail and appending to a new list every third call? I am new to Haskell and very confused with all of this
One other way of doing this is to handle three different base cases, in all of which we're at the end of the list and the list is less than three elements long, and one recursive case, where the list is at least three elements long:
everyThird :: [a] -> [a]
everyThird [] = []
everyThird [x] = [x]
everyThird [x, _] = [x]
everyThird (x:_:_:xs) = x:everyThird xs
You want to do exactly what you said: iterate over the list and include the element only on each third call. However, there's a problem. Haskell is a funny language where the idea of "changing" a variable doesn't make sense, so the usual approach of "have a counter variable i which tells us whether we're on the third element or not" won't work in the usual way. Instead, we'll create a recursive helper function to maintain the count for us.
everyThird :: [Int] -> [Int]
everyThird xs = helper 0 xs
where helper _ [] = []
helper 0 (x : xs) = x : helper 2 xs
helper n (_ : xs) = helper (n - 1) xs
We have three cases in the helper.
If the list is empty, stop and return the empty list.
If the counter is at 0 (that is, if we're on the third element), make a list starting with the current element and ending with the rest of the computation.
If the counter is not at zero, count down and continue iteration.
Because of the way pattern matching works, it will try these three statements in order.
Notice how we use an additional argument to be the counter variable since we can't mutate the variable like we would in an imperative language. Also, notice how we construct the list recursively; we never "append" to an existing list because that would imply that we're mutating the list. We simply build the list up from scratch and end up with the correct result on the first go round.
Haskell doesn't have classical iteration (i.e. no loops), at least not without monads, but you can use similar logic as you would in a for loop by zipping your list with indexes [0..] and applying appropriate functions from Data.List.
E.g. What you need to do is filter every third element:
everyThirdWithIndexes list = filter (\x -> snd x `mod` 3 == 0) $ zip list [0..]
Of course you have to get rid of the indexes, there are two elegant ways you can do this:
everyThird list = map (fst) . everyThirdWithIndexes list
-- or:
everyThird list = fst . unzip . everyThirdWithIndexes list
If you're not familiar with filter and map, you can define a simple recursion that builds a list from every first element of a list, drops the next two and then adds another from a new function call:
everyThird [] = [] -- both in case if the list is empty and the end case
everyThird (x:xs) = x : everyThird (drop 2 xs)
EDIT: If you have any questions about these solutions (e.g. some syntax that you are not familiar with), feel free to ask in the comments. :)
One classic approach:
everyThird xs = [x | (1,x) <- zip (cycle [1..3]) xs]
You can also use chunksOf from Data.List.Split to seperate the lists into chunks of 3, then just map the first element of each:
import Data.List.Split
everyThird :: [a] -> [a]
everyThird xs = map head $ chunksOf 3 xs
Which works as follows:
*Main> everyThird [1,2,3,4,5]
[1,4]
Note: You may need to run cabal install split to use chunksOf.
I have problems trying to separate a list follows, suppose we have the following lists
[[1,2,3,4], [5,6,7,8], [9,10,11,12 ], [13,14,15,16,17]].
The result should be:
[[1,5,9,13] [2,6,10,14] [3,7,11,16] [4,8,12,16]]
I'm trying to do it the following way:
joinHead (x: xs) = map head (x: xs)
separateLists (x: xs) = xs joinHead x ++ separateLists
obviously this does not work. I hope you can help me. thx.
I adapted the functions you wrote, joinHead and separateLists, to make the code work, while preserving the logic you were following. From what I could infer looking at these functions, the idea was to use joinHead to extract the first element of each child list and return a new list. Then, this new list should be inserted in the front of a list of lists returned from calling separateLists recursively.
Here is the new definition of joinHead:
joinHead :: [[a]] -> [a]
joinHead ([]:_) = []
joinHead xs = map head xs
Note that the first line checks, through pattern matching, whether the first list contained in the list of lists is empty and, if so, returns an empty list ([]). The reasons for that are two:
The function head is unsafe. That means that calling head on an empty list will cause an exception to be thrown (try running in GHCi head []);
For simplicity, I'm assuming that all the lists were already checked to have the same length (length (xs !! 0) == length (xs !! 1) ...).
The definition of separateLists is as follows:
separateLists :: [[a]] -> [[a]]
separateLists ([]:_) = []
separateLists ([x]:xs) = [joinHead ([x]:xs)]
separateLists xs = joinHead xs : separateLists (map tail xs)
Again, the first two definitions are necessary for both stopping the recursion and safety purposes. The first line says: "if the first list is empty, then all the elements of all lists were already consumed, so return []". The second line says: "if the first line has exactly one element, then just call joinHead and return the result wrapped in a list". Note that in the third definition we have a call to tail which, like head, throws exceptions when called on []. That's the reason of why we need a separate case for lists of length 1. Finally, the third line, which is executed for lists of length greater than 1, gets a list from joinHead xs and insert it (using the "cons" operator (:)) in the beginning of the list returned from recursively calling separateLists. In this call, we have to take out the first elements of all the lists, that's why we use map tail xs.
Now, running:
λ: let list = [[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16,17]]
λ: separateLists list
[[1,5,9,13],[2,6,10,14],[3,7,11,15],[4,8,12,16]]
will give you the expected results. I hope it was clear enough. As a final note, I want to point out that this implementation is far from being optimal and, as suggested in the comments, you should probably use the standard Data.List.transpose. As an exercise and didatic example, it's fine! ;-)
let's say i have a list like this:
["Questions", "that", "may", "already", "have", "your", "correct", "answer"]
and want to have this:
[("Questions","that"),("may","already"),("have","your"),("correct","answer")]
can this be done ? or is it a bad Haskell practice ?
For a simple method (that fails for a odd number of elements) you can use
combine :: [a] -> [(a, a)]
combine (x1:x2:xs) = (x1,x2):combine xs
combine (_:_) = error "Odd number of elements"
combine [] = []
Live demo
Or you could use some complex method like in an other answer that I don't really want to understand.
More generic:
map2 :: (a -> a -> b) -> [a] -> [b]
map2 f (x1:x2:xs) = (f x1 x2) : map2 f xs
map2 _ (_:_) = error "Odd number of elements"
map2 _ [] = []
Here is one way to do it, with the help of a helper function that lets you drop every second element from your target list, and then just use zip. This may not have your desired behavior when the list is of odd length since that's not yet defined in the question.
-- This is just from ghci
let my_list = ["Questions", "that", "may", "already", "have", "your", "correct", "answer"]
let dropEvery [] _ = []
let dropEvery list count = (take (count-1) list) ++ dropEvery (drop count list) count
zip (dropEvery my_list 2) $ dropEvery (tail my_list) 2
[("Questions","that"),("may","already"),("have","your"),("correct","answer")
The helper function is taken from question #6 from 99 Questions., where there are many other implementations of the same idea, probably many with better recursion optimization properties.
To understand dropEvery, it's good to remember what take and drop each do. take k some_list takes the first k entries of some_list. Meanwhile drop k some_list drops the first k entries.
If we want to drop every Nth element, it means we want to keep each run of (N-1) elements, then drop one, then do the same thing again until we are done.
The first part of dropEvery does this: it takes the first count-1 entries, which it will then concatenate to whatever it gets from the rest of the list.
After that, it says drop count (forget about the N-1 you kept, and also the 1 (in the Nth spot) that you had wanted to drop all along) -- and after these are dropped, you can just recursively apply the same logic to whatever is leftover.
Using ++ in this manner can be quite expensive in Haskell, so from a performance point of view this is not so great, but it was one of the shorter implementations available at that 99 questions page.
Here's a function to do it all in one shot, which is maybe a bit more readable:
byTwos :: [a] -> [(a,a)]
byTwos [] = []
byTwos xs = zip firsts seconds
where enumerated = zip xs [1..]
firsts = [fst x | x <- enumerated, odd $ snd x]
seconds = [fst x | x <- enumerated, even $ snd x]
In this case, I started out by saying this problem will be easy to solve with zip if I just already had the list of odd-indexed elements and the list of even-indexed elements. So let me just write that down, and then worry about getting them in some where clause.
In the where clause, I say first zip xs [1..] which will make [("Questions", 1), ("that", 2), ...] and so on.
Side note: recall that fst takes the first element of a tuple, and snd takes the second element.
Then firsts says take the first element of all these values if the second element is odd -- these will serve as "firsts" in the final output tuples from zip.
seconds says do the same thing, but only if the second element is even -- these will serve as "seconds" in the final output tuples from zip.
In case the list has odd length, firsts will be one element longer than seconds and so the final zip means that the final element of the list will simply be dropped, and the result will be the same as though you called the function on the front of the list (all but final element).
A simple pattern matching could do the trick :
f [] = []
f (x:y:xs) = (x,y):f(xs)
It means that an empty list gives an empty list, and that a list of a least two elements returns you a list with a couple of these two elements and then application of the same reasoning with what follows...
Using chunk from Data.List.Split you can get the desired result of pairing every two consecutive items in a list, namely for the given list named by xs,
import Data.List.Split
map (\ys -> (ys!!0, ys!!1)) $ chunk 2 xs
This solution assumes the given list has an even number of items.
I am currently working through H-99 Questions after reading Learn You a Haskell. So far I felt like I had a pretty good grasp of the concepts, and I didn't have too much trouble solving or understanding the previous problems. However, this one stumped me and I don't understand the solution.
The problem is:
Generate the combinations of K distinct objects chosen from the N elements of a list
In how many ways can a committee of 3 be chosen from a group of 12 people? We all know that there are C(12,3) = 220 possibilities (C(N,K) denotes the well-known binomial coefficients). For pure mathematicians, this result may be great. But we want to really generate all the possibilities in a list.
The solution provided:
import Data.List
combinations :: Int -> [a] -> [[a]]
combinations 0 _ = [ [] ]
combinations n xs = [ y:ys | y:xs' <- tails xs, ys <- combinations (n-1) xs']
The main point of confusion for me is the y variable. according to how tails works it should be getting assigned the entire list at the beginning and then that list will be preppend to ys after it is generate. However, when the function run it return a list of lists no longer than the n value passed in. Could someone please help me understand exactly how this works?
Variable y is not bound to the whole xs list. For instance, assume xs=[1,2,3]. Then:
y:xs' is matched against [1,2,3] ==> y=1 , xs'=[2,3]
y:xs' is matched against [2,3] ==> y=2 , xs'=[3]
y:xs' is matched against [3] ==> y=3 , xs'=[]
y:xs' is matched against [] ==> pattern match failure
Note that y is an integer above, while xs' is a list of integers.
The Haskell code can be read a a non-deterministic algorithm, as follows. To generate a combination of n elements from xs, get any tail of xs (i.e., drop any number of elements from the beginning). If the tail is empty, ignore it. Otherwise, let the tail be y:xs', where y is the first element of the tail and xs' the remaining (possibly empty) part. Take y and add it to the combination we are generating (as the first element). Then recursively choose other n-1 arguments from the xs' remaining part, and add those to the combination as well. When n drops to zero, we know there is only one combination, namely the empty combination [], so take that.
y is not appended to ys. That would involve the (++) :: [a] -> [a] -> [a] operator.
For that matter the types would not match if you tried to append y and ys. y has type a, while ys has type [a].
Rather, y is consed to ys using (:) :: a -> [a] -> [a] (the cons operator).
The length of the returned list is equal to n because combinations recurses from n to 0 so it will produce exactly n inner lists.