let's say i have a list like this:
["Questions", "that", "may", "already", "have", "your", "correct", "answer"]
and want to have this:
[("Questions","that"),("may","already"),("have","your"),("correct","answer")]
can this be done ? or is it a bad Haskell practice ?
For a simple method (that fails for a odd number of elements) you can use
combine :: [a] -> [(a, a)]
combine (x1:x2:xs) = (x1,x2):combine xs
combine (_:_) = error "Odd number of elements"
combine [] = []
Live demo
Or you could use some complex method like in an other answer that I don't really want to understand.
More generic:
map2 :: (a -> a -> b) -> [a] -> [b]
map2 f (x1:x2:xs) = (f x1 x2) : map2 f xs
map2 _ (_:_) = error "Odd number of elements"
map2 _ [] = []
Here is one way to do it, with the help of a helper function that lets you drop every second element from your target list, and then just use zip. This may not have your desired behavior when the list is of odd length since that's not yet defined in the question.
-- This is just from ghci
let my_list = ["Questions", "that", "may", "already", "have", "your", "correct", "answer"]
let dropEvery [] _ = []
let dropEvery list count = (take (count-1) list) ++ dropEvery (drop count list) count
zip (dropEvery my_list 2) $ dropEvery (tail my_list) 2
[("Questions","that"),("may","already"),("have","your"),("correct","answer")
The helper function is taken from question #6 from 99 Questions., where there are many other implementations of the same idea, probably many with better recursion optimization properties.
To understand dropEvery, it's good to remember what take and drop each do. take k some_list takes the first k entries of some_list. Meanwhile drop k some_list drops the first k entries.
If we want to drop every Nth element, it means we want to keep each run of (N-1) elements, then drop one, then do the same thing again until we are done.
The first part of dropEvery does this: it takes the first count-1 entries, which it will then concatenate to whatever it gets from the rest of the list.
After that, it says drop count (forget about the N-1 you kept, and also the 1 (in the Nth spot) that you had wanted to drop all along) -- and after these are dropped, you can just recursively apply the same logic to whatever is leftover.
Using ++ in this manner can be quite expensive in Haskell, so from a performance point of view this is not so great, but it was one of the shorter implementations available at that 99 questions page.
Here's a function to do it all in one shot, which is maybe a bit more readable:
byTwos :: [a] -> [(a,a)]
byTwos [] = []
byTwos xs = zip firsts seconds
where enumerated = zip xs [1..]
firsts = [fst x | x <- enumerated, odd $ snd x]
seconds = [fst x | x <- enumerated, even $ snd x]
In this case, I started out by saying this problem will be easy to solve with zip if I just already had the list of odd-indexed elements and the list of even-indexed elements. So let me just write that down, and then worry about getting them in some where clause.
In the where clause, I say first zip xs [1..] which will make [("Questions", 1), ("that", 2), ...] and so on.
Side note: recall that fst takes the first element of a tuple, and snd takes the second element.
Then firsts says take the first element of all these values if the second element is odd -- these will serve as "firsts" in the final output tuples from zip.
seconds says do the same thing, but only if the second element is even -- these will serve as "seconds" in the final output tuples from zip.
In case the list has odd length, firsts will be one element longer than seconds and so the final zip means that the final element of the list will simply be dropped, and the result will be the same as though you called the function on the front of the list (all but final element).
A simple pattern matching could do the trick :
f [] = []
f (x:y:xs) = (x,y):f(xs)
It means that an empty list gives an empty list, and that a list of a least two elements returns you a list with a couple of these two elements and then application of the same reasoning with what follows...
Using chunk from Data.List.Split you can get the desired result of pairing every two consecutive items in a list, namely for the given list named by xs,
import Data.List.Split
map (\ys -> (ys!!0, ys!!1)) $ chunk 2 xs
This solution assumes the given list has an even number of items.
Related
I want to create a function that returns every third int from a list of ints without using any predefined functions. For example, everyThird [1,2,3,4,5] --> [1,4]
everyThird:: [a] -> [a]
Could I just continue to iterate over the list using tail and appending to a new list every third call? I am new to Haskell and very confused with all of this
One other way of doing this is to handle three different base cases, in all of which we're at the end of the list and the list is less than three elements long, and one recursive case, where the list is at least three elements long:
everyThird :: [a] -> [a]
everyThird [] = []
everyThird [x] = [x]
everyThird [x, _] = [x]
everyThird (x:_:_:xs) = x:everyThird xs
You want to do exactly what you said: iterate over the list and include the element only on each third call. However, there's a problem. Haskell is a funny language where the idea of "changing" a variable doesn't make sense, so the usual approach of "have a counter variable i which tells us whether we're on the third element or not" won't work in the usual way. Instead, we'll create a recursive helper function to maintain the count for us.
everyThird :: [Int] -> [Int]
everyThird xs = helper 0 xs
where helper _ [] = []
helper 0 (x : xs) = x : helper 2 xs
helper n (_ : xs) = helper (n - 1) xs
We have three cases in the helper.
If the list is empty, stop and return the empty list.
If the counter is at 0 (that is, if we're on the third element), make a list starting with the current element and ending with the rest of the computation.
If the counter is not at zero, count down and continue iteration.
Because of the way pattern matching works, it will try these three statements in order.
Notice how we use an additional argument to be the counter variable since we can't mutate the variable like we would in an imperative language. Also, notice how we construct the list recursively; we never "append" to an existing list because that would imply that we're mutating the list. We simply build the list up from scratch and end up with the correct result on the first go round.
Haskell doesn't have classical iteration (i.e. no loops), at least not without monads, but you can use similar logic as you would in a for loop by zipping your list with indexes [0..] and applying appropriate functions from Data.List.
E.g. What you need to do is filter every third element:
everyThirdWithIndexes list = filter (\x -> snd x `mod` 3 == 0) $ zip list [0..]
Of course you have to get rid of the indexes, there are two elegant ways you can do this:
everyThird list = map (fst) . everyThirdWithIndexes list
-- or:
everyThird list = fst . unzip . everyThirdWithIndexes list
If you're not familiar with filter and map, you can define a simple recursion that builds a list from every first element of a list, drops the next two and then adds another from a new function call:
everyThird [] = [] -- both in case if the list is empty and the end case
everyThird (x:xs) = x : everyThird (drop 2 xs)
EDIT: If you have any questions about these solutions (e.g. some syntax that you are not familiar with), feel free to ask in the comments. :)
One classic approach:
everyThird xs = [x | (1,x) <- zip (cycle [1..3]) xs]
You can also use chunksOf from Data.List.Split to seperate the lists into chunks of 3, then just map the first element of each:
import Data.List.Split
everyThird :: [a] -> [a]
everyThird xs = map head $ chunksOf 3 xs
Which works as follows:
*Main> everyThird [1,2,3,4,5]
[1,4]
Note: You may need to run cabal install split to use chunksOf.
I want to write program that takes array of Ints and length and returns array that consist in position i all elements, that equals i, for example
[0,0,0,1,3,5,3,2,2,4,4,4] 6 -> [[0,0,0],[1],[2,2],[3,3],[4,4,4],[5]]
[0,0,4] 7 -> [[0,0],[],[],[],[4],[],[]]
[] 3 -> [[],[],[]]
[2,2] 3 -> [[],[],[2,2]]
So, that's my solution
import Data.List
import Data.Function
f :: [Int] -> Int -> [[Int]]
f ls len = g 0 ls' [] where
ls' = group . sort $ ls
g :: Int -> [[Int]] -> [[Int]] -> [[Int]]
g val [] accum
| len == val = accum
| otherwise = g (val+1) [] (accum ++ [[]])
g val (x:xs) accum
| len == val = accum
| val == head x = g (val+1) xs (accum ++ [x])
| otherwise = g (val+1) (x:xs) (accum ++ [[]])
But query f [] 1000000 works really long, why?
I see we're accumulating over some data structure. I think foldMap. I ask "Which Monoid"? It's some kind of lists of accumulations. Like this
newtype Bunch x = Bunch {bunch :: [x]}
instance Semigroup x => Monoid (Bunch x) where
mempty = Bunch []
mappend (Bunch xss) (Bunch yss) = Bunch (glom xss yss) where
glom [] yss = yss
glom xss [] = xss
glom (xs : xss) (ys : yss) = (xs <> ys) : glom xss yss
Our underlying elements have some associative operator <>, and we can thus apply that operator pointwise to a pair of lists, just like zipWith does, except that when we run out of one of the lists, we don't truncate, rather we just take the other. Note that Bunch is a name I'm introducing for purposes of this answer, but it's not that unusual a thing to want. I'm sure I've used it before and will again.
If we can translate
0 -> Bunch [[0]] -- single 0 in place 0
1 -> Bunch [[],[1]] -- single 1 in place 1
2 -> Bunch [[],[],[2]] -- single 2 in place 2
3 -> Bunch [[],[],[],[3]] -- single 3 in place 3
...
and foldMap across the input, then we'll get the right number of each in each place. There should be no need for an upper bound on the numbers in the input to get a sensible output, as long as you are willing to interpret [] as "the rest is silence". Otherwise, like Procrustes, you can pad or chop to the length you need.
Note, by the way, that when mappend's first argument comes from our translation, we do a bunch of ([]++) operations, a.k.a. ids, then a single ([i]++), a.k.a. (i:), so if foldMap is right-nested (which it is for lists), then we will always be doing cheap operations at the left end of our lists.
Now, as the question works with lists, we might want to introduce the Bunch structure only when it's useful. That's what Control.Newtype is for. We just need to tell it about Bunch.
instance Newtype (Bunch x) [x] where
pack = Bunch
unpack = bunch
And then it's
groupInts :: [Int] -> [[Int]]
groupInts = ala' Bunch foldMap (basis !!) where
basis = ala' Bunch foldMap id [iterate ([]:) [], [[[i]] | i <- [0..]]]
What? Well, without going to town on what ala' is in general, its impact here is as follows:
ala' Bunch foldMap f = bunch . foldMap (Bunch . f)
meaning that, although f is a function to lists, we accumulate as if f were a function to Bunches: the role of ala' is to insert the correct pack and unpack operations to make that just happen.
We need (basis !!) :: Int -> [[Int]] to be our translation. Hence basis :: [[[Int]]] is the list of images of our translation, computed on demand at most once each (i.e., the translation, memoized).
For this basis, observe that we need these two infinite lists
[ [] [ [[0]]
, [[]] , [[1]]
, [[],[]] , [[2]]
, [[],[],[]] , [[3]]
... ...
combined Bunchwise. As both lists have the same length (infinity), I could also have written
basis = zipWith (++) (iterate ([]:) []) [[[i]] | i <- [0..]]
but I thought it was worth observing that this also is an example of Bunch structure.
Of course, it's very nice when something like accumArray hands you exactly the sort of accumulation you need, neatly packaging a bunch of grungy behind-the-scenes mutation. But the general recipe for an accumulation is to think "What's the Monoid?" and "What do I do with each element?". That's what foldMap asks you.
The (++) operator copies the left-hand list. For this reason, adding to the beginning of a list is quite fast, but adding to the end of a list is very slow.
In summary, avoid adding things to the end of a list. Try to always add to the beginning instead. One simple way to do that is to build the list backwards, and then reverse it at the end. A more devious trick is to use "difference lists" (Google it). Another possibility is to use Data.Sequence rather than a list.
The first thing that should be noted is the most obvious way to implement this is use a data structure that allows random access, an array is an obviously choice. Note that you need to add the elements to the array multiple times and somehow "join them".
accumArray is perfect for this.
So we get:
f l i = elems $ accumArray (\l e -> e:l) [] (0,i-1) (map (\e -> (e,e)) l)
And we're good to go (see full code here).
This approach does involve converting the final array back into a list, but that step is very likely faster than say sorting the list, which often involves scanning the list at least a few times for a list of decent size.
Whenever you use ++ you have to recreate the entire list, since lists are immutable.
A simple solution would be to use :, but that builds a reversed list. However that can be fixed using reverse, which results in only building two lists (instead of 1 million in your case).
Your concept of glomming things onto an accumulator is a very useful one, and both MathematicalOrchid and Guvante show how you can use that concept reasonably efficiently. But in this case, there is a simpler approach that is likely also faster. You started with
group . sort $ ls
and this was a very good place to start! You get a list that's almost the one you want, except that you need to fill in some blanks. How can we figure those out? The simplest way, though probably not quite the most efficient, is to work with a list of all the numbers you want to count up to: [0 .. len-1].
So we start with
f ls len = g [0 .. len-1] (group . sort $ ls)
where
?
How do we define g? By pattern matching!
f ls len = g [0 .. len-1] (group . sort $ ls)
where
-- We may or may not have some lists left,
-- but we counted as high as we decided we
-- would
g [] _ = []
-- We have no lists left, so the rest of the
-- numbers are not represented
g ns [] = map (const []) ns
-- This shouldn't be possible, because group
-- doesn't make empty lists.
g _ ([]:_) = error "group isn't working!"
-- Finally, we have some work to do!
g (n:ns) xls#(xl#(x:_):xls')
| n == x = xl : g ns xls'
| otherwise = [] : g ns xls
That was nice, but making the list of numbers isn't free, so you might be wondering how you can optimize it. One method I invite you to try is using your original technique of keeping a separate counter, but following this same sort of structure.
I'm trying to define a functino that finds the minimum distance between to neighbor numbers on a list
something like this:
minNeighborsDistance [2,3,6,2,0,1,9,8] => 1
My code looks like this:
minNeighborsDistance [] = []
minNeighborsDistance (x:xs) = minimum[minNeighborsDistance xs ++ [subtract x (head xs)]]
Although this seems to run, once I enter a list I receive an Exception error.
I'm new to Haskell I would appreciate any help in this matter.
If you pass a singleton list to minNeighborsDistance then
It'll fail to match [] in the first line, then
it'll successfully match (x:xs) assigning the single value to x and the empty like to xs, then
it'll throw an error when you try to access the head of an empty list.
Further, since you call minNeighborsDistance recursively then you'll always eventually call it on a singleton list excepting when you pass it an empty list.
Here's what I came up with:
minDistance l = minimum . map abs . zipWith (-) l $ tail l
Try this:
minDistance list = minimum (distance list)
where
distance list = map abs $ zipWith (-) list (tail list)
distance calculates the absolute value of the list being subtracted with itself shifted by 1 position:
[2,3,6,2,0,1,9,8] -- the 8 is skipped but it does not make a difference
- [3,6,2,0,1,9,8]
= [1,3,4,2,1,8,1]
minDistance now just gets the smallest element of the resulting list.
Your question is a bit unclear (a type signature would really help here), but if you're wanting to calculate the difference between adjacent elements of the list, then find the minimum of those numbers, I would say the most clear way is to use some extra pattern matching:
-- Is this type you want the function to have?
minNeighborsDistance :: [Int] -> Int
minNeighborsDistance list = minimum $ go list
where
go (x:y:rest) = (x - y) : go (y:rest)
go anythingElse = [] -- Or just go _ = []
However, this won't quite give you the answer you want, because the actual minimum for your example list would be -4 when you go from 6 to 2. But this is an easy fix, just apply abs:
minNeighborsDistance :: [Int] -> Int
minNeighborsDistance list = minimum $ go list
where
go (x:y:rest) = abs (x - y) : go (y:rest)
go anythingElse = []
I've used a helper function to calculate the differences from element to element, then the top-level definition calls minimum on that result to get the final answer.
There is an easier way, though, if you exploit a few functions in Prelude, namely zipWith, map, and drop:
minNeighborsDistance :: [Int] -> Int
minNeighborsDistance list
= minimum -- Calculates the minimum of all the distances
$ (maxBound:) -- Ensures we have at least 1 number to pass to
-- minimum by consing the maximum possible Int
$ map abs -- Ensure all differences are non-negative
-- Compute the difference between each element. I use "drop 1"
-- instead of tail because it won't error on an empty list
$ zipWith (-) list (drop 1 list)
So combined into one line without comments:
minNeighborsDistance list = minimum $ (maxBound:) $ map abs $ zipWith (-) list $ drop 1 list
I want to convert [1,2,3,4] to [[1 2] [2 3] [3 4]] or [(1 2) (2 3) (3 4)]. In clojure I have (partition 2 1 [1,2,3,4]). How can I do it in haskell? I suspect there is such function in standard api but I can't find it.
The standard trick for this is to zip the list with it's own tail:
> let xs = [1,2,3,4] in zip xs (tail xs)
[(1,2),(2,3),(3,4)]
To see why this works, line up the list and its tail visually.
xs = 1 : 2 : 3 : 4 : []
tail xs = 2 : 3 : 4 : []
and note that zip is making a tuple out of each column.
There are two more subtle reasons why this always does the right thing:
zip stops when either list runs out of elements. That makes sense here since we can't have an "incomplete pair" at the end and it also ensures that we get no pairs from a single element list.
When xs is empty, one might expect tail xs to throw an exception. However, because zip
checks its first argument first, when it sees that it's the empty list, the second argument
is never evaluated.
Everything above also holds true for zipWith, so you can use the same method whenever you need to apply a function pairwise to adjacent elements.
For a generic solution like Clojure's partition, there is nothing in the standard libraries. However, you can try something like this:
partition' :: Int -> Int -> [a] -> [[a]]
partition' size offset
| size <= 0 = error "partition': size must be positive"
| offset <= 0 = error "partition': offset must be positive"
| otherwise = loop
where
loop :: [a] -> [[a]]
loop xs = case splitAt size xs of
-- If the second part is empty, we're at the end. But we might
-- have gotten less than we asked for, hence the check.
(ys, []) -> if length ys == size then [ys] else []
(ys, _ ) -> ys : loop (drop offset xs)
Just to throw another answer out there using a different approach:
For n=2 you want to simply zip the list with its tail. For n=3 you want to zip the list with its tail and with the tail of its tail. This pattern continues further, so all we have to do is generalise it:
partition n = sequence . take n . iterate tail
But this only works for an offset of 1. To generalise the offsets we just have to look at the genrated list. It will always have the form:
[[1..something],[2..something+1],..]
So all left to do is select every offsetth element and we should be fine. I shamelessy stole this version from #ertes from this question:
everyNth :: Int -> [a] -> [a]
everyNth n = map head . takeWhile (not . null) . iterate (drop n)
The entire function now becomes:
partition size offset = everyNth offset . sequence . take size . iterate tail
Sometimes is best to roll your own. Recursive functions are what gives LisP its power and appeal. Haskell tries to discourage them but too often a solution is best achieved with a recursive function. They are often quite simple as is this one to produce pairs.
Haskell pattern matching reduces code. This could easily be changed by changing only the pattern to (x:y:yys) to produce (a,b), (c,d), (e,f).
> prs (x:yys#(y:_)) = (x,y):prs yys
> prs "abcdefg"
[('a','b'),('b','c'),('c','d'),('d','e'),('e','f'),('f','g')
test :: [String] -> [String]
test = foldr step []
where step x ys
| elem x ys = x : ys
| otherwise = ys
I am trying to build a new list consisting of all the distinct strings being input. My test data is:
test ["one", "one", "two", "two", "three"]
expected result:
["one", "two", "three"]
I am new to Haskell, and I am sure that I am missing something very fundamental and obvious, but have run out of ways to explore this. Could you provide pointers to where my thinking is deficient?
The actual response is []. It seems that the first guard condition is never met (if I replace it with True, the original list is replicated), so the output list is never built.
My understanding was that the fold would accumulate the result of step on each item of the list, adding it to the empty list. I anticipated that step would test each item for its inclusion in the output list (the first element tested not being there) and would add anything that was not already included to the output list. Obviously not :-)
Your reasoning is correct: you just need to switch = x : ys and = ys so that you add the x when it's not an element of ys. Also, Data.List.nub does this exact thing.
Think about it: your code is saying "when x is in the remainder, prepend x to the result", i.e. creating a duplicate. You just need to change it to "when x is not in the remainder, prepend x to the result" and you get the correct function.
This function differs from Data.List.nub in an important way: this function is more strict. Thus:
test [1..] = _|_ -- infinite loop (try it)
nub [1..] = [1..]
nub gives the answer correctly for infinite lists -- this means that it doesn't need the whole list to start computing results, and thus it is a nice player in the stream processing game.
The reason it is strict is because elem is strict: it searches the whole list (presuming it doesn't find a match) before it returns a result. You could write that like this:
nub :: (Eq a) => [a] -> [a]
nub = go []
where
go seen [] = []
go seen (x:xs) | x `elem` seen = go seen xs
| otherwise = x : go (x:seen) xs
Notice how seen grows like the output so far, whereas yours grows like the remainder of the output. The former is always finite (starting at [] and adding one at a time), whereas the latter may be infinite (eg. [1..]). So this variant can yield elements more lazily.
This would be faster (O(n log n) instead of O(n^2)) if you used a Data.Set instead of a list for seen. But it adds an Ord constraint.