Dynamic number of list comprehension items - haskell

I'm trying to get permutations of a variable number of strings in a list.. I'm sure this is possible in Haskell, I'm just having a hard time finding a reference for this,
I'm looking to be able to do this [ [n1] ++ [n2] ++ etc | n1 <- {first string}, n2 <- {second string}, etc ]
Where my list might be ["hey", "now"]
and my output would look like this:
["hn","ho","hw","en","eo","ew","yn","yo","yw"]
How would I go about doing something like that?

> sequence ["hey", "now"]
["hn","ho","hw","en","eo","ew","yn","yo","yw"]
sequence is very general, but on lists you can think of it as if it were defined as follows:
sequence :: [[a]] -> [[a]]
sequence [] = [[]]
sequence (x:xs) = [ y:ys | y <- x, ys <- sequence xs ]
The result above is sometimes called the "cartesian product" of a list of lists, since it is similar to that operation on sets.

EDIT: This only works for strings of length 2, but shows the desugaring of the list comprehension (since return is concat and fmap is map, if I recall).
Here's a brute force way of doing it (if you'd like to know a possible approach). If you'd like the clean version, please see chi's answer.
concat $ map (\char1 -> map (\char2 -> char1:[char2]) string2) string1 should do it. There might be a better way with list comprehensions, but this does the job too.
Explanation:
concat $ -- Flatten lists
map (\char1 -> -- Iterate over each character of string1
map (\char2 -> -- Iterate over each character of string2
char1 : [char2] -- Add char1 to char2
) string2
) string1

Related

How to conditionally insert elements into list?

Say I have a list of Strings
["hello", "xbox", "blue"]
And now I want to "insert" (as in create a new immutable list) linefeed characters into the list, but only if the word before ends on a vowel, e.g. a function that creates the following list:
["hello", "\n", "xbox", "blue", "\n"]
What's the most elegant/direct way of doing this in haskell?
One way of doing this would be using do-notation. do-notation on the list monad is a lot like list comprehension, but it also allows you to 'return' multiple elements. Here's my implementation:
solution1 :: [String] -> [String]
solution1 strings = do
str <- strings -- Go through each element of the list
if last str `elem` "aeiou"
then [str, "\n"] -- 'Replace' that element with the element then a newline
else [str] -- Do nothing.
But this is a bit of a weird way of going about things, especially if you're a beginner. The usual way would be recursion, so let's do that instead:
solution2 :: [String] -> [String]
solution2 [] = [] -- Base case: empty list.
solution2 (x:xs) = -- Inductive case: non-empty list.
if last x `elem` "aeiou"
then x : "\n" : solution2 xs -- Recur, reconstructing as we go.
else x : solution2 xs -- Recur, this time with no extra newline.
In reality though, these do basically the same thing—do-notation on lists is basically just an abstraction of the second method.
Something to consider: I used the last function, but this will fail on empty strings. How could you fix that?

Generate all changes of signs

I have many lists. Say for short [[1,2],[3,4]].
I need to generate all changes of signs of each element. Thus, for the short example, the result would be
[[1,2],[3,4],[-1,2],[1,-2],[-1,-2],[-3,4],[3,-4],[-3,-4]]
Is there a package to perform such an operation ? Otherwise what algorithm could I use ? (I confess I have not thought a lot about it ...).
It this can help, all my lists have the same length.
Edit
Hmm.. maybe an idea like that:
x = [[2*i,2*j] | i <- [1, -1], j <- [-1,1]]
x
[[2,-2],[2,2],[-2,-2],[-2,2]]
The problem can be broken down to 2 steps:
For a given list of numbers, generate all the possible signs
For the list of lists, apply the function from (1) to each list, then concat the results.
For 1. you can write a simple recursive function that first processes the tail of the list, then for each resulting combination, it generates two versions for the two signs.
signs :: [Int] -> [[Int]]
signs [] = [[]]
signs (x : xs)
= let ps = signs xs
in map (x :) ps ++ map ((-x) :) ps
For 2. simply map the signs function over the input, and concat them. This is what the concatMap function does:
signsAll :: [[Int]] -> [[Int]]
signsAll = concatMap signs
I tried to use an applicative here, because it looks like you are almost there
[id,negate] <*> [3,4]
but it turned out that I need sequence and map, which, in this case, can be combined into a traverse:
traverse (\x->[x,-x]) [3,4]
[[3,4],[3,-4],[-3,4],[-3,-4]]
As others mentioned, now you need concatMap for your function:
concatMap (traverse (\x->[x,-x])) [[3,4],[1,2]]
[[3,4],[3,-4],[-3,4],[-3,-4],[1,2],[1,-2],[-1,2],[-1,-2]]

Long working of program that count Ints

I want to write program that takes array of Ints and length and returns array that consist in position i all elements, that equals i, for example
[0,0,0,1,3,5,3,2,2,4,4,4] 6 -> [[0,0,0],[1],[2,2],[3,3],[4,4,4],[5]]
[0,0,4] 7 -> [[0,0],[],[],[],[4],[],[]]
[] 3 -> [[],[],[]]
[2,2] 3 -> [[],[],[2,2]]
So, that's my solution
import Data.List
import Data.Function
f :: [Int] -> Int -> [[Int]]
f ls len = g 0 ls' [] where
ls' = group . sort $ ls
g :: Int -> [[Int]] -> [[Int]] -> [[Int]]
g val [] accum
| len == val = accum
| otherwise = g (val+1) [] (accum ++ [[]])
g val (x:xs) accum
| len == val = accum
| val == head x = g (val+1) xs (accum ++ [x])
| otherwise = g (val+1) (x:xs) (accum ++ [[]])
But query f [] 1000000 works really long, why?
I see we're accumulating over some data structure. I think foldMap. I ask "Which Monoid"? It's some kind of lists of accumulations. Like this
newtype Bunch x = Bunch {bunch :: [x]}
instance Semigroup x => Monoid (Bunch x) where
mempty = Bunch []
mappend (Bunch xss) (Bunch yss) = Bunch (glom xss yss) where
glom [] yss = yss
glom xss [] = xss
glom (xs : xss) (ys : yss) = (xs <> ys) : glom xss yss
Our underlying elements have some associative operator <>, and we can thus apply that operator pointwise to a pair of lists, just like zipWith does, except that when we run out of one of the lists, we don't truncate, rather we just take the other. Note that Bunch is a name I'm introducing for purposes of this answer, but it's not that unusual a thing to want. I'm sure I've used it before and will again.
If we can translate
0 -> Bunch [[0]] -- single 0 in place 0
1 -> Bunch [[],[1]] -- single 1 in place 1
2 -> Bunch [[],[],[2]] -- single 2 in place 2
3 -> Bunch [[],[],[],[3]] -- single 3 in place 3
...
and foldMap across the input, then we'll get the right number of each in each place. There should be no need for an upper bound on the numbers in the input to get a sensible output, as long as you are willing to interpret [] as "the rest is silence". Otherwise, like Procrustes, you can pad or chop to the length you need.
Note, by the way, that when mappend's first argument comes from our translation, we do a bunch of ([]++) operations, a.k.a. ids, then a single ([i]++), a.k.a. (i:), so if foldMap is right-nested (which it is for lists), then we will always be doing cheap operations at the left end of our lists.
Now, as the question works with lists, we might want to introduce the Bunch structure only when it's useful. That's what Control.Newtype is for. We just need to tell it about Bunch.
instance Newtype (Bunch x) [x] where
pack = Bunch
unpack = bunch
And then it's
groupInts :: [Int] -> [[Int]]
groupInts = ala' Bunch foldMap (basis !!) where
basis = ala' Bunch foldMap id [iterate ([]:) [], [[[i]] | i <- [0..]]]
What? Well, without going to town on what ala' is in general, its impact here is as follows:
ala' Bunch foldMap f = bunch . foldMap (Bunch . f)
meaning that, although f is a function to lists, we accumulate as if f were a function to Bunches: the role of ala' is to insert the correct pack and unpack operations to make that just happen.
We need (basis !!) :: Int -> [[Int]] to be our translation. Hence basis :: [[[Int]]] is the list of images of our translation, computed on demand at most once each (i.e., the translation, memoized).
For this basis, observe that we need these two infinite lists
[ [] [ [[0]]
, [[]] , [[1]]
, [[],[]] , [[2]]
, [[],[],[]] , [[3]]
... ...
combined Bunchwise. As both lists have the same length (infinity), I could also have written
basis = zipWith (++) (iterate ([]:) []) [[[i]] | i <- [0..]]
but I thought it was worth observing that this also is an example of Bunch structure.
Of course, it's very nice when something like accumArray hands you exactly the sort of accumulation you need, neatly packaging a bunch of grungy behind-the-scenes mutation. But the general recipe for an accumulation is to think "What's the Monoid?" and "What do I do with each element?". That's what foldMap asks you.
The (++) operator copies the left-hand list. For this reason, adding to the beginning of a list is quite fast, but adding to the end of a list is very slow.
In summary, avoid adding things to the end of a list. Try to always add to the beginning instead. One simple way to do that is to build the list backwards, and then reverse it at the end. A more devious trick is to use "difference lists" (Google it). Another possibility is to use Data.Sequence rather than a list.
The first thing that should be noted is the most obvious way to implement this is use a data structure that allows random access, an array is an obviously choice. Note that you need to add the elements to the array multiple times and somehow "join them".
accumArray is perfect for this.
So we get:
f l i = elems $ accumArray (\l e -> e:l) [] (0,i-1) (map (\e -> (e,e)) l)
And we're good to go (see full code here).
This approach does involve converting the final array back into a list, but that step is very likely faster than say sorting the list, which often involves scanning the list at least a few times for a list of decent size.
Whenever you use ++ you have to recreate the entire list, since lists are immutable.
A simple solution would be to use :, but that builds a reversed list. However that can be fixed using reverse, which results in only building two lists (instead of 1 million in your case).
Your concept of glomming things onto an accumulator is a very useful one, and both MathematicalOrchid and Guvante show how you can use that concept reasonably efficiently. But in this case, there is a simpler approach that is likely also faster. You started with
group . sort $ ls
and this was a very good place to start! You get a list that's almost the one you want, except that you need to fill in some blanks. How can we figure those out? The simplest way, though probably not quite the most efficient, is to work with a list of all the numbers you want to count up to: [0 .. len-1].
So we start with
f ls len = g [0 .. len-1] (group . sort $ ls)
where
?
How do we define g? By pattern matching!
f ls len = g [0 .. len-1] (group . sort $ ls)
where
-- We may or may not have some lists left,
-- but we counted as high as we decided we
-- would
g [] _ = []
-- We have no lists left, so the rest of the
-- numbers are not represented
g ns [] = map (const []) ns
-- This shouldn't be possible, because group
-- doesn't make empty lists.
g _ ([]:_) = error "group isn't working!"
-- Finally, we have some work to do!
g (n:ns) xls#(xl#(x:_):xls')
| n == x = xl : g ns xls'
| otherwise = [] : g ns xls
That was nice, but making the list of numbers isn't free, so you might be wondering how you can optimize it. One method I invite you to try is using your original technique of keeping a separate counter, but following this same sort of structure.

Haskell: Split a list using list comprehension

How do you split a list into halves using list comprehension?
e.g. If I have [1,1,2,2,3,3,4,4,5,5] and I only want [1,1,2,2,3]
my attempts so far:
half mylist = [r | mylist!r ; r <- [0..(#mylist div 2)] ] ||does not work
Any thoughts?
[Nb: This isn't actually Haskell but similar. ! is used for indexing list, and # gives length)
Edit::
Okay so it turns out that
half mylist = [r | r <- [mylist!0..mylist!(#mylist div 2)] ]
works, but only in list of numbers and not strings. Any clues?
This isn't really an appropriate thing to do with a list comprehension. List comprehensions are alternate syntax for maps and filters (and zips). Splitting a list is a fold.
As such, you should consider a different approach. E.g.
halve :: [a] -> [a]
halve [] = []
halve xs = take (n `div` 2) xs
where n = length xs
Splitting isn't a great operation on large lists, since you take the length first (so it is always n + n/2 operations on the list. It is more appropriate for array-like types that have O(1) length and split.
Another possible solution, using a boolean guard:
half xs = [x | (x,i) <- zip xs [1..], let m = length xs `div` 2, i <= m]
But as Don Stewart says, a list comprehension is not really the right tool for this job.

Can you create more than one element of a list at a time with a list comprehension in haskell?

So, for example, say I had a list of numbers and I wanted to create a list that contained each number multiplied by 2 and 3. Is there any way to do something like the following, but get back a single list of numbers instead of a list of lists of numbers?
mult_nums = [ [(n*2),(n*3)] | n <- [1..5]]
-- this returns [[2,3],[4,6],[6,9],[8,12],[10,15]]
-- but we want [2,3,4,6,6,9,8,12,10,15]
I find that extending the list comprehension makes this easier to read:
[ m | n <- [1..5], m <- [2*n,3*n] ]
It might be helpful to examine exactly what this does, and how it relates to other solutions. Let's define it as a function:
mult lst = [ m | n <- lst, m <- [2*n,3*n] ]
After a fashion, this desugars to
mult' lst =
concatMap (\n -> concatMap (\m -> [m]) [2*n,3*n]) lst
The expression concatMap (\m -> [m]) is wrapping m up in a list in order to immediately flatten it—it is equivalent to map id.
Compare this to #FunctorSalad's answer:
mult1 lst = concatMap (\n -> [n*2,n*3]) lst
We've optimized away concatMap (\m -> [m]).
Now #vili's answer:
mult2 lst = concat [ [(n*2),(n*3)] | n <- lst]
This desugars to:
mult2' lst = concat (concatMap (\n -> [[2*n,3*n]]) lst)
As in the first solution above, we are unnecessarily creating a list of lists that we have to concat away.
I don't think there is a solution that uses list comprehensions, but desugars to mult1. My intuition is that Haskell compilers are generally clever enough that this wouldn't matter (or, alternatively, that unnecessary concats are cheap due to lazy evaluation (whereas they're lethal in eager languages)).
you could use concat.
concat [ [(n*2),(n*3)] | n <- [1..5]]
output: [2,3,4,6,6,9,8,12,10,15]
In some similar cases concatMap can also be convenient, though here it doesn't change much:
concatMap (\n -> [n*2,n*3]) [1..5]

Resources