How to implement the subsequences iterator in Rust? - rust

I want to implement an iterator that produces all the subsequences of an input sequence. Some examples:
subsequences "abc"
["","a","b","ab","c","ac","bc","abc"]
subsequences [1,2]
[[],[1],[2],[1,2]]
subsequences [1,2,3]
[[],[1],[2],[1,2],[3],[1,3],[2,3],[1,2,3]]
subsequences [1,2,3,4]
[[],[1],[2],[1,2],[3],[1,3],[2,3],[1,2,3],[4],[1,4],[2,4],[1,2,4],[3,4],[1,3,4],[2,3,4],[1,2,3,4]]
The Haskell implementation of this is very straightforward:
subsequences :: [a] -> [[a]]
subsequences xs = [] : nonEmptySubsequences xs
nonEmptySubsequences :: [a] -> [[a]]
nonEmptySubsequences [] = []
nonEmptySubsequences (x:xs) = [x] : foldr f [] (nonEmptySubsequences xs)
where f ys r = ys : (x : ys) : r
I just cannot seem to figure out how to recreate this in Rust. I figure it should have the following signature so that it can produce very long sequences without unnecessary memory allocations.
fn subsequences<A: Copy>(xs: &[A]) -> impl Iterator<Item=impl Iterator<Item=A>>;
Any guidance?

fn subsequences<A: Copy>(xs: &[A]) -> impl Iterator<Item=impl Iterator<Item=A>+ '_> {
// return all subsequences of a given sequence
let n = xs.len();
(0..1 << n).map(move |i| {
(0..n).filter(move |j| i & (1 << j) != 0).map(move |j| xs[j])
})
}
mod tests {
use super::*;
#[test]
fn test_subseq() {
let xs = [1, 2, 3];
let mut it = subsequences(&xs);
for _ in 0..8 {
println!("{:?}", it.next().unwrap().collect::<Vec<_>>());
}
}
}
Test Output:
[]
[1]
[2]
[1, 2]
[3]
[1, 3]
[2, 3]
[1, 2, 3]

It looks like that is exactly what itertools::powerset does (playground):
use itertools::Itertools;
fn main() {
for s in ['a', 'b', 'c'].into_iter().powerset() {
println!("{:?}", s);
}
}
[]
['a']
['b']
['c']
['a', 'b']
['a', 'c']
['b', 'c']
['a', 'b', 'c']
The returned value is a Powerset<I> (being I the input type), that implements Iterator<Item = Vec<<I as Iterator>::Item>>, and skimming through the implementation it looks like it does not preallocate the replies, but it computes them on the fly.

Your code in haskell and what you want seems different, isn't the haskell missing the empty subset in [[a]] ([[], [a]])?
Anyway, a more or less direct translation to Rust would be something like this (playground):
fn subsequences<A: Copy>(xs: &[A]) -> Vec<Vec<A>> {
match xs {
[] => vec![],
[x] => vec![vec![*x], vec![]],
[xs#.., x] => {
subsequences(xs).into_iter()
.fold(vec![],
|mut r, ys| {
let mut ys0 = ys.clone();
ys0.push(*x);
r.push(ys0);
r.push(ys);
r
})
}
}
}
fn main() {
for s in subsequences(&['a', 'b', 'c']) {
println!("{:?}", s);
}
}
['a', 'b', 'c']
['a', 'b']
['a', 'c']
['a']
['b', 'c']
['b']
['c']
[]
Note that Haskell is optimized to add values at the front of a list, while Rust is best adding to the end of a Vec so the order of the output is inverted.
It is not easy avoding the allocations because you must store the values you want to return somewhere. Maybe you could return a value that implements an impl Iterator<Item: Iterator<A>> that avoids the allocations but that would require a very different algorithm.

Related

99 Haskell Questions #9

https://wiki.haskell.org/99_questions/1_to_10
regarding the solution to Question 9
pack :: Eq a => [a] -> [[a]] -- problem 9
pack [] = []
pack (x:xs) = (x:first) : pack rest
where
getReps [] = ([], [])
getReps (y:ys)
| y == x = let (f,r) = getReps ys in (y:f, r)
| otherwise = ([], (y:ys))
(first,rest) = getReps xs
--input
pack ['a', 'a', 'a', 'a', 'b', 'c', 'c', 'a','a', 'd', 'e', 'e', 'e', 'e']
--output
["aaaa","b","cc","aa","d","eeee"]
whats going on here: (x:first), i can see that rest is being passed pack, but i don't understand the bit before that.
getReps is a function specific to each iteration of pack, as it closes over the current values of x and xs.
The first call to pack "aaaabccaadeeee" results in a call to getReps "aaabccaadeeee". getReps will return ("aaa", "bccaadeeee"). Effectively, pack self splits off the first 'a' (the value of x), and getReps separates the rest of the 'a's from the value of xs. Thus, the final "value" of pack "aaaabccaadeeee" is ('a':"aaa") : pack "bccaadeeee".

Haskell Recursive HashMap data structure of arbitrary depth

With this Python function:
def mut_add_to_tree(text, tree):
tree_ = tree
for i, c in enumerate(text):
if c in tree_:
tree_[c][0] += 1
tree_ = tree_[c][1]
else:
for c_ in text[i:]:
tree_[c_] = [1, {}]
tree_ = tree_[c_][1]
break
is created a data structure of nested dicts like this:
In [15]: tree = {}
In [16]: mut_add_to_tree("cat", tree)
In [17]: tree
Out[17]: {'c': [1, {'a': [1, {'t': [1, {}]}]}]}
In [18]: mut_add_to_tree("car", tree)
In [19]: tree
Out[19]: {'c': [2, {'a': [2, {'t': [1, {}], 'r': [1, {}]}]}]}
In [20]: mut_add_to_tree("bat", tree)
In [21]: tree
Out[21]:
{'c': [2, {'a': [2, {'t': [1, {}], 'r': [1, {}]}]}],
'b': [1, {'a': [1, {'t': [1, {}]}]}]}
In [22]: mut_add_to_tree("bar", tree)
In [23]: tree
Out[23]:
{'c': [2, {'a': [2, {'t': [1, {}], 'r': [1, {}]}]}],
'b': [2, {'a': [2, {'t': [1, {}], 'r': [1, {}]}]}]}
How can this behaviour be replicated in Haskell?
More generally, how are nested HashMaps of arbitrary depth created and inserted into?
I've experimented with the following:
type NestedHashMap k v = HashMap Char (Int,(HashMap Char v))
toNestedHashMap :: String -> HashMap Char (Int, HashMap Char v)
toNestedHashMap [] = fromList []
toNestedHashMap (x:xs) = fromList [(x, (1, toNestedHashMap xs))]
but already here the compiler tells me
Couldn't match type ‘v’ with ‘(Int, HashMap Char v0)’
‘v’ is a rigid type variable bound by
the type signature for:
toNestedHashMap :: forall v.
String -> HashMap Char (Int, HashMap Char v)
at WordFuncs.hs:48:1-63
Expected type: HashMap Char (Int, HashMap Char v)
Actual type: HashMap
Char (Int, HashMap Char (Int, HashMap Char v0))
Any help appreciated. Thanks.
This is basically an infinite type. Map Char (Int, Map Char (Int, Map Char (... ¿()?)...))) is what the type synonym would have to unroll, to allow the stuff you're doing in Python.
Haskell doesn't allow infinite types per se, but it does allow you to create the structure of such a type. For this, it's not sufficient to make a type synonym, you need a newtype, which in this case means to the compiler “I shouldn't bother recursing this, it's a known, distinguishable type that has already been checked”.
newtype NestedHashMap k v = NestedHashMap -- N.B. the `v` argument is unused
{ getNestedHashMap :: HashMap k (Int, NestedHashMap k v) }
toNestedHashMap :: String -> NestedHashMap Char ()
toNestedHashMap [] = NestedHashMap $ fromList []
toNestedHashMap (x:xs) = NestedHashMap $ fromList [(x, (1, toNestedHashMap xs))]

Clean list comprehension for sampling from list of lists?

I have a lists of list in Haskell. I want to get all the possibilities when taking one element from each list. What I have currently is
a = [ [1,2], [10,20,30], [-1,-2] ] -- as an example
whatIWant = [ [p,q,r] | p <- a!!0, q <- a!!1, r <- a!!2 ]
This does what I want. However, this is obviously not very good code, and I'm looking for a better way of writing the list comprehension so that no index number (0,1,2) shows up in the code... which is where I'm stuck.
How can I do this?
Using a function (which uses a list comprehension inside), my solution is
combinations :: [[a]] -> [[a]]
combinations [] = []
combinations [l] = map (\ x -> [x]) l
combinations (x:xs) = combine (combinations [x]) (combinations xs)
where combine a b = [ p ++ q | p <- a, q <- b ]
Example:
*Main> combinations [[1, 2, 3], [4, 5, 6]]
[[1,4],[1,5],[1,6],[2,4],[2,5],[2,6],[3,4],[3,5],[3,6]]
*Main> combinations [['a', 'b', 'c'], ['A', 'B', 'C'], ['1', '2']]
["aA1","aA2","aB1","aB2","aC1","aC2","bA1","bA2","bB1",...
"bB2","bC1","bC2","cA1","cA2","cB1","cB2","cC1","cC2"]
Edit: of course you can use the sequence function, as was suggested in the comments:
*Main> sequence [['a', 'b', 'c'], ['A', 'B', 'C'], ['1', '2']]
["aA1","aA2","aB1","aB2","aC1","aC2","bA1","bA2","bB1",...
"bB2","bC1","bC2","cA1","cA2","cB1","cB2","cC1","cC2"]
this is obviously not a good code
This is about the best way you can do it, given your constraint that the input is a list of lists.
If you use a different type, e.g. a triple of lists, then you can index structurally. E.g.
Prelude> let x#(a,b,c) = ( [1,2], [10,20,30], [-1,-2] )
Lets you write:
Prelude> [ (p,q,r) | p <- a , q <- b , r <- c ]
[(1,10,-1),(1,10,-2),(1,20,-1)
,(1,20,-2),(1,30,-1),(1,30,-2)
,(2,10,-1),(2,10,-2),(2,20,-1)
,(2,20,-2),(2,30,-1),(2,30,-2)]
Lesson: to avoid indexing, use a type whose structure captures the invariant you want to hold. Lift the dimension of the data into its type.

Generating list of neighbors of a state

I'm generating neighbours of a state in Haskell.
A state is a list of rows. The actions can be performed independently on a row. A function is called on each row which returns a set of neighbours for that row.
Here's an example (I'll let the rows be chars for simplicity):
state = ['a', 'b', 'c']
rowNeighbours a = ['x', 'y']
rowNeighbours c = ['p', 'q']
rowNeighbours _ = []
neighbours should call rowNeighbours on each row and generate a list of states [['x', 'b', 'c'], ['y', 'b', 'c'], ['a', 'b', 'p'], ['a', 'b', 'q']].
I'm having trouble generating this list. The following is what I came up with as a solution.
neighbours state =
[ [x, y, z] | x <- rowNeighbours (state !! 0), y <- [state !! 1], z <- [state !! 2] ] ++
[ [x, y, z] | x <- [state !! 0], y <- rowNeighbours (state !! 1), z <- [state !! 2] ] ++
[ [x, y, z] | x <- [state !! 0], y <- [state !! 1], z <- rowNeighbours (state !! 2) ]
It works, but my actual problem has '6' rows, so this becomes quite inelegant and looks like a non-functional way to do things. I would appreciate any pointers on how to go about doing this, thank you.
I think this'll do what you want:
neighbors (s:tate) = map (: tate) (rowNeighbors s) ++ map (s :) (neighbors tate)
neighbors [] = []

Merge multiple lists if condition is true

I've been trying to wrap my head around this for a while now, but it seems like my lack of Haskell experience just won't get me through it. I couldn't find a similar question here on Stackoverflow (most of them are related to merging all sublists, without any condition)
So here it goes. Let's say I have a list of lists like this:
[[1, 2, 3], [3, 5, 6], [20, 21, 22]]
Is there an efficient way to merge lists if some sort of condition is true? Let's say I need to merge lists that share at least one element. In case of example, result would be:
[[1, 2, 3, 3, 5, 6], [20, 21, 22]]
Another example (when all lists can be merged):
[[1, 2], [2, 3], [3, 4]]
And it's result:
[[1, 2, 2, 3, 3, 4]]
Thanks for your help!
I don't know what to say about efficiency, but we can break down what's going on and get several different functionalities at least. Particular functionalities might be optimizable, but it's important to clarify exactly what's needed.
Let me rephrase the question: For some set X, some binary relation R, and some binary operation +, produce a set Q = {x+y | x in X, y in X, xRy}. So for your example, we might have X being some set of lists, R being "xRy if and only if there's at least one element in both x and y", and + being ++.
A naive implementation might just copy the set-builder notation itself
shareElement :: Eq a => [a] -> [a] -> Bool
shareElement xs ys = or [x == y | x <- xs, y <- ys]
v1 :: (a -> a -> Bool) -> (a -> a -> b) -> [a] -> [b]
v1 (?) (<>) xs = [x <> y | x <- xs, y <- xs, x ? y]
then p = v1 shareElement (++) :: Eq a => [[a]] -> [[a]] might achieve what you want. Except it probably doesn't.
Prelude> p [[1], [1]]
[[1,1],[1,1],[1,1],[1,1]]
The most obvious problem is that we get four copies: two from merging the lists with themselves, two from merging the lists with each other "in both directions". The problem occurs because List isn't the same as Set so we can't kill uniques. Of course, that's an easy fix, we'll just use Set everywhere
import Data.Set as Set
v2 :: (a -> a -> Bool) -> (a -> a -> b) -> Set.Set a -> Set.Set b
v2 (?) (<>) = Set.fromList . v1 (?) (<>) . Set.toList
So we can try again, p = v2 (shareElementonSet.toList) Set.union with
Prelude Set> p $ Set.fromList $ map Set.fromList [[1,2], [2,1]]
fromList [fromList [1,2]]
which seems to work. Note that we have to "go through" List because Set can't be made an instance of Monad or Applicative due to its Ord constraint.
I'd also note that there's a lot of lost behavior in Set. For instance, we fight either throwing away order information in the list or having to handle both x <> y and y <> x when our relation is symmetric.
Some more convenient versions can be written like
v3 :: Monoid a => (a -> a -> Bool) -> [a] -> [a]
v3 r = v2 r mappend
and more efficient ones can be built if we assume that the relationship is, say, an equality relation since then instead of having an O(n^2) operation we can do it in O(nd) where d is the number of partitions (cosets) of the relation.
Generally, it's a really interesting problem.
I just happened to write something similar here: Finding blocks in arrays
You can just modify it so (although I'm not too sure about the efficiency):
import Data.List (delete, intersect)
example1 = [[1, 2, 3], [3, 5, 6], [20, 21, 22]]
example2 = [[1, 2], [2, 3], [3, 4]]
objects zs = map concat . solve zs $ [] where
areConnected x y = not . null . intersect x $ y
solve [] result = result
solve (x:xs) result =
let result' = solve' xs [x]
in solve (foldr delete xs result') (result':result) where
solve' xs result =
let ys = filter (\y -> any (areConnected y) result) xs
in if null ys
then result
else solve' (foldr delete xs ys) (ys ++ result)
OUTPUT:
*Main> objects example1
[[20,21,22],[3,5,6,1,2,3]]
*Main> objects example2
[[3,4,2,3,1,2]]

Resources