Implement iterator that does not consume

Implement iterator that does not consume - rust

I have this code, for learning purposes:
struct Fibonacci {
curr: u32,
next: u32,
}
impl Iterator for Fibonacci {
type Item = u32;
fn next(&mut self) -> Option<Self::Item> {
let current = self.curr;
self.curr = self.next;
self.next = current + self.next;
Some(current)
}
}
impl Fibonacci {
fn new() -> Self {
Fibonacci { curr: 0, next: 1 }
}
fn current(&self) -> u32 {
self.curr
}
}
fn main() {
println!("The first four terms of the Fibonacci sequence are: ");
let fib = Fibonacci::new();
for i in fib.take(4) {
println!("> {}", i);
}
// println!("cur {}", fib.current()); // *fails*
}
Here fib.take(4) consumes the fib value, so the commented out println below the loop doing fib.current() fails , quite naturally, with:
let fib = Fibonacci::new();
--- move occurs because `fib` has type `Fibonacci`, which does not implement the `Copy` trait
for i in fib.take(4) {
------- `fib` moved due to this method call
...
println!("cur {}", fib.current());
^^^^^^^^^^^^^ value borrowed here after move
note: this function takes ownership of the receiver `self`, which moves `fib`
How would I implement the code above such that the Iterator does not consume it's value, and I can use the fib variable after the loop ?

You can use .by_ref():
Borrows an iterator, rather than consuming it.
This is useful to allow applying iterator adapters while still retaining ownership of the original iterator.
println!("The first four terms of the Fibonacci sequence are: ");
let mut fib = Fibonacci::new();
for i in fib.by_ref().take(4) {
println!("> {}", i);
}
println!("cur {}", fib.current());
The first four terms of the Fibonacci sequence are:
> 0
> 1
> 1
> 2
cur 3

Related

Cannot create a generic function that transmutes a slice of bytes into an integer because the size is not known at compile time

I'm trying to create a generic function that transmutes a slice of bytes into an integer.
fn i_from_slice<T>(slice: &[u8]) -> Option<T>
where
T: Sized,
{
match slice.len() {
std::mem::size_of::<T>() => {
let mut buf = [0; std::mem::size_of::<T>()];
buf.copy_from_slice(slice);
Some(unsafe { std::mem::transmute_copy(&buf) })
}
_ => None,
}
}
Rust won't let me do that:
error[E0532]: expected tuple struct/variant, found function `std::mem::size_of`
--> src/lib.rs:6:9
|
6 | std::mem::size_of::<T>() => {
| ^^^^^^^^^^^^^^^^^^^^^^ not a tuple struct/variant
error[E0277]: the size for values of type `T` cannot be known at compilation time
--> src/lib.rs:7:31
|
7 | let mut buf = [0; std::mem::size_of::<T>()];
| ^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `T`
= note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>
= help: consider adding a `where T: std::marker::Sized` bound
= note: required by `std::mem::size_of`
Is there a way that I can statically know the size of T?

If your T is an integer, you don't need any unsafe code, since there is from_ne_bytes.
If you absolutely want a generic function, you can add a trait:
use std::convert::TryInto;
trait FromBytes: Sized {
fn from_ne_bytes_(bytes: &[u8]) -> Option<Self>;
}
impl FromBytes for i32 {
fn from_ne_bytes_(bytes: &[u8]) -> Option<Self> {
bytes.try_into().map(i32::from_ne_bytes).ok()
}
}
// Etc. for the other numeric types.
fn main() {
let i1: i32 = i_from_slice(&[1, 2, 3, 4]).unwrap();
let i2 = i32::from_ne_bytes_(&[1, 2, 3, 4]).unwrap();
assert_eq!(i1, i2);
}
// This `unsafe` usage is invalid, but copied from the original post
// to compare the result with my implementation.
fn i_from_slice<T>(slice: &[u8]) -> Option<T> {
if slice.len() == std::mem::size_of::<T>() {
Some(unsafe { std::mem::transmute_copy(&slice[0]) })
} else {
None
}
}

You don't need your intermediate buffer, you can call transmute_copy directly on the input slice. Moreover, as pointed out by #BenjaminLindley in the comments, you need to make sure that you transmute from the first item in the slice and not the fat pointer that is the slice itself:
fn i_from_slice<T>(slice: &[u8]) -> Option<T> {
if slice.len() == std::mem::size_of::<T>() {
Some(unsafe { std::mem::transmute_copy(&slice[0]) })
} else {
None
}
}

Is there a way that i can know statically the size of T?
Yes, you do know the size at compile time. But the size can vary and is not a constant. Instead of using a fixed-size array, you can use a vector which is a contiguous growable array.
Also, Sized is an opt-out marker trait. All type parameters have an implicit Sized bound. You don't need to spell that fact out.
You need a match arm guard to use pattern matching the way you did, but it is more straightforward to use if-else expression here.
All in all, this works:
fn i_from_slice<T>(slice: &[u8]) -> Option<T> {
let n = std::mem::size_of::<T>();
if slice.len() == n {
let mut buf = vec![0; n];
buf.copy_from_slice(slice);
Some(unsafe { std::mem::transmute_copy(&buf) })
} else {
None
}
}

How to implement multi-valued iterator pattern in Rust?

I have an iterator-type-object that can return zero, one or more items each time it's called. I want to implement a standard Iter API, i.e. next returns Option<Self::Item>, so it can be consumed item by item.
In Clojure I would probably do this with mapcat ("map and concatenate").
My current solution (thanks to #Ryan) uses flat_map but still requires a lot of allocation:
// Desired input:
// A stateful object that implements an iterator which returns a number of results each time.
// The real code is a bit more complicated, this is the minimal example.
struct MyThing {
counter: i32,
}
impl Iterator for MyThing {
type Item = Vec<String>;
fn next(&mut self) -> Option<Vec<String>> {
self.counter += 1;
if self.counter == 4 {
self.counter = 1;
}
match self.counter {
1 => Some(vec!["One".to_string()]),
2 => Some(vec!["One".to_string(), "Two".to_string()]),
3 => Some(vec![
"One".to_string(),
"Two".to_string(),
"Three".to_string(),
]),
_ => Some(vec![]),
}
}
}
fn main() {
let things = MyThing { counter: 0 };
// Missing piece, though the following line does the job:
let flattened = things.flat_map(|x| x);
// However this requires a heap allocation at each loop.
// Desired output: I can iterate, item by item.
for item in flattened {
println!("{:?}", item);
}
}
Given the innovative things I have seen, I wonder if there's a more idiomatic, less expensive way of accomplishing this pattern.

If you know how to generate the "inner" values programmatically, replace Vec<String> with a struct you define that implements Iterator<Item = String>. (Technically only IntoIterator is necessary, but Iterator is sufficient.)
struct Inner {
index: usize,
stop: usize,
}
impl Inner {
fn new(n: usize) -> Self {
Inner { index: 0, stop: n }
}
}
impl Iterator for Inner {
type Item = String;
fn next(&mut self) -> Option<String> {
static WORDS: [&str; 3] = ["One", "Two", "Three"];
let result = if self.index < self.stop {
WORDS.get(self.index).map(|r| r.to_string())
} else {
None
};
self.index += 1;
result
}
}
Because Inner implements Iterator<Item = String>, it can be iterated over much like Vec<String>. But Inner does not have to pre-allocate a Vec and consume items one by one; it can lazily create each String on demand.
The "outer" iterator is just a struct that implements Iterator<Item = Inner>, likewise constructing each Inner lazily:
struct Outer {
counter: i32,
}
impl Iterator for Outer {
type Item = Inner;
fn next(&mut self) -> Option<Inner> {
self.counter = 1 + self.counter % 3;
Some(Inner::new(self.counter as usize))
}
}
As you know, Iterator::flat_map flattens nested structure, so something like the following works:
let things = Outer { counter: 0 };
for item in things.flat_map(|x| x).take(100) {
println!("{:?}", item);
}
In real-life code, Inner and Outer are probably pretty different from this example most of the time. For example, it's not necessarily possible to write Inner without doing the equivalent of allocating a Vec. So the precise shapes and semantics of these iterators depend on concrete information about the use case.
The above assumes that Inner is somehow useful, or easier to implement on its own. You could easily write a single struct that iterates over the sequence without needing to be flattened, but you have to also put the inner iterator state (the index field) into Outer:
struct Outer {
index: usize,
counter: i32,
}
impl Iterator for Outer {
type Item = String;
fn next(&mut self) -> Option<String> {
static WORDS: [&str; 3] = ["One", "Two", "Three"];
let result = WORDS.get(self.index).map(|r| r.to_string());
self.index += 1;
if self.index >= self.counter as usize {
self.counter = 1 + self.counter % 3;
self.index = 0;
};
result
}
}
fn main() {
let things = Outer { counter: 1, index: 0 };
for item in things.take(100) {
println!("{:?}", item);
}
}

How do I split an integer into individual digits?

I'm writing a function that requires the individual digits of a larger integer to perform operations on.
I've tried the following:
fn example(num: i32) {
// I can safely unwrap because I know the chars of the string are going to be valid
let digits = num.to_string().chars().map(|d| d.to_digit(10).unwrap());
for digit in digits {
println!("{}", digit)
}
}
But the borrow checker says the string doesn't live long enough:
error[E0716]: temporary value dropped while borrowed
--> src/lib.rs:3:18
|
3 | let digits = num.to_string().chars().map(|d| d.to_digit(10).unwrap());
| ^^^^^^^^^^^^^^^ - temporary value is freed at the end of this statement
| |
| creates a temporary which is freed while still in use
4 | for digit in digits {
| ------ borrow later used here
|
= note: consider using a `let` binding to create a longer lived value
The following does work:
let temp = num.to_string();
let digits = temp.chars().map(|d| d.to_digit(10).unwrap());
But that looks even more contrived.
Is there a better, and possibly more natural way of doing this?

But the borrow checker says the string doesn't live long enough.
That's because it doesn't. You aren't using the iterator, so the type of digits is
std::iter::Map<std::str::Chars<'_>, <closure>>
That is, a yet-to-be-evaluated iterator that contains references to the allocated string (the unnamed lifetime '_ in Chars). However, since that string has no owner, it is dropped at the end of the statement; before the iterator is consumed.
So, yay for Rust, it prevented a use-after-free bug!
Consuming the iterator would "solve" the problem, as the references to the allocated string would not attempt to live longer than the allocated string; they all end at the end of the statement:
let digits: Vec<_> = num.to_string().chars().map(|d| d.to_digit(10).unwrap()).collect();
If you wanted to return an iterator, you can then convert the Vec back into an iterator:
fn digits(num: usize) -> impl Iterator<Item = u32> {
num.to_string()
.chars()
.map(|d| d.to_digit(10).unwrap())
.collect::<Vec<_>>()
.into_iter()
}
As for an alternate solution, there's the math way, stolen from the C++ question to create a vector:
fn x(n: usize) -> Vec<usize> {
fn x_inner(n: usize, xs: &mut Vec<usize>) {
if n >= 10 {
x_inner(n / 10, xs);
}
xs.push(n % 10);
}
let mut xs = Vec::new();
x_inner(n, &mut xs);
xs
}
fn main() {
let num = 42;
let digits: Vec<_> = num.to_string().chars().map(|d| d.to_digit(10).unwrap()).collect();
println!("{:?}", digits);
let digits = x(42);
println!("{:?}", digits);
}
However, you might want to add all the special case logic for negative numbers, and testing wouldn't be a bad idea.
You might also want a fancy-pants iterator version:
fn digits(mut num: usize) -> impl Iterator<Item = usize> {
let mut divisor = 1;
while num >= divisor * 10 {
divisor *= 10;
}
std::iter::from_fn(move || {
if divisor == 0 {
None
} else {
let v = num / divisor;
num %= divisor;
divisor /= 10;
Some(v)
}
})
}
Or the completely custom type:
struct Digits {
n: usize,
divisor: usize,
}
impl Digits {
fn new(n: usize) -> Self {
let mut divisor = 1;
while n >= divisor * 10 {
divisor *= 10;
}
Digits {
n: n,
divisor: divisor,
}
}
}
impl Iterator for Digits {
type Item = usize;
fn next(&mut self) -> Option<Self::Item> {
if self.divisor == 0 {
None
} else {
let v = Some(self.n / self.divisor);
self.n %= self.divisor;
self.divisor /= 10;
v
}
}
}
fn main() {
let digits: Vec<_> = Digits::new(42).collect();
println!("{:?}", digits);
}
See also:
What is the correct way to return an Iterator (or any other trait)?

How can I put a trait constraint on a type while implementing a trait?

I have an Iterator that produces Fibonacci numbers. I restricted the type to u32, but now I'm struggling to make it generic for any numeric type.
Working, non-generic code:
struct Fib {
value: u32,
next: u32,
}
impl Fib {
fn new( a : u32, b : u32 ) -> Fib {
Fib { value : a, next : b }
}
}
impl Iterator for Fib {
type Item = u32;
fn next(&mut self) -> Option<u32> {
let value = self.value;
let next = self.value + self.next;
self.value = self.next;
self.next = next;
Some( value )
}
}
//////////////////////////////////////////////////
fn main() {
let fib = Fib::new( 1, 2 );
let sum = fib.filter( |x| { x % 2 == 0 })
.take_while( |&x| { x <= 4000000 })
.fold( 0, |sum, x| { sum + x });
println!("{}", sum);
}
The issue is that the implementation of Iterator requires a constraint to Num, but I don't know how to express this:
impl <T : Num> Iterator for Fib<T> { ... }
Produces:
use of undeclared trait name `Num`
And when I try either use std::num::{Num} or use num::traits::{Num}, I am told that the modules do not exist.

I don't think you want Fib to be generic over numeric types, but types that implement the + operator. Like so:
use std::ops::Add;
struct Fib<N>
where N: Add<Output = N> + Copy {
value: N,
next: N,
}
impl<N> Iterator for Fib<N>
where N: Add<Output = N> + Copy {
type Item = N;
fn next(&mut self) -> Option<N> {
let next = self.value + self.next;
self.value = self.next;
self.next = next;
Some(next)
}
}
fn main() {
let fib_seq = Fib {
value: -1,
next: 1,
};
for thing in fib_seq.take(10) {
println!("{}", thing);
}
}
Add is the trait that allows you to use the + operator and produce Output. In this case N implements the Add<Output = N> trait which means N + N will produce something of type N.
That sounds like it, but when you try to do self.next + self.value you are moving value and next out of self which causes an error.
You can't get away with not moving the values since the definition of add has this method signature:
fn add(self, rhs: RHS) -> Self::Output;
RHS in Add's case is just Self. So in order to restrict N it to types that can just be copied with little overhead I added the Copy trait as a restriction.
OP mentions an interesting point: Is it possible to alias traits? In short no.
You could make a new trait:
trait SimpleAdd: Add<Output = Self> + Copy {
}
But then you would have to implement that trait for all the types you wanted. I.e. i32 does not automatically implement SimpleAdd. But you can do it with generics if you wanted:
impl<N> SimpleAdd for N
where N: Add<Output = N> + Copy {
}
So the above two blocks will get you the same thing as a trait alias, but it seems like a hassle.

Iterator over elements around specific index in Vec<Vec<Object>>

I have a grid: Vec<Vec<Object>> and a pair of x/y indices. I want to find all the elements surrounding the one indexed.
Unfortunately, I can't simply loop over the elements because that ends up borrowing the Vec twice and the borrow checker screams at me:
let mut cells = Vec::with_capacity(8);
for cx in xstart..xend {
for cy in ystart..yend {
if cx != x || cy != y {
cells.push(&mut squares[cy as usize][cx as usize]);
}
}
}
cells.into_iter()
My best attempt at changing this into an iterator chain has also failed spectacularly:
let xstart = if x == 0 { x } else { x - 1 };
let xlen = if x + 2 > squares[0].len() { x + 1 } else { 3 };
let ystart = if y == 0 { y } else { y - 1 };
let ylen = if y + 2 > squares.len() { y + 1 } else { 3 };
let xrel = x - xstart;
let yrel = y - ystart;
squares.iter().enumerate()
.skip(ystart).take(ylen).flat_map(|(i, ref row)|
row.iter().enumerate()
.skip(xstart).take(xlen).filter(|&(j, &c)| i != yrel || j != xrel))
Does anyone know how I can do this?

Personally, I am not sure I would be comfortable working with an iterator when the relative positions of the elements can be important. Instead, I would seek to create a "view" of those elements.
The gist can be found here, but the idea is simple so here are the core structures.
#[derive(Debug)]
struct NeighbourhoodRow<'a, T>
where T: 'a
{
pub left : Option<&'a mut T>,
pub center : Option<&'a mut T>,
pub right : Option<&'a mut T>,
}
#[derive(Debug)]
struct Neighbourhood<'a, T>
where T: 'a
{
pub top : NeighbourhoodRow<'a, T>,
pub center : NeighbourhoodRow<'a, T>,
pub bottom : NeighbourhoodRow<'a, T>,
}
To build them, I use a healthy dose of split_at_mut:
fn take_centered_trio<'a, T>(row: &'a mut [T], x: usize) ->
(Option<&'a mut T>, Option<&'a mut T>, Option<&'a mut T>)
{
fn extract<'a, T>(row: &'a mut [T], x: usize) -> (Option<&'a mut T>, &'a mut [T]) {
if x+1 > row.len() {
(None, row)
} else {
let (h, t) = row.split_at_mut(x+1);
(Some(&mut h[x]), t)
}
}
let (prev, row) = if x > 0 { extract(row, x-1) } else { (None, row) };
let (elem, row) = extract(row, 0);
let (next, _ ) = extract(row, 0);
(prev, elem, next)
}
and the rest is just some uninteresting constructors.
Of course, you can then build some kind of iterator over those.

In the end I made a custom iterator with the help of the guys in #rust
I've typed my struct out to give you the actual code. As pointed out by the guys in #rust you cannot return &mut safely from an iterator without using a different iterator that uses unsafe anyway, and given that the math here is simple enough to ensure it doesn't go wrong an unsafe was the way to go.
type FieldSquare = u8;
use std::iter::Iterator;
pub struct SurroundingSquaresIter<'a> {
squares: &'a mut Vec<Vec<FieldSquare>>,
center_x: usize,
center_y: usize,
current_x: usize,
current_y: usize,
}
pub trait HasSurroundedSquares<'a> {
fn surrounding_squares(&'a mut self, x: usize, y:usize) -> SurroundingSquaresIter<'a>;
}
impl<'a> HasSurroundedSquares<'a> for Vec<Vec<FieldSquare>> {
fn surrounding_squares(&'a mut self, x: usize, y:usize) -> SurroundingSquaresIter<'a> {
SurroundingSquaresIter {
squares: self,
center_x: x,
center_y: y,
current_x: if x == 0 { x } else { x - 1 },
current_y: if y == 0 { y } else { y - 1 },
}
}
}
impl<'a> Iterator for SurroundingSquaresIter<'a> {
type Item = &'a mut FieldSquare;
fn next(&mut self) -> Option<&'a mut FieldSquare> {
if self.current_y + 1 > self.squares.len() || self.current_y > self.center_y + 1 {
return None;
}
let ret_x = self.current_x;
let ret_y = self.current_y;
if self.current_x < self.center_x + 1 && self.current_x + 1 < self.squares[self.current_y].len() {
self.current_x += 1;
}
else {
self.current_x = if self.center_x == 0 { self.center_x } else { self.center_x - 1 };
self.current_y += 1;
}
if ret_x == self.center_x && ret_y == self.center_y {
return self.next();
}
Some(unsafe { &mut *(&mut self.squares[ret_y][ret_x] as *mut _) })
}
}

You want to get mutable references to all surrounding elements, right? I don't think this is possible to do it directly. The problem is, Rust cannot statically prove that you want mutable references to different cells. If it ignored this, then, for example, you could make a slight mistake in indexing and get two mutable references to the same data, which is something Rust guarantees to prevent. Hence it disallows this.
On the language level this is caused by IndexMut trait. You can see how its only method's self parameter lifetime is tied to the result lifetime:
fn index_mut(&'a mut self, index: Idx) -> &'a mut Self::Output;
This means that if this method is called (implicitly through an indexing operation) then the whole object will be borrowed mutably until the resulting reference goes out of scope. This prevents calling &mut a[i] multiple times.
The most simple and safest way to fix this would be to refactor your code in a "double buffering" manner - you have two instances of the field and copy data between each other on the each step. Alternatively, you can create a temporary field on each step and replace the main one with it after all computations but it is probably less efficient than swapping two fields.
Another way to solve this would be, naturally, using raw *mut pointers. This is unsafe and should only be used directly as the last resort. You can use unsafety, however, to implement a safe abstraction, something like
fn index_multiple_mut<'a, T>(input: &'a mut [Vec<T>], indices: &[(usize, usize)]) -> Vec<&'a mut T>
where you first check that all indices are different and then use unsafe with some pointer casts (with transmute, probably) to create the resulting vector.
A third possible way would be to use split_at_mut() method in some clever way, but I'm not that sure that it is possible, and if it is, it is likely not very convenient.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Implement iterator that does not consume - rust

Related

Cannot create a generic function that transmutes a slice of bytes into an integer because the size is not known at compile time

How to implement multi-valued iterator pattern in Rust?

How do I split an integer into individual digits?

How can I put a trait constraint on a type while implementing a trait?

Iterator over elements around specific index in Vec<Vec<Object>>

Categories

Resources