Is there a zero-overhead consuming iterator?

Is there a zero-overhead consuming iterator? - rust

I often find myself writing code like:
myvec.iter().map(|x| some_operation(x)).count()
The invocation of count triggers the iterator chain to be consumed, but also produces as non-unit result which is undesired.
I am looking for something like
myvec.iter().map(|x| some_operation(x)).consume()
which should be equivalent to
for _ in myvec.iter().map(|x| some_operation(x)) {}

Iterator::for_each does what you want:
struct Int(i32);
impl Int {
fn print(&self) {
println!("{}", self.0)
}
}
fn main() {
[Int(1), Int(2), Int(3)].into_iter().for_each(Int::print);
}

No, Rust does not have this.
There were several discussions and even an RFC about having a for_each() operation on iterators which will execute a closure for each element of an iterator, consuming it, but nothing is there yet.
Consider using for loop instead:
for x in myvec.iter() {
some_operation(x);
}
In this particular case it does look better than iterator operations.

Related

Is there a way to reference an anonymous type as a struct member?

Consider the following Rust code:
use std::future::Future;
use std::pin::Pin;
fn main() {
let mut v: Vec<_> = Vec::new();
for _ in 1..10 {
v.push(wrap_future(Box::pin(async {})));
}
}
fn wrap_future<T>(a: Pin<Box<dyn Future<Output=T>>>) -> impl Future<Output=T> {
async {
println!("doing stuff before awaiting");
let result=a.await;
println!("doing stuff after awaiting");
result
}
}
As you can see, the futures I'm putting into the Vec don't need to be boxed, since they are all the same type and the compiler can infer what that type is.
I would like to create a struct that has this Vec<...> type as one of its members, so that I could add a line at the end of main():
let thing = MyStruct {myvec: v};
without any additional overhead (i.e. boxing).
Type inference and impl Trait syntax aren't allowed on struct members, and since the future type returned by an async block exists entirely within the compiler and is exclusive to that exact async block, there's no way to reference it by name. It seems to me that what I want to do is impossible. Is it? If so, will it become possible in a future version of Rust?
I am aware that it would be easy to sidestep this problem by simply boxing all the futures in the Vec as I did the argument to wrap_future() but I'd prefer not to do this if I can avoid it.
I am well aware that doing this would mean that there could be only one async block in my entire codebase whose result values could possibly be added to such a Vec, and thus that there could be only one function in my entire codebase that could create values that could possibly be pushed to it. I am okay with this limitation.

Nevermind, I'm stupid. I forgot that structs could have type parameters.
struct MyStruct<F> where F: Future<Output=()> {
myvec: Vec<F>,
}

Mutably iterate through an iterator using Itertools' tuple_windows

I'm attempting to store a series of entries inside a Vec. Later I need to reprocess through the Vec to fill in some information in each entry about the next entry. The minimal example would be something like this:
struct Entry {
curr: i32,
next: Option<i32>
}
struct History {
entries: Vec<Entry>
}
where I would like to fill in the next fields to the next entries' curr value. To achieve this, I want to make use of the tuple_windows function from Itertools on the mutable iterator. I expect I can write a function like this:
impl History {
fn fill_next_with_itertools(&mut self) {
for (a, b) in self.entries.iter_mut().tuple_windows() {
a.next = Some(b.curr);
}
}
}
(playground)
However, it refuse to compile because the iterator Item's type, &mut Entry, is not Clone, which is required by tuple_windows function. I understand there is a way to iterate through the list using the indices like this:
fn fill_next_with_index(&mut self) {
for i in 0..(self.entries.len()-1) {
self.entries[i].next = Some(self.entries[i+1].curr);
}
}
(playground)
But I feel the itertools' approach more natural and elegant. What's the best ways to achieve the same effect?

From the documentation:
tuple_window clones the iterator elements so that they can be part of successive windows, this makes it most suited for iterators of references and other values that are cheap to copy.
This means that if you were to implement it with &mut items, then you'd have multiple mutable references to the same thing which is undefined behaviour.
If you still need shared, mutable access you'd have to wrap it in Rc<RefCell<T>>, Arc<Mutex<T>> or something similar:
fn fill_next_with_itertools(&mut self) {
for (a, b) in self.entries.iter_mut().map(RefCell::new).map(Rc::new).tuple_windows() {
a.borrow_mut().next = Some(b.borrow().curr);
}
}

Can a function that takes a reference be passed as a closure argument that will provide owned values?

I am trying to simplify my closures, but I had a problem converting my closure to a reference to an associated function when the parameter is owned by the closure but the inner function call only expects a reference.
#![deny(clippy::pedantic)]
fn main() {
let borrowed_structs = vec![BorrowedStruct, BorrowedStruct];
//Selected into_iter specifically to reproduce the minimal scenario that closure gets value instead of reference
borrowed_structs
.into_iter()
.for_each(|consumed_struct: BorrowedStruct| MyStruct::my_method(&consumed_struct));
// I want to write it with static method reference like following line:
// for_each(MyStruct::my_method);
}
struct MyStruct;
struct BorrowedStruct;
impl MyStruct {
fn my_method(prm: &BorrowedStruct) {
prm.say_hello();
}
}
impl BorrowedStruct {
fn say_hello(&self) {
println!("hello");
}
}
Playground
Is it possible to simplify this code:
into_iter().for_each(|consumed_struct: BorrowedStruct| MyStruct::my_method(&consumed_struct));
To the following:
into_iter().for_each(MyStruct::my_method)
Note that into_iter here is only to reproduce to scenario that I own the value in my closure. I know that iter can be used in such scenario but it is not the real scenario that I am working on.

The answer to your general question is no. Types must match exactly when passing a function as a closure argument.
There are one-off workarounds, as shown in rodrigo's answer, but the general solution is to simply take the reference yourself, as you've done:
something_taking_a_closure(|owned_value| some_function_or_method(&owned_value))
I actually advocated for this case about two years ago as part of ergonomics revamp, but no one else seemed interested.
In your specific case, you can remove the type from the closure argument to make it more succinct:
.for_each(|consumed_struct| MyStruct::my_method(&consumed_struct))

I don't think there is a for_each_ref in trait Iterator yet. But you can write your own quite easily (playground):
trait MyIterator {
fn for_each_ref<F>(self, mut f: F)
where
Self: Iterator + Sized,
F: FnMut(&Self::Item),
{
self.for_each(|x| f(&x));
}
}
impl<I: Iterator> MyIterator for I {}
borrowed_structs
.into_iter()
.for_each_ref(MyStruct::my_method);
Another option, if you are able to change the prototype of the my_method function you can make it accept the value either by value or by reference with borrow:
impl MyStruct {
fn my_method(prm: impl Borrow<BorrowedStruct>) {
let prm = prm.borrow();
prm.say_hello();
}
}
And then your original code with .for_each(MyStruct::my_method) just works.
A third option is to use a generic wrapper function (playground):
fn bind_by_ref<T>(mut f: impl FnMut(&T)) -> impl FnMut(T) {
move |x| f(&x)
}
And then call the wrapped function with .for_each(bind_by_ref(MyStruct::my_method));.

Why do Rust's `Atomic*` types use non-mutable functions to mutate the value?

I notice that Rust's Atomic* structs have functions which modify the value, such as fetch_add. For instance, I can write this program:
use std::sync::atomic::{AtomicUsize, Ordering};
struct Tester {
counter: AtomicUsize
}
impl Tester {
fn run(&self) {
let counter = self.counter.fetch_add(1, Ordering::Relaxed);
println!("Hi there, this has been run {} times", counter);
}
}
fn main() {
let t = Tester { counter: AtomicUsize::new(0) };
t.run();
t.run();
}
This compiles and runs fine, but if I change the AtomicUsize to a normal integer, it will (correctly) fail to compile due to mutability concerns:
struct Tester {
counter: u64
}
impl Tester {
fn run(&self) {
self.counter = self.counter + 1;
println!("Hi there, this has been run {} times", self.counter);
}
}

It wouldn’t be very useful if it didn’t work this way. With &mut references, only one can exist at a time, and no & references at that time, so the whole question of atomicity of operation would be moot.
Another way of looking at it is &mut are unique references, and & aliasable references. For normal types, mutation can only safely occur if you have a unique reference, but atomic types are all about mutation (via replacement) without needing a unique reference.
The naming of & and &mut has been a fraught matter, with much fear, uncertainty and doubt in the community and documents like Focusing on Ownership explaining how things actually are. The language has ended up staying with & and &mut, but &mut is actually about uniqueness rather than mutability (it’s just that most of the time the two are equivalent).

Caught between a lifetime and an FFI place

I am caught between two different issues/bugs, and can't come up with a decent solution. Any help would be greatly appreciated
Context, FFI, and calling a lot of C functions, and wrapping C types in rust structs.
The first problem is ICE: this path should not cause illegal move.
This is forcing me to do all my struct-wrapping using & references as in:
pub struct CassResult<'a> {
result:&'a cql_ffi::CassResult
}
Instead of the simpler, and preferable:
pub struct CassResult {
result:cql_ffi::CassResult
}
Otherwise code like:
pub fn first_row(&self) -> Result<CassRow,CassError> {unsafe{
Ok(CassRow{row:*cql_ffi::cass_result_first_row(self.result)})
}}
Will result in:
error: internal compiler error: this path should not cause illegal move
Ok(CassRow{row:*cql_ffi::cass_result_first_row(self.result)})
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
So, I go ahead and wrap everything using lifetime managed references, and all is not-horrible until I try to implement an iterator. At which point I see no way around this problem.
method next has an incompatible type for trait: expected concrete lifetime, found bound lifetime parameter
So given those two conflicting issues, I am totally stuck and can't find any way to implement a proper rust iterator around a FFI iterator-like construct.
Edit: With Shep's suggestion, I get:
pub struct CassResult {
pub result:cql_ffi::CassResult
}
and
pub fn get_result(&mut future:future) -> Option<CassResult> {unsafe{
let result:&cql_ffi::CassResult = &*cql_ffi::cass_future_get_result(&mut future.future);
Some(CassResult{result:*result})
}}
but then get:
error: cannot move out of borrowed content
Some(CassResult{result:*result}
Is there any way to make that pattern work? It's repeated all over this FFI wrapping code.

Only a partial answer: use the "streaming iterator" trait and macro.
I have had a similar problem making Rust bindings around the C mysql API. The result is code like this, instead of native for syntax:
let query = format!("SELECT id_y, value FROM table_x WHERE id = {}", id_x);
let res = try!(db::run_query(&query));
streaming_for!( row, res.into_iter(), {
let id_y: usize = try!(row.convert::<usize>(0));
let value: f64 = try!(row.convert::<f64>(1));
});
Here res holds the result and frees memory on drop. The lifetime of row is tied to res:
/// Res has an attached lifetime to guard an internal pointer.
struct Res<'a>{ p: *mut c_void }
/// Wrapper created by into_iter()
struct ResMoveIter<'a>{ res: Res<'a> }
impl<'a> /*StreamingIterator<'a, Row<'a>> for*/ ResMoveIter<'a>{
/// Get the next row, or None if no more rows
pub fn next(&'a mut self) -> Option<Row<'a>>{
...
}
}
#[unsafe_destructor]
impl<'a> Drop for Res<'a>{
fn drop(&mut self){
...
}
}

To answer my own question. The only decent answer was a way around the original ICE, but as thepowersgang comments, the correct way to do this now is to use :std::ptr::read, so using that approach, no ICE, and hopefully progress.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Is there a zero-overhead consuming iterator? - rust

Iterator::for_each does what you want: struct Int(i32); impl Int { fn print(&self) { println!("{}", self.0) } } fn main() { [Int(1), Int(2), Int(3)].into_iter().for_each(Int::print); }

Related

Is there a way to reference an anonymous type as a struct member?

Mutably iterate through an iterator using Itertools' tuple_windows

Can a function that takes a reference be passed as a closure argument that will provide owned values?

Why do Rust's `Atomic*` types use non-mutable functions to mutate the value?

Caught between a lifetime and an FFI place

Categories

Resources