How to convert a closure without a yield to a generator?

How to convert a closure without a yield to a generator? - rust

I'm writing a library that uses generators to hold continuations. Sometimes I want to pass a closure with no suspension points, or no yields, but the compiler complains that the closure doesn't implement the Generator trait.
I want to compile the following code without adding a yield to the closure; how can I let the compiler treat the closure as a generator?
#![feature(generators, generator_trait)]
use std::ops::Generator;
fn library_func(mut g: Box<dyn Generator<Yield = (), Return = ()>>) {
let x = unsafe { g.resume() };
println!("{:?}", x);
}
fn main() {
// a closure without yield
let x = Box::new(|| {
// uncommenting this line makes it compile, but changes the behavior
// yield ();
});
library_func(x);
}
error[E0277]: the trait bound `[closure#src/main.rs:12:22: 15:6]: std::ops::Generator` is not satisfied
--> src/main.rs:17:18
|
17 | library_func(x);
| ^ the trait `std::ops::Generator` is not implemented for `[closure#src/main.rs:12:22: 15:6]`
|
= note: required for the cast to the object type `dyn std::ops::Generator<Yield=(), Return=()>`

A closure isn't a generator, so the compiler can't really treat it as one. It is unclear whether the generator you wish to implement is supposed to return or yield the return value of the function; assuming you want the former, you can use a yield statement after a return statement to create a generator that does not yield:
let x = Box::new(|| {
return;
yield;
});
If you need this frequently, you can also wrap this in a function:
fn into_generator<F, T>(f: F) -> impl Generator<Yield = (), Return = T>
where
F: FnOnce() -> T,
{
#[allow(unreachable_code)]
|| {
return f();
yield;
}
}
(Full code on the playground)

Related

Pass an iterable to a function and iterate twice in rust

I have a function which looks like
fn do_stuff(values: HashSet<String>) {
// Count stuff
for s in values.iter() {
prepare(s);
}
// Process stuff
for s in values.iter() {
process(s);
}
}
This works fine. For a unit test, I want to pass a two value collection where the elements are passed in a known order. (Processing them in the other order won't test the case I am trying to test.) HashSet doesn't guarantee an order, so I would like to pass a Vec instead.
I would like to change the argument to Iterable, but it appears that only IntoIter exists. I tried
fn do_stuff<C>(values: C)
where C: IntoIterator<Item=String>
{
// Count stuff
for s in values {
prepare(s);
}
// Process stuff
for s in values {
process(s);
}
}
which fails because the first iteration consumes values. The compiler suggests borrowing values, but
fn do_stuff<C>(values: C)
where C: IntoIterator<Item=String>
{
// Count stuff
for s in &values {
prepare(s);
}
// Process stuff
for s in values {
process(s);
}
}
fails because
the trait Iterator is not implemented for &C
I could probably make something with clone work, but the actual set will be large and I would like to avoid copying it if possible.
Thinking about that, the signature probably should be do_stuff(values: &C), so if that makes the problem simpler, then that is an acceptable solution.
SO suggests Writing a generic function that takes an iterable container as parameter in Rust as a related question, but that is a lifetime problem. I am not having problems with lifetimes.
It looks like How to create an `Iterable` trait for references in Rust? may actually be the solution. But I'm having trouble getting it to compile.
My first attempt is
pub trait Iterable {
type Item;
type Iter: Iterator<Item = Self::Item>;
fn iterator(&self) -> Self::Iter;
}
impl Iterable for HashSet<String> {
type Item = String;
type Iter = HashSet<String>::Iterator;
fn iterator(&self) -> Self::Iter {
self.iter()
}
}
which fails with
error[E0223]: ambiguous associated type
--> src/file.rs:178:17
|
178 | type Iter = HashSet<String>::Iterator;
| ^^^^^^^^^^^^^^^^^^^^^^^^^ help: use fully-qualified syntax: `<HashSet<std::string::String> as Trait>::Iterator`
Following that suggestion:
impl Iterable for HashSet<String> {
type Item = String;
type Iter = <HashSet<std::string::String> as Trait>::Iterator;
fn iterator(&self) -> Self::Iter {
self.iter()
}
}
failed with
error[E0433]: failed to resolve: use of undeclared type `Trait`
--> src/file.rs:178:50
|
178 | type Iter = <HashSet<std::string::String> as Trait>::Iterator;
| ^^^^^ use of undeclared type `Trait`
The rust documents don't seem to include Trait as a known type. If I replace Trait with HashSet, it doesn't recognize Iterator or IntoIter as the final value in the expression.
Implementation of accepted answer
Attempting to implement #eggyal answer, I was able to get this to compile
use std::collections::HashSet;
fn do_stuff<I>(iterable: I)
where
I: IntoIterator + Copy,
I::Item: AsRef<str>,
{
// Count stuff
for s in iterable {
prepare(s.as_ref());
}
// Process stuff
for s in iterable {
process(s.as_ref());
}
}
fn prepare(s: &str) {
println!("prepare: {}", s)
}
fn process(s: &str) {
println!("process: {}", s)
}
#[cfg(test)]
mod test_cluster {
use super::*;
#[test]
fn doit() {
let vec: Vec<String> = vec!["a".to_string(), "b".to_string(), "c".to_string()];
let set = vec.iter().cloned().collect::<HashSet<_>>();
do_stuff(&vec);
do_stuff(&set);
}
}
which had this output
---- simple::test_cluster::doit stdout ----
prepare: a
prepare: b
prepare: c
process: a
process: b
process: c
prepare: c
prepare: b
prepare: a
process: c
process: b
process: a

IntoIterator is not only implemented by the collection types themselves, but in most cases (including Vec and HashSet) it is also implemented by their borrows (yielding an iterator of borrowed items). Moreover, immutable borrows are always Copy. So you can do:
fn do_stuff<I>(iterable: I)
where
I: IntoIterator + Copy,
I::Item: AsRef<str>,
{
// Count stuff
for s in iterable {
prepare(s);
}
// Process stuff
for s in iterable {
process(s);
}
}
And this would then be invoked by passing in a borrow of the relevant collection:
let vec = vec!["a", "b", "c"];
let set = vec.iter().cloned().collect::<HashSet<_>>();
do_stuff(&vec);
do_stuff(&set);
Playground.
However, depending on your requirements (whether all items must first be prepared before any can be processed), it may be possible in this case to combine the preparation and processing into a single pass of the iterator.

Iterators over containers can be cloned if you want to iterate the container twice, so accepting an IntoIterator + Clone should work for you. Example code:
fn do_stuff<I>(values: I)
where
I: IntoIterator + Clone,
{
// Count stuff
for s in values.clone() {
prepare(s);
}
// Process stuff
for s in values {
process(s);
}
}
You can now pass in e.g. either a hash set or a vector, and both of them can be iterated twice:
let vec = vec!["a", "b", "c"];
let set: HashSet<_> = vec.iter().cloned().collect();
do_stuff(vec);
do_stuff(set);
(Playground)

Using ? inside closure

I've got this simple parsing function
use std::collections::BTreeMap;
fn parse_kv(data: &str) -> BTreeMap<String, String> {
data.split('&')
.map(|kv| kv.split('='))
.map(|mut kv| (kv.next().unwrap().into(), kv.next().unwrap().into()))
.collect()
}
#[test]
fn parse_kv_test() {
let result = parse_kv("test1=1&test2=2");
assert_eq!(result["test1"], "1");
assert_eq!(result["test2"], "2");
}
It works fine and all, but I want to have Option or Result return type like so:
fn parse_kv(data: &str) -> Option<BTreeMap<String, String>>
This implementation:
fn parse_kv(data: &str) -> Option<BTreeMap<String, String>> {
Some(data.split('&')
.map(|kv| kv.split('='))
.map(|mut kv| (kv.next()?.into(), kv.next()?.into()))
.collect())
}
Unfortunately gives the following error:
error[E0277]: the `?` operator can only be used in a function that returns `Result` or `Option` (or another type that implements `std::ops::Try`)
--> src/ecb_cut_paste.rs:23:24
|
23 | .map(|mut kv| (kv.next()?.into(), kv.next()?.into()))
| ^^^^^^^^^^ cannot use the `?` operator in a function that returns `(_, _)`
|
= help: the trait `std::ops::Try` is not implemented for `(_, _)`
= note: required by `std::ops::Try::from_error`
Is it somehow possible to use ? operator inside closure to return None from such function? If not, how would I need to handle idiomatically such case?

The issue here is that the closure itself is a function, so using ? will return from the closure instead of the outer function. This can still be used to implement the function the way you want, however:
use std::collections::BTreeMap;
fn parse_kv(data: &str) -> Option<BTreeMap<String, String>> {
data.split('&')
.map(|kv| kv.split('='))
.map(|mut kv| Some((kv.next()?.into(), kv.next()?.into())))
.collect()
}
#[test]
fn parse_kv_test() {
let result = parse_kv("test1=1&test2=2").unwrap();
assert_eq!(result["test1"], "1");
assert_eq!(result["test2"], "2");
let result2 = parse_kv("test1=1&test2");
assert_eq!(result2, None);
}
There are a couple points to note here: First, the question marks and Some(...) in the second map invocation mean you have an iterator of Option<(String, String)> - type inference figures this out for you.
The next point of note is that collect() can automatically convert Iterator<Option<T>> into Option<Collection<T>> (same with Result - relevant documentation here). I added a test demonstrating that this works.
One other thing to be aware of is that using collect in this way still allows short-circuiting. Once the first None is yielded by the iterator, collect will immediately return with None, rather than continuing to process each element.

Why can't I use the ? operator in my main function on a function that returns an Option?

Using this file:
use std::env;
fn main() {
println!("{}", env::args().nth(3)?);
}
I get this error:
error[E0277]: the `?` operator can only be used in a function that returns `Result` or `Option` (or another type that implements `std::ops::Try`)
--> src/main.rs:4:20
|
4 | println!("{}", env::args().nth(3)?);
| ^^^^^^^^^^^^^^^^^^^ cannot use the `?` operator in a function that returns `()`
|
= help: the trait `std::ops::Try` is not implemented for `()`
= note: required by `std::ops::Try::from_error`
However this is confusing because nth does return Option:
fn nth(&mut self, n: usize) -> Option<Self::Item>
Am I misunderstanding the documentation or is this a bug?

The return type of main must implement std::process::Termination(currently it's an unstable trait). If you look at the end of the documentation, you will see:
impl Termination for !
impl Termination for ()
impl Termination for ExitCode
impl<E: Debug> Termination for Result<!, E>
impl<E: Debug> Termination for Result<(), E>
If you want to return an Option you must implement the trait on it. This is not practical because you can't implement a trait on foreign type, so the best solution is to convert Option<T> to Result<T, E>:
use std::env;
fn main() -> Result<(), Box<std::error::Error>> {
println!("{}", env::args().nth(3).ok_or("Missing argument")?);
Ok(())
}
See also:
Why do try!() and ? not compile when used in a function that doesn't return Option or Result?

The ? operator will cause the function containing it to return None if the value the ? is applied to is None.
This means you can write
fn not_main() -> Option<()> {
println!("{}", std::env::args().nth(3)?);
Ok(())
}
since nth returns an Option<Item> and not_main returns an Option<()>.
However, your main does not return an Option, hence ? can't work inside it.
How you work around this will depend on what you want to do in the case of a missing argument. The most brutal solution is to unwrap instead - which will cause your code to panic.
fn main() {
println!("{}", env::args().nth(3).unwrap())
}
An alternative is to match and handle the missing case
fn main() {
match std::env::args().nth(3) {
Some(ref v) => println!("{}", v),
None => println!("Missing argument"),
}
}
Since Option supports Debug you could print the debug version - which will output None, or Some("arg3").
fn main() {
println!("{:?}", std::env::args().nth(3));
}

If you really want to use ? on a Option value in main, you probably need to implement you own Option.
In your case, nothing::Probably is a better Option.
Example (you need nightly toolchain to run it):
use nothing::{Nothing, Probably, Something};
fn get_args() -> Probably<Vec<String>> {
match std::env::args().skip(1).collect::<Vec<String>>() {
args # _ if args.len() > 0 => Something(args),
_ => Nothing,
}
}
fn main() -> Probably<Vec<String>> {
let some_args = get_args();
println!("some_args = {some_args:?}");
let args = some_args?; // <- it returns here if some_args is Nothing
println!("args = {args:?}");
Something(args)
}
It works because Probably implements std::process::Termination so you can return it from you main function. Additionally it implements std::ops::Try so you can use ? on it.

Return type for rusqlite MappedRows

I am trying to write a method that returns a rusqlite::MappedRows:
pub fn dump<F>(&self) -> MappedRows<F>
where F: FnMut(&Row) -> DateTime<UTC>
{
let mut stmt =
self.conn.prepare("SELECT created_at FROM work ORDER BY created_at ASC").unwrap();
let c: F = |row: &Row| {
let created_at: DateTime<UTC> = row.get(0);
created_at
};
stmt.query_map(&[], c).unwrap()
}
I am getting stuck on a compiler error:
error[E0308]: mismatched types
--> src/main.rs:70:20
|
70 | let c: F = |row: &Row| {
| ____________________^ starting here...
71 | | let created_at: DateTime<UTC> = row.get(0);
72 | | created_at
73 | | };
| |_________^ ...ending here: expected type parameter, found closure
|
= note: expected type `F`
= note: found type `[closure#src/main.rs:70:20: 73:10]`
What am I doing wrong here?
I tried passing the closure directly to query_map but I get the same compiler error.

I'll divide the answer in two parts, the first about how to fix the return type without considering borrow-checker, the second about why it doesn't work even if you fixed the return type.
§1.
Every closure has a unique, anonymous type, so c cannot be of any type F the caller provides. That means this line will never compile:
let c: F = |row: &Row| { ... } // no, wrong, always.
Instead, the type should be propagated out from the dump function, i.e. something like:
// ↓ no generics
pub fn dump(&self) -> MappedRows<“type of that c”> {
..
}
Stable Rust does not provide a way to name that type. But we could do so in nightly with the "impl Trait" feature:
#![feature(conservative_impl_trait)]
// ↓~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pub fn dump(&self) -> MappedRows<impl FnMut(&Row) -> DateTime<UTC>> {
..
}
// note: wrong, see §2.
The impl F here means that, “we are going to return a MappedRows<T> type where T: F, but we are not going to specify what exactly is T; the caller should be ready to treat anything satisfying F as a candidate of T”.
As your closure does not capture any variables, you could in fact turn c into a function. We could name a function pointer type, without needing "impl Trait".
// ↓~~~~~~~~~~~~~~~~~~~~~~~~
pub fn dump(&self) -> MappedRows<fn(&Row) -> DateTime<UTC>> {
let mut stmt = self.conn.prepare("SELECT created_at FROM work ORDER BY created_at ASC").unwrap();
fn c(row: &Row) -> DateTime<UTC> {
row.get(0)
}
stmt.query_map(&[], c as fn(&Row) -> DateTime<UTC>).unwrap()
}
// note: wrong, see §2.
Anyway, if we do use "impl Trait", since MappedRows is used as an Iterator, it is more appropriate to just say so:
#![feature(conservative_impl_trait)]
pub fn dump<'c>(&'c self) -> impl Iterator<Item = Result<DateTime<UTC>>> + 'c {
..
}
// note: wrong, see §2.
(without the 'c bounds the compiler will complain E0564, seems lifetime elision doesn't work with impl Trait yet)
If you are stuck with Stable Rust, you cannot use the "impl Trait" feature. You could wrap the trait object in a Box, at the cost of heap allocation and dynamic dispatch:
pub fn dump(&self) -> Box<Iterator<Item = Result<DateTime<UTC>>>> {
...
Box::new(stmt.query_map(&[], c).unwrap())
}
// note: wrong, see §2.
§2.
The above fix works if you want to, say, just return an independent closure or iterator. But it does not work if you return rusqlite::MappedRows. The compiler will not allow the above to work due to lifetime issue:
error: `stmt` does not live long enough
--> 1.rs:23:9
|
23 | stmt.query_map(&[], c).unwrap()
| ^^^^ does not live long enough
24 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the anonymous lifetime #1 defined on the body at 15:80...
--> 1.rs:15:81
|
15 | pub fn dump(conn: &Connection) -> MappedRows<impl FnMut(&Row) -> DateTime<UTC>> {
| ^
And this is correct. MappedRows<F> is actually MappedRows<'stmt, F>, this type is valid only when the original SQLite statement object (having 'stmt lifetime) outlives it — thus the compiler complains that stmt is dead when you return the function.
Indeed, if the statement is dropped before we iterate on those rows, we will get garbage results. Bad!
What we need to do is to make sure all rows are read before dropping the statement.
You could collect the rows into a vector, thus disassociating the result from the statement, at the cost of storing everything in memory:
// ↓~~~~~~~~~~~~~~~~~~~~~~~~~
pub fn dump(&self) -> Vec<Result<DateTime<UTC>>> {
..
let it = stmt.query_map(&[], c).unwrap();
it.collect()
}
Or invert the control, let dump accept a function, which dump will call while keeping stmt alive, at the cost of making the calling syntax weird:
// ↓~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pub fn dump<F>(&self, mut f: F) where F: FnMut(Result<DateTime<UTC>>) {
...
for res in stmt.query_map(&[], c).unwrap() {
f(res);
}
}
x.dump(|res| println!("{:?}", res));
Or split dump into two functions, and let the caller keep the statement alive, at the cost of exposing an intermediate construct to the user:
#![feature(conservative_impl_trait)]
pub fn create_dump_statement(&self) -> Statement {
self.conn.prepare("SELECT '2017-03-01 12:34:56'").unwrap()
}
pub fn dump<'s>(&self, stmt: &'s mut Statement) -> impl Iterator<Item = Result<DateTime<UTC>>> + 's {
stmt.query_map(&[], |row| row.get(0)).unwrap()
}
...
let mut stmt = x.create_dump_statement();
for res in x.dump(&mut stmt) {
println!("{:?}", res);
}

The issue here is that you are implicitly trying to return a closure, so to find explanations and examples you can search for that.
The use of the generic <F> means that the caller decides the concrete type of F and not the function dump.
What you would like to achieve instead requires the long awaited feature impl trait.

Is using `ref` in a function argument the same as automatically taking a reference?

Rust tutorials often advocate passing an argument by reference:
fn my_func(x: &Something)
This makes it necessary to explicitly take a reference of the value at the call site:
my_func(&my_value).
It is possible to use the ref keyword usually used in pattern matching:
fn my_func(ref x: Something)
I can call this by doing
my_func(my_value)
Memory-wise, does this work like I expect or does it copy my_value on the stack before calling my_func and then get a reference to the copy?

The value is copied, and the copy is then referenced.
fn f(ref mut x: i32) {
*x = 12;
}
fn main() {
let mut x = 42;
f(x);
println!("{}", x);
}
Output: 42

Both functions declare x to be &Something. The difference is that the former takes a reference as the parameter, while the latter expects it to be a regular stack value. To illustrate:
#[derive(Debug)]
struct Something;
fn by_reference(x: &Something) {
println!("{:?}", x); // prints "&Something""
}
fn on_the_stack(ref x: Something) {
println!("{:?}", x); // prints "&Something""
}
fn main() {
let value_on_the_stack: Something = Something;
let owned: Box<Something> = Box::new(Something);
let borrowed: &Something = &value_on_the_stack;
// Compiles:
on_the_stack(value_on_the_stack);
// Fail to compile:
// on_the_stack(owned);
// on_the_stack(borrowed);
// Dereferencing will do:
on_the_stack(*owned);
on_the_stack(*borrowed);
// Compiles:
by_reference(owned); // Does not compile in Rust 1.0 - editor
by_reference(borrowed);
// Fails to compile:
// by_reference(value_on_the_stack);
// Taking a reference will do:
by_reference(&value_on_the_stack);
}
Since on_the_stack takes a value, it gets copied, then the copy matches against the pattern in the formal parameter (ref x in your example). The match binds x to the reference to the copied value.

If you call a function like f(x) then x is always passed by value.
fn f(ref x: i32) {
// ...
}
is equivalent to
fn f(tmp: i32) {
let ref x = tmp;
// or,
let x = &tmp;
// ...
}
i.e. the referencing is completely restricted to the function call.

The difference between your two functions becomes much more pronounced and obvious if the value doesn't implement Copy. For example, a Vec<T> doesn't implement Copy, because that is an expensive operation, instead, it implements Clone (Which requires a specific method call).
Assume two methods are defined as such
fn take_ref(ref v: Vec<String>) {}// Takes a reference, ish
fn take_addr(v: &Vec<String>) {}// Takes an explicit reference
take_ref will try to copy the value passed, before referencing it. For Vec<T>, this is actually a move operation (Because it doesn't copy). This actually consumes the vector, meaning the following code would throw a compiler error:
let v: Vec<String>; // assume a real value
take_ref(v);// Value is moved here
println!("{:?}", v);// Error, v was moved on the previous line
However, when the reference is explicit, as in take_addr, the Vec isn't moved but passed by reference. Therefore, this code does work as intended:
let v: Vec<String>; // assume a real value
take_addr(&v);
println!("{:?}", v);// Prints contents as you would expect

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to convert a closure without a yield to a generator? - rust

Related

Pass an iterable to a function and iterate twice in rust

Using ? inside closure

Why can't I use the ? operator in my main function on a function that returns an Option?

Return type for rusqlite MappedRows

Is using `ref` in a function argument the same as automatically taking a reference?

Categories

Resources