Lifetime issue assigning reference from conditional - rust

I'm quite new in Rust and I'm having an issue with lifetimes that I believe I can understand what is happening and why, but can't get around in my head how can I solve it.
For simplicity I created this short "clone" of what I'm actually trying to do, but the real code is using asyc-stripe. Will annotate the example code with the real one in case is relevant.
There is the following structure:
// https://github.com/arlyon/async-stripe/blob/9f1a84144a23cc7b2124a1252ee15dc646ce0215/src/resources/generated/subscription.rs#L385
struct ObjectA<'a> {
field: i32,
object_b_id: Option<&'a str>,
}
// https://github.com/arlyon/async-stripe/blob/9f1a84144a23cc7b2124a1252ee15dc646ce0215/src/resources/generated/subscription.rs#L570
impl<'a> ObjectA<'a> {
fn new(field: i32) -> Self {
return Self {
field,
object_b_id: Default::default(),
};
}
}
// https://github.com/arlyon/async-stripe/blob/9f1a84144a23cc7b2124a1252ee15dc646ce0215/src/resources/generated/subscription.rs#L210
fn persist_obj_a(obj_a: ObjectA<'_>) {}
// ---
// https://github.com/arlyon/async-stripe/blob/9f1a84144a23cc7b2124a1252ee15dc646ce0215/src/resources/generated/payment_method.rs#L18
struct ObjectB {
id: ObjectBId,
}
// https://github.com/arlyon/async-stripe/blob/9f1a84144a23cc7b2124a1252ee15dc646ce0215/src/ids.rs#L518
struct ObjectBId {
value: String,
}
impl ObjectBId {
fn as_str(&self) -> &str {
return self.value.as_str();
}
}
// This is a wrapper around https://github.com/arlyon/async-stripe/blob/9f1a84144a23cc7b2124a1252ee15dc646ce0215/src/resources/generated/payment_method.rs#L128 that just returns the first one found (id any, hence the Option)
fn load_object_b() -> Option<ObjectB> {
return Some(ObjectB {
id: ObjectBId {
value: String::from("some_id"),
},
});
}
And what I'm trying to do is: load the ObjectB with load_object_b and use its ID into a ObjectA.
Ok, so on to my attempts.
First attempt
fn first_try(condition: bool) {
let mut obj_a = ObjectA::new(1);
if condition {
match load_object_b() {
Some(obj_b) => obj_a.object_b_id = Some(obj_b.id.as_str()),
None => (),
}
}
persist_obj_a(obj_a);
}
In here I get
obj_b.id does not live long enough
Which I can understand, since from what I can understand the obj_b only exists during the match arm and is droped by the end of it.
Second attempt
fn second_try(condition: bool) {
let mut obj_a = ObjectA::new(1);
if condition {
let obj_b = load_object_b();
match obj_b {
Some(ref obj_b) => obj_a.object_b_id = Some(obj_b.id.as_str()),
None => (),
}
}
persist_obj_a(obj_a);
}
Here I get
obj_b.0 does not live long enough
Which I guess it is still the same idea, just in a different place. Since again, from my understanding, obj_b now only lives within the scope of the if condition.
Third and last attempt
I ended up "solving" it with:
fn third_try(condition: bool) {
let mut obj_a = ObjectA::new(1);
let obj_b = load_object_b();
let obj_b_id = match obj_b {
Some(ref obj_b) => Some(obj_b.id.as_str()),
None => None,
};
if condition {
obj_a.object_b_id = obj_b_id;
}
persist_obj_a(obj_a);
}
In here I moved the obj_b to have the same lifetime as obj_a. So it solves the issue that I was having.
My problem with this solution is that I feel that I'm wasting resource doing the (possible expensive) request to load_object_b even if I'm not gonna use it based on the condition.
Not sure if I'm missing something very obvious or just going on the overall wrong direction, but would appreciate some light on what I might be doing wrong.

This should work, I think:
fn third_try(condition: bool) {
let mut obj_a = ObjectA::new(1);
let obj_b = if condition { load_object_b() } else { None };
obj_a.object_b_id = obj_b.as_ref().map (|o| o.id.as_str());
persist_obj_a(obj_a);
}

Rust allows you to have conditionally initialized variables. You can declare obj_b ouside of the if, but only initialize it inside the if. The compiler will ensure you can use it only if it is initialized.
fn second_try(condition: bool) {
let mut obj_a = ObjectA::new(1);
let obj_b;
if condition {
obj_b = load_object_b();
match obj_b {
Some(ref obj_b) => obj_a.object_b_id = Some(obj_b.id.as_str()),
None => (),
}
}
persist_obj_a(obj_a);
}

Related

Rust: Implement AVL Tree and error: thread 'main' panicked at 'already borrowed: BorrowMutError'

I have the following tree structure:
use std::cell::RefCell;
use std::rc::Rc;
use std::cmp;
use std::cmp::Ordering;
type AVLTree<T> = Option<Rc<RefCell<TreeNode<T>>>>;
#[derive(Debug, PartialEq, Clone)]
struct TreeSet<T: Ord> {
root: AVLTree<T>,
}
impl<T: Ord> TreeSet<T> {
fn new() -> Self {
Self {
root: None
}
}
fn insert(&mut self, value: T) -> bool {
let current_tree = &mut self.root;
while let Some(current_node) = current_tree {
let node_key = &current_node.borrow().key;
match node_key.cmp(&value) {
Ordering::Less => { let current_tree = &mut current_node.borrow_mut().right; },
Ordering::Equal => {
return false;
}
Ordering::Greater => { let current_tree = &mut current_node.borrow_mut().left; },
}
}
*current_tree = Some(Rc::new(RefCell::new(TreeNode {
key: value,
left: None,
right: None,
parent: None
})));
true
}
}
#[derive(Clone, Debug, PartialEq)]
struct TreeNode<T: Ord> {
pub key: T,
pub parent: AVLTree<T>,
left: AVLTree<T>,
right: AVLTree<T>,
}
fn main() {
let mut new_avl_tree: TreeSet<u32> = TreeSet::new();
new_avl_tree.insert(3);
new_avl_tree.insert(5);
println!("Tree: {:#?}", &new_avl_tree);
}
Building with cargo build is fine, but when I run cargo run, I got the below error:
thread 'main' panicked at 'already borrowed: BorrowMutError', src\libcore\result.rs:1165:5
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace. error: process didn't
exit successfully: target\debug\avl-tree.exe (exit code: 101)
If i just call insert(3), it will be fine and my tree gets printed correctly. However, if I insert(5) after insert(3), I will get that error.
How do I fix that?
Manually implementing data structures such as linked list, tree, graph are not task for novices, because of memory safety rules in language. I suggest you to read Too Many Linked Lists tutorial, which discusses how to implement safe and unsafe linked lists in Rust right way.
Also read about name shadowing.
Your error is that inside a cycle you try to borrow mutable something which is already borrowed as immutable.
let node_key = &current_node.borrow().key; // Borrow as immutable
match node_key.cmp(&value) {
Ordering::Less => { let current_tree = &mut current_node.borrow_mut().right; }, // Create a binding which will be immediately deleted and borrow as mutable.
And I recommend you to read Rust book to learn rust.
First let us correct your algorithm. The following lines are incorrect:
let current_tree = &mut current_node.borrow_mut().right;
...
let current_tree = &mut current_node.borrow_mut().left;
Both do not reassign a value to current_tree but create a new (unused) one (#Inline refers to it as Name shadowing). Remove the let and make current_tree mut.
Now we get a compiler error temporary value dropped while borrowed. Probably the compiler error message did mislead you. It tells you to use let to increase the lifetime, and this would be right if you used the result in the same scope, but no let can increase the lifetime beyond the scope.
The problem is that you cannot pass out a reference to a value owned by a loop (as current_node.borrow_mut.right). So it would be better to use current_tree as owned variable. Sadly this means that many clever tricks in your code will not work any more.
Another problem in the code is the multiple borrow problem (your original runtime warning is about this). You cannot call borrow() and borrow_mut() on the same RefCell without panic(that is the purpose of RefCell).
So after finding the problems in your code, I got interested in how I would write the code. And now that it is written, I thought it would be fair to share it:
fn insert(&mut self, value: T) -> bool {
if let None = self.root {
self.root = TreeSet::root(value);
return true;
}
let mut current_tree = self.root.clone();
while let Some(current_node) = current_tree {
let mut borrowed_node = current_node.borrow_mut();
match borrowed_node.key.cmp(&value) {
Ordering::Less => {
if let Some(next_node) = &borrowed_node.right {
current_tree = Some(next_node.clone());
} else {
borrowed_node.right = current_node.child(value);
return true;
}
}
Ordering::Equal => {
return false;
}
Ordering::Greater => {
if let Some(next_node) = &borrowed_node.left {
current_tree = Some(next_node.clone());
} else {
borrowed_node.left = current_node.child(value);
return true;
}
}
};
}
true
}
//...
trait NewChild<T: Ord> {
fn child(&self, value: T) -> AVLTree<T>;
}
impl<T: Ord> NewChild<T> for Rc<RefCell<TreeNode<T>>> {
fn child(&self, value: T) -> AVLTree<T> {
Some(Rc::new(RefCell::new(TreeNode {
key: value,
left: None,
right: None,
parent: Some(self.clone()),
})))
}
}
One will have to write the two methods child(value:T) and root(value:T) to make this compile.

Borrowing the mutable member used inside the loop

The problem I want to solve is:
Given the recursively nested data structure, eg. a JSON tree, and a path pointing to (possibly non-existent) element inside it, return the mutable reference of the element, that's the closest to given path.
Example: if we have JSON document in form { a: { b: { c: "foo" } } } and a path a.b.d, we want to have a mutable pointer to value stored under key "b".
This is a code snippet, what I've got so far:
use std::collections::HashMap;
enum Json {
Number(i64),
Bool(bool),
String(String),
Array(Vec<Json>),
Object(HashMap<String, Json>)
}
struct Pointer<'a, 'b> {
value: &'a mut Json,
path: Vec<&'b str>,
position: usize
}
/// Return a mutable pointer to JSON element having shared
/// the nearest common path with provided JSON.
fn nearest_mut<'a,'b>(obj: &'a mut Json, path: Vec<&'b str>) -> Pointer<'a,'b> {
let mut i = 0;
let mut current = obj;
for &key in path.iter() {
match current {
Json::Array(array) => {
match key.parse::<usize>() {
Ok(index) => {
match array.get_mut(index) {
Some(inner) => current = inner,
None => break,
}
},
_ => break,
}
} ,
Json::Object(map) => {
match map.get_mut(key) {
Some(inner) => current = inner,
None => break
}
},
_ => break,
};
i += 1;
}
Pointer { path, position: i, value: current }
}
The problem is that this doesn't pass through Rust's borrow checker, as current is borrowed as mutable reference twice, once inside match statement and once at the end of the function, when constructing the pointer method.
I've tried a different approaches, but not figured out how to achieve the goal (maybe going the unsafe path).
I completely misread your question and I owe you an apology.
You cannot do it in one pass - you're going to need to do a read-only pass to find the nearest path (or exact path), and then a read-write pass to actually extract the reference, or pass a mutator function in the form of a closure.
I've implemented the two-pass method for you. Do note that it is still pretty performant:
fn nearest_mut<'a, 'b>(obj: &'a mut Json, path: Vec<&'b str>) -> Pointer<'a, 'b> {
let valid_path = nearest_path(obj, path);
exact_mut(obj, valid_path).unwrap()
}
fn exact_mut<'a, 'b>(obj: &'a mut Json, path: Vec<&'b str>) -> Option<Pointer<'a, 'b>> {
let mut i = 0;
let mut target = obj;
for token in path.iter() {
i += 1;
// borrow checker gets confused about `target` being mutably borrowed too many times because of the loop
// this once-per-loop binding makes the scope clearer and circumvents the error
let target_once = target;
let target_opt = match *target_once {
Json::Object(ref mut map) => map.get_mut(*token),
Json::Array(ref mut list) => match token.parse::<usize>() {
Ok(t) => list.get_mut(t),
Err(_) => None,
},
_ => None,
};
if let Some(t) = target_opt {
target = t;
} else {
return None;
}
}
Some(Pointer {
path,
position: i,
value: target,
})
}
/// Return a mutable pointer to JSON element having shared
/// the nearest common path with provided JSON.
fn nearest_path<'a, 'b>(obj: &'a Json, path: Vec<&'b str>) -> Vec<&'b str> {
let mut i = 0;
let mut target = obj;
let mut valid_paths = vec![];
for token in path.iter() {
// borrow checker gets confused about `target` being mutably borrowed too many times because of the loop
// this once-per-loop binding makes the scope clearer and circumvents the error
let target_opt = match *target {
Json::Object(ref map) => map.get(*token),
Json::Array(ref list) => match token.parse::<usize>() {
Ok(t) => list.get(t),
Err(_) => None,
},
_ => None,
};
if let Some(t) = target_opt {
target = t;
valid_paths.push(*token)
} else {
return valid_paths;
}
}
return valid_paths
}
The principle is simple - I reused the method I wrote in my initial question in order to get the nearest valid path (or exact path).
From there, I feed that straight into the function that I had in my original answer, and since I am certain the path is valid (from the prior function call) I can safely unwrap() :-)

How to return when ref and ownership transfer both won't work

so, if I return this
self.string_ref.unwrap().as_ref()
compiler will say
error[E0515]: cannot return value referencing temporary value
returns a value referencing data owned by the current function
if I return this
*self.string_ref.unwrap().as_ref()
the compiler will say
error[E0507]: cannot move out of borrowed content
this is just drove me crazy
here is the code: (playground)
use std::ptr::NonNull;
struct A {
string_ref: Option<NonNull<String>>,
}
struct Number {
num: i32
}
impl A {
fn hello() {
}
fn give_me_string(&self) -> String {
unsafe {
*self.string_ref.unwrap().as_ref()
}
}
}
fn main() {
let a = A {
string_ref: NonNull::new(&mut String::from("hello world") as *mut String)
};
let t = a.give_me_string();
println!("{}", t)
}
Stripping your example to the bare minimum:
struct A {
string_ref: Option<NonNull<String>>,
}
impl A {
fn give_me_string(&self) -> String {
unsafe {
*self.string_ref.unwrap().as_ref()
}
}
}
There are a few errors here:
The most obvious one is that you're trying to take ownership of self.string_ref, even though you've only borrowed self.
To solve this you'll want to use a match statement, which allows you to destructure self.string_ref and not consume it:
fn give_me_string(&self) -> String {
unsafe {
match self.string_ref {
Some(x) => x.as_ref(),
None => panic!("Had no `string_ref`!")
}
}
}
as_ref returns &T, so you can't return an owned string, instead you need to either clone it and then return an owned string, or take reference to it:
//Option one: Clone contents
match self.string_ref {
Some(ref x) => x.as_ref().clone(),
_ => //...
}
//Option two: Return reference.
fn give_me_string(&self) -> &str {
unsafe {
match &self.string_ref {
Some(x) => x.as_ref() as _,
_ => //...
}
}
}
To address another problem mentioned in the comments, you have the following statement in your main function:
string_ref: NonNull::new(&mut String::from("hello world") as *mut String)
This will cause UB due to its nature. You are forming a String by using String::from, but are not storing its value anywhere and are instead immediately casting into a pointer. This will free the String at the end of the line, causing UB.
So I basically figured out what's going on, thanks to #Optimistic Peach
fn give_me_string(&self) -> &String {
unsafe {
match self.string_ref {
Some(x) => &*(x.as_ptr() as *const _), //without ref
Some(ref x) => x.as_ptr(), // with ref
None => panic!("hello?")
}
}
}

What's an idiomatic way to delete a value from HashMap if it is empty?

The following code works, but it doesn't look nice as the definition of is_empty is too far away from the usage.
fn remove(&mut self, index: I, primary_key: &Rc<K>) {
let is_empty;
{
let ks = self.data.get_mut(&index).unwrap();
ks.remove(primary_key);
is_empty = ks.is_empty();
}
// I have to wrap `ks` in an inner scope so that we can borrow `data` mutably.
if is_empty {
self.data.remove(&index);
}
}
Do we have some ways to drop the variables in condition before entering the if branches, e.g.
if {ks.is_empty()} {
self.data.remove(&index);
}
Whenever you have a double look-up of a key, you need to think Entry API.
With the entry API, you get a handle to a key-value pair and can:
read the key,
read/modify the value,
remove the entry entirely (getting the key and value back).
It's extremely powerful.
In this case:
use std::collections::HashMap;
use std::collections::hash_map::Entry;
fn remove(hm: &mut HashMap<i32, String>, index: i32) {
if let Entry::Occupied(o) = hm.entry(index) {
if o.get().is_empty() {
o.remove_entry();
}
}
}
fn main() {
let mut hm = HashMap::new();
hm.insert(1, String::from(""));
remove(&mut hm, 1);
println!("{:?}", hm);
}
I did this in the end:
match self.data.entry(index) {
Occupied(mut occupied) => {
let is_empty = {
let ks = occupied.get_mut();
ks.remove(primary_key);
ks.is_empty()
};
if is_empty {
occupied.remove();
}
},
Vacant(_) => unreachable!()
}

How do I return an Iterator that's generated by a function that takes &'a mut self (when self is created locally)?

Update: The title of the post has been updated, and the answer has been moved out of the question. The short answer is you can't. Please see my answer to this question.
I'm following an Error Handling blog post here (github for it is here), and I tried to make some modifications to the code so that the search function returns an Iterator instead of a Vec. This has been insanely difficult, and I'm stuck.
I've gotten up to this point:
fn search<'a, P: AsRef<Path>>(file_path: &Option<P>, city: &str)
-> Result<FilterMap<csv::reader::DecodedRecords<'a, Box<Read>, Row>,
FnMut(Result<Row, csv::Error>)
-> Option<Result<PopulationCount, csv::Error>>>,
CliError> {
let mut found = vec![];
let input: Box<io::Read> = match *file_path {
None => Box::new(io::stdin()),
Some(ref file_path) => Box::new(try!(fs::File::open(file_path))),
};
let mut rdr = csv::Reader::from_reader(input);
let closure = |row: Result<Row, csv::Error>| -> Option<Result<PopulationCount, csv::Error>> {
let row = match row {
Ok(row) => row,
Err(err) => return Some(Err(From::from(err))),
};
match row.population {
None => None,
Some(count) => if row.city == city {
Some(Ok(PopulationCount {
city: row.city,
country: row.country,
count: count,
}))
} else {
None
}
}
};
let found = rdr.decode::<Row>().filter_map(closure);
if !found.all(|row| match row {
Ok(_) => true,
_ => false,
}) {
Err(CliError::NotFound)
} else {
Ok(found)
}
}
with the following error from the compiler:
src/main.rs:97:1: 133:2 error: the trait `core::marker::Sized` is not implemented for the type `core::ops::FnMut(core::result::Result<Row, csv::Error>) -> core::option::Option<core::result::Result<PopulationCount, csv::Error>>` [E0277]
src/main.rs:97 fn search<'a, P: AsRef<Path>>(file_path: &Option<P>, city: &str) -> Result<FilterMap<csv::reader::DecodedRecords<'a, Box<Read>, Row>, FnMut(Result<Row, csv::Error>) -> Option<Result<PopulationCount, csv::Error>>>, CliError> {
src/main.rs:98 let mut found = vec![];
src/main.rs:99 let input: Box<io::Read> = match *file_path {
src/main.rs:100 None => Box::new(io::stdin()),
src/main.rs:101 Some(ref file_path) => Box::new(try!(fs::File::open(file_path))),
src/main.rs:102 };
...
src/main.rs:97:1: 133:2 note: `core::ops::FnMut(core::result::Result<Row, csv::Error>) -> core::option::Option<core::result::Result<PopulationCount, csv::Error>>` does not have a constant size known at compile-time
src/main.rs:97 fn search<'a, P: AsRef<Path>>(file_path: &Option<P>, city: &str) -> Result<FilterMap<csv::reader::DecodedRecords<'a, Box<Read>, Row>, FnMut(Result<Row, csv::Error>) -> Option<Result<PopulationCount, csv::Error>>>, CliError> {
src/main.rs:98 let mut found = vec![];
src/main.rs:99 let input: Box<io::Read> = match *file_path {
src/main.rs:100 None => Box::new(io::stdin()),
src/main.rs:101 Some(ref file_path) => Box::new(try!(fs::File::open(file_path))),
src/main.rs:102 };
...
error: aborting due to previous error
I've also tried this function definition:
fn search<'a, P: AsRef<Path>, F>(file_path: &Option<P>, city: &str)
-> Result<FilterMap<csv::reader::DecodedRecords<'a, Box<Read>, Row>, F>,
CliError>
where F: FnMut(Result<Row, csv::Error>)
-> Option<Result<PopulationCount, csv::Error>> {
with these errors from the compiler:
src/main.rs:131:12: 131:17 error: mismatched types:
expected `core::iter::FilterMap<csv::reader::DecodedRecords<'_, Box<std::io::Read>, Row>, F>`,
found `core::iter::FilterMap<csv::reader::DecodedRecords<'_, Box<std::io::Read>, Row>, [closure src/main.rs:105:19: 122:6]>`
(expected type parameter,
found closure) [E0308]
src/main.rs:131 Ok(found)
I can't Box the closure because then it won't be accepted by filter_map.
I then tried this out:
fn search<'a, P: AsRef<Path>>(file_path: &Option<P>, city: &'a str)
-> Result<(Box<Iterator<Item=Result<PopulationCount, csv::Error>> + 'a>, csv::Reader<Box<io::Read>>), CliError> {
let input: Box<io::Read> = match *file_path {
None => box io::stdin(),
Some(ref file_path) => box try!(fs::File::open(file_path)),
};
let mut rdr = csv::Reader::from_reader(input);
let mut found = rdr.decode::<Row>().filter_map(move |row| {
let row = match row {
Ok(row) => row,
Err(err) => return Some(Err(err)),
};
match row.population {
None => None,
Some(count) if row.city == city => {
Some(Ok(PopulationCount {
city: row.city,
country: row.country,
count: count,
}))
},
_ => None,
}
});
if found.size_hint().0 == 0 {
Err(CliError::NotFound)
} else {
Ok((box found, rdr))
}
}
fn main() {
let args: Args = Docopt::new(USAGE)
.and_then(|d| d.decode())
.unwrap_or_else(|err| err.exit());
match search(&args.arg_data_path, &args.arg_city) {
Err(CliError::NotFound) if args.flag_quiet => process::exit(1),
Err(err) => fatal!("{}", err),
Ok((pops, rdr)) => for pop in pops {
match pop {
Err(err) => panic!(err),
Ok(pop) => println!("{}, {}: {} - {:?}", pop.city, pop.country, pop.count, rdr.byte_offset()),
}
}
}
}
Which gives me this error:
src/main.rs:107:21: 107:24 error: `rdr` does not live long enough
src/main.rs:107 let mut found = rdr.decode::<Row>().filter_map(move |row| {
^~~
src/main.rs:100:117: 130:2 note: reference must be valid for the lifetime 'a as defined on the block at 100:116...
src/main.rs:100 -> Result<(Box<Iterator<Item=Result<PopulationCount, csv::Error>> + 'a>, csv::Reader<Box<io::Read>>), CliError> {
src/main.rs:101 let input: Box<io::Read> = match *file_path {
src/main.rs:102 None => box io::stdin(),
src/main.rs:103 Some(ref file_path) => box try!(fs::File::open(file_path)),
src/main.rs:104 };
src/main.rs:105
...
src/main.rs:106:51: 130:2 note: ...but borrowed value is only valid for the block suffix following statement 1 at 106:50
src/main.rs:106 let mut rdr = csv::Reader::from_reader(input);
src/main.rs:107 let mut found = rdr.decode::<Row>().filter_map(move |row| {
src/main.rs:108 let row = match row {
src/main.rs:109 Ok(row) => row,
src/main.rs:110 Err(err) => return Some(Err(err)),
src/main.rs:111 };
...
error: aborting due to previous error
Have I designed something wrong, or am I taking the wrong approach? Am I missing something really simple and stupid? I'm not sure where to go from here.
Returning iterators is possible, but it comes with some restrictions.
To demonstrate it's possible, two examples, (A) with explicit iterator type and (B) using boxing (playpen link).
use std::iter::FilterMap;
fn is_even(elt: i32) -> Option<i32> {
if elt % 2 == 0 {
Some(elt)
} else { None }
}
/// (A)
pub fn evens<I: IntoIterator<Item=i32>>(iter: I)
-> FilterMap<I::IntoIter, fn(I::Item) -> Option<I::Item>>
{
iter.into_iter().filter_map(is_even)
}
/// (B)
pub fn cumulative_sums<'a, I>(iter: I) -> Box<Iterator<Item=i32> + 'a>
where I: IntoIterator<Item=i32>,
I::IntoIter: 'a,
{
Box::new(iter.into_iter().scan(0, |acc, x| {
*acc += x;
Some(*acc)
}))
}
fn main() {
// The output is:
// 0 is even, 10 is even,
// 1, 3, 6, 10,
for even in evens(vec![0, 3, 7, 10]) {
print!("{} is even, ", even);
}
println!("");
for cs in cumulative_sums(1..5) {
print!("{}, ", cs);
}
println!("");
}
You experienced a problem with (A) -- explicit type! Unboxed closures, that we get from regular lambda expressions with |a, b, c| .. syntax, have unique anonymous types. Functions require explicit return types, so that doesn't work here.
Some solutions for returning closures:
Use a function pointer fn() as in example (A). Often you don't need a closure environment anyway.
Box the closure. This is reasonable, even if the iterators don't support calling it at the moment. Not your fault.
Box the iterator
Return a custom iterator struct. Requires some boilerplate.
You can see that in example (B) we have to be quite careful with lifetimes. It says that the return value is Box<Iterator<Item=i32> + 'a>, what is this 'a? This is the least lifetime required of anything inside the box! We also put the 'a bound on I::IntoIter -- this ensures we can put that inside the box.
If you just say Box<Iterator<Item=i32>> it will assume 'static.
We have to explicitly declare the lifetimes of the contents of our box. Just to be safe.
This is actually the fundamental problem with your function. You have this: DecodedRecords<'a, Box<Read>, Row>, F>
See that, an 'a! This type borrows something. The problem is it doesn't borrow it from the inputs. There are no 'a on the inputs.
You'll realize that it borrows from a value you create during the function, and that value's lifespan ends when the function returns. We cannot return DecodedRecords<'a> from the function, because it wants to borrow a local variable.
Where to go from here? My easiest answer would be to perform the same split that csv does. One part (Struct or value) that owns the reader, and one part (struct or value) that is the iterator and borrows from the reader.
Maybe the csv crate has an owning decoder that takes ownership of the reader it is processing. In that case you can use that to dispel the borrowing trouble.
This answer is based on #bluss's answer + help from #rust on irc.mozilla.org
One issue that's not obvious from the code, and which was causing the final error displayed just above, has to do with the definition of csv::Reader::decode (see its source). It takes &'a mut self, the explanation of this problem is covered in this answer. This essentially causes the lifetime of the reader to be bounded to the block it's called in. The way to fix this is to split the function in half (since I can't control the function definition, as recommended in the previous answer link). I needed a lifetime on the reader that was valid within the main function, so the reader could then be passed down into the search function. See the code below (It could definitely be cleaned up more):
fn population_count<'a, I>(iter: I, city: &'a str)
-> Box<Iterator<Item=Result<PopulationCount,csv::Error>> + 'a>
where I: IntoIterator<Item=Result<Row,csv::Error>>,
I::IntoIter: 'a,
{
Box::new(iter.into_iter().filter_map(move |row| {
let row = match row {
Ok(row) => row,
Err(err) => return Some(Err(err)),
};
match row.population {
None => None,
Some(count) if row.city == city => {
Some(Ok(PopulationCount {
city: row.city,
country: row.country,
count: count,
}))
},
_ => None,
}
}))
}
fn get_reader<P: AsRef<Path>>(file_path: &Option<P>)
-> Result<csv::Reader<Box<io::Read>>, CliError>
{
let input: Box<io::Read> = match *file_path {
None => Box::new(io::stdin()),
Some(ref file_path) => Box::new(try!(fs::File::open(file_path))),
};
Ok(csv::Reader::from_reader(input))
}
fn search<'a>(reader: &'a mut csv::Reader<Box<io::Read>>, city: &'a str)
-> Box<Iterator<Item=Result<PopulationCount, csv::Error>> + 'a>
{
population_count(reader.decode::<Row>(), city)
}
fn main() {
let args: Args = Docopt::new(USAGE)
.and_then(|d| d.decode())
.unwrap_or_else(|err| err.exit());
let reader = get_reader(&args.arg_data_path);
let mut reader = match reader {
Err(err) => fatal!("{}", err),
Ok(reader) => reader,
};
let populations = search(&mut reader, &args.arg_city);
let mut found = false;
for pop in populations {
found = true;
match pop {
Err(err) => fatal!("fatal !! {}", err),
Ok(pop) => println!("{}, {}: {}", pop.city, pop.country, pop.count),
}
}
if !(found || args.flag_quiet) {
fatal!("{}", CliError::NotFound);
}
}
I've learned a lot trying to get this to work, and have much more appreciation for the compiler errors. It's now clear that had this been C, the last error above could actually have caused segfaults, which would have been much harder to debug. I've also realized that converting from a pre-computed vec to an iterator requires more involved thinking about when the memory comes in and out of scope; I can't just change a few function calls and return types and call it a day.

Resources