How to Idiomatically Test for Overflow when Shifting Left (<<) in Rust? - rust

For most operators that might overflow, Rust provides a checked version. For example, to test if an addition overflows one could use checked_add:
match 255u8.checked_add(1) {
Some(_) => println!("no overflow"),
None => println!("overflow!"),
}
This prints "overflow!". There is also a checked_shl, but according to the documentation it only checks if the shift is larger than or equal to the number of bits in self. That means that while this:
match 255u8.checked_shl(8) {
Some(val) => println!("{}", val),
None => println!("overflow!"),
}
is caught and prints "overflow!", This:
match 255u8.checked_shl(7) {
Some(val) => println!("{}", val),
None => println!("overflow!"),
}
simply prints 128, clearly not catching the overflow.
What is the correct way to check for any kind of overflow when shifting left?

I'm not aware of any idiomatic way of doing this, but something like implementing your own trait would work: Playground
The algorithm is basically to check if there are not fewer leading zeros in the number than the shift size
#![feature(bool_to_option)]
trait LossCheckedShift {
fn loss_checked_shl(self, rhs: u32) -> Option<Self>
where Self: std::marker::Sized;
}
impl LossCheckedShift for u8 {
fn loss_checked_shl(self, rhs: u32) -> Option<Self> {
(rhs <= self.leading_zeros()).then_some(self << rhs)
// in stable Rust
// if rhs <= self.leading_zeros() { Some(self << rhs) }
// else { None }
}
}
fn main() {
match 255u8.loss_checked_shl(7) {
Some(val) => println!("{}", val),
None => println!("overflow!"), // <--
}
match 127u8.loss_checked_shl(1) {
Some(val) => println!("{}", val), // <--
None => println!("overflow!"),
}
match 127u8.loss_checked_shl(2) {
Some(val) => println!("{}", val),
None => println!("overflow!"), // <--
}
}

You could do a complementary right-shift (right-shift by 8 - requested_number_of_bits) and check if 0 remains. If so, it means that no bits would be lost by left-shifting:
fn safe_shl(n: u8, shift_for: u8) -> Option<u8> {
if n >> (8 - shift_for) != 0 {
return None; // would lose some data
}
Some(n << shift_for)
}
One can also write a generic version that accepts any numeric type, including bigints (and which applied to u8 generates exactly the same code as above):
use std::mem::size_of;
use std::ops::{Shl, Shr};
fn safe_shl<T>(n: T, shift_for: u32) -> Option<T>
where
T: Default + Eq,
for<'a> &'a T: Shl<u32, Output = T> + Shr<u32, Output = T>,
{
let bits_in_t = size_of::<T>() as u32 * 8;
let zero = T::default();
if &n >> (bits_in_t - shift_for) != zero {
return None; // would lose some data
}
Some(&n << shift_for)
}
Playground

Related

How to format to other number bases besides decimal, hexadecimal? [duplicate]

Currently I'm using the following code to return a number as a binary (base 2), octal (base 8), or hexadecimal (base 16) string.
fn convert(inp: u32, out: u32, numb: &String) -> Result<String, String> {
match isize::from_str_radix(numb, inp) {
Ok(a) => match out {
2 => Ok(format!("{:b}", a)),
8 => Ok(format!("{:o}", a)),
16 => Ok(format!("{:x}", a)),
10 => Ok(format!("{}", a)),
0 | 1 => Err(format!("No base lower than 2!")),
_ => Err(format!("printing in this base is not supported")),
},
Err(e) => Err(format!(
"Could not convert {} to a number in base {}.\n{:?}\n",
numb, inp, e
)),
}
}
Now I want to replace the inner match statement so I can return the number as an arbitrarily based string (e.g. base 3.) Are there any built-in functions to convert a number into any given radix, similar to JavaScript's Number.toString() method?
For now, you cannot do it using the standard library, but you can:
use my crate radix_fmt
or roll your own implementation:
fn format_radix(mut x: u32, radix: u32) -> String {
let mut result = vec![];
loop {
let m = x % radix;
x = x / radix;
// will panic if you use a bad radix (< 2 or > 36).
result.push(std::char::from_digit(m, radix).unwrap());
if x == 0 {
break;
}
}
result.into_iter().rev().collect()
}
fn main() {
assert_eq!(format_radix(1234, 10), "1234");
assert_eq!(format_radix(1000, 10), "1000");
assert_eq!(format_radix(0, 10), "0");
}
If you wanted to eke out a little more performance, you can create a struct and implement Display or Debug for it. This avoids allocating a String. For maximum over-engineering, you can also have a stack-allocated array instead of the Vec.
Here is Boiethios' answer with these changes applied:
struct Radix {
x: i32,
radix: u32,
}
impl Radix {
fn new(x: i32, radix: u32) -> Result<Self, &'static str> {
if radix < 2 || radix > 36 {
Err("Unnsupported radix")
} else {
Ok(Self { x, radix })
}
}
}
use std::fmt;
impl fmt::Display for Radix {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let mut x = self.x;
// Good for binary formatting of `u128`s
let mut result = ['\0'; 128];
let mut used = 0;
let negative = x < 0;
if negative {
x*=-1;
}
let mut x = x as u32;
loop {
let m = x % self.radix;
x /= self.radix;
result[used] = std::char::from_digit(m, self.radix).unwrap();
used += 1;
if x == 0 {
break;
}
}
if negative {
write!(f, "-")?;
}
for c in result[..used].iter().rev() {
write!(f, "{}", c)?;
}
Ok(())
}
}
fn main() {
assert_eq!(Radix::new(1234, 10).to_string(), "1234");
assert_eq!(Radix::new(1000, 10).to_string(), "1000");
assert_eq!(Radix::new(0, 10).to_string(), "0");
}
This could still be optimized by:
creating an ASCII array instead of a char array
not zero-initializing the array
Since these avenues require unsafe or an external crate like arraybuf, I have not included them. You can see sample code in internal implementation details of the standard library.
Here is an extended solution based on the first comment which does not bind the parameter x to be a u32:
fn format_radix(mut x: u128, radix: u32) -> String {
let mut result = vec![];
loop {
let m = x % radix as u128;
x = x / radix as u128;
// will panic if you use a bad radix (< 2 or > 36).
result.push(std::char::from_digit(m as u32, radix).unwrap());
if x == 0 {
break;
}
}
result.into_iter().rev().collect()
}
This is faster than the other answer:
use std::char::from_digit;
fn encode(mut n: u32, r: u32) -> Option<String> {
let mut s = String::new();
loop {
if let Some(c) = from_digit(n % r, r) {
s.insert(0, c)
} else {
return None
}
n /= r;
if n == 0 {
break
}
}
Some(s)
}
Note I also tried these, but they were slower:
https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.push_front
https://doc.rust-lang.org/std/string/struct.String.html#method.push
https://doc.rust-lang.org/std/vec/struct.Vec.html#method.insert

Temporary value dropped while borrowed while pushing elements into a Vec

I'm trying to solve the RPN calculator exercise at exercism but stumbled upon this temporary value dropped while borrowed error that I can't seem to work out.
Here's my code:
#[derive(Debug)]
pub enum CalculatorInput {
Add,
Subtract,
Multiply,
Divide,
Value(i32),
}
pub fn evaluate(inputs: &[CalculatorInput]) -> Option<i32> {
let mut stack = Vec::new();
for input in inputs {
match input {
CalculatorInput::Value(value) => {
stack.push(value);
},
operator => {
if stack.len() < 2 {
return None;
}
let second = stack.pop().unwrap();
let first = stack.pop().unwrap();
let result = match operator {
CalculatorInput::Add => first + second,
CalculatorInput::Subtract => first - second,
CalculatorInput::Multiply => first * second,
CalculatorInput::Divide => first / second,
CalculatorInput::Value(_) => return None,
};
stack.push(&result.clone());
}
}
}
if stack.len() != 1 {
None
} else {
Some(*stack.pop().unwrap())
}
}
And the error I get:
error[E0716]: temporary value dropped while borrowed
--> src/lib.rs:32:29
|
32 | stack.push(&result.clone());
| ^^^^^^^^^^^^^^ - temporary value is freed at the end of this statement
| |
| creates a temporary which is freed while still in use
...
36 | if stack.len() != 1 {
| ----- borrow later used here
|
= note: consider using a `let` binding to create a longer lived value
If I understand correctly, the variable result is no loger live outside of the for loop (outside of the operator match branch indeed), that's why I cloned it, but it still gives me the same error.
How can I make a copy of the result which is owned by the stack Vec (if that's what I should do)?
Just for reference, and in case anybody fins this useful, this is the final solution taking into account all the help received:
use crate::CalculatorInput::{Add,Subtract,Multiply,Divide,Value};
#[derive(Debug)]
pub enum CalculatorInput {
Add,
Subtract,
Multiply,
Divide,
Value(i32),
}
pub fn evaluate(inputs: &[CalculatorInput]) -> Option<i32> {
let mut stack: Vec<i32> = Vec::new();
for input in inputs {
match input {
Value(value) => {
stack.push(*value);
},
operator => {
if stack.len() < 2 {
return None;
}
let second: i32 = stack.pop().unwrap();
let first: i32 = stack.pop().unwrap();
let result: i32 = match operator {
Add => first + second,
Subtract => first - second,
Multiply => first * second,
Divide => first / second,
Value(_) => return None,
};
stack.push(result);
}
}
}
if stack.len() != 1 {
None
} else {
stack.pop()
}
}
No need to clone, because i32 implements the Copy trait.
The problem was that my vec was receiving an &i32 instead of i32, and thus rust infered it to be a Vec<&i32>.
The error is because Rust did not infer the type you expected.
In your code, the type of value is inferred to be &i32 because input is a reference of a element in inputs, and you push a value later, therefore the type of stack is inferred to be Vec<&i32>.
A best fix is to explicitly specify the type of stack:
let mut stack: Vec<i32> = Vec::new();
And because i32 has implemented Copy trait, you should never need to clone a i32 value, if it is a reference, just dereference it.
Fixed code:
#[derive(Debug)]
pub enum CalculatorInput {
Add,
Subtract,
Multiply,
Divide,
Value(i32),
}
pub fn evaluate(inputs: &[CalculatorInput]) -> Option<i32> {
let mut stack: Vec<i32> = Vec::new();
for input in inputs {
match input {
CalculatorInput::Value(value) => {
stack.push(*value);
}
operator => {
if stack.len() < 2 {
return None;
}
let second = stack.pop().unwrap();
let first = stack.pop().unwrap();
let result = match operator {
CalculatorInput::Add => first + second,
CalculatorInput::Subtract => first - second,
CalculatorInput::Multiply => first * second,
CalculatorInput::Divide => first / second,
CalculatorInput::Value(_) => return None,
};
stack.push(result);
}
}
}
if stack.len() != 1 {
None
} else {
Some(stack.pop().unwrap())
}
}
You have the same behavior with this simple exemple
fn main() {
let mut stack = Vec::new();
let a = String::from("test");
stack.push(&a.clone());
//-------- ^
println!("{:?}", stack);
}
and the good way is to not borrow when clone.
fn main() {
let mut stack = Vec::new();
let a = String::from("test");
stack.push(a.clone());
//-------- ^
println!("{:?}", stack);
}
The variable should be used like this stack.push(result.clone()); and change code like this
pub fn evaluate(inputs: &[CalculatorInput]) -> Option<i32> {
let mut stack: Vec<i32> = Vec::new();
//---------------- ^
for input in inputs {
match input {
CalculatorInput::Value(value) => {
stack.push(value.clone());
//----------------- ^
},
operator => {
if stack.len() < 2 {
return None;
}
let second = stack.pop().unwrap();
let first = stack.pop().unwrap();
let result = match operator {
CalculatorInput::Add => first + second,
CalculatorInput::Subtract => first - second,
CalculatorInput::Multiply => first * second,
CalculatorInput::Divide => first / second,
CalculatorInput::Value(_) => return None,
};
stack.push(result.clone());
//-^
}
}
}
if stack.len() != 1 {
None
} else {
Some(stack.pop().unwrap())
//------- ^
}
}

Rust String vs &str iterators

I'm trying to write a function which accepts a list of tokens. But I'm having problems making it general enough to handle two pretty similar calls:
let s = String::from("-abc -d --echo");
parse( s.split_ascii_whitespace() );
parse( std::env::args() );
String::split_ascii_whitespace() returns std::str:SplitAsciiWhitespace which implements Iterator<Item=&'a str>.
std::env::args() returns std::env::Args which implements Iterator<Item=String>.
Is there a way for me to write a function signature for parse that will accept both methods?
My solution right now requires duplicating function bodies:
fn main() {
let s = String::from("-abc -d --echo");
parse_args( s.split_ascii_whitespace() );
parse_env( std::env::args() );
}
fn parse_env<I: Iterator<Item=String>>(mut it: I) {
loop {
match it.next() {
None => return,
Some(s) => println!("{}",s),
}
}
}
fn parse_args<'a, I: Iterator<Item=&'a str>>(mut it: I) {
loop {
match it.next() {
None => return,
Some(s) => println!("{}",s),
}
}
}
If not possible, then some advice on how to use the traits so the functions can use the same name would be nice.
You can require the item type to be AsRef<str>, which will include both &str and String:
fn parse<I>(mut it: I)
where
I: Iterator,
I::Item: AsRef<str>,
{
loop {
match it.next() {
None => return,
Some(s) => println!("{}", s.as_ref()),
}
}
}
Depending on your use case, you could try:
fn main() {
let s = String::from("-abc -d --echo");
parse( s.split_ascii_whitespace() );
parse( std::env::args() );
}
fn parse<T: std::borrow::Borrow<str>, I: Iterator<Item=T>>(mut it: I) {
loop {
match it.next() {
None => return,
Some(s) => println!("{}",s.borrow()),
}
}
}
I used Borrow as a means to get to a &str, but your concrete use case may be served by other, possibly custom, traits.

How to avoid cloning a big integer in rust

I used the num::BigUInt type to avoid integer overflows when calculating the factorial of a number.
However, I had to resort to using .clone() to pass rustc's borrow checker.
How can I refactor the factorial function to avoid cloning what could be large numbers many times?
use num::{BigUint, FromPrimitive, One};
fn main() {
for n in -2..33 {
let bign: Option<BigUint> = FromPrimitive::from_isize(n);
match bign {
Some(n) => println!("{}! = {}", n, factorial(n.clone())),
None => println!("Number must be non-negative: {}", n),
}
}
}
fn factorial(number: BigUint) -> BigUint {
if number < FromPrimitive::from_usize(2).unwrap() {
number
} else {
number.clone() * factorial(number - BigUint::one())
}
}
I tried to use a reference to BigUInt in the function definition but got some errors saying that BigUInt did not support references.
The first clone is easy to remove. You are trying to use n twice in the same expression, so don't use just one expression:
print!("{}! = ", n);
println!("{}", factorial(n));
is equivalent to println!("{}! = {}", n, factorial(n.clone())) but does not try to move n and use a reference to it at the same time.
The second clone can be removed by changing factorial not to be recursive:
fn factorial(mut number: BigUint) -> BigUint {
let mut result = BigUint::one();
let one = BigUint::one();
while number > one {
result *= &number;
number -= &one;
}
result
}
This might seem unidiomatic however. There is a range function, that you could use with for, however, it uses clone internally, defeating the point.
I don't think take a BigUint as parameter make sense for a factorial. u32 should be enough:
use num::{BigUint, One};
fn main() {
for n in 0..42 {
println!("{}! = {}", n, factorial(n));
}
}
fn factorial_aux(accu: BigUint, i: u32) -> BigUint {
if i > 1 {
factorial_aux(accu * i, i - 1)
}
else {
accu
}
}
fn factorial(n: u32) -> BigUint {
factorial_aux(BigUint::one(), n)
}
Or if you really want to keep BigUint:
use num::{BigUint, FromPrimitive, One, Zero};
fn main() {
for i in (0..42).flat_map(|i| FromPrimitive::from_i32(i)) {
print!("{}! = ", i);
println!("{}", factorial(i));
}
}
fn factorial_aux(accu: BigUint, i: BigUint) -> BigUint {
if !i.is_one() {
factorial_aux(accu * &i, i - 1u32)
} else {
accu
}
}
fn factorial(n: BigUint) -> BigUint {
if !n.is_zero() {
factorial_aux(BigUint::one(), n)
} else {
BigUint::one()
}
}
Both version doesn't do any clone.
If you use ibig::UBig instead of BigUint, those clones will be free, because ibig is optimized not to allocate memory from the heap for numbers this small.

How do I return an Iterator that's generated by a function that takes &'a mut self (when self is created locally)?

Update: The title of the post has been updated, and the answer has been moved out of the question. The short answer is you can't. Please see my answer to this question.
I'm following an Error Handling blog post here (github for it is here), and I tried to make some modifications to the code so that the search function returns an Iterator instead of a Vec. This has been insanely difficult, and I'm stuck.
I've gotten up to this point:
fn search<'a, P: AsRef<Path>>(file_path: &Option<P>, city: &str)
-> Result<FilterMap<csv::reader::DecodedRecords<'a, Box<Read>, Row>,
FnMut(Result<Row, csv::Error>)
-> Option<Result<PopulationCount, csv::Error>>>,
CliError> {
let mut found = vec![];
let input: Box<io::Read> = match *file_path {
None => Box::new(io::stdin()),
Some(ref file_path) => Box::new(try!(fs::File::open(file_path))),
};
let mut rdr = csv::Reader::from_reader(input);
let closure = |row: Result<Row, csv::Error>| -> Option<Result<PopulationCount, csv::Error>> {
let row = match row {
Ok(row) => row,
Err(err) => return Some(Err(From::from(err))),
};
match row.population {
None => None,
Some(count) => if row.city == city {
Some(Ok(PopulationCount {
city: row.city,
country: row.country,
count: count,
}))
} else {
None
}
}
};
let found = rdr.decode::<Row>().filter_map(closure);
if !found.all(|row| match row {
Ok(_) => true,
_ => false,
}) {
Err(CliError::NotFound)
} else {
Ok(found)
}
}
with the following error from the compiler:
src/main.rs:97:1: 133:2 error: the trait `core::marker::Sized` is not implemented for the type `core::ops::FnMut(core::result::Result<Row, csv::Error>) -> core::option::Option<core::result::Result<PopulationCount, csv::Error>>` [E0277]
src/main.rs:97 fn search<'a, P: AsRef<Path>>(file_path: &Option<P>, city: &str) -> Result<FilterMap<csv::reader::DecodedRecords<'a, Box<Read>, Row>, FnMut(Result<Row, csv::Error>) -> Option<Result<PopulationCount, csv::Error>>>, CliError> {
src/main.rs:98 let mut found = vec![];
src/main.rs:99 let input: Box<io::Read> = match *file_path {
src/main.rs:100 None => Box::new(io::stdin()),
src/main.rs:101 Some(ref file_path) => Box::new(try!(fs::File::open(file_path))),
src/main.rs:102 };
...
src/main.rs:97:1: 133:2 note: `core::ops::FnMut(core::result::Result<Row, csv::Error>) -> core::option::Option<core::result::Result<PopulationCount, csv::Error>>` does not have a constant size known at compile-time
src/main.rs:97 fn search<'a, P: AsRef<Path>>(file_path: &Option<P>, city: &str) -> Result<FilterMap<csv::reader::DecodedRecords<'a, Box<Read>, Row>, FnMut(Result<Row, csv::Error>) -> Option<Result<PopulationCount, csv::Error>>>, CliError> {
src/main.rs:98 let mut found = vec![];
src/main.rs:99 let input: Box<io::Read> = match *file_path {
src/main.rs:100 None => Box::new(io::stdin()),
src/main.rs:101 Some(ref file_path) => Box::new(try!(fs::File::open(file_path))),
src/main.rs:102 };
...
error: aborting due to previous error
I've also tried this function definition:
fn search<'a, P: AsRef<Path>, F>(file_path: &Option<P>, city: &str)
-> Result<FilterMap<csv::reader::DecodedRecords<'a, Box<Read>, Row>, F>,
CliError>
where F: FnMut(Result<Row, csv::Error>)
-> Option<Result<PopulationCount, csv::Error>> {
with these errors from the compiler:
src/main.rs:131:12: 131:17 error: mismatched types:
expected `core::iter::FilterMap<csv::reader::DecodedRecords<'_, Box<std::io::Read>, Row>, F>`,
found `core::iter::FilterMap<csv::reader::DecodedRecords<'_, Box<std::io::Read>, Row>, [closure src/main.rs:105:19: 122:6]>`
(expected type parameter,
found closure) [E0308]
src/main.rs:131 Ok(found)
I can't Box the closure because then it won't be accepted by filter_map.
I then tried this out:
fn search<'a, P: AsRef<Path>>(file_path: &Option<P>, city: &'a str)
-> Result<(Box<Iterator<Item=Result<PopulationCount, csv::Error>> + 'a>, csv::Reader<Box<io::Read>>), CliError> {
let input: Box<io::Read> = match *file_path {
None => box io::stdin(),
Some(ref file_path) => box try!(fs::File::open(file_path)),
};
let mut rdr = csv::Reader::from_reader(input);
let mut found = rdr.decode::<Row>().filter_map(move |row| {
let row = match row {
Ok(row) => row,
Err(err) => return Some(Err(err)),
};
match row.population {
None => None,
Some(count) if row.city == city => {
Some(Ok(PopulationCount {
city: row.city,
country: row.country,
count: count,
}))
},
_ => None,
}
});
if found.size_hint().0 == 0 {
Err(CliError::NotFound)
} else {
Ok((box found, rdr))
}
}
fn main() {
let args: Args = Docopt::new(USAGE)
.and_then(|d| d.decode())
.unwrap_or_else(|err| err.exit());
match search(&args.arg_data_path, &args.arg_city) {
Err(CliError::NotFound) if args.flag_quiet => process::exit(1),
Err(err) => fatal!("{}", err),
Ok((pops, rdr)) => for pop in pops {
match pop {
Err(err) => panic!(err),
Ok(pop) => println!("{}, {}: {} - {:?}", pop.city, pop.country, pop.count, rdr.byte_offset()),
}
}
}
}
Which gives me this error:
src/main.rs:107:21: 107:24 error: `rdr` does not live long enough
src/main.rs:107 let mut found = rdr.decode::<Row>().filter_map(move |row| {
^~~
src/main.rs:100:117: 130:2 note: reference must be valid for the lifetime 'a as defined on the block at 100:116...
src/main.rs:100 -> Result<(Box<Iterator<Item=Result<PopulationCount, csv::Error>> + 'a>, csv::Reader<Box<io::Read>>), CliError> {
src/main.rs:101 let input: Box<io::Read> = match *file_path {
src/main.rs:102 None => box io::stdin(),
src/main.rs:103 Some(ref file_path) => box try!(fs::File::open(file_path)),
src/main.rs:104 };
src/main.rs:105
...
src/main.rs:106:51: 130:2 note: ...but borrowed value is only valid for the block suffix following statement 1 at 106:50
src/main.rs:106 let mut rdr = csv::Reader::from_reader(input);
src/main.rs:107 let mut found = rdr.decode::<Row>().filter_map(move |row| {
src/main.rs:108 let row = match row {
src/main.rs:109 Ok(row) => row,
src/main.rs:110 Err(err) => return Some(Err(err)),
src/main.rs:111 };
...
error: aborting due to previous error
Have I designed something wrong, or am I taking the wrong approach? Am I missing something really simple and stupid? I'm not sure where to go from here.
Returning iterators is possible, but it comes with some restrictions.
To demonstrate it's possible, two examples, (A) with explicit iterator type and (B) using boxing (playpen link).
use std::iter::FilterMap;
fn is_even(elt: i32) -> Option<i32> {
if elt % 2 == 0 {
Some(elt)
} else { None }
}
/// (A)
pub fn evens<I: IntoIterator<Item=i32>>(iter: I)
-> FilterMap<I::IntoIter, fn(I::Item) -> Option<I::Item>>
{
iter.into_iter().filter_map(is_even)
}
/// (B)
pub fn cumulative_sums<'a, I>(iter: I) -> Box<Iterator<Item=i32> + 'a>
where I: IntoIterator<Item=i32>,
I::IntoIter: 'a,
{
Box::new(iter.into_iter().scan(0, |acc, x| {
*acc += x;
Some(*acc)
}))
}
fn main() {
// The output is:
// 0 is even, 10 is even,
// 1, 3, 6, 10,
for even in evens(vec![0, 3, 7, 10]) {
print!("{} is even, ", even);
}
println!("");
for cs in cumulative_sums(1..5) {
print!("{}, ", cs);
}
println!("");
}
You experienced a problem with (A) -- explicit type! Unboxed closures, that we get from regular lambda expressions with |a, b, c| .. syntax, have unique anonymous types. Functions require explicit return types, so that doesn't work here.
Some solutions for returning closures:
Use a function pointer fn() as in example (A). Often you don't need a closure environment anyway.
Box the closure. This is reasonable, even if the iterators don't support calling it at the moment. Not your fault.
Box the iterator
Return a custom iterator struct. Requires some boilerplate.
You can see that in example (B) we have to be quite careful with lifetimes. It says that the return value is Box<Iterator<Item=i32> + 'a>, what is this 'a? This is the least lifetime required of anything inside the box! We also put the 'a bound on I::IntoIter -- this ensures we can put that inside the box.
If you just say Box<Iterator<Item=i32>> it will assume 'static.
We have to explicitly declare the lifetimes of the contents of our box. Just to be safe.
This is actually the fundamental problem with your function. You have this: DecodedRecords<'a, Box<Read>, Row>, F>
See that, an 'a! This type borrows something. The problem is it doesn't borrow it from the inputs. There are no 'a on the inputs.
You'll realize that it borrows from a value you create during the function, and that value's lifespan ends when the function returns. We cannot return DecodedRecords<'a> from the function, because it wants to borrow a local variable.
Where to go from here? My easiest answer would be to perform the same split that csv does. One part (Struct or value) that owns the reader, and one part (struct or value) that is the iterator and borrows from the reader.
Maybe the csv crate has an owning decoder that takes ownership of the reader it is processing. In that case you can use that to dispel the borrowing trouble.
This answer is based on #bluss's answer + help from #rust on irc.mozilla.org
One issue that's not obvious from the code, and which was causing the final error displayed just above, has to do with the definition of csv::Reader::decode (see its source). It takes &'a mut self, the explanation of this problem is covered in this answer. This essentially causes the lifetime of the reader to be bounded to the block it's called in. The way to fix this is to split the function in half (since I can't control the function definition, as recommended in the previous answer link). I needed a lifetime on the reader that was valid within the main function, so the reader could then be passed down into the search function. See the code below (It could definitely be cleaned up more):
fn population_count<'a, I>(iter: I, city: &'a str)
-> Box<Iterator<Item=Result<PopulationCount,csv::Error>> + 'a>
where I: IntoIterator<Item=Result<Row,csv::Error>>,
I::IntoIter: 'a,
{
Box::new(iter.into_iter().filter_map(move |row| {
let row = match row {
Ok(row) => row,
Err(err) => return Some(Err(err)),
};
match row.population {
None => None,
Some(count) if row.city == city => {
Some(Ok(PopulationCount {
city: row.city,
country: row.country,
count: count,
}))
},
_ => None,
}
}))
}
fn get_reader<P: AsRef<Path>>(file_path: &Option<P>)
-> Result<csv::Reader<Box<io::Read>>, CliError>
{
let input: Box<io::Read> = match *file_path {
None => Box::new(io::stdin()),
Some(ref file_path) => Box::new(try!(fs::File::open(file_path))),
};
Ok(csv::Reader::from_reader(input))
}
fn search<'a>(reader: &'a mut csv::Reader<Box<io::Read>>, city: &'a str)
-> Box<Iterator<Item=Result<PopulationCount, csv::Error>> + 'a>
{
population_count(reader.decode::<Row>(), city)
}
fn main() {
let args: Args = Docopt::new(USAGE)
.and_then(|d| d.decode())
.unwrap_or_else(|err| err.exit());
let reader = get_reader(&args.arg_data_path);
let mut reader = match reader {
Err(err) => fatal!("{}", err),
Ok(reader) => reader,
};
let populations = search(&mut reader, &args.arg_city);
let mut found = false;
for pop in populations {
found = true;
match pop {
Err(err) => fatal!("fatal !! {}", err),
Ok(pop) => println!("{}, {}: {}", pop.city, pop.country, pop.count),
}
}
if !(found || args.flag_quiet) {
fatal!("{}", CliError::NotFound);
}
}
I've learned a lot trying to get this to work, and have much more appreciation for the compiler errors. It's now clear that had this been C, the last error above could actually have caused segfaults, which would have been much harder to debug. I've also realized that converting from a pre-computed vec to an iterator requires more involved thinking about when the memory comes in and out of scope; I can't just change a few function calls and return types and call it a day.

Resources