How to preserve unsized data created in a closure

How to preserve unsized data created in a closure - rust

I'm trying to create unsized data in a callback, and store the reference in a vector. Is this possible in any way?
fn main() {
let mut results = vec![];
let mut callback = |val: &str| results.push(val.clone());
// ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^ `val` escapes the closure body here
for s in ["a","b","c"] {
callback(s);
};
}
Okay, so in working this I found a solution, but I have questions about it.
use std::clone::Clone;
pub fn main() {
let mut results = vec![];
let mut callback = |val: &str| results.push(Clone::clone(&val.to_owned()));
for s in ["a","b","c"] {
callback(s);
};
}
Why does this work but neither val.clone() or &(val.clone()) work?
Why doesn't the complier complain about not knowing the size of results beforehand?
Does this work for any unsized type?
If I call callback(s) twice it works as expected. Why doesn't to_owned() consume the resource?
Is there anything else important here that I (or someone else at my level who might be reading this) am missing?

You've run into bunch misconceptions.
When you call clone on &str, you get another &str. It is useless, as long as any reference implements Copy.
You're trying to insert reference to str with non-static lifetime into a vector with static lifetime. Ask yourself: what would happen, if the object behind the reference will die before this vector? You'll get a vector of dangling references => undefined behavior.
This code will work:
fn main() {
let mut results = vec![]; // results has type `Vec<&'static str>`
let mut callback = |val: &'static str| results.push(val);
for s in ["a","b","c"] {
callback(s);
};
}
You don't need to use std::clone::Clone as long as it's already in std::prelude and accessible in every scope.
When you call to_owned on &str, you actually create a String which has 'static lifetime. So you can insert any String into a Vec. But it is not a reference to the original str anymore. Also you don't need to call clone here.
So this will work too:
pub fn main() {
let mut results = vec![]; // results has type `Vec<String>`
let mut callback = |val: &str| results.push(val.to_owned());
for s in ["a","b","c"] {
callback(s);
};
}

Related

"expected bound lifetime parameter, found concrete lifetime" on StreamExt .scan() method

I am using async-tungstenite to listen to a websocket, and async-std's StreamExt to operate on the resulting stream.
I want to use a HashMap to accumulate the latest Ticker values from the websocket. These Ticker values will be looked up later to be used in calculations. I'm using the symbol (String) value of the Ticker struct as the key for the HashMap. I'm using the .scan StreamExt method to perform the accumulation.
However, I get a compilation error related to lifetimes. Here's some stripped-down code:
let tickers = HashMap::new();
let mut stream = ws.
.scan(tickers, accumulate_tickers);
while let msg = stream.next().await {
println!("{:?}", msg)
}
...and the accumulate_tickers function:
fn accumulate_tickers(tps: &mut HashMap<String, Ticker>, bt: Ticker) -> Option<&HashMap<String, Ticker>> {
tps.insert((*bt.symbol).to_string(), bt);
Some(tps)
}
The compilation error I receive is as follows:
error[E0271]: type mismatch resolving `for<'r> <for<'s> fn(&'s mut std::collections::HashMap<std::string::String, ws_async::model::websocket::Ticker>, ws_async::model::websocket::Ticker) -> std::option::Option<&'s std::collections::HashMap<std::string::String, ws_async::model::websocket::Ticker>> {accumulate_tickers} as std::ops::FnOnce<(&'r mut std::collections::HashMap<std::string::String, ws_async::model::websocket::Ticker>, ws_async::model::websocket::Ticker)>>::Output == std::option::Option<_>`
--> examples/async_std-ws.rs:64:4
|
64 | .scan(tickers, accumulate_tickers);
| ^^^^ expected bound lifetime parameter, found concrete lifetime
I'm unaware of a way to provide a lifetime parameter to the scan method.
I wonder whether the issue may be related to the fact that I modify the HashMap and then try to return it (is it a move issue?). How could I resolve this, or at least narrow down the cause?

I was able to get this working. Working my way through the compiler errors, I ended up with this signature for the accumulate_tickers function:
fn accumulate_tickers<'a>(tps: &'a mut &'static HashMap<String, Ticker>, bt: Ticker) -> Option<&'static HashMap<String, Ticker>>
I do want the accumulator HashMap to have a static lifetime so that makes sense. tps: &'a mut &'static HashMap... does look a bit strange, but it works.
Then, this issue was was that tickers also had to have a static lifetime (it's the initial value for the accumulator. I tried declaring it as static outside the main but it wouldn't let me set it to the result of a function - HashMap::new().
I then turned to lazy_static which allows me to create a static value which does just that:
lazy_static! {
static ref tickers: HashMap<String, Ticker> = HashMap::new();
}
This gave me a HashMap accumulator that had a static lifetime. However, like normal static values declared in the root scope, it was immutable. To fix that, I read some hints from the lazy_static team and then found https://pastebin.com/YES8dsHH. This showed me how to make my static accumulator mutable by wrapping it in Arc<Mutex<_>>.
lazy_static! {
// from https://pastebin.com/YES8dsHH
static ref tickers: Arc<Mutex<HashMap<String, Ticker>>> = {
let mut ts = HashMap::new();
Arc::new(Mutex::new(ts))
};
}
This does mean that I have to retrieve the accumulator from the Mutex (and lock it) before reading or modifying it but, again, it works.
Putting it all together, the stripped-down code now looks like this:
#[macro_use]
extern crate lazy_static;
lazy_static! {
// from https://pastebin.com/YES8dsHH
static ref tickers: Arc<Mutex<HashMap<String, Ticker>>> = {
let mut ts = HashMap::new();
Arc::new(Mutex::new(ts))
};
}
// SNIP
// Inside main()
let mut ticks = ws
.scan(&tickers, accumulate_tickers);
while let Some(msg) = ticks.next().await {
println!("{:?}", msg.lock().unwrap());
}
// SNIP
fn accumulate_tickers<'a>(tps: &'a mut &'static tickers, bt: Ticker) -> Option<&'static tickers> {
tps.lock().unwrap().insert((*bt.symbol).to_string(), bt);
Some(tps)
}
I'd be happy to hear suggestions for ways in which this could be made simpler or more elegant.

Discerning lifetimes understanding the move keyword

I've been playing around with AudioUnit via Rust and the Rust library coreaudio-rs. Their example seems to work well:
extern crate coreaudio;
use coreaudio::audio_unit::{AudioUnit, IOType};
use coreaudio::audio_unit::render_callback::{self, data};
use std::f32::consts::PI;
struct Iter {
value: f32,
}
impl Iterator for Iter {
type Item = [f32; 2];
fn next(&mut self) -> Option<[f32; 2]> {
self.value += 440.0 / 44_100.0;
let amp = (self.value * PI * 2.0).sin() as f32 * 0.15;
Some([amp, amp])
}
}
fn main() {
run().unwrap()
}
fn run() -> Result<(), coreaudio::Error> {
// 440hz sine wave generator.
let mut samples = Iter { value: 0.0 };
//let buf: Vec<[f32; 2]> = vec![[0.0, 0.0]];
//let mut samples = buf.iter();
// Construct an Output audio unit that delivers audio to the default output device.
let mut audio_unit = try!(AudioUnit::new(IOType::DefaultOutput));
// Q: What is this type?
let callback = move |args| {
let Args { num_frames, mut data, .. } = args;
for i in 0..num_frames {
let sample = samples.next().unwrap();
for (channel_idx, channel) in data.channels_mut().enumerate() {
channel[i] = sample[channel_idx];
}
}
Ok(())
};
type Args = render_callback::Args<data::NonInterleaved<f32>>;
try!(audio_unit.set_render_callback(callback));
try!(audio_unit.start());
std::thread::sleep(std::time::Duration::from_millis(30000));
Ok(())
}
However, changing it up a little bit to load via a buffer doesn't work as well:
extern crate coreaudio;
use coreaudio::audio_unit::{AudioUnit, IOType};
use coreaudio::audio_unit::render_callback::{self, data};
fn main() {
run().unwrap()
}
fn run() -> Result<(), coreaudio::Error> {
let buf: Vec<[f32; 2]> = vec![[0.0, 0.0]];
let mut samples = buf.iter();
// Construct an Output audio unit that delivers audio to the default output device.
let mut audio_unit = try!(AudioUnit::new(IOType::DefaultOutput));
// Q: What is this type?
let callback = move |args| {
let Args { num_frames, mut data, .. } = args;
for i in 0..num_frames {
let sample = samples.next().unwrap();
for (channel_idx, channel) in data.channels_mut().enumerate() {
channel[i] = sample[channel_idx];
}
}
Ok(())
};
type Args = render_callback::Args<data::NonInterleaved<f32>>;
try!(audio_unit.set_render_callback(callback));
try!(audio_unit.start());
std::thread::sleep(std::time::Duration::from_millis(30000));
Ok(())
}
It says, correctly so, that buf only lives until the end of run and does not live long enough for the audio unit—which makes sense, because "borrowed value must be valid for the static lifetime...".
In any case, that doesn't bother me; I can modify the iterator to load and read from the buffer just fine. However, it does raise some questions:
Why does the Iter { value: 0.0 } have the 'static lifetime?
If it doesn't have the 'static lifetime, why does it say the borrowed value must be valid for the 'static lifetime?
If it does have the 'static lifetime, why? It seems like it would be on the heap and closed on by callback.
I understand that the move keyword allows moving inside the closure, which doesn't help me understand why it interacts with lifetimes. Why can't it move the buffer? Do I have to move both the buffer and the iterator into the closure? How would I do that?
Over all this, how do I figure out the expected lifetime without trying to be a compiler myself? It doesn't seem like guessing and compiling is always a straightforward method to resolving these issues.

Why does the Iter { value: 0.0 } have the 'static lifetime?
It doesn't; only references have lifetimes.
why does it say the borrowed value must be valid for the 'static lifetime
how do I figure out the expected lifetime without trying to be a compiler myself
Read the documentation; it tells you the restriction:
fn set_render_callback<F, D>(&mut self, f: F) -> Result<(), Error>
where
F: FnMut(Args<D>) -> Result<(), ()> + 'static, // <====
D: Data
This restriction means that any references inside of F must live at least as long as the 'static lifetime. Having no references is also acceptable.
All type and lifetime restrictions are expressed at the function boundary — this is a hard rule of Rust.
I understand that the move keyword allows moving inside the closure, which doesn't help me understand why it interacts with lifetimes.
The only thing that the move keyword does is force every variable directly used in the closure to be moved into the closure. Otherwise, the compiler tries to be conservative and move in references/mutable references/values based on the usage inside the closure.
Why can't it move the buffer?
The variable buf is never used inside the closure.
Do I have to move both the buffer and the iterator into the closure? How would I do that?
By creating the iterator inside the closure. Now buf is used inside the closure and will be moved:
let callback = move |args| {
let mut samples = buf.iter();
// ...
}
It doesn't seem like guessing and compiling is always a straightforward method to resolving these issues.
Sometimes it is, and sometimes you have to think about why you believe the code to be correct and why the compiler states it isn't and come to an understanding.

Mutating a variable in the body of a closure

I'm writing a bunch of assertions that all involve popping a value from a list.
Coming from a Scala background, I naturally did this:
let mut list = List::new();
let assert_pop = |expected| assert_eq!(list.pop(), expected);
So that I could just write assert_pop(None) or assert_pop(Some(3)) instead of having to write assert_eq!(list.pop(), None) or assert_eq!(list.pop(), Some(3)) every time.
Of course the borrow checker doesn't like this one bit because the closure essentially needs to borrow the value for an undisclosed amount of time, while the rest of my code goes around mutating, thus violating the rule of "no aliasing if you're mutating".
The question is: is there a way to get around this? Do I have to write a macro, or is there a funky memory-safe way I can get around this?
Note that I know that I can just define the closure like this:
let_assert_pop = |lst: &mut List, expected| assert_eq!(lst.pop(), expected);
But that would be less DRY, as I'd have to pass in a &mut list as the first argument at every call.

They key is to define the closure as mut, since it needs a mutable reference.
This works:
let mut v = vec![1, 2];
let mut assert_pop = |expected| assert_eq!(v.pop(), expected);
assert_pop(Some(2));
assert_pop(Some(1));
assert_pop(None);
Note that the the pop closure borrows mutably, so you if you want to use the list afterwards, you have to scope it:
let mut v = vec![1,2];
{
let mut assert_pop = |expected| assert_eq!(v.pop(), expected);
assert_pop(Some(2));
v.push(33); // ERROR: v is borrowed mutably...
}
v.push(33); // Works now, since pop is out of scope.

Instead of directly answering your question (which is well-enough answered already), I'll instead address your other points:
just write assert_pop(None) or assert_pop(Some(3))
a memory-safe way solution
don't pass in a &mut list
To solve all that, don't use a closure, just make a new type:
type List<T> = Vec<T>;
struct Thing<T>(List<T>);
impl<T> Thing<T> {
fn assert_pop(&mut self, expected: Option<T>)
where T: PartialEq + std::fmt::Debug,
{
assert_eq!(self.0.pop(), expected);
}
}
fn main() {
let list = List::new();
let mut list = Thing(list);
list.0.push(1);
list.assert_pop(Some(1));
list.assert_pop(None);
// Take it back if we need to
let _list = list.0;
}

Why does this variable definition imply static lifetime?

I'm trying to execute a function on chunks of a vector and then send the result back using the message passing library.
However, I get a strange error about the lifetime of the vector that isn't even participating in the thread operations:
src/lib.rs:153:27: 154:25 error: borrowed value does not live long enough
src/lib.rs:153 let extended_segments = (segment_size..max_val)
error: src/lib.rs:154 .collect::<Vec<_>>()borrowed value does not live long enough
note: reference must be valid for the static lifetime...:153
let extended_segments = (segment_size..max_val)
src/lib.rs:153:3: 155:27: 154 .collect::<Vec<_>>()
note: but borrowed value is only valid for the statement at 153:2:
reference must be valid for the static lifetime...
src/lib.rs:
let extended_segments = (segment_size..max_val)
consider using a `let` binding to increase its lifetime
I tried moving around the iterator and adding lifetimes to different places, but I couldn't get the checker to pass and still stay on type.
The offending code is below, based on the concurrency chapter in the Rust book. (Complete code is at github.)
use std::sync::mpsc;
use std::thread;
fn sieve_segment(a: &[usize], b: &[usize]) -> Vec<usize> {
vec![]
}
fn eratosthenes_sieve(val: usize) -> Vec<usize> {
vec![]
}
pub fn segmented_sieve_parallel(max_val: usize, mut segment_size: usize) -> Vec<usize> {
if max_val <= ((2 as i64).pow(16) as usize) {
// early return if the highest value is small enough (empirical)
return eratosthenes_sieve(max_val);
}
if segment_size > ((max_val as f64).sqrt() as usize) {
segment_size = (max_val as f64).sqrt() as usize;
println!("Segment size is larger than √{}. Reducing to {} to keep resource use down.",
max_val,
segment_size);
}
let small_primes = eratosthenes_sieve((max_val as f64).sqrt() as usize);
let mut big_primes = small_primes.clone();
let (tx, rx): (mpsc::Sender<Vec<usize>>, mpsc::Receiver<Vec<usize>>) = mpsc::channel();
let extended_segments = (segment_size..max_val)
.collect::<Vec<_>>()
.chunks(segment_size);
for this_segment in extended_segments.clone() {
let small_primes = small_primes.clone();
let tx = tx.clone();
thread::spawn(move || {
let sieved_segment = sieve_segment(&small_primes, this_segment);
tx.send(sieved_segment).unwrap();
});
}
for _ in 1..extended_segments.count() {
big_primes.extend(&rx.recv().unwrap());
}
big_primes
}
fn main() {}
How do I understand and avoid this error? I'm not sure how to make the lifetime of the thread closure static as in this question and still have the function be reusable (i.e., not main()). I'm not sure how to "consume all things that come into [the closure]" as mentioned in this question. And I'm not sure where to insert .map(|s| s.into()) to ensure that all references become moves, nor am I sure I want to.

When trying to reproduce a problem, I'd encourage you to create a MCVE by removing all irrelevant code. In this case, something like this seems to produce the same error:
fn segmented_sieve_parallel(max_val: usize, segment_size: usize) {
let foo = (segment_size..max_val)
.collect::<Vec<_>>()
.chunks(segment_size);
}
fn main() {}
Let's break that down:
Create an iterator between numbers.
Collect all of them into a Vec<usize>.
Return an iterator that contains references to the vector.
Since the vector isn't bound to any variable, it's dropped at the end of the statement. This would leave the iterator pointing to an invalid region of memory, so that's disallowed.
Check out the definition of slice::chunks:
fn chunks(&self, size: usize) -> Chunks<T>
pub struct Chunks<'a, T> where T: 'a {
// some fields omitted
}
The lifetime marker 'a lets you know that the iterator contains a reference to something. Lifetime elision has removed the 'a from the function, which looks like this, expanded:
fn chunks<'a>(&'a self, size: usize) -> Chunks<'a, T>
Check out this line of the error message:
help: consider using a let binding to increase its lifetime
You can follow that as such:
fn segmented_sieve_parallel(max_val: usize, segment_size: usize) {
let foo = (segment_size..max_val)
.collect::<Vec<_>>();
let bar = foo.chunks(segment_size);
}
fn main() {}
Although I'd write it as
fn segmented_sieve_parallel(max_val: usize, segment_size: usize) {
let foo: Vec<_> = (segment_size..max_val).collect();
let bar = foo.chunks(segment_size);
}
fn main() {}
Re-inserting this code back into your original problem won't solve the problem, but it will be much easier to understand. That's because you are attempting to pass a reference to thread::spawn, which may outlive the current thread. Thus, everything passed to thread::spawn must have the 'static lifetime. There are tons of questions that detail why that must be prevented and a litany of solutions, including scoped threads and cloning the vector.
Cloning the vector is the easiest, but potentially inefficient:
for this_segment in extended_segments.clone() {
let this_segment = this_segment.to_vec();
// ...
}

Why does HashMap have iter_mut() but HashSet doesn't?

What is the design rationale for supplying an iter_mut function for HashMap but not HashSet in Rust?
Would it be a faux pas to roll one's own (assuming that can even be done)?
Having one could alleviate situations that give rise to
previous borrow of X occurs here; the immutable borrow prevents
subsequent moves or mutable borrows of X until the borrow ends
Example
An extremely convoluted example (Gist) that does not show-case why the parameter passing is the way that it is. Has a short comment explaining the pain-point:
use std::collections::HashSet;
fn derp(v: i32, unprocessed: &mut HashSet<i32>) {
if unprocessed.contains(&v) {
// Pretend that v has been processed
unprocessed.remove(&v);
}
}
fn herp(v: i32) {
let mut unprocessed: HashSet<i32> = HashSet::new();
unprocessed.insert(v);
// I need to iterate over the unprocessed values
while let Some(u) = unprocessed.iter().next() {
// And them pass them mutably to another function
// as I will process the values inside derp and
// remove them from the set.
//
// This is an extremely convoluted example but
// I need for derp to be a separate function
// as I will employ recursion there, as it is
// much more succinct than an iterative version.
derp(*u, &mut unprocessed);
}
}
fn main() {
println!("Hello, world!");
herp(10);
}
The statement
while let Some(u) = unprocessed.iter().next() {
is an immutable borrow, hence
derp(*u, &mut unprocessed);
is impossible as unprocessed cannot be borrowed mutably. The immutable borrow does not end until the end of the while-loop.
I have tried to use this as reference and essentially ended up with trying to fool the borrow checker through various permutations of assignments, enclosing braces, but due to the coupling of the intended expressions the problem remains.

You have to think about what HashSet actually is. The IterMut that you get from HashMap::iter_mut() is only mutable on the value part: (&key, &mut val), ((&'a K, &'a mut V))
HashSet is basically a HashMap<T, ()>, so the actual values are the keys, and if you would modify the keys the hash of them would have to be updated or you get an invalid HashMap.

If your HashSet contains a Copy type, such as i32, you can work on a copy of the value to release the borrow on the HashSet early. To do this, you need to eliminate all borrows from the bindings in the while let expression. In your original code, u is of type &i32, and it keeps borrowing from unprocessed until the end of the loop. If we change the pattern to Some(&u), then u is of type i32, which doesn't borrow from anything, so we're free to use unprocessed as we like.
fn herp(v: i32) {
let mut unprocessed: HashSet<i32> = HashSet::new();
unprocessed.insert(v);
while let Some(&u) = unprocessed.iter().next() {
derp(u, &mut unprocessed);
}
}
If the type is not Copy or is too expensive to copy/clone, you can wrap them in Rc or Arc, and clone them as you iterate on them using cloned() (cloning an Rc or Arc doesn't clone the underlying value, it just clones the Rc pointer and increments the reference counter).
use std::collections::HashSet;
use std::rc::Rc;
fn derp(v: &i32, unprocessed: &mut HashSet<Rc<i32>>) {
if unprocessed.contains(v) {
unprocessed.remove(v);
}
}
fn herp(v: Rc<i32>) {
let mut unprocessed: HashSet<Rc<i32>> = HashSet::new();
unprocessed.insert(v);
while let Some(u) = unprocessed.iter().cloned().next() {
// If you don't use u afterwards,
// you could also pass if by value to derp.
derp(&u, &mut unprocessed);
}
}
fn main() {
println!("Hello, world!");
herp(Rc::new(10));
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to preserve unsized data created in a closure - rust

Related

"expected bound lifetime parameter, found concrete lifetime" on StreamExt .scan() method

Discerning lifetimes understanding the move keyword

Mutating a variable in the body of a closure

Why does this variable definition imply static lifetime?

Why does HashMap have iter_mut() but HashSet doesn't?

Categories

Resources