"Popping" a value from a HashSet - rust

I can't seem to find a way to pop a (random) value from a HashSet. Inspired by other code samples, I wrote the following:
my_set.iter().next().map(|i| my_set.take(i).unwrap())
I.e get an iterator on the set's values, take the first value (a reference) from it, and then run my_set.take() using the reference previously obtained, to get the value itself (not the reference) and remove it from the set.
This doesn't compile due to:
error[E0500]: closure requires unique access to `my_set` but it is already borrowed
|
32 | my_set.iter().next().map(|i| my_set.take(i).unwrap())
| ------ --- ^^^ ------ second borrow occurs due to use of `my_set` in closure
| | | |
| | | closure construction occurs here
| | first borrow later used by call
| borrow occurs here
I've tried many, many variations on this, but they all fail due to an immutable borrow then being borrowed as mutable (error 502).
Can any one recommend a way to rewrite the above?

If you're okay with cloning the item removed you can do:
let elem = set.iter().next().unwrap().clone();
set.remove(&elem);
Playground
Or with a guard against the set being empty:
if let Some(elem) = set.iter().next().cloned() {
set.remove(&elem);
}
Playground

This feels hacky, but you can use HashSet::retain with an impure function:
let mut flag = false;
my_set.retain(|_| mem::replace(&mut flag, true));
(playground)
The element removed is truly arbitrary — I ran the code on the playground a few times, and obtained different results.
As interjay mentioned in a comment, this approach iterates over the entire set just to remove one value, so the cost is generally more significant than cloning one value. Therefore, only use this workaround when the element really cannot be cloned.

Related

Why the order of computing operands is direct and not reverse? (Swapping in a Vec without a local variable) [duplicate]

This question already has an answer here:
Why is indexing a mutable vector based on its len() considered simultaneous borrowing?
(1 answer)
Closed 2 months ago.
This post was edited and submitted for review last month and failed to reopen the post:
Original close reason(s) were not resolved
All I wanna do is to swap the first and the last element in Vec.
So I wrote this:
// getting a vector of integers from stdin:
let mut ranks = std::io::stdin()
.lock()
.lines()
.next()
.unwrap()
.unwrap()
.trim()
.split_whitespace()
.map(|s| s.parse::<usize>().unwrap())
.collect::<Vec<usize>>();
// ...
// pushing something to ranks
// ...
ranks.swap(0, ranks.len() - 1); // <------ TRYING TO SWAP
And of course, it doesn't work, because
error[E0502]: cannot borrow `ranks` as immutable because it is also borrowed as mutable
--> src\main.rs:4:19
|
4 | ranks.swap(0, ranks.len() - 1);
| --------------^^^^^^^^^^^-----
| | | |
| | | immutable borrow occurs here
| | mutable borrow later used by call
| mutable borrow occurs here
|
help: try adding a local storing this argument...
--> src\main.rs:4:19
|
4 | ranks.swap(0, ranks.len() - 1);
| ^^^^^^^^^^^
help: ...and then using that local as the argument to this call
By using this piece of advice the compiler gave me, I came up with following:
let last = ranks.len() - 1;
ranks.swap(0, last);
...which looks terrible.
So the thing is: why do I need to create local variable to store vector length, if I want to pass it to the method? Well, of course because of the borrowing rules: the value of ranks will be borrowed as mutable as I call swap, so that I can't use ranks.len() anymore.
But isn't it reasonable to suppose that the values of parameters will be calculated before the method starts to do anything and before it somehow can change the content of vector or it's length?
Basically, the order of computing parameters of a function is straight in Rust, which creates many obsticles for writing clean code for me. So I would like to ask, why the order is chosen to be straight? Cause if it was reverse, things would be much easier to express. For instance, the above piece of code could be rewritten as:
ranks.swap(0, ranks.len() - 1);
...which (I think you would agree) is much more readable and cleaner.
Also, it is strange that the similar in the sense of borrowing code compiles succesfully:
let mut vec = vec![1, 2, 3]; // creating vector
vec.push(*vec.last().unwrap()); // DOUBLE BORROWING HERE!!!
// ^ ^
// | |
// | ----- immutable
// ----- mutable
println!("{:?}", vec);
So what the hell is going on? Double standarts, aren't they?
I would like to know the answer. Thank you for explanation.
but isn't it reasonable to suppose that the values of parameters will be calculated before the method starts to do anything and before it somehow can change the content of vector?
...are you sure?
Evaluation order of operands
The following list of expressions all evaluate their operands the same way, as described after the list. Other expressions either don't take operands or evaluate them conditionally as described on their respective pages.
...
Call expression
Method call expression
...
The operands of these expressions are evaluated prior to applying the effects of the expression. Expressions taking multiple operands are evaluated left to right as written in the source code.
So, self is evaluated before ranks.len() - 1. And self is actually &mut self because of auto-ref. So you first take a mutable reference to the vector, then call len() on it - that requires a borrowing it - while it is already borrowed mutably!
Now, this sometimes work (for example, if you would call push() instead of swap()). But this works by magic (called two-phase borrows that splits borrowing into borrowing and activating the reference), and this magic doesn't work here. I think the reason is that swap() is defined for slices and not Vec and thus we need to go through Deref - but I'm not totally sure.

Why storing return value avoids borrowing twice

use std::fs::File;
use std::io::prelude::*;
use std::io::BufReader;
fn main() {
let f = File::open("test.txt").expect("Can't open");
let mut b = BufReader::new(f);
let v = b.fill_buf().unwrap();
println!("v: {:?}", v);
b.consume(v.len());
}
will not compile, the error is
error[E0499]: cannot borrow `b` as mutable more than once at a time
--> src/main.rs:10:5
|
7 | let v = b.fill_buf().unwrap();
| ------------ first mutable borrow occurs here
...
10 | b.consume(v.len());
| ^^^^^^^^^^-------^
| | |
| | first borrow later used here
| second mutable borrow occurs here
Changing the last lines to:
let len = v.len();
b.consume(len);
And all is ok.
I fail to understand why the first example is borrowing something twice, and why storing the length in a variable and passing that variable to b.consume() is ok - could someone explain why the 2. variant is ok, and the first is not ?
When you run into this kind of issue, it can help to desugar things a bit. As you may know, method syntax is syntax sugar for an associated function call:
b.consume(v.len());
// desugars to
BufRead::consume(&mut b, [_]::len(v));
Argument expressions are generally evaluated left-to-right, so we try to get a new mutable reference to b before we release the reference we're holding in v. There are some cases where the compiler can recognize and automatically avoid this issue, but this does not seem to be one of those cases.
You may ask "Why does it say second mutable reference? v is an immutable reference!" Well, the lifetime of the reference in v is tied to the lifetime of the mutable reference when you call b.fill_buf(). So the compiler thinks that the mutable reference to b must remain valid until v is released.
The reason the fix works is because it flips the order of the arguments, evaluating v.len() first, and releasing that first reference.

Fold with string array

I tried some code like this:
fn main() {
let a = vec!["May", "June"];
let s = a.iter().fold("", |s2, s3|
s2 + s3
);
println!("{}", s == "MayJune");
}
Result:
error[E0369]: cannot add `&&str` to `&str`
--> a.rs:4:10
|
4 | s2 + s3
| -- ^ -- &&str
| | |
| | `+` cannot be used to concatenate two `&str` strings
| &str
|
help: `to_owned()` can be used to create an owned `String` from a string
reference. String concatenation appends the string on the right to the string
on the left and may require reallocation. This requires ownership of the string
on the left
|
4 | s2.to_owned() + s3
| ^^^^^^^^^^^^^
Ok, fair enough. So I change my code to exactly that. But then I get this:
error[E0308]: mismatched types
--> a.rs:4:7
|
4 | s2.to_owned() + s3
| ^^^^^^^^^^^^^^^^^^
| |
| expected `&str`, found struct `std::string::String`
| help: consider borrowing here: `&(s2.to_owned() + s3)`
Ok, fair enough. So I change my code to exactly that. But then I get this:
error[E0515]: cannot return reference to temporary value
--> a.rs:4:7
|
4 | &(s2.to_owned() + s3)
| ^--------------------
| ||
| |temporary value created here
| returns a reference to data owned by the current function
Why is Rust giving bogus suggestion, and is what I am trying to do possible? Note, I would prefer to avoid suggestions such as "just use join" or similar, as this question is intended to address a more generic problem. Rust version:
rustc 1.46.0 (04488afe3 2020-08-24)
is what I am trying to do possible?
Stargazeur provided a working version in their comment: the initial value / accumulator needs to be a String rather than an &str.
Why is Rust giving bogus suggestion
Rustc doesn't have a global-enough vision so it is able to see the "detail" issue but it doesn't realise that it's really a local effect of a larger problem: fold's signature is
fn fold<B, F>(self, init: B, f: F) -> B
because you're giving fold an &str, it must ultimately return an &str, which is only possible if F just returns something it gets from "the outside", not if it creates anything internally. Since you want to create something inside your callback, the value of init is the issue.
Rustc doesn't see the conflict at that level though, because as far as it's concerned that's a perfectly valid signature e.g. you might be following a chain of things through a hashmap of returning a constant string reference for all it cares, the only real conflict it sees is between this:
F: FnMut(B, Self::Item) -> B
and the implementation of your function which doesn't actually work, so it tries to help you with that:
Rust doesn't allow adding two &str together because that would implicitly allocate a String which is the sort of hidden concern the core team would rather not hide, so Add is only implemented between String and &str, that's the first issue you see, and since that's somewhat unusual (the average language just lets you concatenate string-ish stuff or even not-at-all-strings to strings) rustc devs have added a help text noting that the LHS must be an owned String, which generally helps / works but
then the addition returns a String, so now your function doesn't match the F signature anymore: since the init is an &str that's the type of the accumlator so you need to return an &str
except if you try to create a reference to the string you've just created, you just created it inside the function, once the function returns the string will be dead and the reference left dangling, which rust can not allow
And that's how despite the best intentions, because the compiler's view is too local it guilelessly leads you down a completely useless path of frustration.
You may want to report this issue on the bug tracker (or see if it's already there). I don't know if the compiler diagnostics system would be able to grok this situation though.

Understanding a lifetime issue

I'm hitting a lifetime error when compiling a change I made for Firecracker (on aarch64, but I doubt the issue is architecture-dependent):
error[E0716]: temporary value dropped while borrowed
--> src/vmm/src/device_manager/mmio.rs:174:24
|
174 | let int_evt = &serial
| ________________________^
175 | | .lock()
176 | | .expect("Poisoned legacy serial lock")
| |__________________________________________________^ creates a temporary which is freed while still in use
177 | .interrupt_evt();
| - temporary value is freed at the end of this statement
178 | vm.register_irqfd(int_evt, self.irq)
| ------- borrow later used here
|
= note: consider using a `let` binding to create a longer lived value
The original code (which compiles fine) is:
vm.register_irqfd(&serial
.lock()
.expect("Poisoned legacy serial lock")
.interrupt_evt(), self.irq)
.map_err(Error::RegisterIrqFd)?;
I don't understand the difference. The error message seems to state that expect() is returning a temporary and that I'm taking a const reference to it, in C++ this would extend the lifetime of the temporary, does it not in Rust? Either way, why does it work in the original code but not after I bind to an l-value (C++ parlance, I'm not sure if it is the same for Rust)?
I tried creating a SSCE here, but it worked as expected!
A simple, reproducible example of the problem (playground):
// create array inside mutex
let mutex = Mutex::new([ 0i32 ]);
// get reference to item inside array
let item: &i32 = mutex.lock().unwrap().get(0).unwrap();
// use reference for something
println!("item = {:?}", item);
mutex.lock().unwrap() returns a MutexGuard<'_, Option<i32>>, which borrows the data inside the mutex. It also owns a lock on the data, that is released when the guard is dropped, which means that noone else may borrow the data at the same time.
When you call a method of the inner type on that guard (like .get in the above example, or .interrupt_evt in your code), it will borrow with the lifetime of the guard, since you can only access the data safely while the guard exist. But the guard isn't stored in any variable, so it only exists temporarily for that statement, and is immediately dropped at the end of it. So you cannot get a reference to the data outside of the statement.
To solve this problem is very simple: first store the guard in a variable, and then borrow from it. That will ensure that the guard lives longer than the references you get from it (playground):
// create array inside mutex
let mutex = Mutex::new([ 0i32 ]);
// get reference to item inside array
let guard = mutex.lock().unwrap();
let item: &i32 = guard.get(0).unwrap();
// use reference for something
println!("item = {:?}", item);
// guard is now destroyed at end of scope
// and mutex lock is released here

How to take the first element of a BTreeSet without clone [duplicate]

This question already has answers here:
Getting first member of a BTreeSet
(2 answers)
Closed 2 years ago.
I need to implement a function similar to the unstable BTreeSet::pop_first().
It is not a duplicate of Getting first member of a BTreeSet, since I ask for a way get the first element of a BTreeSet without making a copy.
fn pop<O:Ord>(set: &mut std::collections::BTreeSet<O>) -> O
Based on a previous question (Getting first member of a BTreeSet), I get a compiler error about a mutable use after an inmutable one:
pub fn pop<O:Ord>(set: &mut std::collections::BTreeSet<O>) -> O {
let n = {
let mut start_iter = set.iter();
start_iter.next().unwrap()
};
set.take(n).unwrap()
}
error[E0502]: cannot borrow `*set` as mutable because it is also borrowed as immutable
--> src/main.rs:493:5
|
489 | let mut start_iter = set.iter();
| --- immutable borrow occurs here
...
493 | set.take(n).unwrap()
| ^^^^----^^^
| | |
| | immutable borrow later used by call
| mutable borrow occurs here
If I replace set.iter() with set.iter().clone() the code works, but I would like to not make copies of the elements of the set because it will be a costly operation.
How can I implement such a pop function?
rustc version 1.41.0
It is not possible to implement that function like you ask without access to the internals of BTreeSet: The elements of the iterator are borrowed from the set, so n is borrowed from the set as well. While you have borrowed something from the set, you cannot modify it.
You could:
Use clone()/cloned(), like you mentioned. If you were worried it would clone all elements in the iterator when calling pop_first(), that's not the case. It would clone only the one element since iterators are lazy.
Keep both a BinaryHeap and a HashSet of your elements. Before inserting an element into the BinaryHeap, check whether it is already in the HashSet.
Find/implement an extended BTreeSet from scratch that has this functionality.

Resources