Generic operation on slice of Cow<str> - string

I'm trying to implement the following code, which removes the prefix from a slice of Cow<str>'s.
fn remove_prefix(v: &mut [Cow<str>], prefix: &str) {
for t in v.iter_mut() {
match *t {
Borrowed(&s) => s = s.trim_left_matches(prefix),
Owned(s) => s = s.trim_left_matches(prefix).to_string(),
}
}
}
I have two questions:
I can't get this to compile - I've tried loads of combinations of &'s and *'s but to no avail.
Is there a better way to apply functions to a Cow<str> without having to match it to Borrowed and Owned every time. I mean it seems like I should just be able to do something like *t = t.trim_left_matches(prefix) and if t is a Borrowed(str) it leaves it as a str (since trim_left_matches allows that), and if it is an Owned(String) it leaves it as a String. Similarly for replace() it would realise it has to convert both to a String (since you can't use replace() on a str). Is something like that possible?

Question #1 strongly implies how you think pattern matching and/or pointers work in Rust doesn't quite line up with how they actually work. The following code compiles:
fn remove_prefix(v: &mut [Cow<str>], prefix: &str) {
use std::borrow::Cow::*;
for t in v.iter_mut() {
match *t {
Borrowed(ref mut s) => *s = s.trim_left_matches(prefix),
Owned(ref mut s) => *s = s.trim_left_matches(prefix).to_string(),
}
}
}
If your case, Borrowed(&s) is matched against Borrowed(&str), meaning that s is of type str. This is impossible: you absolutely cannot have a variable of a dynamically sized type. It's also counter-productive. Given that you want to modify s, binding to it by value won't help at all.
What you want is to modify the thing contained in the Borrowed variant. This means you want a mutable pointer to that storage location. Hence, Borrowed(ref mut s): this is not destructuring the value inside the Borrowed at all. Rather, it binds directly to the &str, meaning that s is of type &mut &str; a mutable pointer to a (pointer to a str). In other words: a mutable pointer to a string slice.
At that point, mutating the contents of the Borrowed is done by re-assigning the value through the mutable pointer: *s = ....
Finally, the exact same reasoning applies to the Owned case: you were trying to bind by-value, then mutate it, which cannot possibly do what you want. Instead, bind by mutable pointer to the storage location, then re-assign it.
As for question #2... not really. That would imply some kind of overloading, which Rust doesn't do (by deliberate choice). If you are doing this a lot, you could write an extension trait that adds methods of interest to Cow.

You can definitely do it.
fn remove_prefix(v: &mut [Cow<str>], prefix: &str) {
for t in v.iter_mut() {
match *t {
Cow::Borrowed(ref mut s) => *s = s.trim_left_matches(prefix),
Cow::Owned(ref mut s) => *s = s.trim_left_matches(prefix).to_string(),
}
}
}
ref mut s means “take a mutable reference to the value and call it s” in a pattern. Thus you have s of type &mut &str or &mut String. You must then use *s =  in order to change what that mutable reference is pointing to (thus, change the string inside the Cow).

Related

Why do I need the type annotation here?

In this leetcode invert binary tree problem, I'm trying to borrow a node wrapped in an Rc mutably. Here is the code.
use std::rc::Rc;
use std::cell::RefCell;
impl Solution {
pub fn invert_tree(root: Option<Rc<RefCell<TreeNode>>>) -> Option<Rc<RefCell<TreeNode>>> {
let mut stack: Vec<Option<Rc<RefCell<TreeNode>>>> = vec![root.clone()];
while stack.len() > 0 {
if let Some(node) = stack.pop().unwrap() {
let n: &mut TreeNode = &mut node.borrow_mut();
std::mem::swap(&mut n.left, &mut n.right);
stack.extend(vec![n.left.clone(), n.right.clone()]);
}
}
root
}
}
If I change the line let n: &mut TreeNode to just let n = &mut node.borrow_mut(), I get a compiler error on the next line, "cannot borrow *n as mutable more than once at a time"
It seems like the compiler infers n to be of type &mut RefMut<TreeNode>, but it all works out when I explicitly say it is &mut TreeNode. Any reason why?
A combination of borrow splitting and deref-coercion causes the seemingly identical code to behave differently.
The compiler infers n to be of type RefMut<TreeNode>, because that's what borrow_mut actually returns:
pub fn borrow_mut(&self) -> RefMut<'_, T>
RefMut is a funny little type that's designed to look like a &mut, but it's actually a separate thing. It implements Deref and DerefMut, so it will happily pretend to be a &mut TreeNode when needed. But Rust is still inserting calls to .deref() in there for you.
Now, why does one work and not the other? Without the type annotation, after deref insertion, you get
let n = &mut node.borrow_mut();
std::mem::swap(&mut n.deref_mut().left, &mut n.deref_mut().right);
So we're trying to call deref_mut (which takes a &mut self) twice in the same line on the same variable. That's not allowed by Rust's borrow rules, so it fails.
(Note that the &mut on the first line simply borrows an owned value for no reason. Temporary lifetime extension lets us get away with this, even though you don't need the &mut at all in this case)
Now, on the other hand, if you do put in the type annotation, then Rust sees that borrow_mut returns a RefMut<'_, TreeNode> but you asked for a &mut TreeNode, so it inserts the deref_mut on the first line. You get
let n: &mut TreeNode = &mut node.borrow_mut().deref_mut();
std::mem::swap(&mut n.left, &mut n.right);
Now the only deref_mut call is on the first line. Then, on the second line, we access n.left and n.right, both mutably, simultaneously. It looks like we're accessing n mutably twice at once, but Rust is actually smart enough to see that we're accessing two disjoint parts of n simultaneously, so it allows it. This is called borrow splitting. Rust will split borrows on different instance fields, but it's not smart enough to see the split across a deref_mut call (function calls could, in principle, do anything, so Rust's borrow checker refuses to try to do advanced reasoning about their return value).

Turning a RefMut into a &mut

use std::cell::RefCell;
use std::rc::Weak;
struct Elem {
attached_elem: Weak<RefCell<Elem>>,
value: i32,
}
impl Elem {
fn borrow_mut_attached_elem(&self) -> &mut Elem {
//what should this line be?
self.attached_elem.upgrade().unwrap().borrow_mut()
}
}
fn main(){}
I have tried some similar other lines but nothing has worked so far, even the experimental cell_leak feature for RefMut.
I don't mind changing the signature of the function I just want to reduce the overhead of getting a mutable reference to the attached_elem from an Elem.
what should this line be?
There is nothing you can put in that line to (safely) satisfy the function signature - and that's for good reason. While RefCell does allow obtaining &mut T from a RefCell<T> (that's why it exists), it must guarantee that only one mutable reference exist at a time. It does so by only providing a temporary reference whose lifetime is tied to the RefMut<T> wrapper. Once the wrapper is dropped, the value is marked as no longer borrowed, so the reference must not outlive it.
If Rust were to allow you to return a naked &mut Elem, you'd be able to use the reference after the RefCell has ceased being marked as borrowed. In that case, what's to stop you from calling borrow_mut_attached_elem() again, and obtain a second mutable reference to the same Elem?
So you'll definitely need to change the signature. If you just need to give outside code temporary access to &mut Elem, the easiest way is to accept a closure that will receive it. For example:
fn with_attached_elem<R>(&self, f: impl FnOnce(&mut Elem) -> R) -> R {
let rc = self.attached_elem.upgrade().unwrap();
let retval = f(&mut *rc.borrow_mut());
retval
}
You'd call it to do something with the element, e.g.:
elem.with_attached_elem(|e| e.value += 1);
with_attached_elem takes care to return the value returned by the closure, allowing you to collect data from &mut Elem and propagate it to the caller. For example, to pick up the value of the attached element you could use:
let value = elem.with_attached_elem(|e| e.value);

Immutable object changing to mutable depending on function signature

Checkout the Rust code below. It compiles
fn main() {
let vec0 = Vec::new();
let mut vec1 = fill_vec(vec0);
println!("{} has length {} content `{:?}`", "vec1", vec1.len(), vec1);
vec1.push(88);
println!("{} has length {} content `{:?}`", "vec1", vec1.len(), vec1);
}
fn fill_vec(mut vec: Vec<i32>) -> Vec<i32> {
vec.push(22);
vec.push(44);
vec.push(66);
vec
}
Here I am declaring vec0 as immutable but fill_vec takes in a mutable vector. Depending on the function signature it seems Rust is changing the nature of the argument being passed.
My question is, this obviously seems like a "shot yourself in the foot" instant. Why does Rust allow this? Or, is this actually safe and I am missing something?
There are different things at play here that can all explain why this behavior make sense:
First of, mut doesn't really mean "mutable". There are such things as interior mutability, Cells, Mutexes, etc., which allow you to modify state without needing a single mut. Rather, mut means that you can get mutually exclusive references.
Second, mutability is a property of a binding. vec0 in main and vec in fill_vec are different bindings, so they can have different mutability.
See also:
What does 'let x = x' do in Rust?
Finally ownership: fill_vec takes full ownership of its parameter, which effectively doesn't exist anymore in main. Why should the function not be allowed to do whatever it wants with its owned parameters? Had the function taken the parameter as a mutable reference, you would have needed to declare the original binding as mut:
fn main() {
let mut vec0 = Vec::new();
// ^^^ now _needs_ a mutable binding
fill_vec(&mut vec0);
// ^^^^ needs an explicit `&mut` reference
}
fn fill_vec(vec: &mut Vec<i32>) {
// ^^^^ borrows rather than take ownership
// …
}
You're making the argument vec of fill_vec mutable. You are still passing the vec by value.
If you wanted a mutable reference you would have vec: &mut Vec<i32>.

Difference between &mut and ref mut for trait objects

First of all, I'm not asking what's the difference between &mut and ref mut per se.
I'm asking because I thought:
let ref mut a = MyStruct
is the same as
let a = &mut MyStruct
Consider returning a trait object from a function. You can return a Box<Trait> or a &Trait. If you want to have mutable access to its methods, is it possible to return &mut Trait?
Given this example:
trait Hello {
fn hello(&mut self);
}
struct English;
struct Spanish;
impl Hello for English {
fn hello(&mut self) {
println!("Hello!");
}
}
impl Hello for Spanish {
fn hello(&mut self) {
println!("Hola!");
}
}
The method receives a mutable reference for demonstration purposes.
This won't compile:
fn make_hello<'a>() -> &'a mut Hello {
&mut English
}
nor this:
fn make_hello<'a>() -> &'a mut Hello {
let b = &mut English;
b
}
But this will compile and work:
fn make_hello<'a>() -> &'a mut Hello {
let ref mut b = English;
b
}
My theory
This example will work out of the box with immutable references (not necessary to assign it to a variable, just return &English) but not with mutable references. I think this is due to the rule that there can be only one mutable reference or as many immutable as you want.
In the case of immutable references, you are creating an object and borrowing it as a return expression; its reference won't die because it's being borrowed.
In the case of mutable references, if you try to create an object and borrow it mutably as a return expression you have two mutable references (the created object and its mutable reference). Since you cannot have two mutable references to the same object it won't perform the second, hence the variable won't live long enough. I think that when you write let mut ref b = English and return b you are moving the mutable reference because it was captured by a pattern.
All of the above is a poor attempt to explain to myself why it works, but I don't have the fundamentals to prove it.
Why does this happen?
I've also cross-posted this question to Reddit.
This is a bug. My original analysis below completely ignored the fact that it was returning a mutable reference. The bits about promotion only make sense in the context of immutable values.
This is allowable due to a nuance of the rules governing temporaries (emphasis mine):
When using an rvalue in most lvalue contexts, a temporary unnamed lvalue is created and used instead, if not promoted to 'static.
The reference continues:
Promotion of an rvalue expression to a 'static slot occurs when the expression could be written in a constant, borrowed, and dereferencing that borrow where the expression was the originally written, without changing the runtime behavior. That is, the promoted expression can be evaluated at compile-time and the resulting value does not contain interior mutability or destructors (these properties are determined based on the value where possible, e.g. &None always has the type &'static Option<_>, as it contains nothing disallowed).
Your third case can be rewritten as this to "prove" that the 'static promotion is occurring:
fn make_hello_3<'a>() -> &'a mut Hello {
let ref mut b = English;
let c: &'static mut Hello = b;
c
}
As for why ref mut allows this and &mut doesn't, my best guess is that the 'static promotion is on a best-effort basis and &mut just isn't caught by whatever checks are present. You could probably look for or file an issue describing the situation.

Returning a borrow from an owned resource

I am trying to write a function that maps an Arc<[T]> into an Iterable, for use with flat_map (that is, I want to call i.flat_map(my_iter) for some other i: Iterator<Item=Arc<[T]>>).
fn my_iter<'a, T>(n: Arc<[T]>) -> slice::Iter<'a, T> {
let t: &'a [T] = &*n.clone();
t.into_iter()
}
The function above does not work because n.clone() produces an owned value of type Arc<[T]>, which I can dereference to [T] and then borrow to get &[T], but the lifetime of the borrow only lasts until the end of the function, while the 'a lifetime lasts until the client drops the returned iterator.
How do I clone the Arc in such a way that the client takes ownership of the clone, so that the value is only dropped after the client is done with the iterator (assuming no one else is using the Arc)?
Here's some sample code for the source iterator:
struct BaseIter<T>(Arc<[T]>);
impl<T> Iterator for BaseIter<T> {
type Item = Arc<[T]>;
fn next(&mut self) -> Option<Self::Item> {
Some(self.0.clone())
}
}
How do I implement the result of BaseIter(data).flat_map(my_iter) (which is of type Iterator<&T>) given that BaseIter is producing data, not just borrowing it? (The real thing is more complicated than this, it's not always the same result, but the ownership semantics are the same.)
You cannot do this. Remember, lifetimes in Rust are purely compile-time entities and are only used to validate that your code doesn't accidentally access dropped data. For example:
fn my_iter<'a, T>(n: Arc<[T]>) -> slice::Iter<'a, T>
Here 'a does not "last until the client drops the returned iterator"; this reasoning is incorrect. From the point of view of slice::Iter its lifetime parameter means the lifetime of the slice it is pointing at; from the point of view of my_iter 'a is just a lifetime parameter which can be chosen arbitrarily by the caller. In other words, slice::Iter is always tied to some slice with some concrete lifetime, but the signature of my_iter states that it is able to return arbitrary lifetime. Do you see the contradiction?
As a side note, due to covariance of lifetimes you can return a slice of a static slice from such a function:
static DATA: &'static [u8] = &[1, 2, 3];
fn get_data<'a>() -> &'a [u8] {
DATA
}
The above definition compiles, but it only works because DATA is stored in static memory of your program and is always valid when your program is running; this is not so with Arc<[T]>.
Arc<[T]> implies shared ownership, that is, the data inside Arc<[T]> is jointly owned by all clones of the original Arc<[T]> value. Therefore, when the last clone of an Arc goes out of scope, the value it contains is dropped, and the respective memory is freed. Now, consider what would happen if my_iter() was allowed to compile:
let iter = {
let data: Arc<[i32]> = get_arc_slice();
my_iter(data.clone())
};
iter.map(|x| x+1).collect::<Vec<_>>();
Because in my_iter() 'a can be arbitrary and is not linked in any way to Arc<[T]> (and can not be, actually), nothing prevents this code from compilation - the user might as well choose 'static lifetime. However, here all clones of data will be dropped inside the block, and the array it contains inside will be freed. Using iter after the block is unsafe because it now provides access to the freed memory.
How do I clone the Arc in such a way that the client takes ownership of the clone, so that the value is only dropped after the client is done with the iterator (assuming no one else is using the Arc)?
So, as follows from the above, this is impossible. Only the owner of the data determines when this data should be destroyed, and borrowed references (whose existence is always implied by lifetime parameters) may only borrow the data for the time when it exists, but borrows cannot affect when and how the data is destroyed. In order for borrowed references to compile, they need to always borrow only the data which is valid through the whole time these references are active.
What you can do is to rethink your architecture. It is hard to say what exactly can be done without looking at the full code, but in the case of this particular example you can, for example, first collect the iterator into a vector and then iterate over the vector:
let items: Vec<_> = your_iter.collect();
items.iter().flat_map(my_iter)
Note that now my_iter() should indeed accept &Arc<[T]>, just as Francis Gagné has suggested; this way, the lifetimes of the output iterator will be tied to the lifetime of the input reference, and everything should work fine, because now it is guaranteed that Arcs are stored stably in the vector for their later perusal during the iteration.
There's no way to make this work by passing an Arc<[T]> by value. You need to start from a reference to an Arc<[T]> in order to construct a valid slice::Iter.
fn my_iter<'a, T>(n: &'a Arc<[T]>) -> slice::Iter<'a, T> {
n.into_iter()
}
Or, if we elide the lifetimes:
fn my_iter<T>(n: &Arc<[T]>) -> slice::Iter<T> {
n.into_iter()
}
You need to use another iterator as return type of the function my_iter. slice::Iter<'a, T> has an associated type Item = &'a T. You need an iterator with associated type Item = T. Something like vec::IntoIter<T>. You can implement such an iterator yourself:
use std::sync::Arc;
struct BaseIter<T>(Arc<[T]>);
impl<T> Iterator for BaseIter<T> {
type Item = Arc<[T]>;
fn next(&mut self) -> Option<Self::Item> {
Some(self.0.clone())
}
}
struct ArcIntoIter<T>(usize, Arc<[T]>);
impl<T:Clone> Iterator for ArcIntoIter<T> {
type Item = T;
fn next(&mut self) -> Option<Self::Item> {
if self.0 < self.1.len(){
let i = self.0;
self.0+=1;
Some(self.1[i].clone())
}else{
None
}
}
}
fn my_iter<T>(n: Arc<[T]>) -> ArcIntoIter<T> {
ArcIntoIter(0, n)
}
fn main() {
let data = Arc::new(["A","B","C"]);
println!("{:?}", BaseIter(data).take(3).flat_map(my_iter).collect::<String>());
//output:"ABCABCABC"
}

Resources