How to build an Rc<str> or Rc<[T]>? - rust

I'd like to create an Rc<str> because I want reduce the indirection from following the 2 pointers that accessing an Rc<String> requires. I need to use an Rc because I truly have shared ownership. I detail in another question more specific issues I have around my string type.
Rc has a ?Sized bound:
pub struct Rc<T: ?Sized> { /* fields omitted */ }
I've also heard that Rust 1.2 will come with proper support for storing unsized types in an Rc, but I'm unsure how this differs from 1.1.
Taking the str case as example, my naive attempt (also this for building from a String) fails with:
use std::rc::Rc;
fn main() {
let a: &str = "test";
let b: Rc<str> = Rc::new(*a);
println!("{}", b);
}
error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
--> src/main.rs:5:22
|
5 | let b: Rc<str> = Rc::new(*a);
| ^^^^^^^ `str` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: required by `<std::rc::Rc<T>>::new`
It's clear that in order to create an Rc<str>, I need to copy the whole string: RcBox would be itself an unsized type, storing the string itself alongside the weak and strong pointers — the naive code above doesn't even make sense.
I've been told that one can not instantiate such type, but instead instantiate an Rc<T> with a sized T and then coerce it to an unsized type. The example given is for the storing a trait object: first create Rc<ConcreteType> and then coerce to Rc<Trait>. But this doesn't make sense either: neither this nor this work (and you can't coerce from &str or String to str anyway).

As of Rust 1.21.0 and as mandated by RFC 1845, creating an Rc<str> or Arc<str> is now possible:
use std::rc::Rc;
use std::sync::Arc;
fn main() {
let a: &str = "hello world";
let b: Rc<str> = Rc::from(a);
println!("{}", b);
// or equivalently:
let b: Rc<str> = a.into();
println!("{}", b);
// we can also do this for Arc,
let a: &str = "hello world";
let b: Arc<str> = Arc::from(a);
println!("{}", b);
}
(Playground)
See <Rc as From<&str>> and <Arc as From<&str>>.

Creating an Rc<[T]> can be done via coercions and as-casts from fixed sized arrays, e.g. coercions can be done as follows:
use std::rc::Rc;
fn main() {
let x: Rc<[i32; 4]> = Rc::new([1, 2, 3, 4]);
let y: Rc<[i32]> = x;
println!("{:?}", y);
}
However, this doesn't work for strings, since they have no raw fixed-sized equivalent to create the first value. It is possible to do unsafely, e.g. by creating a UTF-8 encoded Rc<[u8]> and transmuting that to Rc<str>. Theoretically there could be a crate on crates.io for it, but I can't find one at the moment.
An alternative is owning_ref, which isn't quite std::rc::Rc itself, but should allow, for example, getting an RcRef<..., str> pointing into an Rc<String>. (This approach will work best if one uses RcRef uniformly in place of Rc, except for construction.)
extern crate owning_ref;
use owning_ref::RcRef;
use std::rc::Rc;
fn main() {
let some_string = "foo".to_owned();
let val: RcRef<String> = RcRef::new(Rc::new(some_string));
let borrowed: RcRef<String, str> = val.map(|s| &**s);
let erased: RcRef<owning_ref::Erased, str> = borrowed.erase_owner();
}
The erasing means that RcRef<..., str>s can come from multiple different sources, e.g. a RcRef<Erased, str> can come from a string literal too.
NB. at the time of writing, the erasure with RcRef requires a nightly compiler, and depending on owning_ref with the nightly feature:
[dependencies]
owning_ref = { version = "0.1", features = ["nightly"] }

Related

How to convert from `cell::Ref<'_, [u8]>` into `&[u8]` or the other way around?

I'm modifying a library that holds Items returned by ChunkExact slice iterator. My iterator requires interior mutability and uses a RefCell. As a result of that, my iterator cannot return Items of type &[u8] but returns Ref<'_, [u8]> instead. The two types seem to be equivalent for practical use. For instance, this:
for i in slice.chunks_exact(2) {
println!("{:?}", i);
}
works just as well as this:
for my_iterator() {
println!("{:?}", i);
}
Full working example in the playground.
Close as they are, I cannot convert one item into the other:
= note: expected enum `Option<Ref<'_, [u8]>>`
found enum `Option<&[u8]>`
I saw the experimental cell::Ref::leak method, but seems like what I'm trying to do should not require such a scary feature...
You can use Option::as_deref:
let a: Option<Ref<'_, [u8]>> = ...;
let b: Option<&[u8]> = a.as_deref();
Ref<'_, T> implements Deref<Target = T> which is why Ref<'_, [u8]> can be used like a &[u8] in most contexts. However, they are still different types, which is why you get the error on assigning one to the other in an option.
You cannot go the other way around and create a Ref<'_, T> from a &T since there is no RefCell for it to reference.
This will only work in a context where the original Option<Ref> is kept around. You cannot use this to convert your iterator from returning Ref<'_, [u8]>s to returning &[u8]s. This is because the lifetime of the &[u8] is bound to the Ref, not the original RefCell.

How can I implement std::convert::From such that it does not consume its input?

I have managed to make the Rust type checker go into an infinite loop. A very similar program compiles with no trouble. Why does the program I want not compile?
To save your time and effort, I have made minimal versions of the two programs that isolate the problem. Of course, the minimal version is a pointless program. You'll have to use your imagination to see my motivation.
Success
Let me start with the version that works. The struct F<T> wraps a T. The type Target can be converted from an F<T> provided T can.
struct F<T>(T);
impl<T> From<F<T>> for Target where Target: From<T> {
fn from(a: F<T>) -> Target {
let b = Target::from(a.0);
f(&b)
}
}
Here's an example caller:
fn main() {
let x = Target;
let y = F(F(F(x)));
let z = Target::from(y);
println!("{:?}", z);
}
This runs and prints "Target".
Failure
The function f does not consume its argument. I would prefer it if the From conversion also did not consume its argument, because the type F<T> could be expensive or impossible to clone. I can write a custom trait FromRef that differs from std::convert::From by accepting an immutable borrow instead of an owned value:
trait FromRef<T> {
fn from_ref(a: &T) -> Self;
}
Of course, I ultimately want to use From<&'a T>, but by defining my own trait I can ask my question more clearly, without messing around with lifetime parameters. (The behaviour of the type-checker is the same using From<&'a T>).
Here's my implementation:
impl<T> FromRef<F<T>> for Target where Target: FromRef<T> {
fn from_ref(a: &F<T>) -> Target {
let b = Target::from_ref(&a.0);
f(&b)
}
}
This compiles. However, the main() function doesn't:
fn main() {
let x = Target;
let y = F(F(F(x)));
let z = Target::from_ref(y);
println!("{:?}", z);
}
It gives a huge error message beginning:
error[E0275]: overflow evaluating the requirement `_: std::marker::Sized`
--> <anon>:26:13
|
26 | let z = Target::from_ref(y);
| ^^^^^^^^^^^^^^^^
|
= note: consider adding a `#![recursion_limit="128"]` attribute to your crate
= note: required because of the requirements on the impl of `FromRef<F<_>>` for `Target`
= note: required because of the requirements on the impl of `FromRef<F<F<_>>>` for `Target`
= note: required because of the requirements on the impl of `FromRef<F<F<F<_>>>>` for `Target`
etc...
What am I doing wrong?
Update
I've randomly fixed it!
The problem was that I forgot to implement FromRef<Target> for Target.
So I would now like to know: what was the compiler thinking? I still can't relate the problem to the error message.
You can't avoid consuming the input in the standard From/Into traits.
They are defined to always consume the input. Their definition specifies both input and output as owned types, with unrelated lifetimes, so you can't even "cheat" by trying to consume a reference.
If you're returning a reference, you can implement AsRef<T> instead. Or if your type is a thin wrapper/smart pointer, Deref<T>. You can provide methods as_foo()
If you're returning a new (owned) object, the convention is to provide to_foo() methods.

How do you create a Box<dyn Trait>, or a boxed unsized value in general?

I have the following code
extern crate rand;
use rand::Rng;
pub struct Randomizer {
rand: Box<Rng>,
}
impl Randomizer {
fn new() -> Self {
let mut r = Box::new(rand::thread_rng()); // works
let mut cr = Randomizer { rand: r };
cr
}
fn with_rng(rng: &Rng) -> Self {
let mut r = Box::new(*rng); // doesn't work
let mut cr = Randomizer { rand: r };
cr
}
}
fn main() {}
It complains that
error[E0277]: the trait bound `rand::Rng: std::marker::Sized` is not satisfied
--> src/main.rs:16:21
|
16 | let mut r = Box::new(*rng);
| ^^^^^^^^ `rand::Rng` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `rand::Rng`
= note: required by `<std::boxed::Box<T>>::new`
I don't understand why it requires Sized on Rng when Box<T> doesn't impose this on T.
More about the Sized trait and bound - it's a rather special trait, which is implicitly added to every function, which is why you don't see it listed in the prototype for Box::new:
fn new(x: T) -> Box<T>
Notice that it takes x by value (or move), so you need to know how big it is to even call the function.
In contrast, the Box type itself does not require Sized; it uses the (again special) trait bound ?Sized, which means "opt out of the default Sized bound":
pub struct Box<T> where T: ?Sized(_);
If you look through, there is one way to create a Box with an unsized type:
impl<T> Box<T> where T: ?Sized
....
unsafe fn from_raw(raw: *mut T) -> Box<T>
so from unsafe code, you can create one from a raw pointer. From then on, all the normal things work.
The problem is actually quite simple: you have a trait object, and the only two things you know about this trait object are:
its list of available methods
the pointer to its data
When you request to move this object to a different memory location (here on the heap), you are missing one crucial piece of information: its size.
How are you going to know how much memory should be reserved? How many bits to move?
When an object is Sized, this information is known at compile-time, so the compiler "injects" it for you. In the case of a trait-object, however, this information is unknown (unfortunately), and therefore this is not possible.
It would be quite useful to make this information available and to have a polymorphic move/clone available, but this does not exist yet and I do not remember any proposal for it so far and I have no idea what the cost would be (in terms of maintenance, runtime penalty, ...).
I also want to post the answer, that one way to deal with this situation is
fn with_rng<TRand: Rng>(rng: &TRand) -> Self {
let r = Box::new(*rng);
Randomizer { rand: r }
}
Rust's monomorphism will create the necessary implementation of with_rng replacing TRand by a concrete sized type. In addition, you may add a trait bound requiring TRand to be Sized.

What do the ampersand '&' and star '*' symbols mean in Rust?

Despite thoroughly reading the documentation, I'm rather confused about the meaning of the & and * symbol in Rust, and more generally about what is a Rust reference exactly.
In this example, it seems to be similar to a C++ reference (that is, an address that is automatically dereferenced when used):
fn main() {
let c: i32 = 5;
let rc = &c;
let next = rc + 1;
println!("{}", next); // 6
}
However, the following code works exactly the same:
fn main() {
let c: i32 = 5;
let rc = &c;
let next = *rc + 1;
println!("{}", next); // 6
}
Using * to dereference a reference wouldn't be correct in C++. So I'd like to understand why this is correct in Rust.
My understanding so far, is that, inserting * in front of a Rust reference dereferences it, but the * is implicitly inserted anyway so you don't need to add it (while in C++, it's implicitly inserted and if you insert it you get a compilation error).
However, something like this doesn't compile:
fn main() {
let mut c: i32 = 5;
let mut next: i32 = 0;
{
let rc = &mut c;
next = rc + 1;
}
println!("{}", next);
}
error[E0369]: binary operation `+` cannot be applied to type `&mut i32`
--> src/main.rs:6:16
|
6 | next = rc + 1;
| ^^^^^^
|
= note: this is a reference to a type that `+` can be applied to; you need to dereference this variable once for this operation to work
= note: an implementation of `std::ops::Add` might be missing for `&mut i32`
But this works:
fn main() {
let mut c: i32 = 5;
let mut next: i32 = 0;
{
let rc = &mut c;
next = *rc + 1;
}
println!("{}", next); // 6
}
It seems that implicit dereferencing (a la C++) is correct for immutable references, but not for mutable references. Why is this?
Using * to dereference a reference wouldn't be correct in C++. So I'd like to understand why this is correct in Rust.
A reference in C++ is not the same as a reference in Rust. Rust's references are much closer (in usage, not in semantics) to C++'s pointers. With respect to memory representation, Rust's references often are just a single pointer, while C++'s references are supposed to be alternative names of the same object (and thus have no memory representation).
The difference between C++ pointers and Rust references is that Rust's references are never NULL, never uninitialized and never dangling.
The Add trait is implemented (see the bottom of the doc page) for the following pairs and all other numeric primitives:
&i32 + i32
i32 + &i32
&i32 + &i32
This is just a convenience thing the std-lib developers implemented. The compiler can figure out that a &mut i32 can be used wherever a &i32 can be used, but that doesn't work (yet?) for generics, so the std-lib developers would need to also implement the Add traits for the following combinations (and those for all primitives):
&mut i32 + i32
i32 + &mut i32
&mut i32 + &mut i32
&mut i32 + &i32
&i32 + &mut i32
As you can see that can get quite out of hand. I'm sure that will go away in the future. Until then, note that it's rather rare to end up with a &mut i32 and trying to use it in a mathematical expression.
This answer is for those looking for the basics (e.g. coming from Google).
From the Rust book's References and Borrowing:
fn main() {
let s1 = String::from("hello");
let len = calculate_length(&s1);
println!("The length of '{}' is {}.", s1, len);
}
fn calculate_length(s: &String) -> usize {
s.len()
}
These ampersands represent references, and they allow you to refer to some value without taking ownership of it [i.e. borrowing].
The opposite of referencing by using & is dereferencing, which is accomplished with the dereference operator, *.
And a basic example:
let x = 5;
let y = &x; //set y to a reference to x
assert_eq!(5, x);
assert_eq!(5, *y); // dereference y
If we tried to write assert_eq!(5, y); instead, we would get a compilation error can't compare `{integer}` with `&{integer}`.
(You can read more in the Smart Pointers chapter.)
And from Method Syntax:
Rust has a feature called automatic referencing and dereferencing. Calling methods is one of the few places in Rust that has this behavior.
Here’s how it works: when you call a method with object.something(), Rust automatically adds in &, &mut, or * so object matches the signature of the method. In other words, the following are the same:
p1.distance(&p2);
(&p1).distance(&p2);
From the docs for std::ops::Add:
impl<'a, 'b> Add<&'a i32> for &'b i32
impl<'a> Add<&'a i32> for i32
impl<'a> Add<i32> for &'a i32
impl Add<i32> for i32
It seems the binary + operator for numbers is implemented for combinations of shared (but not mutable) references of the operands and owned versions of the operands. It has nothing to do with automatic dereferencing.

Why use an immutable reference to i32

In the chapter Lifetimes of the Rust book, there's an example:
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let y = &5; // this is the same as `let _y = 5; let y = &_y;`
let f = Foo { x: y };
println!("{}", f.x);
}
Why do they use x: &'a i32?
I think if it is just x: i32 then they cannot demonstrate the lifetime usage. However, is there any other reason behind it? Is there any production code that uses immutable reference to a primitive type like i32?
In this particular case the reason is indeed to show the concept of lifetimes. As for the general case, however, I see no reason making an immutable reference to a primitive type (mutable references, of course, is another matter) except of when it is done in generic code:
struct Holder<'a, T> {
r: &'a T
}
let x: i32 = 123;
let h: Holder<i32> = Holder { r: &x };
Here if you have such structure, you have no other choice as to use a reference to an i32. Naturally, this structure can also be used with other, non-primitive and non-movable types.
As Shepmaster has mentioned in comments, there is indeed a case where you have references to primitive types - it is by-reference iterators. Remember, by a convention (which the standard library follows) iter() method on a collection should return an iterator of references into the collection:
let v: Vec<i32> = vec![1, 2, 3, 4];
let i = v.iter(); // i is Iterator<Item=&i32>
Then almost all methods on the iterator which take a closure will accept closures whose argument is a reference:
i.map(|n| *n + 1) // n is of type &i32
Note that this is in fact a consequence of the more general case with generics. Vectors and slices may contain arbitrary types, including non-moveable ones, so they just have to have methods which would allow their users to borrow their contents.

Resources