Rust views over a shared refcounted array - rust

I am trying to implement multiple "slices" over a shared array of objects. My solution is to morally make a
struct MySlice<T>{ shared_data: Rc<[T]> , beg: usize, len: usize }
I want to be able to implement both
impl From<Vec<T>> for MySlice<T> { /* ?? */ }
and fn get_mut_slice(&mut MySlice<T>) -> &mut [T] which returns the entire underlying slice.
With the above implementation I can get the former but not the latter. I tried modifying the type into
struct MySlice1<T>{ shared_data: Rc<RefCell<[T]>> , beg: usize, len: usize }
But I couldn't get it to work. Is there an idiomatic way to get both?

The first can be done by storing a Rc<RefCell<Vec<T>>>. I don't think this can be done with a slice, unfortunately.
The second cannot exist as-is. If you have shared ownership, and you want to get a mutable reference, you have to make sure nobody else will use the value as long as the mutable reference is active. This can only be done with a guard. You can change its signature to:
pub fn get_mut_slice(&mut MySlice<T>) -> RefMut<'_, [T]>
Then:
pub fn get_mut_slice(&mut self) -> RefMut<'_, [T]> {
RefMut::map(self.shared_data.borrow_mut(), |s| {
&mut s[self.beg..][..self.len]
})
}

Related

Returning an array/slice of `u32` from a function

I have a reasonably simple function (let's call it intersection) that takes two parameters of type &[u32] and I'd like the return type to be &[u32]. This function takes in two slices(arrays?), and returns a new slice(array?) containing elements that are in both slices.
pub fn intersection<'a>(left: &'a [u32], right: &'a [u32]) -> &'a [u32] {
let left_set: HashSet<u32> = left.iter().cloned().collect();
let right_set: HashSet<u32> = right.iter().cloned().collect();
// I can't figure out how to get a
// `&[u32]` output idiomatically
let result: &[u32] = left_set
.intersection(&right_set)
.into_iter()
.....
.....
result //<- this is a slice
}
I suppose I could do something like create a Vec<u32> but then borrow checker doesn't like me returning that Vec<u32>.
pub fn intersection<'a>(left: &'a [u32], right: &'a [u32]) -> &'a [u32] {
.....
.....
let mut result: Vec<u32> = left_set
.intersection(&right_set)
.into_iter()
.cloned()
.collect();
result.sort();
result.as_slice() //<-- ERROR cannot return reference to local variable
// `result` returns a reference to data owned by the current function
}
I'm probably missing a trick here. Any advice on how to do this idiomatically in Rust?
This function takes in two arrays
No, it takes two slices.
I'm probably missing a trick here. Any advice on how to do this idiomatically in Rust?
There is no trick and you can't. A slice is a form of borrow, by definition a slice refers to memory owned by some other collection (static memory, a vector, an array, ...).
This means like every other borrow it can't be returned if it borrows data from the local scope, that would result in a dangling pointer (as the actual owner will get destroyed when the scope ends).
The correct thing to do is to just return a Vec:
pub fn intersection<'a>(left: &'a [u32], right: &'a [u32]) -> Vec<u32> {
left.iter().collect::<HashSet<_>>().intersection(
&right.iter().collect()
).map(|&&v| v).collect()
}
Or if it's very common for one of the slices to be a subset of the other and you're happy paying for the check (possibly because you can use something like a bitmap) you could return a Cow and in the subset case return the subset slice:
pub fn intersection<'a>(left: &'a [u32], right: &'a [u32]) -> Cow<'a, [u32]> {
if issubset(left, right) {
Cow::Borrowed(left)
} else if issubset(right, left) {
Cow::Borrowed(right)
} else {
Cow::Owned(
left.iter().collect::<HashSet<_>>().intersection(
&right.iter().collect()
).map(|&&v| v).collect()
)
}
}

How can I pass something string-like with the `Read` trait to an implementation of `From`?

I'm writing a tokenizer, and for convenience I wrote a Reader object, that returns words one at a time. When words is exhausted, it reads from the BufReader to populate the words. Accordingly, I figured that file and words should both live in the struct.
The problem I'm having is that I want to test it by passing in strings to be tokenized, rather than having to rely on files. That's why I tried to implement From on both a File and then &str and String. The latter two don't work (as highlighted below).
I tried to annotate Reader with a lifetime, that I then used in the implementation of From<&'a str>, but that didn't work. I ended up with a Reader<'a, T: Read>, but the compiler complained that nothing used the lifetime parameter.
An alternative implementation of From<&'static str> works fine, but that means any strings passed in have to exist for the static lifetime.
I also saw this question/answer, but it seems to be different since their Enum has a lifetime parameter.
I have two supplementary question along with my overall question in the title:
I also saw FromStr, but haven't tried to use that yet - is it appropriate for this?
Are my code comments re variable ownership/lifetimes below correct?
My minimal example is here (with imports elided):
#[derive(Debug)]
struct Reader<T: Read> {
file: BufReader<T>,
words: Vec<String>,
}
impl From<File> for Reader<File> {
fn from(value: File) -> Self { // value moves into from
Reader::new(BufReader::new(value)) // value moves into BufReader(?)
}
}
// THE NEXT TWO DON'T WORK
impl From<&str> for Reader<&[u8]> {
fn from(value: &str) -> Self { // Compiler can't know how long the underlying data lives
Reader::new(BufReader::new(value.as_bytes())) // The data may not live as long as BufReader
}
}
impl From<String> for Reader<&[u8]> {
fn from(value: String) -> Self { // value moves into from
Reader::new(BufReader::new(value.as_bytes())) // value doesn't move into BufReader or Reader
} // value gets dropped
}
impl<T: Read> Reader<T> {
fn new(input: BufReader<T>) -> Self {
Self {
file: input,
words: vec![],
}
}
}
The &str one compiles with lifetime annotations (playground):
impl<'a> From<&'a str> for Reader<&'a [u8]> {
fn from(value: &'a str) -> Self {
Reader::new(BufReader::new(value.as_bytes()))
}
}
As discussed in the comments, you need to only annotate the reference, not try to incorporate lifetime annotations into the Reader itself.
Note that the same approach doesn't work for String because the signature of from moves it into the function, and the function cannot return bytes that belong to a local variable. You could implement it for &String, but then you can as well use &str.

How can I copy a shared slice to have a boxed slice?

I have a container:
pub struct Foo<T> {
pub data: Box<[T]>,
}
I would like a method to initialize a new one from an existing slice:
impl<T> Foo<T> {
fn from_slice(slice: &[T]) -> Foo<T> {
Foo {
data: Box::new(/* something here */),
}
}
}
I'd like to create a Foo instance from any kind of slice, coming from a dynamic vector or a static string.
I suppose there is a reason why vec! is a macro, but is there a way to avoid writing one? I guess I could do slice.to_vec().into_boxed_slice(), but it doesn't seem right to create a Vec as a proxy to a clone...
I'm not using a Vec in my struct because the data isn't supposed to change in size during the lifetime of my container. It didn't feel right to use a Vec but I may be wrong.
If your slice contains Copy types, you can use From / Into to perform the construction:
pub struct Foo<T> {
pub data: Box<[T]>,
}
impl<T> Foo<T> {
fn from_slice(slice: &[T]) -> Foo<T>
where
T: Copy,
{
Foo { data: slice.into() }
}
}
If your data is Clone, then you can use to_vec + into_boxed_slice:
impl<T> Foo<T> {
fn from_slice(slice: &[T]) -> Foo<T>
where
T: Clone,
{
Foo { data: slice.to_vec().into_boxed_slice() }
}
}
it doesn't seem right to create a Vec as a proxy to a clone
You aren't cloning here. When you clone a type T, you get a type T back. You are starting with a &[T] and want to get a Box<[T]>, not a [T] (which you can't have).
Creating a boxed slice through a Vec means that you temporarily take up 3 machine-sized integers instead of 2; this is unlikely to be a performance problem compared to the amount of allocation performed.
I do agree with starblue's answer that keeping a Vec<T> is probably simpler for most cases, but I admit that there are times where it's useful to have a boxed slice.
See also:
Initialize boxed slice without clone or copy
What is the use of into_boxed_slice() methods?
Performance comparison of a Vec and a boxed slice
I suppose there is a reason why vec! is a macro
The implementation of vec! is public:
macro_rules! vec {
($elem:expr; $n:expr) => (
$crate::vec::from_elem($elem, $n)
);
($($x:expr),*) => (
<[_]>::into_vec(box [$($x),*])
);
($($x:expr,)*) => (vec![$($x),*])
}
It's really only a macro for syntax convenience (and because it uses the unstable box keyword); it takes the arguments, creates an array, boxes it, coerces it to a boxed slice, then converts it to a Vec.

Can I implement a trait which adds information to an external type in Rust?

I just implemented a simple trait to keep the history of a struct property:
fn main() {
let mut weight = Weight::new(2);
weight.set(3);
weight.set(5);
println!("Current weight: {}. History: {:?}", weight.value, weight.history);
}
trait History<T: Copy> {
fn set(&mut self, value: T);
fn history(&self) -> &Vec<T>;
}
impl History<u32> for Weight {
fn set(&mut self, value: u32) {
self.history.push(self.value);
self.value = value;
}
fn history(&self) -> &Vec<u32> {
&self.history
}
}
pub struct Weight {
value: u32,
history: Vec<u32>,
}
impl Weight {
fn new(value: u32) -> Weight {
Weight {
value,
history: Vec::new(),
}
}
}
I don't expect this is possible, but could you add the History trait (or something equivalent) to something which doesn't already have a history property (like u32 or String), effectively tacking on some information about which values the variable has taken?
No. Traits cannot add data members to the existing structures. Actually, only a programmer can do that by modifying the definition of a structure. Wrapper structures or hash-tables are the ways to go.
No, traits can only contain behavior, not data. But you could make a struct.
If you could implement History for u32, you'd have to keep the entire history of every u32 object indefinitely, in case one day someone decided to call .history() on it. (Also, what would happen when you assign one u32 to another? Does its history come with it, or does the new value just get added to the list?)
Instead, you probably want to be able to mark specific u32 objects to keep a history. A wrapper struct, as red75prime's answer suggests, will work:
mod hist {
use std::mem;
pub struct History<T> {
value: T,
history: Vec<T>,
}
impl<T> History<T> {
pub fn new(value: T) -> Self {
History {
value,
history: Vec::new(),
}
}
pub fn set(&mut self, value: T) {
self.history.push(mem::replace(&mut self.value, value));
}
pub fn get(&self) -> T
where
T: Copy,
{
self.value
}
pub fn history(&self) -> &[T] {
&self.history
}
}
}
It's generic, so you can have a History<u32> or History<String> or whatever you want, but the get() method will only be implemented when the wrapped type is Copy.* Your Weight type could just be an alias for History<u32>. Here it is in the playground.
Wrapping this code in a module is a necessary part of maintaining the abstraction. That means you can't write weight.value, you have to call weight.get(). If value were marked pub, you could assign directly to weight.value (bypassing set) and then history would be inaccurate.
As a side note, you almost never want &Vec<T> when you can use &[T], so I changed the signature of history(). Another thing you might consider is returning an iterator over the previous values (perhaps in reverse order) instead of a slice.
* A better way of getting the T out of a History<T> is to implement Deref and write *foo instead of foo.get().

How can I model a bidirectional map without annoying the borrow checker?

From Why can't I store a value and a reference to that value in the same struct? I learned that I cannot store a value and a reference in the same struct.
The proposed solution is:
The easiest and most recommended solution is to not attempt to put these items in the same structure together. By doing this, your structure nesting will mimic the lifetimes of your code.
Place types that own data into a structure together and then provide methods that allow you to get references or objects containing references as needed.
However, I do not know how to apply this in my concrete case:
I want to build bidirectional map, implemented by two internal HashMaps.
Clearly, one of them has to own the data. However, the other part is also essential to the bidirectional map, so I don't see how I could separate these two while still maintaining a bidirectional map interface.
struct BidiMap<'a, S: 'a, T: 'a> { ? }
fn put(&mut self, s: S, t: T) -> ()
fn get(&self, s: &S) -> T
fn get_reverse(&self, t: &T) -> S
In this case, the easiest solution is to act like a language with a garbage collector would work:
use std::collections::HashMap;
use std::rc::Rc;
use std::hash::Hash;
use std::ops::Deref;
struct BidiMap<A, B> {
left_to_right: HashMap<Rc<A>, Rc<B>>,
right_to_left: HashMap<Rc<B>, Rc<A>>,
}
impl<A, B> BidiMap<A, B>
where
A: Eq + Hash,
B: Eq + Hash,
{
fn new() -> Self {
BidiMap {
left_to_right: HashMap::new(),
right_to_left: HashMap::new(),
}
}
fn put(&mut self, a: A, b: B) {
let a = Rc::new(a);
let b = Rc::new(b);
self.left_to_right.insert(a.clone(), b.clone());
self.right_to_left.insert(b, a);
}
fn get(&self, a: &A) -> Option<&B> {
self.left_to_right.get(a).map(Deref::deref)
}
fn get_reverse(&self, b: &B) -> Option<&A> {
self.right_to_left.get(b).map(Deref::deref)
}
}
fn main() {
let mut map = BidiMap::new();
map.put(1, 2);
println!("{:?}", map.get(&1));
println!("{:?}", map.get_reverse(&2));
}
Of course, you'd want to have much more rigorous code as this allows you to break the bidirectional mapping. This just shows you one way of solving the problem.
Clearly, one of them has to own the data
Clearly, that's not true ^_^. In this case, both maps share the ownership using Rc.
Benchmark this solution to know if it's efficient enough.
Doing anything more efficient would require much heavier thinking about ownership. For example, if the left_to_right map owned the data and you used a raw pointer in the other map, that pointer would become invalidated as soon as the first map reallocated.

Resources