There seem to be two ways to try to turn a vector into an array, either via a slice (fn a) or directly (fn b):
use std::array::TryFromSliceError;
use std::convert::TryInto;
type Input = Vec<u8>;
type Output = [u8; 1000];
// Rust 1.47
pub fn a(vec: Input) -> Result<Output, TryFromSliceError> {
vec.as_slice().try_into()
}
// Rust 1.48
pub fn b(vec: Input) -> Result<Output, Input> {
vec.try_into()
}
Practically speaking, what's the difference between these? Is it just the error type? The fact that the latter was added makes me wonder whether there's more to it than that.
They have slightly different behavior.
The slice to array implementation will copy the elements from the slice. It has to copy instead of move because the slice doesn't own the elements.
The Vec to array implementation will consume the Vec and move its contents to the new array. It can do this because it does own the elements.
Related
How can you easily borrow a vector of vectors as a slice of slices?
fn use_slice_of_slices<T>(slice_of_slices: &[&[T]]) {
// Do something...
}
fn main() {
let vec_of_vec = vec![vec![0]; 10];
use_slice_of_slices(&vec_of_vec);
}
I will get the following error:
error[E0308]: mismatched types
--> src/main.rs:7:25
|
7 | use_slice_of_slices(&vec_of_vec);
| ^^^^^^^^^^^ expected slice, found struct `std::vec::Vec`
|
= note: expected type `&[&[_]]`
found type `&std::vec::Vec<std::vec::Vec<{integer}>>`
I could just as easily define use_slice_of_slices as
fn use_slice_of_slices<T>(slice_of_slices: &[Vec<T>]) {
// Do something
}
and the outer vector would be borrowed as a slice and all would work. But what if, just for the sake of argument, I want to borrow it as a slice of slices?
Assuming automatic coercing from &Vec<Vec<T>> to &[&[T]] is not possible, then how can I define a function borrow_vec_of_vec as below?
fn borrow_vec_of_vec<'a, T: 'a>(vec_of_vec: Vec<Vec<T>>) -> &'a [&'a [T]] {
// Borrow vec_of_vec...
}
To put it in another way, how could I implement Borrow<[&[T]]> for Vec<Vec<T>>?
You cannot.
By definition, a slice is a view on an existing collection of element. It cannot conjure up new elements, or new views of existing elements, out of thin air.
This stems from the fact that Rust generic parameters are generally invariants. That is, while a &Vec<T> can be converted as a &[T] after a fashion, the T in those two expressions MUST match.
A possible work-around is to go generic yourself.
use std::fmt::Debug;
fn use_slice_of_slices<U, T>(slice_of_slices: &[U])
where
U: AsRef<[T]>,
T: Debug,
{
for slice in slice_of_slices {
println!("{:?}", slice.as_ref());
}
}
fn main() {
let vec_of_vec = vec![vec![0]; 10];
use_slice_of_slices(&vec_of_vec);
}
Instead of imposing what the type of the element should be, you instead accept any type... but place a bound that it must be coercible to [T].
This has nearly the same effect, as then the generic function can only manipulate [T] as a slice. As a bonus, it works with multiple types (any which can be coerced into a [T]).
A deref coercion from Vec<T> to &[T] is cheap. A Vec<T> is represented by a struct essentially containing a pointer to the heap-allocated data, the capacity of the heap allocation and the current length of the vector. A slice &[T] is a fat pointer consisting of a pointer to the data and the length of the slice. The conversion from Vec<T> to &[T] essentially requires to copy the pointer and the length from the Vec<T> struct to a new fat pointer.
If we want to convert from Vec<Vec<T>> to &[&[T]], we need to perform the above conversion for each of the inner vectors. This means we need to store an unknown number of fat pointers somewhere. This requires to allocate space for these fat pointers somewhere. When converting a single vector, the compiler will reserve space for the single resulting fat pointer on the stack. For an unknown, potentially large, number of fat pointers this is not possible, and the conversion also isn't cheap anymore. This is the reason this conversion isn't easily possible, and you need to write explicit code for it.
So whenever you can, you should instead change your function signature as suggested in Matthieu's answer. If you don't control the function signature, your only choice is to write the explicit conversion code, allocating a new vector:
fn vecs_to_slices<T>(vecs: &[Vec<T>]) -> Vec<&[T]> {
vecs.iter().map(Vec::as_slice).collect()
}
Applied to the functions in the original post, this can be used like this:
use_slice_of_slices(&vecs_to_slice(&vec_of_vec));
Generalized Question
How can I implement a general function pinned_array_of_default in stable Rust where [T; N] is too large to fit on the stack?
fn pinned_array_of_default<T: Default, const N: usize>() -> Pin<Box<[T; N]>> {
unimplemented!()
}
Alternatively, T can implement Copy if that makes the process easier.
fn pinned_array_of_element<T: Copy, const N: usize>(x: T) -> Pin<Box<[T; N]>> {
unimplemented!()
}
Keeping the solution in safe Rust would have been preferable, but it seems unlikely that it is possible.
Approaches
Initially I was hopping that by implementing Default I might be able to get Default to handle the initial allocation, however it still creates it on the stack so this will not work for large values of N.
let boxed: Box<[T; N]> = Box::default();
let foo = Pin::new(boxed);
I suspect I need to use MaybeUninit to achieve this and there is a Box::new_uninit() function, but it is currently unstable and I would ideally like to keep this within stable Rust. I also somewhat unsure if transmuting Pin<Box<MaybeUninit<B>>> to Pin<Box<B>> could somehow have negative effects on the Pin.
Background
The purpose behind using a Pin<Box<[T; N]>> is to hold a block of pointers where N is some constant factor/multiple of the page size.
#[repr(C)]
#[derive(Copy, Clone)]
pub union Foo<R: ?Sized> {
assigned: NonNull<R>,
next_unused: Option<NonNull<Self>>,
}
Each pointer may or may not be in use at a given point in time. An in-use Foo points to R, and an unused/empty Foo has a pointer to either the next empty Foo in the block or None. A pointer to the first unused Foo in the block is stored separately. When a block is full, a new block is created and then pointer chain of unused positions continues through the next block.
The box needs to be pinned since it will contain self referential pointers as well as outside structs holding pointers into assigned positions in each block.
I know that Foo is wildly unsafe by Rust standards, but the general question of creating a Pin<Box<[T; N]>> still stands
A way to construct a large array on the heap and avoid creating it on the stack is to proxy through a Vec. You can construct the elements and use .into_boxed_slice() to get a Box<[T]>. You can then use .try_into() to convert it to a Box<[T; N]>. And then use .into() to convert it to a Pin<Box<[T; N]>>:
fn pinned_array_of_default<T: Default, const N: usize>() -> Pin<Box<[T; N]>> {
let mut vec = vec![];
vec.resize_with(N, T::default);
let boxed: Box<[T; N]> = match vec.into_boxed_slice().try_into() {
Ok(boxed) => boxed,
Err(_) => unreachable!(),
};
boxed.into()
}
You can optionally make this look more straight-forward if you add T: Clone so that you can do vec![T::default(); N] and/or add T: Debug so you can use .unwrap() or .expect().
See also:
Creating a fixed-size array on heap in Rust
As I understand the idiomatic way to apply a function to each element of a structure in Rust, is to implement IntoIterator and FromIterator and use map and collect. Like this:
enum F<A> {
// fields omitted
}
impl<A> IntoIterator for F<A> {
// implementation omitted
}
impl<A> FromIterator<A> for F<A> {
// implementation omitted
}
fn mapF<A, B>(x : F<A>, f) -> F<B>
where f : Fn(A) -> B
{
x.into_iter().map(f).collect()
}
However it doesn't seem possible to implement FromIterator for a tree, because there are multiple ways to organize a sequence of values into a tree. Is there some way around this?
the idiomatic way to apply a function to each element of a structure in Rust, is to implement IntoIterator and FromIterator
This is not quite true. The idiomatic way is to provide one iterator, but you don't have to implement these traits.
Take for example &str: there isn't a canonical way to iterate on a string. You could iterate on its bytes or its characters, therefore it doesn't implement IntoIterator but has two methods bytes and chars returning a different type of iterator.
A tree would be similar: there isn't a single way to iterate a tree, so it could have a depth_first_search method returning a DepthFirstSearch iterator and a breadth_first_search method returning a BreadthFirstSearch iterator.
Similarly a String can be constructed from an iterator of &str or and iterator of char so String implements both FromIterator<&str> and FromIterator<char>, but it does not implement FromIterator<u8> because random bytes are unlikely to form a valid UTF-8 string.
That is, there isn't always a one-to-one relation between a collection, and its iterator.
and use […] collect
This is (mostly) incorrect. Collecting is not a good way to consume an iterator, unless you actually want to use the collected result afterwards. If you only want to execute the effect of an iterator, use for of the for_each method.
You could include information about tree structure into the iterator, something like
impl F {
pub fn path_iter(self) -> impl Iterator<Iter=(TreePath, A)> { ... }
// rest of impl
}
impl<A> FromIterator<(TreePath, A)> for F<A> {
// implementation omitted
}
fn mapF<A, B>(x : F<A>, f) -> F<B>
where f : Fn(A) -> B
{
x.path_iter().map(|pair| (pair.0, f(pair.1))).collect()
}
With TreePath a type specific for your tree. Probably better representing not the path itself but how to move to the next node.
I originally suggested implementing IntoIterator with Item = (TreePath, A) but on further thought the default iterator should still have Item = A.
I have an Iterator<Item = &(T, U)> over a slice &[(T, U)]. I'd like to unzip this iterator into its components (i.e. obtain (Vec<&T>, Vec<&U>)).
Rust provides unzip functionality through the .unzip() method on Interator:
points.iter().unzip()
Unfortunately, this doesn't work as-is because .unzip() expects the type of the iterator's item to be a tuple; mine is a reference to a tuple.
To fix this, I tried to write a function which converts between a reference to a tuple and a tuple of references:
fn distribute_ref<'a, T, U>(x: &'a (T, U)) -> (&'a T, &'a U) {
(&x.0, &x.1)
}
I can then map over the resulting iterator to get something .unzip() can handle:
points.iter().map(distribute_ref).unzip()
This works now, but I this feels a bit strange. In particular, distribute_ref seems like a fairly simple operation that would be provided by the Rust standard library. I'm guessing it either is and I can't find it, or I'm not approaching this the right way.
Is there a better way to do this?
Is there a better way to do this?
"Better" is a bit subjective. You can make your function shorter:
fn distribute_ref<T, U>(x: &(T, U)) -> (&T, &U) {
(&x.0, &x.1)
}
Lifetime elision allows to omit lifetime annotations in this case.
You can use a closure to do the same thing:
points.iter().map(|&(ref a, ref b)| (a, b)).unzip()
Depending on the task it can be sufficient to clone the data. Especially in this case, as reference to u8 takes 4 or 8 times more space than u8 itself.
points().iter().cloned().unzip()
I'm trying to initialize a boxed slice of None values, such that the underlying type T does not need to implement Clone or Copy. Here a few ideal solutions:
fn by_vec<T>() -> Box<[Option<T>]> {
vec![None; 5].into_boxed_slice()
}
fn by_arr<T>() -> Box<[Option<T>]> {
Box::new([None; 5])
}
Unfortunately, the by_vec implementation requires T: Clone and the by_arr implemenation requires T: Copy. I've experimented with a few more approaches:
fn by_vec2<T>() -> Box<[Option<T>]> {
let v = &mut Vec::with_capacity(5);
for i in 0..v.len() {
v[i] = None;
}
v.into_boxed_slice() // Doesn't work: cannot move out of borrowed content
}
fn by_iter<T>() -> Box<[Option<T>]> {
(0..5).map(|_| None).collect::<Vec<Option<T>>>().into_boxed_slice()
}
by_vec2 doesn't get past the compiler (I'm not sure I understand why), but by_iter does. I'm concerned about the performance of collect -- will it need to resize the vector it is collecting into as it iterates, or can it allocate the correct sized vector to begin with?
Maybe I'm going about this all wrong -- I'm very new to Rust, so any tips would be appreciated!
Let's start with by_vec2. You are taking a &mut reference to a Vec. You shouldn't do that, work directly with the Vec and make the v binding mutable.
Then you are iterating over the length of a Vec with a capacity of 5 and a length of 0. That means your loop never gets executed. What you wanted was to iterate over 0..v.cap().
Since your v is still of length 0, accessing v[i] in the loop will panic at runtime. What you actually want is v.push(None). This would normally cause reallocations, but in your case you already allocated with Vec::with_capacity, so pushing 5 times will not allocate.
This time around we did not take a reference to the Vec so into_boxed_slice will actually work.
fn by_vec2<T>() -> Box<[Option<T>]> {
let mut v = Vec::with_capacity(5);
for _ in 0..v.capacity() {
v.push(None);
}
v.into_boxed_slice()
}
Your by_iter function actually only allocates once. The Range iterator created by 0..5 knows that is exactly 5 elements long. So collect will in fact check that length and allocate only once.