Performing a reverse split_off operation in Rust efficiently - rust

My use case involves two mutable vectors a and b and a usize parameter x. I want to make the following change:
take the elements b[0..x] and append them to the end of a (changing capacity of a as required)
transform b into b[x..], without changing the original capacity of b
Currently I do the following:
while a.len() < x && !b.is_empty() {
a.push(b.pop_front().unwrap());
// here `b` is a VecDeque but I am happy to use a Vec if pop_front was not required
}
Obviously this seems a very slow operation, checking two conditions and calling unwrap at every iteration. It would be great if there was a rev_split_off operation such that:
let mut front = b.rev_split_off(x);
a.append(&mut front);
Here rev_split_off returns a newly allocated vector for the slice b[0..x] and transforms b into the remaining slice with unchanged capacity.
Question: How to perform my use case efficiently, with or without using such a thing as rev_split_off?

Well, I think you will have to implement the rev_split_off yourself (even though I would probably call it split_off_back but it's the same).
Here is how I would implement it:
/// Moves the `i` first elements of `vec` at the end of `buffer`.
fn split_off_back<T>(vec: &mut Vec<T>, i: usize, buffer: &mut Vec<T>) {
// We have to make sure vec has enough elements.
// You could make the function unsafe and ask the caller to ensure
// this condition.
assert!(vec.len() >= i);
// Reserve enough memory in the target buffer
buffer.reserve(i);
// Now we know `buffer.capacity() >= buffer.len() + i`.
unsafe {
// SAFETY:
// * `vec` and `buffer` are two distinct vectors (they come from mutable references)
// so their allocations cannot overlap.
// * `vec` is valid for reads because we have an exclusive reference to it and we
// checked the value of `i`.
// * `buffer` is valid for writes because we ensured we had enough memory to store
// `i` additional elements.
std::ptr::copy_nonoverlapping(vec.as_ptr(), buffer.as_mut_ptr().add(buffer.len()), i);
// Now the memory is moved.
// we are not allowed to use it again from the `vec` vector.
// We just extanded `buffer`, we need to update its length.
// SAFEFY:
// * We ensured that the new length is less than the capacity (with `Vec::reserved`)
// * The vector is initialized for this new length (we moved the values).
buffer.set_len(buffer.len() + i);
// Now we need to update the first vector. The values from index `i` to its end
// need to be moved at the begining of the vector.
// SAFETY:
// * We have an exclusive reference to the vector. It is both valid for reads and writes.
std::ptr::copy(vec.as_ptr().add(i), vec.as_mut_ptr(), i);
// And update the length of `vec`.
// SAFETY: This subtraction is safe because we previously checked that `vec.len() >= i`.
vec.set_len(vec.len() - i);
}
}
Note that I put buffer in the parameters of the function to avoid allocating a vector. If you want the same semantic as split_off, you can just do the following.
fn split_of_back<T>(vec: &mut Vec<T>, i: usize) -> Vec<T> {
assert!(vec.len() >= i);
let mut buffer = Vec::with_capacity(i);
unsafe { /* same thing */ }
buffer
}

This is very simple using drain and extend.
a.extend(b.drain(..x));
If your values are Copy, then you can get optimal speed using as_slice. IIUC, using extend_from_slice should be optional due to specialization, but aids clarity.
a.extend_from_slice(b.drain(..x).as_slice());

I have added some benchmarks to show that the unsafe version of this code is significantly faster.
#![feature(test)]
extern crate test;
#[cfg(test)]
mod tests {
extern crate test;
use test::{black_box, Bencher};
/// Moves the `i` first elements of `vec` at the end of `buffer`.
fn split_off_back<T>(vec: &mut Vec<T>, i: usize, buffer: &mut Vec<T>) {
assert!(vec.len() >= i);
buffer.reserve(i);
unsafe {
std::ptr::copy_nonoverlapping(vec.as_ptr(), buffer.as_mut_ptr().add(buffer.len()), i);
buffer.set_len(buffer.len() + i);
std::ptr::copy(vec.as_ptr().add(i), vec.as_mut_ptr(), i);
vec.set_len(vec.len() - i);
}
}
fn split_off_back_two<T>(vec: &mut Vec<T>, i: usize, buffer: &mut Vec<T>) {
buffer.extend(vec.drain(..i));
}
const VEC_SIZE: usize = 100000;
const SPLIT_POINT: usize = 20000;
#[bench]
fn run_v1(b: &mut Bencher) {
b.iter(|| {
let mut a = black_box(vec![0; VEC_SIZE]);
let mut b = black_box(vec![0; VEC_SIZE]);
split_off_back(&mut a, SPLIT_POINT, &mut b);
});
}
#[bench]
fn run_v2(b: &mut Bencher) {
b.iter(|| {
let mut a = black_box(vec![0; VEC_SIZE]);
let mut b = black_box(vec![0; VEC_SIZE]);
split_off_back_two(&mut a, SPLIT_POINT, &mut b);
});
}
}
This is the output of cargo bench on my machine:
running 2 tests
test tests::run_v1 ... bench: 98,863 ns/iter (+/- 2,058)
test tests::run_v2 ... bench: 230,665 ns/iter (+/- 6,093)
test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured; 0 filtered out; finished in 0.48s

Related

Is it sound to use raw pointers to an allocated vector to allow multiple threads to write to nonoverlapping parts?

I have multiple threads doing a computation and want to collect the results into a pre-allocated vector. To turn off the borrow checker, I wrote the function:
fn set_unsync(vec: &Vec<usize>, idx: usize, val: usize) {
let first_elem = vec.as_ptr() as *mut usize;
unsafe { *first_elem.add(idx) = val }
}
With that we can fill a vector concurrently (e.g. using Rayon):
let vec = vec![0; 10];
(0..10).into_par_iter().for_each(|i| set_unsync(&vec, i, i));
It compiles, it works, and even Clippy likes it, but is it sound? After reading about things that appear to be sound but actually are Undefined Behavior, I'm unsure. For example, the documentation of the as_ptr method says:
The caller must also ensure that the memory the pointer
(non-transitively) points to is never written to (except inside an
UnsafeCell) using this pointer or any pointer derived from it.
Strictly speaking, the solution violates this. However, it feels sound to me. If it is not, how can we let multiple threads write to nonoverlapping parts of the same vector without using locks?
Assuming this is your minimal reproducible example:
use rayon::prelude::*;
fn set_unsync(vec: &Vec<usize>, idx: usize, val: usize) {
let first_elem = vec.as_ptr() as *mut usize;
unsafe { *first_elem.add(idx) = val }
}
fn main() {
let input = vec![2, 3, 9];
let output = vec![0; 100];
input.par_iter().for_each(|&i| {
for j in i * 10..(i + 1) * 10 {
set_unsync(&output, j, i);
}
});
dbg!(output);
}
If you are asking of whether this code works and always will work, then I'd answer with yes.
BUT: it violates many rules on how safe and unsafe code should interact with each other.
If you write a function that is not marked unsafe, you indicate that this method can be abused by users in any way possible without causing undefined behaviour (note that "users" here is not just other people, this also means your own code in safe sections). If you cannot guarantee this, you should mark it unsafe, requiring the caller of the function to mark the invocation as unsafe as well, because the caller then again has to make sure he is using your function correctly. And every point in your code that required a programmer to manually prove that it is free of undefined behaviour must require an unsafe as well. If it's possible to have sections that require a human to prove this, but do not require an unsafe, there is something unsound in your code.
In your case, the set_unsync function is not marked unsafe, but the following code causes undefined behaviour:
fn set_unsync(vec: &Vec<usize>, idx: usize, val: usize) {
let first_elem = vec.as_ptr() as *mut usize;
unsafe { *first_elem.add(idx) = val }
}
fn main() {
let output = vec![0; 5];
set_unsync(&output, 100000000000000, 42);
dbg!(output);
}
Not that at no point in your main did you need an unsafe, and yet a segfault is happening here.
Now if you say "but set_unsync is not pub, so no one else can call it. And I, in my par_iter, have ensured that I am using it correctly" - then this is the best indicator that you should mark set_unsync as unsafe. The act of "having to ensure to use the function correctly" is more or less the definition of an unsafe function. unsafe doesn't mean it will break horribly, it just means that the caller has to manually make sure that he is using it correctly, because the compiler can't. It's unsafe from the compiler's point of view.
Here is an example of how your code could be rewritten in a more sound way.
I don't claim that it is 100% sound, because I haven't thought about it enough.
But I hope this demonstrates how to cleanly interface between safe and unsafe code:
use rayon::prelude::*;
// mark as unsafe, as it's possible to provide parameters that
// cause undefined behaviour
unsafe fn set_unsync(vec: &[usize], idx: usize, val: usize) {
let first_elem = vec.as_ptr() as *mut usize;
// No need to use unsafe{} here, as the entire function is already unsafe
*first_elem.add(idx) = val
}
// Does not need to be marked `unsafe` as no combination of parameters
// could cause undefined behaviour.
// Also note that output is marked `&mut`, which is also crucial.
// Mutating data behind a non-mutable reference is also considered undefined
// behaviour.
fn do_something(input: &[usize], output: &mut [usize]) {
input.par_iter().for_each(|&i| {
// This assert is crucial for soundness, otherwise an incorrect value
// in `input` could cause an out-of-bounds access on `output`
assert!((i + 1) * 10 <= output.len());
for j in i * 10..(i + 1) * 10 {
unsafe {
// This is the critical point where we interface
// from safe to unsafe code.
// This call requires the programmer to manually verify that
// `set_unsync` never gets called with dangerous parameters.
set_unsync(&output, j, i);
}
}
});
}
fn main() {
// note that we now have to declare output `mut`, as it should be
let input = vec![2, 3, 9];
let mut output = vec![0; 100];
do_something(&input, &mut output);
dbg!(output);
}

How to allocate buffer for C library call

The question is not new and there are two approaches, as far as I can tell:
Use Vec<T>, as suggested here
Manage the heap memory yourself, using std::alloc::alloc, as shown here
My question is whether these are indeed the two (good) alternatives.
Just to make this perfectly clear: Both approaches work. The question is whether there is another, maybe preferred way. The example below is introduced to identify where use of Vec is not good, and where other approaches therefore may be better.
Let's state the problem: Suppose there's a C library that requires some buffer to write into. This could be a compression library, for example. It is easiest to have Rust allocate the heap memory and manage it instead of allocating in C/C++ with malloc/new and then somehow passing ownership to Rust.
Let's go with the compression example. If the library allows incremental (streaming) compression, then I would need a buffer that keeps track of some offset.
Following approach 1 (that is: "abuse" Vec<T>) I would wrap Vec and use len and capacity for my purposes:
/// `Buffer` is basically a Vec
pub struct Buffer<T>(Vec<T>);
impl<T> Buffer<T> {
/// Create new buffer of length `len`
pub fn new(len: usize) -> Self {
Buffer(Vec::with_capacity(len))
}
/// Return length of `Buffer`
pub fn len(&self) -> usize {
return self.0.len()
}
/// Return total allocated size of `Buffer`
pub fn capacity(&self) -> usize {
return self.0.capacity()
}
/// Return remaining length of `Buffer`
pub fn remaining(&self) -> usize {
return self.0.capacity() - self.len()
}
/// Increment the offset
pub fn increment(&mut self, by:usize) {
unsafe { self.0.set_len(self.0.len()+by); }
}
/// Returns an unsafe mutable pointer to the buffer
pub fn as_mut_ptr(&mut self) -> *mut T {
unsafe { self.0.as_mut_ptr().add(self.0.len()) }
}
/// Returns ref to `Vec<T>` inside `Buffer`
pub fn as_vec(&self) -> &Vec<T> {
&self.0
}
}
The only interesting functions are increment and as_mut_ptr.
Buffer would be used like this
fn main() {
// allocate buffer for compressed data
let mut buf: Buffer<u8> = Buffer::new(1024);
loop {
// perform C function call
let compressed_len: usize = compress(some_input, buf.as_mut_ptr(), buf.remaining());
// increment
buf.increment(compressed_len);
}
// get Vec inside buf
let compressed_data = buf.as_vec();
}
Buffer<T> as shown here is clearly dangerous, for example if any reference type is used. Even T=bool may result in undefined behaviour. But the problems with uninitialised instance of T can be avoided by introducing a trait that limits the possible types T.
Also, if alignment matters, then Buffer<T> is not a good idea.
But otherwise, is such a Buffer<T> really the best way to do this?
There doesn't seem to be an out-of-the box solution. The bytes crate comes close, it offers a "container for storing and operating on contiguous slices of memory", but the interface is not flexible enough.
You absolutely can use a Vec's spare capacity as to write into manually. That is why .set_len() is available. However, compress() must know that the given pointer is pointing to uninitialized memory and thus is not allowed to read from it (unless written to first) and you must guarantee that the returned length is the number of bytes initialized. I think these rules are roughly the same between Rust and C or C++ in this regard.
Writing this in Rust would look like this:
pub struct Buffer<T>(Vec<T>);
impl<T> Buffer<T> {
pub fn new(len: usize) -> Self {
Buffer(Vec::with_capacity(len))
}
/// SAFETY: `by` must be less than or equal to `space_len()` and the bytes at
/// `space_ptr_mut()` to `space_ptr_mut() + by` must be initialized
pub unsafe fn increment(&mut self, by: usize) {
self.0.set_len(self.0.len() + by);
}
pub fn space_len(&self) -> usize {
self.0.capacity() - self.0.len()
}
pub fn space_ptr_mut(&mut self) -> *mut T {
unsafe { self.0.as_mut_ptr().add(self.0.len()) }
}
pub fn as_vec(&self) -> &Vec<T> {
&self.0
}
}
unsafe fn compress(_input: i32, ptr: *mut u8, len: usize) -> usize {
// right now just writes 5 bytes if there's space for them
let written = usize::min(5, len);
for i in 0..written {
ptr.add(i).write(0);
}
written
}
fn main() {
let mut buf: Buffer<u8> = Buffer::new(1024);
let some_input = 5i32;
unsafe {
let compressed_len: usize = compress(some_input, buf.space_ptr_mut(), buf.space_len());
buf.increment(compressed_len);
}
let compressed_data = buf.as_vec();
println!("{:?}", compressed_data);
}
You can see it on the playground. If you run it through Miri, you'll see it picks up no undefined behavior, but if you over-advertise how much you've written (say return written + 10) then it does produce an error that reading uninitialized memory was detected.
One of the reasons there isn't an out-of-the-box type for this is because Vec is that type:
fn main() {
let mut buf: Vec<u8> = Vec::with_capacity(1024);
let some_input = 5i32;
let spare_capacity = buf.spare_capacity_mut();
unsafe {
let compressed_len: usize = compress(
some_input,
spare_capacity.as_mut_ptr().cast(),
spare_capacity.len(),
);
buf.set_len(buf.len() + compressed_len);
}
println!("{:?}", buf);
}
Your Buffer type doesn't really add any convenience or safety and a third-party crate can't do so because it relies on the correctness of compress().
Is such a Buffer really the best way to do this?
Yes, this is pretty much the lowest cost ways to provide a buffer for writing. Looking at the generated release assembly, it is just one call to allocate and that's it. You can get tricky by using a special allocator or simply pre-allocate and reuse allocations if you're doing this many times (but be sure to measure since the built-in allocator will do this anyway, just more generally).

How to accept str.chars() or str.bytes() in a function and iterate twice?

Is there any way to pass somestring.chars() or somestring.bytes() to a function and allow that function to reconstruct the iterator?
An example is below. The goal is for the function to be able to iterate through coll multiple times, reconstructing it as needed using into_iter(). It works correctly for vectors and arrays, but I have not been able to get it working for the string iterator methods.
// Lifetime needed to indicate the iterator objects
// don't disappear
fn test_single<'a, I, T>(collection: &'a I)
where
&'a I: IntoIterator<Item = T>,
T: Display,
{
let count = collection.into_iter().count();
println!("Len: {}", count);
for x in collection.into_iter() {
println!("Item: {}", x);
}
}
fn main() {
// Works
test_single(&[1, 2, 3, 4]);
test_single(&vec!['a', 'b', 'c', 'd']);
let s = "abcd";
// Desired usage; does not work
// test_single(&s.chars());
// test_single(&s.bytes());
}
The general error is that Iterator is not implemented for &Chars<'_>. This doesn't make sense because chars definitely does implement IntoIterator and Iterator
Is there a solution that allows for the desired usage of test_single(&s.chars())?
Link to the playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=230ee86cd109384a1c62c362aed9d47f
(IntoIterator is prefered over Iterator for my application, since I also need to specify that IntoIterator::IntoIter is a DoubleEndedIterator.)
This can work but not the way you have it written.
You can't iterate a shared reference because Iterator::next() takes &mut self. IntoIterator::into_iter() could be made to work with e.g. &Chars, but that's not necessary because Chars and Bytes both implement Clone, which creates a copy of the iterator (but doesn't duplicate the underlying data).
So you just need to adjust your bounds and accept the iterator by value, cloning it when you will need another iterator later:
fn test_single<I, T>(collection: I)
where
I: Clone + IntoIterator<Item = T>,
T: Display,
{
let count = collection.clone().into_iter().count();
println!("Len: {}", count);
for x in collection.into_iter() {
println!("Item {}", x);
}
}
Now you can call test_single(s.chars()), for example.
(Playground)
Side note: You can express the type I purely with impl, which might be more readable:
fn test_single(collection: impl Clone + IntoIterator<Item=impl Display>) {

Returning iterator from weak references for mapping and modifying values

I'm trying quite complex stuff with Rust where I need the following attributes, and am fighting the compiler.
Object which itself lives from start to finish of application, however, where internal maps/vectors could be modified during application lifetime
Multiple references to object that can read internal maps/vectors of an object
All single threaded
Multiple nested iterators which are map/modified in lazy manner to perform fast and complex calculations (see example below)
A small example, which already causes problems:
use std::cell::RefCell;
use std::rc::Rc;
use std::sync::Weak;
pub struct Holder {
array_ref: Weak<RefCell<Vec<isize>>>,
}
impl Holder {
pub fn new(array_ref: Weak<RefCell<Vec<isize>>>) -> Self {
Self { array_ref }
}
fn get_iterator(&self) -> impl Iterator<Item = f64> + '_ {
self.array_ref
.upgrade()
.unwrap()
.borrow()
.iter()
.map(|value| *value as f64 * 2.0)
}
}
get_iterator is just one of the implementations of a trait, but even this example already does not work.
The reason for Weak/Rc is to make sure that multiple places points to object (from point (1)) and other place can modify its internals (Vec<isize>).
What is the best way to approach this situation, given that end goal is performance critical?
EDIT:
Person suggested using https://doc.rust-lang.org/std/cell/struct.Ref.html#method.map
But unfortunately still can't get - if I should also change return type - or maybe the closure function is wrong here
fn get_iterator(&self) -> impl Iterator<Item=f64> + '_ {
let x = self.array_ref.upgrade().unwrap().borrow();
let map1 = Ref::map(x, |x| &x.iter());
let map2 = Ref::map(map1, |iter| &iter.map(|y| *y as f64 * 2.0));
map2
}
IDEA say it has wrong return type
the trait `Iterator` is not implemented for `Ref<'_, Map<std::slice::Iter<'_, isize>, [closure#src/bin/main.rs:30:46: 30:65]>>`
This won't work because self.array_ref.upgrade() creates a local temporary Arc value, but the Ref only borrows from it. Obviously, you can't return a value that borrows from a local.
To make this work you need a second structure to own the Arc, which can implement Iterator in this case since the produced items aren't references:
pub struct HolderIterator(Arc<RefCell<Vec<isize>>>, usize);
impl Iterator for HolderIterator {
type Item = f64;
fn next(&mut self) -> Option<f64> {
let r = self.0.borrow().get(self.1)
.map(|&y| y as f64 * 2.0);
if r.is_some() {
self.1 += 1;
}
r
}
}
// ...
impl Holder {
// ...
fn get_iterator<'a>(&'a self) -> Option<impl Iterator<Item=f64>> {
self.array_ref.upgrade().map(|rc| HolderIterator(rc, 0))
}
}
Alternatively, if you want the iterator to also weakly-reference the value contained within, you can have it hold a Weak instead and upgrade on each next() call. There are performance implications, but this also makes it easier to have get_iterator() be able to return an iterator directly instead of an Option, and the iterator written so that a failed upgrade means the sequence has ended:
pub struct HolderIterator(Weak<RefCell<Vec<isize>>>, usize);
impl Iterator for HolderIterator {
type Item = f64;
fn next(&mut self) -> Option<f64> {
let r = self.0.upgrade()?
.borrow()
.get(self.1)
.map(|&y| y as f64 * 2.0);
if r.is_some() {
self.1 += 1;
}
r
}
}
// ...
impl Holder {
// ...
fn get_iterator<'a>(&'a self) -> impl Iterator<Item=f64> {
HolderIterator(Weak::clone(&self.array_ref), 0)
}
}
This will make it so that you always get an iterator, but it's empty if the Weak is dead. The Weak can also die during iteration, at which point the sequence will abruptly end.

Mutating and non-mutating method chains

I have two functions, which I'm hoping to use in method chains. They both do basically the same thing, except that one of them overwrites itself and another returns a clone. I'm coming from Ruby, and I'm used to just calling self.dup.mutable_method in the destructive method.
I believe I have a solution worked out in Rust, but I'm not sure whether it has an extra allocation going on in there somewhere, and I'm afraid that it'll consume itself. This is audio DSP code, so I want to make sure that there are no allocations in the mutable method. (I'm three days into Rust, so mea culpa for the non-generalized trait impls.)
impl Filter for DVec<f64> {
fn preemphasis_mut<'a>(&'a mut self, freq: f64, sample_rate: f64) -> &'a mut DVec<f64> {
let filter = (-2.0 * PI * freq / sample_rate).exp();
for i in (1..self.len()).rev() {
self[i] -= self[i-1] * filter;
};
self
}
fn preemphasis(&self, freq: f64, sample_rate: f64) -> DVec<f64> {
let mut new = self.clone();
new.preemphasis_mut(freq, sample_rate);
new
}
}
// Ideal code:
let mut sample: DVec<f64> = method_that_loads_sample();
let copy_of_sample = sample.preemphasis(75.0, 44100.0); // this mutates and copies, with one allocation
sample.preemphasis_mut(75.0, 44100.0); // this mutates in-place, with no allocations
copy_of_sample.preemphasis_mut(75.0, 44100.0)
.preemphasis_mut(150.0, 44100.0); // this mutates twice in a row, with no allocations
I have not seen any libraries follow any patterns similar to Ruby's foo and foo! method pairs when it comes to self mutation. I believe this mostly to be because Rust places mutability front-and-center, so it's much more difficult to "accidentally" mutate something. To that end, I would probably drop one of your methods and allow the user to decide when something should be mutated:
use std::f64::consts::PI;
trait Filter {
fn preemphasis<'a>(&'a mut self, freq: f64, sample_rate: f64) -> &'a mut Self;
}
impl Filter for Vec<f64> {
fn preemphasis<'a>(&'a mut self, freq: f64, sample_rate: f64) -> &'a mut Self {
let filter = (-2.0 * PI * freq / sample_rate).exp();
for i in (1..self.len()).rev() {
self[i] -= self[i-1] * filter;
};
self
}
}
fn main() {
let mut sample = vec![1.0, 2.0];
// this copies then mutates, with one allocation
let mut copy_of_sample = sample.clone();
copy_of_sample
.preemphasis(75.0, 44100.0);
// this mutates in-place, with no allocations
sample
.preemphasis(75.0, 44100.0);
// this mutates twice in a row, with no allocations
copy_of_sample
.preemphasis(75.0, 44100.0)
.preemphasis(150.0, 44100.0);
}
I think a key thing here is that the caller of the code can easily see when something will be mutated (because of the &mut reference to self). The caller also gets to determine when and where the clone happens.

Resources