How can concatenated &[u8] slices implement the Read trait without additional copying?

How can concatenated &[u8] slices implement the Read trait without additional copying? - rust

The Read trait is implemented for &[u8]. How can I get a Read trait over several concatenated u8 slices without actually doing any concatenation first?
If I concatenate first, there will be two copies -- multiple arrays into a single array followed by copying from single array to destination via the Read trait. I would like to avoid the first copying.
I want a Read trait over &[&[u8]] that treats multiple slices as a single continuous slice.
fn foo<R: std::io::Read + Send>(data: R) {
// ...
}
let a: &[u8] = &[1, 2, 3, 4, 5];
let b: &[u8] = &[1, 2];
let c: &[&[u8]] = &[a, b];
foo(c); // <- this won't compile because `c` is not a slice of bytes.

You could use the multi_reader crate, which can concatenate any number of values that implement Read:
let a: &[u8] = &[1, 2, 3, 4, 5];
let b: &[u8] = &[1, 2];
let c: &[&[u8]] = &[a, b];
foo(multi_reader::MultiReader::new(c.iter().copied()));
If you don't want to depend on an external crate, you can wrap the slices in a struct of your own and implement Read for it:
struct MultiRead<'a> {
sources: &'a [&'a [u8]],
pos_in_current: usize,
}
impl<'a> MultiRead<'a> {
fn new(sources: &'a [&'a [u8]]) -> MultiRead<'a> {
MultiRead {
sources,
pos_in_current: 0,
}
}
}
impl Read for MultiRead<'_> {
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
let current = loop {
if self.sources.is_empty() {
return Ok(0); // EOF
}
let current = self.sources[0];
if self.pos_in_current < current.len() {
break current;
}
self.pos_in_current = 0;
self.sources = &self.sources[1..];
};
let read_size = buf.len().min(current.len() - self.pos_in_current);
buf[..read_size].copy_from_slice(&current[self.pos_in_current..][..read_size]);
self.pos_in_current += read_size;
Ok(read_size)
}
}
Playground

Create a wrapper type around the slices and implement Read for it. Compared to user4815162342's answer, I delegate down to the implementation of Read for slices:
use std::{io::Read, mem};
struct Wrapper<'a, 'b>(&'a mut [&'b [u8]]);
impl<'a, 'b> Read for Wrapper<'a, 'b> {
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
let slices = mem::take(&mut self.0);
match slices {
[head, ..] => {
let n_bytes = head.read(buf)?;
if head.is_empty() {
// Advance the child slice
self.0 = &mut slices[1..];
} else {
// More to read, put back all the child slices
self.0 = slices;
}
Ok(n_bytes)
}
_ => Ok(0),
}
}
}
fn main() {
let parts: &mut [&[u8]] = &mut [b"hello ", b"world"];
let mut w = Wrapper(parts);
let mut buf = Vec::new();
w.read_to_end(&mut buf).unwrap();
assert_eq!(b"hello world", &*buf);
}
A more efficient implementation would implement further methods from Read, such as read_to_end or read_vectored.
See also:
How do I implement a trait I don't own for a type I don't own?

Related

How to traverse and consume a vector in given order? [duplicate]

For example, I have a Vec<String> and an array storing indexes.
let src = vec!["a".to_string(), "b".to_string(), "c".to_string()];
let idx_arr = [2_usize, 0, 1];
The indexes stored in idx_arr comes from the range 0..src.len(), without repetition or omission.
I want to move the elements in src to another container in the given order, until the vector is completely consumed. For example,
let iter = into_iter_in_order(src, &idx_arr);
for s in iter {
// s: String
}
// or
consume_vec_in_order(src, &idx_arr, |s| {
// s: String
});
If the type of src can be changed to Vec<Option<String>>, things will be much easier, just use src[i].take(). However, it cannot.
Edit:
"Another container" refers to any container, such as a queue or hash set. Reordering in place is not the answer to the problem. It introduces the extra time cost of O(n). The ideal method should be 0-cost.

Not sure if my algorithm satisfies your requirements but here I have an algorithm that can consume the provided vector in-order without initializing a new temporary vector, which is more efficient for a memory.
fn main() {
let src = &mut vec!["a".to_string(), "b".to_string(), "c".to_string(), "d".to_string()];
let idx_arr = [2_usize, 3, 1, 0];
consume_vector_in_order(src, idx_arr.to_vec());
println!("{:?}", src); // d , c , a , b
}
// In-place consume vector in order
fn consume_vector_in_order<T>(v: &mut Vec<T>, inds: Vec<usize>) -> &mut Vec<T>
where
T: Default,
{
let mut i: usize = 0;
let mut temp_inds = inds.to_vec();
while i < inds.to_vec().len() {
let s_index = temp_inds[i];
if s_index != i {
let new_index = temp_inds[s_index];
temp_inds.swap(s_index, new_index);
v.swap(s_index, new_index);
} else {
i += 1;
}
}
v
}

You can use the technique found in How to sort a Vec by indices? (using my answer in particular) since that can reorder the data in-place from the indices, and then its just simple iteration:
fn consume_vec_in_order<T>(mut vec: Vec<T>, order: &[usize], mut cb: impl FnMut(T)) {
sort_by_indices(&mut vec, order.to_owned());
for elem in vec {
cb(elem);
}
}
Full example available on the playground.

Edit:
An ideal method, but needs to access unstable features and functions not exposed by the standard library.
use std::alloc::{Allocator, RawVec};
use std::marker::PhantomData;
use std::mem::{self, ManuallyDrop};
use std::ptr::{self, NonNull};
#[inline]
unsafe fn into_iter_in_order<'a, T, A: Allocator>(
vec: Vec<T, A>,
order: &'a [usize],
) -> IntoIter<'a, T, A> {
unsafe {
let mut vec = ManuallyDrop::new(vec);
let cap = vec.capacity();
let alloc = ManuallyDrop::new(ptr::read(vec.allocator()));
let ptr = order.as_ptr();
let end = ptr.add(order.len());
IntoIter {
buf: NonNull::new_unchecked(vec.as_mut_ptr()),
_marker_1: PhantomData,
cap,
alloc,
ptr,
end,
_marker_2: PhantomData,
}
}
}
struct IntoIter<'a, T, A: Allocator> {
buf: NonNull<T>,
_marker_1: PhantomData<T>,
cap: usize,
alloc: ManuallyDrop<A>,
ptr: *const usize,
end: *const usize,
_marker_2: PhantomData<&'a usize>,
}
impl<T, A: Allocator> Iterator for IntoIter<T, A> {
type Item = T;
#[inline]
fn next(&mut self) -> Option<T> {
if self.ptr == self.end {
None
} else {
let idx = unsafe { *self.ptr };
self.ptr = unsafe { self.ptr.add(1) };
if T::IS_ZST {
Some(unsafe { mem::zeroed() })
} else {
Some(unsafe { ptr::read(self.buf.as_ptr().add(idx)) })
}
}
}
}
impl<#[may_dangle] T, A: Allocator> Drop for IntoIter<T, A> {
fn drop(&mut self) {
struct DropGuard<'a, T, A: Allocator>(&'a mut IntoIter<T, A>);
impl<T, A: Allocator> Drop for DropGuard<'_, T, A> {
fn drop(&mut self) {
unsafe {
// `IntoIter::alloc` is not used anymore after this and will be dropped by RawVec
let alloc = ManuallyDrop::take(&mut self.0.alloc);
// RawVec handles deallocation
let _ = RawVec::from_raw_parts_in(self.0.buf.as_ptr(), self.0.cap, alloc);
}
}
}
let guard = DropGuard(self);
// destroy the remaining elements
unsafe {
while self.ptr != self.end {
let idx = *self.ptr;
self.ptr = self.ptr.add(1);
let p = if T::IS_ZST {
self.buf.as_ptr().wrapping_byte_add(idx)
} else {
self.buf.as_ptr().add(idx)
};
ptr::drop_in_place(p);
}
}
// now `guard` will be dropped and do the rest
}
}
Example:
let src = vec![
"0".to_string(),
"1".to_string(),
"2".to_string(),
"3".to_string(),
"4".to_string(),
];
let mut dst = vec![];
let iter = unsafe { into_iter_in_order(src, &[2, 1, 3, 0, 4]) };
for s in iter {
dst.push(s);
}
assert_eq!(dst, vec!["2", "1", "3", "0", "4"]);
My previous answer:
use std::mem;
use std::ptr;
pub unsafe fn consume_vec_in_order<T>(vec: Vec<T>, order: &[usize], mut cb: impl FnMut(T)) {
// Check whether `order` contains all numbers in 0..len without repetition
// or omission.
if cfg!(debug_assertions) {
use std::collections::HashSet;
let n = order.len();
if n != vec.len() {
panic!("The length of `order` is not equal to that of `vec`.");
}
let mut set = HashSet::<usize>::new();
for &idx in order {
if idx >= n {
panic!("`{idx}` in the `order` is out of range (0..{n}).");
} else if set.contains(&idx) {
panic!("`order` contains the repeated element `{idx}`");
} else {
set.insert(idx);
}
}
}
unsafe {
for &idx in order {
let s = ptr::read(vec.get_unchecked(idx));
cb(s);
}
vec.set_len(0);
}
}
Example:
let src = vec![
"0".to_string(),
"1".to_string(),
"2".to_string(),
"3".to_string(),
"4".to_string(),
];
let mut dst = vec![];
consume_vec_in_order(
src,
&[2, 1, 3, 0, 4],
|elem| dst.push(elem),
);
assert_eq!(dst, vec!["2", "1", "3", "0", "4"]);

Rust same mutable reference in multiple vectors

I am trying to store an object into 2 different reference vectors, and if I modify the object from first vector ref, it should be visible from the second vector.
I still haven't understood well borrowing so that's what I'm trying to acheive :
use std::fmt::Display;
use std::fmt::Formatter;
use std::fmt::Result;
use std::vec::Vec;
struct A {
x: u32,
}
impl Display for A {
fn fmt(&self, f: &mut Formatter) -> Result {
write!(f, "{}", self.x)
}
}
fn main() {
let mut a = A{x: 1};
let mut v1: Vec<&mut A> = Vec::new();
let mut v2: Vec<&mut A> = Vec::new();
v1.push(&mut a);
v2.push(&mut a);
let f: &mut A = v1[0];
f.x = 3;
for v in &v1 {
println!("Value: {}", v);
}
for v in &v2 {
println!("Value: {}", v);
}
}
Of course this doesn't compile. What should I do to be able to store same object in different collection like objects ? (I don't see any concurency issue or borrowing here).

Mutable borrow into two parts with cleanup

I have some object that I want to split into two parts via a mutable borrow, then combine those back together into the original object when the split references go out of scope.
The simplified example below is for a Count struct that holds a single i32, which we want to split into two &mut i32s, who are both incorporated back into the original Count when the two mutable references go out of scope.
The approach I am taking below is to use an intermediate object CountSplit which holds a mutable reference to the original Count object and has the Drop trait implemented to do the re-combination logic.
This approach feels kludgy. In particular, this is awkward:
let mut ms = c.make_split();
let (x, y) = ms.split();
Doing this in one line like let (x, y) = c.make_split().split(); is not allowed because the intermediate object must have a longer lifetime. Ideally I would be able to do something like let (x, y) = c.magic_split(); and avoid exposing the intermediate object altogether.
Is there a way to do this which doesn't require doing two let's every time, or some other way to tackle this pattern that would be more idiomatic?
#[derive(Debug)]
struct Count {
val: i32,
}
trait MakeSplit<'a> {
type S: Split<'a>;
fn make_split(&'a mut self) -> Self::S;
}
impl<'a> MakeSplit<'a> for Count {
type S = CountSplit<'a>;
fn make_split(&mut self) -> CountSplit {
CountSplit {
top: self,
second: 0,
}
}
}
struct CountSplit<'a> {
top: &'a mut Count,
second: i32,
}
trait Split<'a> {
fn split(&'a mut self) -> (&'a mut i32, &'a mut i32);
}
impl<'a, 'b> Split<'a> for CountSplit<'b> {
fn split(&mut self) -> (&mut i32, &mut i32) {
(&mut self.top.val, &mut self.second)
}
}
impl<'a> Drop for CountSplit<'a> {
fn drop(&mut self) {
println!("custom drop occurs here");
self.top.val += self.second;
}
}
fn main() {
let mut c = Count { val: 2 };
println!("{:?}", c); // Count { val: 2 }
{
let mut ms = c.make_split();
let (x, y) = ms.split();
println!("split: {} {}", x, y); // split: 2 0
// each of these lines correctly gives a compile-time error
// c.make_split(); // can't borrow c as mutable
// println!("{:?}", c); // or immutable
// ms.split(); // also can't borrow ms
*x += 100;
*y += 5000;
println!("split: {} {}", x, y); // split: 102 5000
} // custom drop occurs here
println!("{:?}", c); // Count { val: 5102 }
}
playground:

I don't think a reference to a temporary value like yours can be made to work in today's Rust.
If it's any help, if you specifically want to call a function with two &mut i32 parameters like you mentioned in the comments, e.g.
fn foo(a: &mut i32, b: &mut i32) {
*a += 1;
*b += 2;
println!("split: {} {}", a, b);
}
you can already do that with the same number of lines as you'd have if your chaining worked.
With the chaining, you'd call
let (x, y) = c.make_split().split();
foo(x, y);
And if you just leave out the conversion to a tuple, it looks like this:
let mut ms = c.make_split();
foo(&mut ms.top.val, &mut ms.second);
You can make it a little prettier by e.g. storing the mutable reference to val directly in CountSplit as first, so that it becomes foo(&mut ms.first, &mut ms.second);. If you want it to feel even more like a tuple, I think you can use DerefMut to be able to write foo(&mut ms.0, &mut ms.1);.
Alternatively, you can of course formulate this as a function taking a function
impl Count {
fn as_split<F: FnMut(&mut i32, &mut i32)>(&mut self, mut f: F) {
let mut second = 0;
f(&mut self.val, &mut second);
self.val += second;
}
}
and then just call
c.as_split(foo);

Efficiently insert or replace multiple elements in the middle or at the beginning of a Vec?

Is there any straightforward way to insert or replace multiple elements from &[T] and/or Vec<T> in the middle or at the beginning of a Vec in linear time?
I could only find std::vec::Vec::insert, but that's only for inserting a single element in O(n) time, so I obviously cannot call that in a loop.
I could do a split_off at that index, extend the new elements into the left half of the split, and then extend the second half into the first, but is there a better way?

As of Rust 1.21.0, Vec::splice is available and allows inserting at any point, including fully prepending:
let mut vec = vec![1, 5];
let slice = &[2, 3, 4];
vec.splice(1..1, slice.iter().cloned());
println!("{:?}", vec); // [1, 2, 3, 4, 5]
The docs state:
Note 4: This is optimal if:
The tail (elements in the vector after range) is empty
or replace_with yields fewer elements than range’s length
or the lower bound of its size_hint() is exact.
In this case, the lower bound of the slice's iterator should be exact, so it should perform one memory move.
splice is a bit more powerful in that it allows you to remove a range of values (the first argument), insert new values (the second argument), and optionally get the old values (the result of the call).
Replacing a set of items
let mut vec = vec![0, 1, 5];
let slice = &[2, 3, 4];
vec.splice(..2, slice.iter().cloned());
println!("{:?}", vec); // [2, 3, 4, 5]
Getting the previous values
let mut vec = vec![0, 1, 2, 3, 4];
let slice = &[9, 8, 7];
let old: Vec<_> = vec.splice(3.., slice.iter().cloned()).collect();
println!("{:?}", vec); // [0, 1, 2, 9, 8, 7]
println!("{:?}", old); // [3, 4]

Okay, there is no appropriate method in Vec interface (as I can see). But we can always implement the same thing ourselves.
memmove
When T is Copy, probably the most obvious way is to move the memory, like this:
fn push_all_at<T>(v: &mut Vec<T>, offset: usize, s: &[T]) where T: Copy {
match (v.len(), s.len()) {
(_, 0) => (),
(current_len, _) => {
v.reserve_exact(s.len());
unsafe {
v.set_len(current_len + s.len());
let to_move = current_len - offset;
let src = v.as_mut_ptr().offset(offset as isize);
if to_move > 0 {
let dst = src.offset(s.len() as isize);
std::ptr::copy_memory(dst, src, to_move);
}
std::ptr::copy_nonoverlapping_memory(src, s.as_ptr(), s.len());
}
},
}
}
shuffle
If T is not copy, but it implements Clone, we can append given slice to the end of the Vec, and move it to the required position using swaps in linear time:
fn push_all_at<T>(v: &mut Vec<T>, mut offset: usize, s: &[T]) where T: Clone + Default {
match (v.len(), s.len()) {
(_, 0) => (),
(0, _) => { v.push_all(s); },
(_, _) => {
assert!(offset <= v.len());
let pad = s.len() - ((v.len() - offset) % s.len());
v.extend(repeat(Default::default()).take(pad));
v.push_all(s);
let total = v.len();
while total - offset >= s.len() {
for i in 0 .. s.len() { v.swap(offset + i, total - s.len() + i); }
offset += s.len();
}
v.truncate(total - pad);
},
}
}
iterators concat
Maybe the best choice will be to not modify Vec at all. For example, if you are going to access the result via iterator, we can just build iterators chain from our chunks:
let v: &[usize] = &[0, 1, 2];
let s: &[usize] = &[3, 4, 5, 6];
let offset = 2;
let chain = v.iter().take(offset).chain(s.iter()).chain(v.iter().skip(offset));
let result: Vec<_> = chain.collect();
println!("Result: {:?}", result);

I was trying to prepend to a vector in rust and found this closed question that was linked here, (despite this question being both prepend and insert AND efficiency. I think my answer would be better as an answer for that other, more precises question because I can't attest to the efficiency), but the following code helped me prepend, (and the opposite.) [I'm sure that the other two answers are more efficient, but the way that I learn, I like having answers that can be cut-n-pasted with examples that demonstrate an application of the answer.]
pub trait Unshift<T> { fn unshift(&mut self, s: &[T]) -> (); }
pub trait UnshiftVec<T> { fn unshift_vec(&mut self, s: Vec<T>) -> (); }
pub trait UnshiftMemoryHog<T> { fn unshift_memory_hog(&mut self, s: Vec<T>) -> (); }
pub trait Shift<T> { fn shift(&mut self) -> (); }
pub trait ShiftN<T> { fn shift_n(&mut self, s: usize) -> (); }
impl<T: std::clone::Clone> ShiftN<T> for Vec<T> {
fn shift_n(&mut self, s: usize) -> ()
// where
// T: std::clone::Clone,
{
self.drain(0..s);
}
}
impl<T: std::clone::Clone> Shift<T> for Vec<T> {
fn shift(&mut self) -> ()
// where
// T: std::clone::Clone,
{
self.drain(0..1);
}
}
impl<T: std::clone::Clone> Unshift<T> for Vec<T> {
fn unshift(&mut self, s: &[T]) -> ()
// where
// T: std::clone::Clone,
{
self.splice(0..0, s.to_vec());
}
}
impl<T: std::clone::Clone> UnshiftVec<T> for Vec<T> {
fn unshift_vec(&mut self, s: Vec<T>) -> ()
where
T: std::clone::Clone,
{
self.splice(0..0, s);
}
}
impl<T: std::clone::Clone> UnshiftMemoryHog<T> for Vec<T> {
fn unshift_memory_hog(&mut self, s: Vec<T>) -> ()
where
T: std::clone::Clone,
{
let mut tmp: Vec<_> = s.to_owned();
//let mut tmp: Vec<_> = s.clone(); // this also works for some data types
/*
let local_s: Vec<_> = self.clone(); // explicit clone()
tmp.extend(local_s); // to vec is possible
*/
tmp.extend(self.clone());
*self = tmp;
//*self = (*tmp).to_vec(); // Just because it compiles, doesn't make it right.
}
}
// this works for: v = unshift(v, &vec![8]);
// (If you don't want to impl Unshift for Vec<T>)
#[allow(dead_code)]
fn unshift_fn<T>(v: Vec<T>, s: &[T]) -> Vec<T>
where
T: Clone,
{
// create a mutable vec and fill it
// with a clone of the array that we want
// at the start of the vec.
let mut tmp: Vec<_> = s.to_owned();
// then we add the existing vector to the end
// of the temporary vector.
tmp.extend(v);
// return the tmp vec that is identitcal
// to unshift-ing the original vec.
tmp
}
/*
N.B. It is sometimes (often?) more memory efficient to reverse
the vector and use push/pop, rather than splice/drain;
Especially if you create your vectors in "stack order" to begin with.
*/
fn main() {
let mut v: Vec<usize> = vec![1, 2, 3];
println!("Before push:\t {:?}", v);
v.push(0);
println!("After push:\t {:?}", v);
v.pop();
println!("popped:\t\t {:?}", v);
v.drain(0..1);
println!("drain(0..1)\t {:?}", v);
/*
// We could use a function
let c = v.clone();
v = unshift_fn(c, &vec![0]);
*/
v.splice(0..0, vec![0]);
println!("splice(0..0, vec![0]) {:?}", v);
v.shift_n(1);
println!("shift\t\t {:?}", v);
v.unshift_memory_hog(vec![8, 16, 31, 1]);
println!("MEMORY guzzler unshift {:?}", v);
//v.drain(0..3);
v.drain(0..=2);
println!("back to the start: {:?}", v);
v.unshift_vec(vec![0]);
println!("zerothed with unshift: {:?}", v);
let mut w = vec![4, 5, 6];
/*
let prepend_this = &[1, 2, 3];
w.unshift_vec(prepend_this.to_vec());
*/
w.unshift(&[1, 2, 3]);
assert_eq!(&w, &[1, 2, 3, 4, 5, 6]);
println!("{:?} == {:?}", &w, &[1, 2, 3, 4, 5, 6]);
}

How can I swap items in a vector, slice, or array in Rust?

My code looks like this:
fn swap<T>(mut collection: Vec<T>, a: usize, b: usize) {
let temp = collection[a];
collection[a] = collection[b];
collection[b] = temp;
}
Rust is pretty sure I'm not allowed to "move out of dereference" or "move out of indexed content", whatever that is. How do I convince Rust that this is possible?

There is a swap method defined for &mut [T]. Since a Vec<T> can be mutably dereferenced as a &mut [T], this method can be called directly:
fn main() {
let mut numbers = vec![1, 2, 3];
println!("before = {:?}", numbers);
numbers.swap(0, 2);
println!("after = {:?}", numbers);
}
To implement this yourself, you have to write some unsafe code. Vec::swap is implemented like this:
fn swap(&mut self, a: usize, b: usize) {
unsafe {
// Can't take two mutable loans from one vector, so instead just cast
// them to their raw pointers to do the swap
let pa: *mut T = &mut self[a];
let pb: *mut T = &mut self[b];
ptr::swap(pa, pb);
}
}
It takes two raw pointers from the vector and uses ptr::swap to swap them safely.
There is also a mem::swap(&mut T, &mut T) when you need to swap two distinct variables. That cannot be used here because Rust won't allow taking two mutable borrows from the same vector.

As mentioned by #gsingh2011, the accepted answer is no longer good approach. With current Rust this code works fine:
fn main() {
let mut numbers = vec![1, 2, 3];
println!("before = {:?}", numbers);
numbers.swap(0, 2);
println!("after = {:?}", numbers);
}
try it here

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How can concatenated &[u8] slices implement the Read trait without additional copying? - rust

Related

How to traverse and consume a vector in given order? [duplicate]

Rust same mutable reference in multiple vectors

Mutable borrow into two parts with cleanup

Efficiently insert or replace multiple elements in the middle or at the beginning of a Vec?

How can I swap items in a vector, slice, or array in Rust?

Categories

Resources