How can slices be split using another slice as a delimiter?

How can slices be split using another slice as a delimiter? - rust

Does the standard library provide a way to split a slice [T] using another slice of the same type as a delimiter? The library's documentation lists methods that operate on single-element delimiters rather than slices.
For example: A slice of 5 u64 integers [1u64, 4u64, 0u64, 0u64, 8u64] split using [0u64, 0u64] as a delimiter would result in two slices [1u64, 4u64] and [8u64].

Does the standard library provide a way to split a slice [T] using another slice of the same type as a delimiter?
As of Rust 1.9, no, but you can implement it:
fn main() {
let a = [1, 4, 7, 0, 0, 8, 10, 0, 0];
let b = [0, 0];
let mut iter = split_subsequence(&a, &b);
assert_eq!(&[1, 4, 7], iter.next().unwrap());
assert_eq!(&[8, 10], iter.next().unwrap());
assert!(iter.next().unwrap().is_empty());
assert_eq!(None, iter.next());
}
pub struct SplitSubsequence<'a, 'b, T: 'a + 'b> {
slice: &'a [T],
needle: &'b [T],
ended: bool,
}
impl<'a, 'b, T: 'a + 'b + PartialEq> Iterator for SplitSubsequence<'a, 'b, T> {
type Item = &'a [T];
fn next(&mut self) -> Option<Self::Item> {
if self.ended {
None
} else if self.slice.is_empty() {
self.ended = true;
Some(self.slice)
} else if let Some(p) = self.slice
.windows(self.needle.len())
.position(|w| w == self.needle) {
let item = &self.slice[..p];
self.slice = &self.slice[p + self.needle.len()..];
Some(item)
} else {
self.ended = true;
let item = self.slice;
self.slice = &self.slice[self.slice.len() - 1..];
Some(item)
}
}
}
fn split_subsequence<'a, 'b, T>(slice: &'a [T], needle: &'b [T]) -> SplitSubsequence<'a, 'b, T>
where T: 'a + 'b + PartialEq
{
SplitSubsequence {
slice: slice,
needle: needle,
ended: false,
}
}
Note that this implementation uses a naive algorithm for finding an equal subsequence.

Related

How to return an iterator for a tuple of slices that iterates the first slice then the second slice?

I have a function that splits a slice into three parts, a leading and trailing slice, and a reference to the middle element.
/// The leading and trailing parts of a slice.
struct LeadingTrailing<'a, T>(&'a mut [T], &'a mut [T]);
/// Divides one mutable slice into three parts, a leading and trailing slice,
/// and a reference to the middle element.
pub fn split_at_rest_mut<T>(x: &mut [T], index: usize) -> (&mut T, LeadingTrailing<T>) {
debug_assert!(index < x.len());
let (leading, trailing) = x.split_at_mut(index);
let (val, trailing) = trailing.split_first_mut().unwrap();
(val, LeadingTrailing(leading, trailing))
}
I would like to implement Iterator for LeadingTrailing<'a, T> so that it first iterates over the first slice, and then over the second. i.e., it will behave like:
let mut foo = [0,1,2,3,4,5];
let (item, lt) = split_at_rest_mut(&foo, 2);
for num in lt.0 {
...
}
for num in lt.1 {
...
}
I have tried converting to a Chain:
struct LeadingTrailing<'a, T>(&'a mut [T], &'a mut [T]);
impl <'a, T> LeadingTrailing<'a, T> {
fn to_chain(&mut self) -> std::iter::Chain<&'a mut [T], &'a mut [T]> {
self.0.iter_mut().chain(self.1.iter_mut())
}
}
But I get the error:
89 | self.0.iter_mut().chain(self.1.iter_mut())
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `&mut [T]`, found struct `std::slice::IterMut`
I have also tried creating a custom Iterator
/// The leading and trailing parts of a slice.
struct LeadingTrailing<'a, T>(&'a mut [T], &'a mut [T]);
struct LTOthersIterator<'a, T> {
data: LeadingTrailing<'a, T>,
index: usize,
}
/// Iterates over the first slice, then the second slice.
impl<'a, T> Iterator for LTOthersIterator<'a, T> {
type Item = &'a T;
fn next(&mut self) -> Option<Self::Item> {
let leading_len = self.data.0.len();
let trailing_len = self.data.1.len();
let total_len = leading_len + trailing_len;
match self.index {
0..=leading_len => {
self.index += 1;
self.data.0.get(self.index - 1)
}
leading_len..=total_len => {
self.index += 1;
self.data.1.get(self.index - leading_len - 1)
}
}
}
}
But I get the error:
error[E0495]: cannot infer an appropriate lifetime for autoref due to conflicting requirements
--> src\main.rs:104:29
|
104 | self.data.0.get(self.index - 1)
^^^
What is the correct way to do this?

You either let the compiler do the work:
impl <'a, T> LeadingTrailing<'a, T> {
fn to_chain(&mut self) -> impl Iterator<Item = &mut T> {
self.0.iter_mut().chain(self.1.iter_mut())
}
}
Or perscribe the correct type, Chain takes the iterators, not the thing they got created from.
impl <'a, T> LeadingTrailing<'a, T> {
fn to_chain(&'a mut self) -> std::iter::Chain<std::slice::IterMut<'a, T>, std::slice::IterMut<'a, T>> {
self.0.iter_mut().chain(self.1.iter_mut())
}
}

The return value of to_chain() is incorrect.
For simplicity, just use impl Iterator.
/// The leading and trailing parts of a slice.
#[derive(Debug)]
pub struct LeadingTrailing<'a, T>(&'a mut [T], &'a mut [T]);
/// Divides one mutable slice into three parts, a leading and trailing slice,
/// and a reference to the middle element.
pub fn split_at_rest_mut<T>(x: &mut [T], index: usize) -> (&mut T, LeadingTrailing<T>) {
debug_assert!(index < x.len());
let (leading, trailing) = x.split_at_mut(index);
let (val, trailing) = trailing.split_first_mut().unwrap();
(val, LeadingTrailing(leading, trailing))
}
impl<T> LeadingTrailing<'_, T> {
fn to_chain(&mut self) -> impl Iterator<Item = &mut T> {
self.0.iter_mut().chain(self.1.iter_mut())
}
}
fn main() {
let mut arr = [0, 1, 2, 3, 4, 5, 6, 7, 8];
let (x, mut leadtrail) = split_at_rest_mut(&mut arr, 5);
println!("x: {}", x);
println!("leadtrail: {:?}", leadtrail);
for el in leadtrail.to_chain() {
*el *= 2;
}
println!("leadtrail: {:?}", leadtrail);
}
x: 5
leadtrail: LeadingTrailing([0, 1, 2, 3, 4], [6, 7, 8])
leadtrail: LeadingTrailing([0, 2, 4, 6, 8], [12, 14, 16])
The fully written out version would be:
impl<T> LeadingTrailing<'_, T> {
fn to_chain(&mut self) -> std::iter::Chain<std::slice::IterMut<T>, std::slice::IterMut<T>> {
self.0.iter_mut().chain(self.1.iter_mut())
}
}

A nice version for Vec<Vec<T>>.get?

Is there any comfortable way to get value from Vec<Vec<T>>? I can do it for a normal 1D Vec: vec.get(), but if vec is Vec<Vec>, get returns the Some<Vec<T>>, not the value of T. Is there a nice way to 'get' value from 2D matrix (Vec<Vec<T>>)?

Option::and_then lets you chain optional return values (o.and_then(f) is equivalent to o.map(f).flatten()):
vec.get(i).and_then(|v| v.get(j))
It also easily extends to higher dimensions:
vec
.get(i)
.and_then(|v| v.get(j))
.and_then(|v| v.get(k))
.and_then(|v| v.get(l))
// and so on

#Aplet123 is of course right, that's the way to go and his answer should be marked correct.
But in case you wonder how to make this prettier, you could wrap it in a custom trait for Vec<Vec<T>>:
trait Get2D {
type Val;
fn get2d(&self, i: usize, j: usize) -> Option<&Self::Val>;
fn get2d_mut(&mut self, i: usize, j: usize) -> Option<&mut Self::Val>;
}
impl<T> Get2D for Vec<Vec<T>> {
type Val = T;
fn get2d(&self, i: usize, j: usize) -> Option<&T> {
self.get(i).and_then(|e| e.get(j))
}
fn get2d_mut(&mut self, i: usize, j: usize) -> Option<&mut T> {
self.get_mut(i).and_then(|e| e.get_mut(j))
}
}
fn main() {
let mut data: Vec<Vec<i32>> = vec![vec![1, 2, 3], vec![4, 5, 6]];
println!("{}", data.get2d(1, 1).unwrap());
*data.get2d_mut(0, 1).unwrap() = 42;
println!("{:?}", data);
}
5
[[1, 42, 3], [4, 5, 6]]

How can concatenated &[u8] slices implement the Read trait without additional copying?

The Read trait is implemented for &[u8]. How can I get a Read trait over several concatenated u8 slices without actually doing any concatenation first?
If I concatenate first, there will be two copies -- multiple arrays into a single array followed by copying from single array to destination via the Read trait. I would like to avoid the first copying.
I want a Read trait over &[&[u8]] that treats multiple slices as a single continuous slice.
fn foo<R: std::io::Read + Send>(data: R) {
// ...
}
let a: &[u8] = &[1, 2, 3, 4, 5];
let b: &[u8] = &[1, 2];
let c: &[&[u8]] = &[a, b];
foo(c); // <- this won't compile because `c` is not a slice of bytes.

You could use the multi_reader crate, which can concatenate any number of values that implement Read:
let a: &[u8] = &[1, 2, 3, 4, 5];
let b: &[u8] = &[1, 2];
let c: &[&[u8]] = &[a, b];
foo(multi_reader::MultiReader::new(c.iter().copied()));
If you don't want to depend on an external crate, you can wrap the slices in a struct of your own and implement Read for it:
struct MultiRead<'a> {
sources: &'a [&'a [u8]],
pos_in_current: usize,
}
impl<'a> MultiRead<'a> {
fn new(sources: &'a [&'a [u8]]) -> MultiRead<'a> {
MultiRead {
sources,
pos_in_current: 0,
}
}
}
impl Read for MultiRead<'_> {
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
let current = loop {
if self.sources.is_empty() {
return Ok(0); // EOF
}
let current = self.sources[0];
if self.pos_in_current < current.len() {
break current;
}
self.pos_in_current = 0;
self.sources = &self.sources[1..];
};
let read_size = buf.len().min(current.len() - self.pos_in_current);
buf[..read_size].copy_from_slice(&current[self.pos_in_current..][..read_size]);
self.pos_in_current += read_size;
Ok(read_size)
}
}
Playground

Create a wrapper type around the slices and implement Read for it. Compared to user4815162342's answer, I delegate down to the implementation of Read for slices:
use std::{io::Read, mem};
struct Wrapper<'a, 'b>(&'a mut [&'b [u8]]);
impl<'a, 'b> Read for Wrapper<'a, 'b> {
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
let slices = mem::take(&mut self.0);
match slices {
[head, ..] => {
let n_bytes = head.read(buf)?;
if head.is_empty() {
// Advance the child slice
self.0 = &mut slices[1..];
} else {
// More to read, put back all the child slices
self.0 = slices;
}
Ok(n_bytes)
}
_ => Ok(0),
}
}
}
fn main() {
let parts: &mut [&[u8]] = &mut [b"hello ", b"world"];
let mut w = Wrapper(parts);
let mut buf = Vec::new();
w.read_to_end(&mut buf).unwrap();
assert_eq!(b"hello world", &*buf);
}
A more efficient implementation would implement further methods from Read, such as read_to_end or read_vectored.
See also:
How do I implement a trait I don't own for a type I don't own?

How to generate iterator with sliding window pairs?

I'd like to create an iterator that for this input:
[1, 2, 3, 4]
Will contain the following:
(1, 2)
(2, 3)
(3, 4)
Peekable seems ideal for this, but I'm new to Rust, so this naïve version doesn't work:
fn main() {
let i = ['a', 'b', 'c']
.iter()
.peekable();
let j = i.map(|x| (x, i.peek()));
println!("{:?}", j);
println!("Hello World!");
}
What am I doing wrong?

You can use the windows method on slices, and then map the arrays into tuples:
fn main() {
let i = [1, 2, 3, 4]
.windows(2)
.map(|pair| (pair[0], pair[1]));
println!("{:?}", i.collect::<Vec<_>>());
}
playground
If you want a solution that works for all iterators (and not just slices) and are willing to use a 3rd-party library you can use the tuple_windows method from itertools.
use itertools::{Itertools, TupleWindows}; // 0.10.0
fn main() {
let i: TupleWindows<_, (i32, i32)> = vec![1, 2, 3, 4]
.into_iter()
.tuple_windows();
println!("{:?}", i.collect::<Vec<_>>());
}
playground
If you're not willing to use a 3rd-party library it's still simple enough that you can implement it yourself! Here's an example generic implementation that works for any Iterator<Item = T> where T: Clone:
use std::collections::BTreeSet;
struct PairIter<I, T>
where
I: Iterator<Item = T>,
T: Clone,
{
iterator: I,
last_item: Option<T>,
}
impl<I, T> PairIter<I, T>
where
I: Iterator<Item = T>,
T: Clone,
{
fn new(iterator: I) -> Self {
PairIter {
iterator,
last_item: None,
}
}
}
impl<I, T> Iterator for PairIter<I, T>
where
I: Iterator<Item = T>,
T: Clone,
{
type Item = (T, T);
fn next(&mut self) -> Option<Self::Item> {
if self.last_item.is_none() {
self.last_item = self.iterator.next();
}
if self.last_item.is_none() {
return None;
}
let curr_item = self.iterator.next();
if curr_item.is_none() {
return None;
}
let temp_item = curr_item.clone();
let result = (self.last_item.take().unwrap(), curr_item.unwrap());
self.last_item = temp_item;
Some(result)
}
}
fn example<T: Clone>(iterator: impl Iterator<Item = T>) -> impl Iterator<Item = (T, T)> {
PairIter::new(iterator)
}
fn main() {
let mut set = BTreeSet::new();
set.insert(String::from("a"));
set.insert(String::from("b"));
set.insert(String::from("c"));
set.insert(String::from("d"));
dbg!(example(set.into_iter()).collect::<Vec<_>>());
}
playground

You can use tuple_windows() from the itertools crate as a drop-in replacement:
use itertools::Itertools;
fn main() {
let data = vec![1, 2, 3, 4];
for (a, b) in data.iter().tuple_windows() {
println!("({}, {})", a, b);
}
}
(1, 2)
(2, 3)
(3, 4)

Efficiently insert or replace multiple elements in the middle or at the beginning of a Vec?

Is there any straightforward way to insert or replace multiple elements from &[T] and/or Vec<T> in the middle or at the beginning of a Vec in linear time?
I could only find std::vec::Vec::insert, but that's only for inserting a single element in O(n) time, so I obviously cannot call that in a loop.
I could do a split_off at that index, extend the new elements into the left half of the split, and then extend the second half into the first, but is there a better way?

As of Rust 1.21.0, Vec::splice is available and allows inserting at any point, including fully prepending:
let mut vec = vec![1, 5];
let slice = &[2, 3, 4];
vec.splice(1..1, slice.iter().cloned());
println!("{:?}", vec); // [1, 2, 3, 4, 5]
The docs state:
Note 4: This is optimal if:
The tail (elements in the vector after range) is empty
or replace_with yields fewer elements than range’s length
or the lower bound of its size_hint() is exact.
In this case, the lower bound of the slice's iterator should be exact, so it should perform one memory move.
splice is a bit more powerful in that it allows you to remove a range of values (the first argument), insert new values (the second argument), and optionally get the old values (the result of the call).
Replacing a set of items
let mut vec = vec![0, 1, 5];
let slice = &[2, 3, 4];
vec.splice(..2, slice.iter().cloned());
println!("{:?}", vec); // [2, 3, 4, 5]
Getting the previous values
let mut vec = vec![0, 1, 2, 3, 4];
let slice = &[9, 8, 7];
let old: Vec<_> = vec.splice(3.., slice.iter().cloned()).collect();
println!("{:?}", vec); // [0, 1, 2, 9, 8, 7]
println!("{:?}", old); // [3, 4]

Okay, there is no appropriate method in Vec interface (as I can see). But we can always implement the same thing ourselves.
memmove
When T is Copy, probably the most obvious way is to move the memory, like this:
fn push_all_at<T>(v: &mut Vec<T>, offset: usize, s: &[T]) where T: Copy {
match (v.len(), s.len()) {
(_, 0) => (),
(current_len, _) => {
v.reserve_exact(s.len());
unsafe {
v.set_len(current_len + s.len());
let to_move = current_len - offset;
let src = v.as_mut_ptr().offset(offset as isize);
if to_move > 0 {
let dst = src.offset(s.len() as isize);
std::ptr::copy_memory(dst, src, to_move);
}
std::ptr::copy_nonoverlapping_memory(src, s.as_ptr(), s.len());
}
},
}
}
shuffle
If T is not copy, but it implements Clone, we can append given slice to the end of the Vec, and move it to the required position using swaps in linear time:
fn push_all_at<T>(v: &mut Vec<T>, mut offset: usize, s: &[T]) where T: Clone + Default {
match (v.len(), s.len()) {
(_, 0) => (),
(0, _) => { v.push_all(s); },
(_, _) => {
assert!(offset <= v.len());
let pad = s.len() - ((v.len() - offset) % s.len());
v.extend(repeat(Default::default()).take(pad));
v.push_all(s);
let total = v.len();
while total - offset >= s.len() {
for i in 0 .. s.len() { v.swap(offset + i, total - s.len() + i); }
offset += s.len();
}
v.truncate(total - pad);
},
}
}
iterators concat
Maybe the best choice will be to not modify Vec at all. For example, if you are going to access the result via iterator, we can just build iterators chain from our chunks:
let v: &[usize] = &[0, 1, 2];
let s: &[usize] = &[3, 4, 5, 6];
let offset = 2;
let chain = v.iter().take(offset).chain(s.iter()).chain(v.iter().skip(offset));
let result: Vec<_> = chain.collect();
println!("Result: {:?}", result);

I was trying to prepend to a vector in rust and found this closed question that was linked here, (despite this question being both prepend and insert AND efficiency. I think my answer would be better as an answer for that other, more precises question because I can't attest to the efficiency), but the following code helped me prepend, (and the opposite.) [I'm sure that the other two answers are more efficient, but the way that I learn, I like having answers that can be cut-n-pasted with examples that demonstrate an application of the answer.]
pub trait Unshift<T> { fn unshift(&mut self, s: &[T]) -> (); }
pub trait UnshiftVec<T> { fn unshift_vec(&mut self, s: Vec<T>) -> (); }
pub trait UnshiftMemoryHog<T> { fn unshift_memory_hog(&mut self, s: Vec<T>) -> (); }
pub trait Shift<T> { fn shift(&mut self) -> (); }
pub trait ShiftN<T> { fn shift_n(&mut self, s: usize) -> (); }
impl<T: std::clone::Clone> ShiftN<T> for Vec<T> {
fn shift_n(&mut self, s: usize) -> ()
// where
// T: std::clone::Clone,
{
self.drain(0..s);
}
}
impl<T: std::clone::Clone> Shift<T> for Vec<T> {
fn shift(&mut self) -> ()
// where
// T: std::clone::Clone,
{
self.drain(0..1);
}
}
impl<T: std::clone::Clone> Unshift<T> for Vec<T> {
fn unshift(&mut self, s: &[T]) -> ()
// where
// T: std::clone::Clone,
{
self.splice(0..0, s.to_vec());
}
}
impl<T: std::clone::Clone> UnshiftVec<T> for Vec<T> {
fn unshift_vec(&mut self, s: Vec<T>) -> ()
where
T: std::clone::Clone,
{
self.splice(0..0, s);
}
}
impl<T: std::clone::Clone> UnshiftMemoryHog<T> for Vec<T> {
fn unshift_memory_hog(&mut self, s: Vec<T>) -> ()
where
T: std::clone::Clone,
{
let mut tmp: Vec<_> = s.to_owned();
//let mut tmp: Vec<_> = s.clone(); // this also works for some data types
/*
let local_s: Vec<_> = self.clone(); // explicit clone()
tmp.extend(local_s); // to vec is possible
*/
tmp.extend(self.clone());
*self = tmp;
//*self = (*tmp).to_vec(); // Just because it compiles, doesn't make it right.
}
}
// this works for: v = unshift(v, &vec![8]);
// (If you don't want to impl Unshift for Vec<T>)
#[allow(dead_code)]
fn unshift_fn<T>(v: Vec<T>, s: &[T]) -> Vec<T>
where
T: Clone,
{
// create a mutable vec and fill it
// with a clone of the array that we want
// at the start of the vec.
let mut tmp: Vec<_> = s.to_owned();
// then we add the existing vector to the end
// of the temporary vector.
tmp.extend(v);
// return the tmp vec that is identitcal
// to unshift-ing the original vec.
tmp
}
/*
N.B. It is sometimes (often?) more memory efficient to reverse
the vector and use push/pop, rather than splice/drain;
Especially if you create your vectors in "stack order" to begin with.
*/
fn main() {
let mut v: Vec<usize> = vec![1, 2, 3];
println!("Before push:\t {:?}", v);
v.push(0);
println!("After push:\t {:?}", v);
v.pop();
println!("popped:\t\t {:?}", v);
v.drain(0..1);
println!("drain(0..1)\t {:?}", v);
/*
// We could use a function
let c = v.clone();
v = unshift_fn(c, &vec![0]);
*/
v.splice(0..0, vec![0]);
println!("splice(0..0, vec![0]) {:?}", v);
v.shift_n(1);
println!("shift\t\t {:?}", v);
v.unshift_memory_hog(vec![8, 16, 31, 1]);
println!("MEMORY guzzler unshift {:?}", v);
//v.drain(0..3);
v.drain(0..=2);
println!("back to the start: {:?}", v);
v.unshift_vec(vec![0]);
println!("zerothed with unshift: {:?}", v);
let mut w = vec![4, 5, 6];
/*
let prepend_this = &[1, 2, 3];
w.unshift_vec(prepend_this.to_vec());
*/
w.unshift(&[1, 2, 3]);
assert_eq!(&w, &[1, 2, 3, 4, 5, 6]);
println!("{:?} == {:?}", &w, &[1, 2, 3, 4, 5, 6]);
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How can slices be split using another slice as a delimiter? - rust

Related

How to return an iterator for a tuple of slices that iterates the first slice then the second slice?

A nice version for Vec<Vec<T>>.get?

How can concatenated &[u8] slices implement the Read trait without additional copying?

How to generate iterator with sliding window pairs?

Efficiently insert or replace multiple elements in the middle or at the beginning of a Vec?

Categories

Resources