How can I join strings from VecDeque in Rust? - rust

This is what I'm trying to do, but it doesn't work:
use std::collections::VecDeque;
let mut x: VecDeque<String> = VecDeque::new();
x.push_back("bye".to_string());
x.push_front("hello".to_string());
x.join(" "); // doesn't compile!
I'm expecting hello bye as a result.
What is the right way?

The simplest option is probably to convert the VecDeque .into() a Vec, though as the documentation says:
This never needs to re-allocate, but does need to do O(n) data movement if the circular buffer doesn’t happen to be at the beginning of the allocation.
so ymmv on the efficiency front.
Alternatively, add a dependency on itertools which has a number of nice utilities including a join method (and also for your case a helper join function so you don't even need to convert the collection to an iterator).

Related

What is the most idomatic way to write iterators that map an uncertain number of input items to other objects in Rust?

I'm trying to implement a Lexer. Since lexers emit tokens, I suppose that we can perceive a Lexer as a special iterator that maps certain chunks of chars to Tokens. I therefore expect Lexer to store an Iterator<Item=char> and manipulate that iterator instead of a &str to enable maximum flexibility.
struct Lexer<T: Iterator<Item=char>> {
source: T
}
Yet I find it hard to manipulate the iterator, since almost all iterator adaptors take ownership, and with generics I cannot change the type of T at runtime, unless I switch to Box.
self.source.take_while(|x| x.is_whitespace())
A possible workaround is to require that the iterator implement Clone, use a clone every time I want to transform it, remember how many characters I have seen, and call next that many times. I believe that it is too clumsy.
I wonder if there is an idomatic way to write iterators that map an uncertain number of input items (in this case, chars) into another object (in this case, Tokens)?
The most elegant way I can come up with so far is to use while let etc. which are not so fluent-style-like. I inspected the implementation of GroupBy in itertools and found that they use the while let approach too.

Rust: can I have a fixed size slice by borrowing the whole fixed size array in a smaller scope in a simple way

I saw the workarounds and they where kinda long. Am I missing a feature of Rust or a simple solution (Important: not workaround). I feel like I should be able to do this with maybe a simple macro but arrayref crate implementations aren't what I am looking for. Is this a feature that needs to be added to Rust or creating fixed size slicing from fixed sized array in a smaller scope is something bad.
Basically what I want to do is this;
fn f(arr:[u8;4]){
arr[0];
}
fn basic(){
let mut arr:[u8;12] = [0;12];
// can't I borrow the whole array but but a fixed slice to it?
f(&mut arr[8..12]); // But this is know on compile time?
f(&mut arr[8..12] as &[u8;4]); // Why can't I do these things?
}
What I want can be achieved by below code(from other so threads)
use array_ref;
fn foo(){
let buf:[u8;12] = [0;12];
let (_, fixed_slice) = mut_array_refs![
&mut buf,
8,
4
];
write_u32_into(fixed_slice,0);
}
fn write_u32_into(fixed_slice:&mut [u8;12],num:u32){
// won't have to check if fixed_slice.len() == 12 and won't panic
}
But I looked into the crate and even though this never panics there are many unsafe blocks and many lines of code. It is a workaround for the Rust itself. In the first place I wanted something like this to get rid of the overhead of checking the size and the possible runtime panic.
Also this is a little overhead it doesn't matter isn't a valid answer because technically I should be able to guarantee this in compile time even if the overhead is small this doesn't mean rust doesn't need to have this type of feature or I should not be looking for an ideal way.
Note: Can this be solved with lifetimes?
Edit: If we where able to have a different syntax for fixed slices such as arr[12;;16] and when I borrowed them this way it would borrow it would borrow the whole arr. I think this way many functions for example (write_u32) would be implemented in a more "rusty" way.
Use let binding with slice_patterns feature. It was stabilized in Rust 1.42.
let v = [1, 2, 3]; // inferred [i32; 3]
let [_, ref subarray # ..] = v; // subarray is &[i32; 2]
let a = v[0]; // 1
let b = subarray[1]; // 3
Here is a section from the Rust reference about slice patterns.
Why it doesn't work
What you want is not available as a feature in rust stable or nightly because multiple things related to const are not stabilized yet, namely const generics and const traits. The reason traits are involved is because the arr[8..12] is a call to the core::ops::Index::<Range<usize>> trait that returns a reference to a slice, in your case [u8]. This type is unsized and not equal to [u8; 4] even if the compiler could figure out that it is, rust is inherently safe and can be overprotective sometimes to ensure safety.
What can you do then?
You have a few routes you can take to solve this issue, I'll stay in a no_std environment for all this as that seems to be where you're working and will avoid extra crates.
Change the function signature
The current function signature you have takes the four u8s as an owned value. If you only are asking for 4 values you can instead take those values as parameters to the function. This option breaks down when you need larger arrays but at that point, it would be better to take the array as a reference or using the method below.
The most common way, and the best way in my opinion, is to take the array in as a reference to a slice (&[u8] or &mut [u8]). This is not the same as taking a pointer to the value in C, slices in rust also carry the length of themselves so you can safely iterate through them without worrying about buffer overruns or if you read all the data. This does require changing the algorithms below to account for variable-sized input but most of the time there is a just as good option to use.
The safe way
Slice can be converted to arrays using TryInto, but this comes at the cost of runtime size checking which you seem to want to avoid. This is an option though and may result in a minimal performance impact.
Example:
fn f(arr: [u8;4]){
arr[0];
}
fn basic(){
let mut arr:[u8;12] = [0;12];
f(arr[8..12].try_into().unwrap());
}
The unsafe way
If you're willing to leave the land of safety there are quite a few things you can do to force the compiler to recognize the data as you want it to, but they can be abused. It's usually better to use rust idioms rather than force other methods in but this is a valid option.
fn basic(){
let mut arr:[u8;12] = [0;12];
f(unsafe {*(arr[8..12].as_ptr() as *const [u8; 4])});
}
TL;DR
I recommend changing your types to utilize slices rather than arrays but if that's not feasible I'd suggest avoiding unsafety, the performance won't be as bad as you think.

Why are len() and is_empty() not defined in a trait?

Most patterns in Rust are captured by traits (Iterator, From, Borrow, etc.).
How come a pattern as pervasive as len/is_empty has no associated trait in the standard library? Would that cause problems which I do not foresee? Was it deemed useless? Or is it only that nobody thought of it (which seems unlikely)?
Was it deemed useless?
I would guess that's the reason.
What could you do with the knowledge that something is empty or has length 15? Pretty much nothing, unless you also have a way to access the elements of the collection for example. The trait that unifies collections is Iterator. In particular an iterator can tell you how many elements its underlying collection has, but it also does a lot more.
Also note that should you need an Empty trait, you can create one and implement it for all standard collections, unlike interfaces in most languages. This is the power of traits. This also means that the standard library doesn't need to provide small utility traits for every single use case, they can be provided by libraries!
Just adding a late but perhaps useful answer here. Depending on what exactly you need, using the slice type might be a good option, rather than specifying a trait. Slices have len(), is_empty(), and other useful methods (full docs here). Consider the following:
use core::fmt::Display;
fn printme<T: Display>(x: &[T]) {
println!("length: {}, empty: ", x.len());
for c in x {
print!("{}, ", c);
}
println!("\nDone!");
}
fn main() {
let s = "This is some string";
// Vector
let vv: Vec<char> = s.chars().collect();
printme(&vv);
// Array
let x = [1, 2, 3, 4];
printme(&x);
// Empty
let y:Vec<u8> = Vec::new();
printme(&y);
}
printme can accept either a vector or an array. Most other things that it accepts will need some massaging.
I think maybe the reason for there being no Length trait is that most functions will either a) work through an iterator without needing to know its length (with Iterator), or b) require len because they do some sort of random element access, in which case a slice would be the best bet. In the first case, knowing length may be helpful to pre-allocate memory of some size, but size_hint takes care of this when used for anything like Vec::with_capacity, or ExactSizeIterator for anything that needs specific allocations. Most other cases would probably need to be collected to a vector at some point within the function, which has its len.
Playground link to my example here: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9a034c2e8b75775449afa110c05858e7

How do I collect the values of a HashMap into a vector?

I can not find a way to collect the values of a HashMap into a Vec in the documentation. I have score_table: HashMap<Id, Score> and I want to get all the Scores into all_scores: Vec<Score>.
I was tempted to use the values method (all_scores = score_table.values()), but it does not work since values is not a Vec.
I know that Values implements the ExactSizeIterator trait, but I do not know how to collect all values of an iterator into a vector without manually writing a for loop and pushing the values in the vector one after one.
I also tried to use std::iter::FromIterator; but ended with something like:
all_scores = Vec::from_iter(score_table.values());
expected type `std::vec::Vec<Score>`
found type `std::vec::Vec<&Score>`
Thanks to Hash map macro refuses to type-check, failing with a misleading (and seemingly buggy) error message?, I changed it to:
all_scores = Vec::from_iter(score_table.values().cloned());
and it does not produce errors to cargo check.
Is this a good way to do it?
The method Iterator.collect is designed for this specific task. You're right in that you need .cloned() if you want a vector of actual values instead of references (unless the stored type implements Copy, like primitives), so the code looks like this:
all_scores = score_table.values().cloned().collect();
Internally, collect() just uses FromIterator, but it also infers the type of the output. Sometimes there isn't enough information to infer the type, so you may need to explicitly specify the type you want, like so:
all_scores = score_table.values().cloned().collect::<Vec<Score>>();
If you don't need score_table anymore, you can transfer the ownership of Score values to all_scores by:
let all_scores: Vec<Score> = score_table.into_iter()
.map(|(_id, score)| score)
.collect();
This approach will be faster and consume less memory than the clone approach by #apetranzilla. It also supports any struct, not only structs that implement Clone.
There are three useful methods on HashMaps, which all return iterators:
values() borrows the collection and returns references (&T).
values_mut() gives mutable references &mut T which is useful to modify elements of the collection without destroying score_table.
into_values() gives you the elements directly: T! The iterator takes ownership of all the elements. This means that score_table no longer owns them, so you can't use score_table anymore!
In your example, you call values() to get &T references, then convert them to owned values T via a clone().
Instead, if we have an iterator of owned values, then we can convert it to a Vec using Iterator::collect():
let all_scores: Vec<Score> = score_table.into_values().collect();
Sometimes, you may need to specify the collecting type:
let all_scores = score_table.into_values().collect::<Vec<Score>>();

When should I use direct access into a Rust Vec instead of the get method?

Rust supports two methods for accessing the elements of a vector:
let mut v = vec![1, 2, 3];
let first_element = &v[0];
let second_element = v.get(1);
The get() method returns an Option type, which seems like a useful safety feature. The C-like syntax &v[0] seems shorter to type, but gives up the safety benefits, since invalid reads cause a run-time error rather than producing an indication that the read was out of bounds.
It's not clear to me when I would want to use the direct access approach, because it seems like the only advantage is that it's quicker to type (I save 3 characters). Is there some other advantage (perhaps a speedup?) that I'm not seeing? I guess I would save the conditional of a match expression, but that doesn't seem like it offers much benefit compared to the costs.
Neither of them is quicker because they both do bounds checks. In fact, your question is quite generic because there are other pairs of methods where one of them panics while the other returns an option, such as String::reserve vs String::try_reserve.
If you are sure that you are in bounds, use the brackets version. This is only a syntactic shortcut for get().unwrap().
If you are unsure of this, use the get() method and do your check.
If you critically need maximum speed and you cannot use an iterator and you have determined through benchmarks that the indexing is the bottleneck and you are sure to be in bounds, you can use the get_unchecked() method. Be careful about this because it is unsafe: it is always better to not have any unsafe block in your code.
Just a little bit of advice: if you are concerned by your program performance, avoid using those methods and prefer to use iterators as much as you can. For example, the second example is faster than the first one because in the first case there are one million bounds checks:
let v: Vec<_> = (0..1000_000).collect();
for idx in 0..1000_000 {
// do something with v[idx]
}
for num in &v {
// do something with num
}

Resources