How should I call Vec::with_capacity with an i32? - rust

I have a function which allocates a vector on the stack. This code doesn't work:
fn my_func(n: i32) {
let mut v = Vec::with_capacity(n);
}
The compiler says n needs to be a usize. I suppose that makes sense from a type safety point of view, but I need to use n in other calculations where an i32 is called for. What's the proper way to handle this?

Cast to usize.
let n: i32 = 4;
let v = Vec::<i16>::with_capacity(n as usize);

Related

Datatype for indexing vector

I just started to learn Rust. I understand that Rust's for loop indices and vector indices must be of type usize, hence I have written the following code. The computation of j within the for loop requires i to be type u32, so I convert it. Now, I have to change the type of i and j again to get the vector items.
I would like to avoid this constant back-and-forth conversion, is there an alternate way to do this in Rust? Thank you for your help.
fn compute(dots: Vec, N: u32) -> f32 {
let mut j: u32;
let mut value: f32 = 0.0;
for i in 0..N {
j = (i as u32 + 1) % N;
value += dots[i as usize].a * dots[j as usize].b;
value -= dots[i as usize].b * dots[j as usize].a;
}
return value
}
Either change the function signature to use N: usize, or, if you can't do that, just let M = N as usize and loop over 0..M (the loop variable will then have type usize).
Be aware that in real code, you need to be sure that usize is at least as wide as u32 if you opt for the conversion. If you cannot assure that, use try_into instead of as to convert.

Using an iterator as an argument to a function multiple times from one vector

I'm trying to write some Rust code to decode GPS data from an SDR receiver. I'm reading samples in from a file and converting the binary data to a series of complex numbers, which is a time-consuming process. However, there are times when I want to stream samples in without keeping them in memory (e.g. one very large file processed only one way or samples directly from the receiver) and other times when I want to keep the whole data set in memory (e.g. one small file processed in multiple different ways) to avoid repeating the work of parsing the binary file.
Therefore, I want to write functions or structs with iterators to be as general as possible, but I know they aren't sized, so I need to put them in a Box. I would have expected something like this to work.
This is the simplest example I could come up with to demonstrate the same basic problem.
fn sum_squares_plus(iter: Box<Iterator<Item = usize>>, x: usize) -> usize {
let mut ans: usize = 0;
for i in iter {
ans += i * i;
}
ans + x
}
fn main() {
// Pretend this is an expensive operation that I don't want to repeat five times
let small_data: Vec<usize> = (0..10).collect();
for x in 0..5 {
// Want to iterate over immutable references to the elements of small_data
let iterbox: Box<Iterator<Item = usize>> = Box::new(small_data.iter());
println!("{}: {}", x, sum_squares_plus(iterbox, x));
}
// 0..100 is more than 0..10 and I'm only using it once,
// so I want to 'stream' it instead of storing it all in memory
let x = 55;
println!("{}: {}", x, sum_squares_plus(Box::new(0..100), x));
}
I've tried several different variants of this, but none seem to work. In this particular case, I'm getting
error[E0271]: type mismatch resolving `<std::slice::Iter<'_, usize> as std::iter::Iterator>::Item == usize`
--> src/main.rs:15:52
|
15 | let iterbox: Box<Iterator<Item = usize>> = Box::new(small_data.iter());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected reference, found usize
|
= note: expected type `&usize`
found type `usize`
= note: required for the cast to the object type `dyn std::iter::Iterator<Item = usize>`
I'm not worried about concurrency and I'd be happy to just get it working sequentially on a single thread, but a concurrent solution would be a nice bonus.
The current error you're running into is here:
let iterbox:Box<Iterator<Item = usize>> = Box::new(small_data.iter());
You're declaring that you want an iterator that returns usize items, but small_data.iter() is an iterator that returns references to usize items (&usize). That why you get the error "expected reference, found usize". usize is a small type that's cloneable so you can simply use the .cloned() iterator adapter to provide an iterator that actually returns a usize.
let iterbox: Box<Iterator<Item = usize>> = Box::new(small_data.iter().cloned());
Once you're past that hurdle, the next problem is that the iterator returned over small_data contains a reference to the small_data. Since sum_squares_plus is defined to accept a Box<Iterator<Item = usize>>, it's implied in that signature that the Iterator trait object within the box has a 'static lifetime. The iterator you're providing does not because it borrows small_data. To fix that you need to adjust the sum_squares_plus definition to
fn sum_squares_plus<'a>(iter: Box<Iterator<Item = usize> + 'a>, x: usize) -> usize
Note the 'a lifetime annotations. The code should then compile, but unless there's some constraints other than what's clearly defined here, a more idiomatic and efficient approach would be to avoid using trait objects and the associated allocations. The below code should work using static dispatch without any trait objects.
fn sum_squares_plus<I: Iterator<Item = usize>>(iter: I, x: usize) -> usize {
let mut ans: usize = 0;
for i in iter {
ans += i * i;
}
ans + x
}
fn main() {
// Pretend this is an expensive operation that I don't want to repeat five times
let small_data: Vec<usize> = (0..10).collect();
for x in 0..5 {
println!("{}: {}", x, sum_squares_plus(small_data.iter().cloned(), x));
}
// 0..100 is more than 0..10 and I'm only using it once,
// so I want to 'stream' it instead of storing it all in memory
let x = 55;
println!("{}: {}", x, sum_squares_plus(Box::new(0..100), x));
}

How to translate "x-y" to vec![x, x+1, … y-1, y]?

This solution seems rather inelegant:
fn parse_range(&self, string_value: &str) -> Vec<u8> {
let values: Vec<u8> = string_value
.splitn(2, "-")
.map(|part| part.parse().ok().unwrap())
.collect();
{ values[0]..(values[1] + 1) }.collect()
}
Since splitn(2, "-") returns exactly two results for any valid string_value, it would be better to assign the tuple directly to two variables first and last rather than a seemingly arbitrary-length Vec. I can't seem to do this with a tuple.
There are two instances of collect(), and I wonder if it can be reduced to one (or even zero).
Trivial implementation
fn parse_range(string_value: &str) -> Vec<u8> {
let pos = string_value.find(|c| c == '-').expect("No valid string");
let (first, second) = string_value.split_at(pos);
let first: u8 = first.parse().expect("Not a number");
let second: u8 = second[1..].parse().expect("Not a number");
{ first..second + 1 }.collect()
}
Playground
I would recommend returning a Result<Vec<u8>, Error> instead of panicking with expect/unwrap.
Nightly implementation
My next thought was about the second collect. Here is a code example which uses nightly code, but you won't need any collect at all.
#![feature(conservative_impl_trait, inclusive_range_syntax)]
fn parse_range(string_value: &str) -> impl Iterator<Item = u8> {
let pos = string_value.find(|c| c == '-').expect("No valid string");
let (first, second) = string_value.split_at(pos);
let first: u8 = first.parse().expect("Not a number");
let second: u8 = second[1..].parse().expect("Not a number");
first..=second
}
fn main() {
println!("{:?}", parse_range("3-7").collect::<Vec<u8>>());
}
Instead of calling collect the first time, just advance the iterator:
let mut values = string_value
.splitn(2, "-")
.map(|part| part.parse().unwrap());
let start = values.next().unwrap();
let end = values.next().unwrap();
Do not call .ok().unwrap() — that converts the Result with useful error information to an Option, which has no information. Just call unwrap directly on the Result.
As already mentioned, if you want to return a Vec, you'll want to call collect to create it. If you want to return an iterator, you can. It's not bad even in stable Rust:
fn parse_range(string_value: &str) -> std::ops::Range<u8> {
let mut values = string_value
.splitn(2, "-")
.map(|part| part.parse().unwrap());
let start = values.next().unwrap();
let end = values.next().unwrap();
start..end + 1
}
fn main() {
assert!(parse_range("1-5").eq(1..6));
}
Sadly, inclusive ranges are not yet stable, so you'll need to continue to use +1 or switch to nightly.
Since splitn(2, "-") returns exactly two results for any valid string_value, it would be better to assign the tuple directly to two variables first and last rather than a seemingly arbitrary-length Vec. I can't seem to do this with a tuple.
This is not possible with Rust's type system. You are asking for dependent types, a way for runtime values to interact with the type system. You'd want splitn to return a (&str, &str) for a value of 2 and a (&str, &str, &str) for a value of 3. That gets even more complicated when the argument is a variable, especially when it's set at run time.
The closest workaround would be to have a runtime check that there are no more values:
assert!(values.next().is_none());
Such a check doesn't feel valuable to me.
See also:
What is the correct way to return an Iterator (or any other trait)?
How do I include the end value in a range?

How to allow function to work with integers or floats?

I found a function to compute a mean and have been playing with it. The code snippet below runs, but if the data inside the input changes from a float to an int an error occurs. How do I get this to work with floats and integers?
use std::borrow::Borrow;
fn mean(arr: &mut [f64]) -> f64 {
let mut i = 0.0;
let mut mean = 0.0;
for num in arr {
i += 1.0;
mean += (num.borrow() - mean) / i;
}
mean
}
fn main() {
let val = mean(&mut vec![4.0, 5.0, 3.0, 2.0]);
println!("The mean is {}", val);
}
The code in the question doesn't compile because f64 does not have a borrow() method. Also, the slice it accepts doesn't need to be mutable since we are not changing it. Here is a modified version that compiles and works:
fn mean(arr: &[f64]) -> f64 {
let mut i = 0.0;
let mut mean = 0.0;
for &num in arr {
i += 1.0;
mean += (num - mean) / i;
}
mean
}
We specify &num when looping over arr, so that the type of num is f64 rather than a reference to f64. This snippet would work with both, but omitting it would break the generic version.
For the same function to accept floats and integers alike, its parameter needs to be generic. Ideally we'd like it to accept any type that can be converted into f64, including f32 or user-defined types that defin such a conversion. Something like this:
fn mean<T>(arr: &[T]) -> f64 {
let mut i = 0.0;
let mut mean = 0.0;
for &num in arr {
i += 1.0;
mean += (num as f64 - mean) / i;
}
mean
}
This doesn't compile because x as f64 is not defined for x of an arbitry type. Instead, we need a trait bound on T that defines a way to convert T values to f64. This is exactly the purpose of the Into trait; every type T that implements Into<U> defines an into(self) -> U method. Specifying T: Into<f64> as the trait bound gives us the into() method that returns an f64.
We also need to request T to be Copy, to prevent reading the value from the array to "consume" the value, i.e. attempt moving it out of the array. Since primitive numbers such as integers implement Copy, this is ok for us. Working code then looks like this:
fn mean<T: Into<f64> + Copy>(arr: &[T]) -> f64 {
let mut i = 0.0;
let mut mean = 0.0;
for &num in arr {
i += 1.0;
mean += (num.into() - mean) / i;
}
mean
}
fn main() {
let val1 = mean(&vec![4.0, 5.0, 3.0, 2.0]);
let val2 = mean(&vec![4, 5, 3, 2]);
println!("The means are {} and {}", val1, val2);
}
Note that this will only work for types that define lossless conversion to f64. Thus it will work for u32, i32 (as in the above example) and smaller integer types, but it won't accept for example a vector of i64 or u64, which cannot be losslessly converted to f64.
Also note that this problem lends nicely to functional programming idioms such as enumerate() and fold(). Although outside the scope of this already longish answer, writing out such an implementation is an exercise hard to resist.

What do the ampersand '&' and star '*' symbols mean in Rust?

Despite thoroughly reading the documentation, I'm rather confused about the meaning of the & and * symbol in Rust, and more generally about what is a Rust reference exactly.
In this example, it seems to be similar to a C++ reference (that is, an address that is automatically dereferenced when used):
fn main() {
let c: i32 = 5;
let rc = &c;
let next = rc + 1;
println!("{}", next); // 6
}
However, the following code works exactly the same:
fn main() {
let c: i32 = 5;
let rc = &c;
let next = *rc + 1;
println!("{}", next); // 6
}
Using * to dereference a reference wouldn't be correct in C++. So I'd like to understand why this is correct in Rust.
My understanding so far, is that, inserting * in front of a Rust reference dereferences it, but the * is implicitly inserted anyway so you don't need to add it (while in C++, it's implicitly inserted and if you insert it you get a compilation error).
However, something like this doesn't compile:
fn main() {
let mut c: i32 = 5;
let mut next: i32 = 0;
{
let rc = &mut c;
next = rc + 1;
}
println!("{}", next);
}
error[E0369]: binary operation `+` cannot be applied to type `&mut i32`
--> src/main.rs:6:16
|
6 | next = rc + 1;
| ^^^^^^
|
= note: this is a reference to a type that `+` can be applied to; you need to dereference this variable once for this operation to work
= note: an implementation of `std::ops::Add` might be missing for `&mut i32`
But this works:
fn main() {
let mut c: i32 = 5;
let mut next: i32 = 0;
{
let rc = &mut c;
next = *rc + 1;
}
println!("{}", next); // 6
}
It seems that implicit dereferencing (a la C++) is correct for immutable references, but not for mutable references. Why is this?
Using * to dereference a reference wouldn't be correct in C++. So I'd like to understand why this is correct in Rust.
A reference in C++ is not the same as a reference in Rust. Rust's references are much closer (in usage, not in semantics) to C++'s pointers. With respect to memory representation, Rust's references often are just a single pointer, while C++'s references are supposed to be alternative names of the same object (and thus have no memory representation).
The difference between C++ pointers and Rust references is that Rust's references are never NULL, never uninitialized and never dangling.
The Add trait is implemented (see the bottom of the doc page) for the following pairs and all other numeric primitives:
&i32 + i32
i32 + &i32
&i32 + &i32
This is just a convenience thing the std-lib developers implemented. The compiler can figure out that a &mut i32 can be used wherever a &i32 can be used, but that doesn't work (yet?) for generics, so the std-lib developers would need to also implement the Add traits for the following combinations (and those for all primitives):
&mut i32 + i32
i32 + &mut i32
&mut i32 + &mut i32
&mut i32 + &i32
&i32 + &mut i32
As you can see that can get quite out of hand. I'm sure that will go away in the future. Until then, note that it's rather rare to end up with a &mut i32 and trying to use it in a mathematical expression.
This answer is for those looking for the basics (e.g. coming from Google).
From the Rust book's References and Borrowing:
fn main() {
let s1 = String::from("hello");
let len = calculate_length(&s1);
println!("The length of '{}' is {}.", s1, len);
}
fn calculate_length(s: &String) -> usize {
s.len()
}
These ampersands represent references, and they allow you to refer to some value without taking ownership of it [i.e. borrowing].
The opposite of referencing by using & is dereferencing, which is accomplished with the dereference operator, *.
And a basic example:
let x = 5;
let y = &x; //set y to a reference to x
assert_eq!(5, x);
assert_eq!(5, *y); // dereference y
If we tried to write assert_eq!(5, y); instead, we would get a compilation error can't compare `{integer}` with `&{integer}`.
(You can read more in the Smart Pointers chapter.)
And from Method Syntax:
Rust has a feature called automatic referencing and dereferencing. Calling methods is one of the few places in Rust that has this behavior.
Here’s how it works: when you call a method with object.something(), Rust automatically adds in &, &mut, or * so object matches the signature of the method. In other words, the following are the same:
p1.distance(&p2);
(&p1).distance(&p2);
From the docs for std::ops::Add:
impl<'a, 'b> Add<&'a i32> for &'b i32
impl<'a> Add<&'a i32> for i32
impl<'a> Add<i32> for &'a i32
impl Add<i32> for i32
It seems the binary + operator for numbers is implemented for combinations of shared (but not mutable) references of the operands and owned versions of the operands. It has nothing to do with automatic dereferencing.

Resources