Returning reference from a function in Rust - rust

Following is a program that returns the reference to the largest value of any given vector.I used generics for this, but does not work.
fn largest<T : PartialOrd>(vec : &[T]) -> &T{
let mut biggest = vec[0];
for &item in vec{
if item > biggest{
biggest = item
}
}
&biggest
}
I know I am returning a reference to a local variable, so It won't compile. The other solution is to use copy trait like,
fn largest<T : PartialOrd + Copy>(vec : &[T]) -> T{}
Is there any way so that I can return the reference and avoid using Copy trait?

This is probably what you want:
fn largest<T : PartialOrd>(vec : &[T]) -> &T{
let mut biggest = &vec[0];
for item in vec{
if item > biggest{
biggest = item
}
}
biggest
}
biggest is a reference of type mut &T, so the reference can be rebinded later at the line biggest = item.
By doing it this way, at no point in the code will you be making copies of T, and so will be returning a reference to one of the elements of the slice.

Related

`fold` values into a HashMap

After reading this article Learning Programming Concepts by Jumping in at the Deep End I can't seem to understand how exactly fold() is working in this context. Mainly how fold() knows to grab the word variable from split().
Here's the example:
use std::collections::HashMap;
fn count_words(text: &str) -> HashMap<&str, usize> {
text.split(' ').fold(
HashMap::new(),
|mut map, word| { *map.entry(word).or_insert(0) += 1; map }
)
}
Playground
Rust docs say:
fold() takes two arguments: an initial value, and a closure with two arguments: an ‘accumulator’, and an element. The closure returns the value that the accumulator should have for the next iteration.
Iterator - fold
So I get the mut map is the accumulator and I get that split() returns an iterator and therefore fold() is iterating over those values but how does fold know to grab that value? It's being implicitly passed but I cant seem to wrap my head around this. How is that being mapped to the word variable...
Not sure if I have the right mental model for this...
Thanks!
but how does fold know to grab that value?
fold() is a method on the iterator. That means that it has access to self which is the actual iterator, so it can call self.next() to get the next item (in this case the word, since self is of type Split, so its next() does get the next word). You could imagine fold() being implemented with the following pseudocode:
fn fold<B, F>(mut self, init: B, mut f: F) -> B
where
Self: Sized,
F: FnMut(B, Self::Item) -> B,
{
let mut accum = init;
while let Some(x) = self.next() {
accum = f(accum, x);
}
accum
}
Ok, the above is not pseudocode, it's the actual implementation.

What does this Rust Closure argument syntax mean?

I modified code found on the internet to create a function that obtains the statistical mode of any Hashable type that implements Eq, but I do not understand some of the syntax. Here is the function:
use std::hash::Hash;
use std::collections::HashMap;
pub fn mode<'a, I, T>(items: I) -> &'a T
where I: IntoIterator<Item = &'a T>,
T: Hash + Clone + Eq, {
let mut occurrences: HashMap<&T, usize> = HashMap::new();
for value in items.into_iter() {
*occurrences.entry(value).or_insert(0) += 1;
}
occurrences
.into_iter()
.max_by_key(|&(_, count)| count)
.map(|(val, _)| val)
.expect("Cannot compute the mode of zero items")
}
(I think requiring Clone may be overkill.)
The syntax I do not understand is in the closure passed to map_by_key:
|&(_, count)| count
What is the &(_, count) doing? I gather the underscore means I can ignore that parameter. Is this some sort of destructuring of a tuple in a parameter list? Does this make count take the reference of the tuple's second item?
.max_by_key(|&(_, count)| count) is equivalent to .max_by_key(f) where f is this:
fn f<T>(t: &(T, usize)) -> usize {
(*t).1
}
f() could also be written using pattern matching, like this:
fn f2<T>(&(_, count): &(T, usize)) -> usize {
count
}
And f2() is much closer to the first closure you're asking about.
The second closure is essentially the same, except there is no reference slightly complicating matters.

Why does std::vec::Vec implement two kinds of the Extend trait?

The struct std::vec::Vec implements two kinds of Extend, as specified here – impl<'a, T> Extend<&'a T> for Vec<T> and impl<T> Extend<T> for Vec<T>. The documentation states that the first kind is an "Extend implementation that copies elements out of references before pushing them onto the Vec". I'm rather new to Rust, and I'm not sure if I'm understanding it correctly.
I would guess that the first kind is used with the equivalent of C++ normal iterators, and the second kind is used with the equivalent of C++ move iterators.
I'm trying to write a function that accepts any data structure that will allow inserting i32s to the back, so I take a parameter that implements both kinds of Extend, but I can't figure out how to specify the generic parameters to get it to work:
fn main() {
let mut vec = std::vec::Vec::<i32>::new();
add_stuff(&mut vec);
}
fn add_stuff<'a, Rec: std::iter::Extend<i32> + std::iter::Extend<&'a i32>>(receiver: &mut Rec) {
let x = 1 + 4;
receiver.extend(&[x]);
}
The compiler complains that &[x] "creates a temporary which is freed while still in use" which makes sense because 'a comes from outside the function add_stuff. But of course what I want is for receiver.extend(&[x]) to copy the element out of the temporary array slice and add it to the end of the container, so the temporary array will no longer be used after receiver.extend returns. What is the proper way to express what I want?
From the outside of add_stuff, Rect must be able to be extended with a reference whose lifetime is given in the inside of add_stuff. Thus, you could require that Rec must be able to be extended with references of any lifetime using higher-ranked trait bounds:
fn main() {
let mut vec = std::vec::Vec::<i32>::new();
add_stuff(&mut vec);
}
fn add_stuff<Rec>(receiver: &mut Rec)
where
for<'a> Rec: std::iter::Extend<&'a i32>
{
let x = 1 + 4;
receiver.extend(&[x]);
}
Moreover, as you see, the trait bounds were overly tight. One of them should be enough if you use receiver consistently within add_stuff.
That said, I would simply require Extend<i32> and make sure that add_stuff does the right thing internally (if possible):
fn add_stuff<Rec>(receiver: &mut Rec)
where
Rec: std::iter::Extend<i32>
{
let x = 1 + 4;
receiver.extend(std::iter::once(x));
}

How to translate "x-y" to vec![x, x+1, … y-1, y]?

This solution seems rather inelegant:
fn parse_range(&self, string_value: &str) -> Vec<u8> {
let values: Vec<u8> = string_value
.splitn(2, "-")
.map(|part| part.parse().ok().unwrap())
.collect();
{ values[0]..(values[1] + 1) }.collect()
}
Since splitn(2, "-") returns exactly two results for any valid string_value, it would be better to assign the tuple directly to two variables first and last rather than a seemingly arbitrary-length Vec. I can't seem to do this with a tuple.
There are two instances of collect(), and I wonder if it can be reduced to one (or even zero).
Trivial implementation
fn parse_range(string_value: &str) -> Vec<u8> {
let pos = string_value.find(|c| c == '-').expect("No valid string");
let (first, second) = string_value.split_at(pos);
let first: u8 = first.parse().expect("Not a number");
let second: u8 = second[1..].parse().expect("Not a number");
{ first..second + 1 }.collect()
}
Playground
I would recommend returning a Result<Vec<u8>, Error> instead of panicking with expect/unwrap.
Nightly implementation
My next thought was about the second collect. Here is a code example which uses nightly code, but you won't need any collect at all.
#![feature(conservative_impl_trait, inclusive_range_syntax)]
fn parse_range(string_value: &str) -> impl Iterator<Item = u8> {
let pos = string_value.find(|c| c == '-').expect("No valid string");
let (first, second) = string_value.split_at(pos);
let first: u8 = first.parse().expect("Not a number");
let second: u8 = second[1..].parse().expect("Not a number");
first..=second
}
fn main() {
println!("{:?}", parse_range("3-7").collect::<Vec<u8>>());
}
Instead of calling collect the first time, just advance the iterator:
let mut values = string_value
.splitn(2, "-")
.map(|part| part.parse().unwrap());
let start = values.next().unwrap();
let end = values.next().unwrap();
Do not call .ok().unwrap() — that converts the Result with useful error information to an Option, which has no information. Just call unwrap directly on the Result.
As already mentioned, if you want to return a Vec, you'll want to call collect to create it. If you want to return an iterator, you can. It's not bad even in stable Rust:
fn parse_range(string_value: &str) -> std::ops::Range<u8> {
let mut values = string_value
.splitn(2, "-")
.map(|part| part.parse().unwrap());
let start = values.next().unwrap();
let end = values.next().unwrap();
start..end + 1
}
fn main() {
assert!(parse_range("1-5").eq(1..6));
}
Sadly, inclusive ranges are not yet stable, so you'll need to continue to use +1 or switch to nightly.
Since splitn(2, "-") returns exactly two results for any valid string_value, it would be better to assign the tuple directly to two variables first and last rather than a seemingly arbitrary-length Vec. I can't seem to do this with a tuple.
This is not possible with Rust's type system. You are asking for dependent types, a way for runtime values to interact with the type system. You'd want splitn to return a (&str, &str) for a value of 2 and a (&str, &str, &str) for a value of 3. That gets even more complicated when the argument is a variable, especially when it's set at run time.
The closest workaround would be to have a runtime check that there are no more values:
assert!(values.next().is_none());
Such a check doesn't feel valuable to me.
See also:
What is the correct way to return an Iterator (or any other trait)?
How do I include the end value in a range?

Why is this not a dangling reference?

I am following the second edition of the TRPL book (second edition) and am a little confused by one of the tasks. At the end of section 10.2 (Traits) I am asked to reimplement the largest function using the Clone trait. (Note that at this point I have not learned anything about lifetimes yet.) I implemented the following
fn largest<T: PartialOrd + Clone>(list: &[T]) -> &T {
let l = list.clone();
let mut largest = &l[0];
for item in l {
if item > &largest {
largest = item;
}
}
largest
}
This returns a reference to an item of the cloned list. And, lo and behold, it compiles. Why is this not a dangling reference (as described in section 4.2)?
As far as I understand it, largest contains a reference to an item of a (cloned) copy of list, but should l not go out of scope and thus invalidate the reference after largest has finished?
Because l does not have the type you think it does:
fn largest<T: PartialOrd>(list: &[T]) -> &T {
let l: &[T] = list.clone();
let mut largest = &l[0];
for item in l {
if item > &largest {
largest = item;
}
}
largest
}
l is a reference too, cloning a slice actually just returns the slice itself, with the same lifetime.
Therefore it's perfectly fine to take references into the slice, and your return value borrows the original slice.

Resources