Access a vector element with a string - rust

In the rust book we have the following code
#[test]
fn one_result() {
let query = "duct";
let contents = "\
Rust:
safe, fast, productive.
Pick three.";
assert_eq!(vec!["safe, fast, productive."], search(query, contents));
}
and the function for searching is:
pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
let mut results = Vec::new();
for line in contents.lines() {
if line.contains(query) {
results.push(line);
}
}
results
}
How does assert_eq access the vector element with a string? I cannot find any description about such functionality

assert_eq! does not access vector elements by string. It compares equality (==) of the two vectors.
assert_eq! is also just syntactic sugar for checking equality, otherwise panicking.
In other words, this is the same as your assert:
if vec!["safe, fast, productive."] != search(query, contents) {
panic!()
}
Keep reading the book to find out about Traits, notably the Eq and PartialEq traits, which are responsible for testing equality in rust.

I think I found the answer. The code
assert_eq!(vec!["safe, fast, productive."], search(query, contents));
creates a vector with elements of type String with only one entry that contains the text “safe, fast, productive.” and then compares this vector with the returned vector from the search function. So it does not try to access vector elements with strings, but compares vectors that contain only one element each

Related

How to apply a series of iterators to data?

I'm tinkering with Rust by building some basic genetics functionality, e.g. read a file with a DNA sequence, transcribe it to RNA, translate it to an amino acid sequence, etc.
I'd like each of these transformations to accept and return iterators. That way I can string them together (like dna.transcribe().traslate()...) and only collect when necessary, so the compiler can optimize the entire chain of transormations. I'm a data scientist coming from Scala/Spark, so this pattern makes a lot of sense, but I'm not sure how to implement it Rust.
I've read this article about returning iterators but the final solution seems to be to use trait objects (with possibly large performance impact), or to hand roll iterators with associated structs (which allows me to return an iterator, yes, but I don't see how it would allow me to write a transformation that also accepts an iterator).
Any general architectural advice here?
(FYI, my code so far is available here, but I feel like I'm not using Rust idiomatically because a. still can't quite get it to compile b. this pattern of lazily chaining operations has led to unexpectedly complex and messy code that only works on Rust nightly.)
Iterator adaptors are meant to do operations which can't easily be expressed otherwise. Your two examples, .translate(), and .transcribe(), given your explanation of them, could be simplified to the following:
dna
.map(|x| x.translate())
.map(|x| x.transcribe())
// or
dna
.map(|x| x.translate().transcribe())
However, if you are intent on designing your own iterator, the following should work:
struct Transcriber<I: Iterator<Item = Dna>> {
inner: I
}
impl<I: Iterator<Item = Dna>> Iterator for Transcriber<I> {
type Item = TranscribedDna;
fn next(&mut self) -> Option<Self::Item> {
self.next().map(|x| x.transcribe())
}
}
// Extension trait to add the `.transcribe` method to existing iterators
trait TranscribeIteratorExt: Iterator<Item = Dna> {
fn transcribe(self) -> Transcriber<Self>;
}
impl<I: Iterator<Item = Dna>> TranscriberIteratorExt for I {
fn transcribe(self) -> Transcriber<Self> {
Transcriber { inner: self }
}
}
Then you can use
dna
.transcribe() // yields TranscribedDna

Can I transform the strings in a [&'static str; N] into a &[&str] without creating multiple temporary values? [duplicate]

Per Steve Klabnik's writeup in the pre-Rust 1.0 documentation on the difference between String and &str, in Rust you should use &str unless you really need to have ownership over a String. Similarly, it's recommended to use references to slices (&[]) instead of Vecs unless you really need ownership over the Vec.
I have a Vec<String> and I want to write a function that uses this sequence of strings and it doesn't need ownership over the Vec or String instances, should that function take &[&str]? If so, what's the best way to reference the Vec<String> into &[&str]? Or, is this coercion overkill?
You can create a function that accepts both &[String] and &[&str] using the AsRef trait:
fn test<T: AsRef<str>>(inp: &[T]) {
for x in inp { print!("{} ", x.as_ref()) }
println!("");
}
fn main() {
let vref = vec!["Hello", "world!"];
let vown = vec!["May the Force".to_owned(), "be with you.".to_owned()];
test(&vref);
test(&vown);
}
This is actually impossible without either memory allocation or per-element call1.
Going from String to &str is not just viewing the bits in a different light; String and &str have a different memory layout, and thus going from one to the other requires creating a new object. The same applies to Vec and &[]
Therefore, whilst you can go from Vec<T> to &[T], and thus from Vec<String> to &[String], you cannot directly go from Vec<String> to &[&str]. Your choices are:
either accept &[String]
allocate a new Vec<&str> referencing the first Vec, and convert that into a &[&str]
As an example of the allocation:
fn usage(_: &[&str]) {}
fn main() {
let owned = vec![String::new()];
let half_owned: Vec<_> = owned.iter().map(String::as_str).collect();
usage(&half_owned);
}
1 Using generics and the AsRef<str> bound as shown in #aSpex's answer you get a slightly more verbose function declaration with the flexibility you were asking for, but you do have to call .as_ref() in all elements.

Getting Enumerate to work as ExactSizeIterator in Rust

I want to use Rust's Enumerate to get both a character and its index in the slice from each iteration:
fn main() {
for (j, val) in "dummy string".chars().enumerate().rev() {
// ...
}
}
When I compile with cargo run I get:
error: the trait `core::iter::ExactSizeIterator` is not implemented for the type `core::str::Chars<'_>` [E0277]
for (j, val) in "dummy string".chars().enumerate().rev() {
^~~
help: see the detailed explanation for E0277
error: the trait `core::iter::ExactSizeIterator` is not implemented for the type `core::str::Chars<'_>` [E0277]
for (j, val) in "dummy string".chars().enumerate().rev() {
// ...
}
I can understand why this would fail: the rev method needs an ExactSizeIterator since it needs to know the last element in the slice and its index from the beginning. Is it possible to get an ExactSizeIterator in this case, or does the length of the iterator need to be baked in at compile time? If it is possible, is it just a matter of specifying the iterator with something like as ExactSizeIterator or something like that?
The docs for ExactSizeIterator state:
An iterator that knows its exact length.
Many Iterators don't know how many times they will iterate, but some do. If an iterator knows how many times it can iterate, providing access to that information can be useful. For example, if you want to iterate backwards, a good start is to know where the end is.
But that's not the actual trait required by rev!
fn rev(self) -> Rev<Self>
where Self: DoubleEndedIterator
The ExactSizeIterator requirement comes from Enumerate's implementation of DoubleEndedIterator:
impl<I> DoubleEndedIterator for Enumerate<I>
where I: ExactSizeIterator + DoubleEndedIterator
Is it possible to get an ExactSizeIterator in this case, or does the length of the iterator need to be baked in at compile time?
The Chars iterator needs to support both ExactSizeIterator and DoubleEndedIterator, but it only natively supports DoubleEndedIterator.
In order to implement ExactSizeIterator for Chars, you'd need to be able to look at an arbitrary string and know (in a small enough time) how many characters it is made of. This is not generally possible with the UTF-8 encoding, the only encoding of Rust strings.
The length of the iterator is never a compile-time constant.
is it just a matter of specifying the iterator with something like as ExactSizeIterator
You cannot make a type into something it is not.
If you really need this, you could collect it all into a big Vec:
fn main() {
let chars: Vec<_> = "dummy string".chars().collect();
for (j, val) in chars.into_iter().enumerate().rev() {
println!("{}, {}", j, val)
}
}
It's also possible you actually want the characters in reverse order with the count in increasing direction:
fn main() {
for (j, val) in "dummy string".chars().rev().enumerate() {
println!("{}, {}", j, val)
}
}
But you said this:
a character and its index in the slice
Since strings are UTF-8, it's possible you mean you want the number of bytes into the slice. That can be found with the char_indices iterator:
fn main() {
for (j, val) in "dummy string".char_indices().rev() {
println!("{}, {}", j, val)
}
}

How do I compare a vector against a reversed version of itself?

Why won't this compile?
fn isPalindrome<T>(v: Vec<T>) -> bool {
return v.reverse() == v;
}
I get
error[E0308]: mismatched types
--> src/main.rs:2:25
|
2 | return v.reverse() == v;
| ^ expected (), found struct `std::vec::Vec`
|
= note: expected type `()`
found type `std::vec::Vec<T>`
Since you only need to look at the front half and back half, you can use the DoubleEndedIterator trait (methods .next() and .next_back()) to look at pairs of front and back elements this way:
/// Determine if an iterable equals itself reversed
fn is_palindrome<I>(iterable: I) -> bool
where
I: IntoIterator,
I::Item: PartialEq,
I::IntoIter: DoubleEndedIterator,
{
let mut iter = iterable.into_iter();
while let (Some(front), Some(back)) = (iter.next(), iter.next_back()) {
if front != back {
return false;
}
}
true
}
(run in playground)
This version is a bit more general, since it supports any iterable that is double ended, for example slice and chars iterators.
It only examines each element once, and it automatically skips the remaining middle element if the iterator was of odd length.
Read up on the documentation for the function you are using:
Reverse the order of elements in a slice, in place.
Or check the function signature:
fn reverse(&mut self)
The return value of the method is the unit type, an empty tuple (). You can't compare that against a vector.
Stylistically, Rust uses 4 space indents, snake_case identifiers for functions and variables, and has an implicit return at the end of blocks. You should adjust to these conventions in a new language.
Additionally, you should take a &[T] instead of a Vec<T> if you are not adding items to the vector.
To solve your problem, we will use iterators to compare the slice. You can get forward and backward iterators of a slice, which requires a very small amount of space compared to reversing the entire array. Iterator::eq allows you to do the comparison succinctly.
You also need to state that the T is comparable against itself, which requires Eq or PartialEq.
fn is_palindrome<T>(v: &[T]) -> bool
where
T: Eq,
{
v.iter().eq(v.iter().rev())
}
fn main() {
println!("{}", is_palindrome(&[1, 2, 3]));
println!("{}", is_palindrome(&[1, 2, 1]));
}
If you wanted to do the less-space efficient version, you have to allocate a new vector yourself:
fn is_palindrome<T>(v: &[T]) -> bool
where
T: Eq + Clone,
{
let mut reverse = v.to_vec();
reverse.reverse();
reverse == v
}
fn main() {
println!("{}", is_palindrome(&[1, 2, 3]));
println!("{}", is_palindrome(&[1, 2, 1]));
}
Note that we are now also required to Clone the items in the vector, so we add that trait bound to the method.

Finding word in sentence

In the following example:
fn main() {
let str_vec: ~[&str] = "lorem lpsum".split(' ').collect();
if (str_vec.contains("lorem")) {
println!("found it!");
}
}
It will not compile, and says:
error: mismatched types: expected &&'static str
but found 'static str (expected &-ptr but found &'static str)
What's the proper way to find the word in sentence?
The contains() method on vectors (specifically, on all vectors satisfying the std::vec::ImmutableEqVector trait, which is for all vectors containing types that can be compared for equality), has the following signature,
fn contains(&self, x: &T) -> bool
where T is the type of the element in the array. In your code, str_vec holds elements of type &str, so you need to pass in a &&str -- that is, a borrowed pointer to a &str.
Since the type of "lorem" is &'static str, you might attempt first to just write
str_vec.contains(&"lorem")`
In the current version of Rust, that doesn't work. Rust is in the middle of a language change referred to as dynamically-sized types (DST). One of the side effects is that the meaning of the expressions &"string" and &[element1, element2], where & appears before a string or array literal, will be changing (T is the type of the array elements element1 and element2):
Old behavior (still current as of Rust 0.9): The expressions &"string" and &[element1, element2] are coerced to slices &str and &[T], respectively. Slices refer to unknown-length ranges of the underlying string or array.
New behavior: The expressions &"string" and &[element1, element2] are interpreted as & &'static str and &[T, ..2], making their interpretation consistent with the rest of Rust.
Under either of these regimes, the most idiomatic way to obtain a slice of a statically-sized string or array is to use the .as_slice() method. Once you have a slice, just borrow a pointer to that to get the &&str type that .contains() requires. The final code is below (the if condition doesn't need to be surrounded by parentheses in Rust, and rustc will warn if you do have unnecessary parentheses):
fn main() {
let str_vec: ~[&str] = "lorem lpsum".split(' ').collect();
if str_vec.contains(&"lorem".as_slice()) {
println!("found it!");
}
}
Compile and run to get:
found it!
Edit: Recently, a change has landed to start warning on ~[T], which is being deprecated in favor of the Vec<T> type, which is also an owned vector but doesn't have special syntax. (For now, you need to import the type from the std::vec_ng library, but I believe the module std::vec_ng will go away eventually by replacing the current std::vec.) Once this change is made, it seems that you can't borrow a reference to "lorem".as_slice() because rustc considers the lifetime too short -- I think this is a bug too. On the current master, my code above should be:
use std::vec_ng::Vec; // Import will not be needed in the future
fn main() {
let str_vec: Vec<&str> = "lorem lpsum".split(' ').collect();
let slice = &"lorem".as_slice();
if str_vec.contains(slice) {
println!("found it!");
}
}
let sentence = "Lorem ipsum dolor sit amet";
if sentence.words().any(|x| x == "ipsum") {
println!("Found it!");
}
You could also do something with .position() or .count() instead of .any(). See Iterator trait.

Resources