Finding word in sentence - rust

In the following example:
fn main() {
let str_vec: ~[&str] = "lorem lpsum".split(' ').collect();
if (str_vec.contains("lorem")) {
println!("found it!");
}
}
It will not compile, and says:
error: mismatched types: expected &&'static str
but found 'static str (expected &-ptr but found &'static str)
What's the proper way to find the word in sentence?

The contains() method on vectors (specifically, on all vectors satisfying the std::vec::ImmutableEqVector trait, which is for all vectors containing types that can be compared for equality), has the following signature,
fn contains(&self, x: &T) -> bool
where T is the type of the element in the array. In your code, str_vec holds elements of type &str, so you need to pass in a &&str -- that is, a borrowed pointer to a &str.
Since the type of "lorem" is &'static str, you might attempt first to just write
str_vec.contains(&"lorem")`
In the current version of Rust, that doesn't work. Rust is in the middle of a language change referred to as dynamically-sized types (DST). One of the side effects is that the meaning of the expressions &"string" and &[element1, element2], where & appears before a string or array literal, will be changing (T is the type of the array elements element1 and element2):
Old behavior (still current as of Rust 0.9): The expressions &"string" and &[element1, element2] are coerced to slices &str and &[T], respectively. Slices refer to unknown-length ranges of the underlying string or array.
New behavior: The expressions &"string" and &[element1, element2] are interpreted as & &'static str and &[T, ..2], making their interpretation consistent with the rest of Rust.
Under either of these regimes, the most idiomatic way to obtain a slice of a statically-sized string or array is to use the .as_slice() method. Once you have a slice, just borrow a pointer to that to get the &&str type that .contains() requires. The final code is below (the if condition doesn't need to be surrounded by parentheses in Rust, and rustc will warn if you do have unnecessary parentheses):
fn main() {
let str_vec: ~[&str] = "lorem lpsum".split(' ').collect();
if str_vec.contains(&"lorem".as_slice()) {
println!("found it!");
}
}
Compile and run to get:
found it!
Edit: Recently, a change has landed to start warning on ~[T], which is being deprecated in favor of the Vec<T> type, which is also an owned vector but doesn't have special syntax. (For now, you need to import the type from the std::vec_ng library, but I believe the module std::vec_ng will go away eventually by replacing the current std::vec.) Once this change is made, it seems that you can't borrow a reference to "lorem".as_slice() because rustc considers the lifetime too short -- I think this is a bug too. On the current master, my code above should be:
use std::vec_ng::Vec; // Import will not be needed in the future
fn main() {
let str_vec: Vec<&str> = "lorem lpsum".split(' ').collect();
let slice = &"lorem".as_slice();
if str_vec.contains(slice) {
println!("found it!");
}
}

let sentence = "Lorem ipsum dolor sit amet";
if sentence.words().any(|x| x == "ipsum") {
println!("Found it!");
}
You could also do something with .position() or .count() instead of .any(). See Iterator trait.

Related

Why can't I use println with a str?

Prologue: I'm at my first day on Rust here.
This is my demo code:
fn main() {
println!("Hello, world!");
println!(Move::X.to_string());
}
enum Move {
Empty,
X,
O,
}
impl Move {
fn to_string(&self) -> &'static str {
match self {
Move::Empty => "Empty",
Move::X => "X",
Move::O => "O"
}
}
}
This is not compiling because of these errors
I kindly ask you a fix, but mainly I need an explanation.
I tried
println!(String::from(Move::X.to_string()));
but the error is identical.
Because println! is a macro in where the first term expects a string literal. That string literal is evaluated in compile time (so it can never be a reference to actual data).
You can use the newly added formatting string:
let x = Move::X.to_string();
println!("{x}");
or the usual formatting as the error message suggest you to do:
println!("{}", Move::X.to_string())
Playground
First of all, Move::X.to_string(): String, not Move::X.to_string(): &str or Move::X.to_string(): str. See this for an explanation. So even if println! did accept a &str, it's not by building a String that you would solve that issue (even though when calling a function that requires a &str, Rust can Deref String to a &str — but println! is not a function).
Second, the println! macro always and only wants a string literal as its first "argument". That's because it must be able to know at compile time what is the formatting required.

Proper signature for a function accepting an iterator of strings

I'm confused about the proper type to use for an iterator yielding string slices.
fn print_strings<'a>(seq: impl IntoIterator<Item = &'a str>) {
for s in seq {
println!("- {}", s);
}
}
fn main() {
let arr: [&str; 3] = ["a", "b", "c"];
let vec: Vec<&str> = vec!["a", "b", "c"];
let it: std::str::Split<'_, char> = "a b c".split(' ');
print_strings(&arr);
print_strings(&vec);
print_strings(it);
}
Using <Item = &'a str>, the arr and vec calls don't compile. If, instead, I use <Item = &'a'a str>, they work, but the it call doesn't compile.
Of course, I can make the Item type generic too, and do
fn print_strings<'a, I: std::fmt::Display>(seq: impl IntoIterator<Item = I>)
but it's getting silly. Surely there must be a single canonical "iterator of string values" type?
The error you are seeing is expected because seq is &Vec<&str> and &Vec<T> implements IntoIterator with Item=&T, so with your code, you end up with Item=&&str where you are expecting it to be Item=&str in all cases.
The correct way to do this is to expand Item type so that is can handle both &str and &&str. You can do this by using more generics, e.g.
fn print_strings(seq: impl IntoIterator<Item = impl AsRef<str>>) {
for s in seq {
let s = s.as_ref();
println!("- {}", s);
}
}
This requires the Item to be something that you can retrieve a &str from, and then in your loop .as_ref() will return the &str you are looking for.
This also has the added bonus that your code will also work with Vec<String> and any other type that implements AsRef<str>.
TL;DR The signature you use is fine, it's the callers that are providing iterators with wrong Item - but can be easily fixed.
As explained in the other answer, print_string() doesn't accept &arr and &vec because IntoIterator for &[T; n] and &Vec<T> yield references to T. This is because &Vec, itself a reference, is not allowed to consume the Vec in order to move T values out of it. What it can do is hand out references to T items sitting inside the Vec, i.e. items of type &T. In the case of your callers that don't compile, the containers contain &str, so their iterators hand out &&str.
Other than making print_string() more generic, another way to fix the issue is to call it correctly to begin with. For example, these all compile:
print_strings(arr.iter().map(|sref| *sref));
print_strings(vec.iter().copied());
print_strings(it);
Playground
iter() is the method provided by slices (and therefore available on arrays and Vec) that iterates over references to elements, just like IntoIterator of &Vec. We call it explicitly to be able to call map() to convert &&str to &str the obvious way - by using the * operator to dereference the &&str. The copied() iterator adapter is another way of expressing the same, possibly a bit less cryptic than map(|x| *x). (There is also cloned(), equivalent to map(|x| x.clone()).)
It's also possible to call print_strings() if you have a container with String values:
let v = vec!["foo".to_owned(), "bar".to_owned()];
print_strings(v.iter().map(|s| s.as_str()));

Unknown size at compile time when trying to print string contents in Rust

I have a couple of pieces of code, once errors out and the other doesn't, and I don't understand why.
The one that errors out when compiling:
fn main() {
let s1 = String::from("hello");
println!("{}", *s1);
}
This throws: doesn't have a size known at compile-time, on the line println!("{}", *s1);
The one that works:
fn main() {
let s1 = String::from("hello");
print_string(&s1);
}
fn print_string(s1: &String) {
println!("{}", *s1);
}
Why is this happening? Aren't both correct ways to access the string contents and printing them?
In the first snippet you’re dereferencing a String. This yields an str which is a dynamically sized type (sometimes called unsized types in older texts). DSTs are somewhat difficult to use directly
In the second snippet you’re dereferencing a &String, which yields a regular String, which is a normal sized type.
In both cases the dereference is completely useless, why are you even using one?

Generic operation on slice of Cow<str>

I'm trying to implement the following code, which removes the prefix from a slice of Cow<str>'s.
fn remove_prefix(v: &mut [Cow<str>], prefix: &str) {
for t in v.iter_mut() {
match *t {
Borrowed(&s) => s = s.trim_left_matches(prefix),
Owned(s) => s = s.trim_left_matches(prefix).to_string(),
}
}
}
I have two questions:
I can't get this to compile - I've tried loads of combinations of &'s and *'s but to no avail.
Is there a better way to apply functions to a Cow<str> without having to match it to Borrowed and Owned every time. I mean it seems like I should just be able to do something like *t = t.trim_left_matches(prefix) and if t is a Borrowed(str) it leaves it as a str (since trim_left_matches allows that), and if it is an Owned(String) it leaves it as a String. Similarly for replace() it would realise it has to convert both to a String (since you can't use replace() on a str). Is something like that possible?
Question #1 strongly implies how you think pattern matching and/or pointers work in Rust doesn't quite line up with how they actually work. The following code compiles:
fn remove_prefix(v: &mut [Cow<str>], prefix: &str) {
use std::borrow::Cow::*;
for t in v.iter_mut() {
match *t {
Borrowed(ref mut s) => *s = s.trim_left_matches(prefix),
Owned(ref mut s) => *s = s.trim_left_matches(prefix).to_string(),
}
}
}
If your case, Borrowed(&s) is matched against Borrowed(&str), meaning that s is of type str. This is impossible: you absolutely cannot have a variable of a dynamically sized type. It's also counter-productive. Given that you want to modify s, binding to it by value won't help at all.
What you want is to modify the thing contained in the Borrowed variant. This means you want a mutable pointer to that storage location. Hence, Borrowed(ref mut s): this is not destructuring the value inside the Borrowed at all. Rather, it binds directly to the &str, meaning that s is of type &mut &str; a mutable pointer to a (pointer to a str). In other words: a mutable pointer to a string slice.
At that point, mutating the contents of the Borrowed is done by re-assigning the value through the mutable pointer: *s = ....
Finally, the exact same reasoning applies to the Owned case: you were trying to bind by-value, then mutate it, which cannot possibly do what you want. Instead, bind by mutable pointer to the storage location, then re-assign it.
As for question #2... not really. That would imply some kind of overloading, which Rust doesn't do (by deliberate choice). If you are doing this a lot, you could write an extension trait that adds methods of interest to Cow.
You can definitely do it.
fn remove_prefix(v: &mut [Cow<str>], prefix: &str) {
for t in v.iter_mut() {
match *t {
Cow::Borrowed(ref mut s) => *s = s.trim_left_matches(prefix),
Cow::Owned(ref mut s) => *s = s.trim_left_matches(prefix).to_string(),
}
}
}
ref mut s means “take a mutable reference to the value and call it s” in a pattern. Thus you have s of type &mut &str or &mut String. You must then use *s =  in order to change what that mutable reference is pointing to (thus, change the string inside the Cow).

How to convert a String into a &'static str

How do I convert a String into a &str? More specifically, I would like to convert it into a str with the static lifetime (&'static str).
Updated for Rust 1.0
You cannot obtain &'static str from a String because Strings may not live for the entire life of your program, and that's what &'static lifetime means. You can only get a slice parameterized by String own lifetime from it.
To go from a String to a slice &'a str you can use slicing syntax:
let s: String = "abcdefg".to_owned();
let s_slice: &str = &s[..]; // take a full slice of the string
Alternatively, you can use the fact that String implements Deref<Target=str> and perform an explicit reborrowing:
let s_slice: &str = &*s; // s : String
// *s : str (via Deref<Target=str>)
// &*s: &str
There is even another way which allows for even more concise syntax but it can only be used if the compiler is able to determine the desired target type (e.g. in function arguments or explicitly typed variable bindings). It is called deref coercion and it allows using just & operator, and the compiler will automatically insert an appropriate amount of *s based on the context:
let s_slice: &str = &s; // okay
fn take_name(name: &str) { ... }
take_name(&s); // okay as well
let not_correct = &s; // this will give &String, not &str,
// because the compiler does not know
// that you want a &str
Note that this pattern is not unique for String/&str - you can use it with every pair of types which are connected through Deref, for example, with CString/CStr and OsString/OsStr from std::ffi module or PathBuf/Path from std::path module.
You can do it, but it involves leaking the memory of the String. This is not something you should do lightly. By leaking the memory of the String, we guarantee that the memory will never be freed (thus the leak). Therefore, any references to the inner object can be interpreted as having the 'static lifetime.
fn string_to_static_str(s: String) -> &'static str {
Box::leak(s.into_boxed_str())
}
fn main() {
let mut s = String::new();
std::io::stdin().read_line(&mut s).unwrap();
let s: &'static str = string_to_static_str(s);
}
As of Rust version 1.26, it is possible to convert a String to &'static str without using unsafe code:
fn string_to_static_str(s: String) -> &'static str {
Box::leak(s.into_boxed_str())
}
This converts the String instance into a boxed str and immediately leaks it. This frees all excess capacity the string may currently occupy.
Note that there are almost always solutions that are preferable over leaking objects, e.g. using the crossbeam crate if you want to share state between threads.
TL;DR: you can get a &'static str from a String which itself has a 'static lifetime.
Although the other answers are correct and most useful, there's a (not so useful) edge case, where you can indeed convert a String to a &'static str:
The lifetime of a reference must always be shorter or equal to the lifetime of the referenced object. I.e. the referenced object has to live longer (or equal long) than the reference. Since 'static means the entire lifetime of a program, a longer lifetime does not exist. But an equal lifetime will be sufficient. So if a String has a lifetime of 'static, you can get a &'static str reference from it.
Creating a static of type String has theoretically become possible with Rust 1.31 when the const fn feature was released. Unfortunately, the only const function returning a String is String::new() currently, and it's still behind a feature gate (so Rust nightly is required for now).
So the following code does the desired conversion (using nightly) ...and actually has no practical use except for completeness of showing that it is possible in this edge case.
#![feature(const_string_new)]
static MY_STRING: String = String::new();
fn do_something(_: &'static str) {
// ...
}
fn main() {
do_something(&MY_STRING);
}

Resources