take the code from rustbook for example
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() {
x
} else {
y
}
}
we have the above function, now suppose we have main:
// can't compile , i understand this part, just post here for reference
fn main1(){
let string1 = String::from("abcd");
let result;
{
let string2 = String::from("xyz"); // we will change this to &str in next example
result = longest(string1.as_str(), string2.as_str());
^^^^^^ doesn't compile , string2 doesn't live long enough
}
println!("The longest string is {}", result);
}
but if we slight change it to the code below, changing the string2 to be a string slice, this code actually compiles , and i dont quite know what's going on, does the "xyz" still count as valid memory?
//This compiles. but why? shouldn't ```longest()``` return a smaller lifetime and refuses to compile?
fn main2(){
let string1 = String::from("abcd");
let result;
{
let string2 = "xyz"; // <------ changed
result = longest(string1.as_str(), string2);
}
println!("The longest string is {}", result);
}
The lifetime of a string literal is 'static (the complete type is &'static str), which means it will live until the end of the program. That's because strings are put as data into a special section of the compiled binary, so as long as the program is running, that data is also accessible.
"xyz" is &'static str, it is valid for the entire duration of the program.
string2 will always be dropped, but in first example string2 owns your data, in second it’s just a reference to ’static data, so the data won’t be invalidated after string2’s drop.
Related
I try to write a code sample since an hour, who add some spaces in a String. So the main borrow a String to a function who add spaces, and then ,I wanna get the string back in the main function.
fn main() {
...
let mut line = String::new();
line = make_spaces(5, &mut line);
}
fn make_spaces(number: u8, string: &mut String) -> &mut String {
for _ in 0..number {
string.push(' ');
}
string
}
But the borrow checker give me the following error :
error[E0308]: mismatched types
--> src/main.rs:14:12
|
14 | line = make_spaces(left_spaces, &mut line);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| expected struct `String`, found `&mut String`
| help: try using a conversion method: `make_spaces(left_spaces, &mut line).to_string()`
I'm new with rust, and I know I don't understand borrowing. But I searched the net, and I don't understand it any better.
As I understand, I give the line ownership to make_spaces function, and then I (try to) give the ownership of string back to the main function.
Could you tell me where I'm wrong what is the solution of this problem please ? I know it's a common issue, but I don't see anything simple about that on stackoverflow.
There are two ways to do what you're trying, but you're mixing the two:
Take a mutable parameter as a reference
fn main() {
let mut line = String::new();
make_spaces(5, &mut line);
}
fn make_spaces(number: u8, string: &mut String) {
for _ in 0..number {
string.push(' ');
}
}
In that case, the caller gives you a reference to a struct so that you can modify it in place. There is no need to return anything.
Move the parameter into the function, then return the modified value
fn main() {
let mut line = String::new();
line = make_spaces(5, line);
}
fn make_spaces(number: u8, mut string: String) -> String {
for _ in 0..number {
string.push(' ');
}
string
}
In that case, the string is moved into the make_spaces function, so you need to move it back to the caller by returning it.
You don't have to return anything: you received a reference to a struct the caller of the function already has.
Just do
fn main() {
let mut line = String::new();
make_spaces(5, &mut line);
// use line here
}
fn make_spaces(number: u8, string: &mut String) {
for _ in 0..number {
string.push(' ');
}
}
I'm learning Rust from the Book and I was tackling the exercises at the end of chapter 8, but I'm hitting a wall with the one about converting words into Pig Latin. I wanted to see specifically if I could pass a &mut String to a function that takes a &mut str (to also accept slices) and modify the referenced string inside it so the changes are reflected back outside without the need of a return, like in C with a char **.
I'm not quite sure if I'm just messing up the syntax or if it's more complicated than it sounds due to Rust's strict rules, which I have yet to fully grasp. For the lifetime errors inside to_pig_latin() I remember reading something that explained how to properly handle the situation but right now I can't find it, so if you could also point it out for me it would be very appreciated.
Also what do you think of the way I handled the chars and indexing inside strings?
use std::io::{self, Write};
fn main() {
let v = vec![
String::from("kaka"),
String::from("Apple"),
String::from("everett"),
String::from("Robin"),
];
for s in &v {
// cannot borrow `s` as mutable, as it is not declared as mutable
// cannot borrow data in a `&` reference as mutable
to_pig_latin(&mut s);
}
for (i, s) in v.iter().enumerate() {
print!("{}", s);
if i < v.len() - 1 {
print!(", ");
}
}
io::stdout().flush().unwrap();
}
fn to_pig_latin(mut s: &mut str) {
let first = s.chars().nth(0).unwrap();
let mut pig;
if "aeiouAEIOU".contains(first) {
pig = format!("{}-{}", s, "hay");
s = &mut pig[..]; // `pig` does not live long enough
} else {
let mut word = String::new();
for (i, c) in s.char_indices() {
if i != 0 {
word.push(c);
}
}
pig = format!("{}-{}{}", word, first.to_lowercase(), "ay");
s = &mut pig[..]; // `pig` does not live long enough
}
}
Edit: here's the fixed code with the suggestions from below.
fn main() {
// added mut
let mut v = vec![
String::from("kaka"),
String::from("Apple"),
String::from("everett"),
String::from("Robin"),
];
// added mut
for mut s in &mut v {
to_pig_latin(&mut s);
}
for (i, s) in v.iter().enumerate() {
print!("{}", s);
if i < v.len() - 1 {
print!(", ");
}
}
println!();
}
// converted into &mut String
fn to_pig_latin(s: &mut String) {
let first = s.chars().nth(0).unwrap();
if "aeiouAEIOU".contains(first) {
s.push_str("-hay");
} else {
// added code to make the new first letter uppercase
let second = s.chars().nth(1).unwrap();
*s = format!(
"{}{}-{}ay",
second.to_uppercase(),
// the slice starts at the third char of the string, as if &s[2..]
&s[first.len_utf8() * 2..],
first.to_lowercase()
);
}
}
I'm not quite sure if I'm just messing up the syntax or if it's more complicated than it sounds due to Rust's strict rules, which I have yet to fully grasp. For the lifetime errors inside to_pig_latin() I remember reading something that explained how to properly handle the situation but right now I can't find it, so if you could also point it out for me it would be very appreciated.
What you're trying to do can't work: with a mutable reference you can update the referee in-place, but this is extremely limited here:
a &mut str can't change length or anything of that matter
a &mut str is still just a reference, the memory has to live somewhere, here you're creating new Strings inside your function then trying to use these as the new backing buffers for the reference, which as the compiler tells you doesn't work: the String will be deallocated at the end of the function
What you could do is take an &mut String, that lets you modify the owned string itself in-place, which is much more flexible. And, in fact, corresponds exactly to your request: an &mut str corresponds to a char*, it's a pointer to a place in memory.
A String is also a pointer, so an &mut String is a double-pointer to a zone in memory.
So something like this:
fn to_pig_latin(s: &mut String) {
let first = s.chars().nth(0).unwrap();
if "aeiouAEIOU".contains(first) {
*s = format!("{}-{}", s, "hay");
} else {
let mut word = String::new();
for (i, c) in s.char_indices() {
if i != 0 {
word.push(c);
}
}
*s = format!("{}-{}{}", word, first.to_lowercase(), "ay");
}
}
You can also likely avoid some of the complete string allocations by using somewhat finer methods e.g.
fn to_pig_latin(s: &mut String) {
let first = s.chars().nth(0).unwrap();
if "aeiouAEIOU".contains(first) {
s.push_str("-hay")
} else {
s.replace_range(first.len_utf8().., "");
write!(s, "-{}ay", first.to_lowercase()).unwrap();
}
}
although the replace_range + write! is not very readable and not super likely to be much of a gain, so that might as well be a format!, something along the lines of:
fn to_pig_latin(s: &mut String) {
let first = s.chars().nth(0).unwrap();
if "aeiouAEIOU".contains(first) {
s.push_str("-hay")
} else {
*s = format!("{}-{}ay", &s[first.len_utf8()..], first.to_lowercase());
}
}
I want a function that can take two arguments (string, number of letters to crop off front) and return the same string except with the letters before character x gone.
If I write
let mut example = "stringofletters";
CropLetters(example, 3);
println!("{}", example);
then the output should be:
ingofletters
Is there any way I can do this?
In many uses it would make sense to simply return a slice of the input, avoiding any copy. Converting #Shepmaster's solution to use immutable slices:
fn crop_letters(s: &str, pos: usize) -> &str {
match s.char_indices().skip(pos).next() {
Some((pos, _)) => &s[pos..],
None => "",
}
}
fn main() {
let example = "stringofletters"; // works with a String if you take a reference
let cropped = crop_letters(example, 3);
println!("{}", cropped);
}
Advantages over the mutating version are:
No copy is needed. You can call cropped.to_string() if you want a newly allocated result; but you don't have to.
It works with static string slices as well as mutable String etc.
The disadvantage is that if you really do have a mutable string you want to modify, it would be slightly less efficient as you'd need to allocate a new String.
Issues with your original code:
Functions use snake_case, types and traits use CamelCase.
"foo" is a string literal of type &str. These may not be changed. You will need something that has been heap-allocated, such as a String.
The call crop_letters(stringofletters, 3) would transfer ownership of stringofletters to the method, which means you wouldn't be able to use the variable anymore. You must pass in a mutable reference (&mut).
Rust strings are not ASCII, they are UTF-8. You need to figure out how many bytes each character requires. char_indices is a good tool here.
You need to handle the case of when the string is shorter than 3 characters.
Once you have the byte position of the new beginning of the string, you can use drain to move a chunk of bytes out of the string. We just drop these bytes and let the String move over the remaining bytes.
fn crop_letters(s: &mut String, pos: usize) {
match s.char_indices().nth(pos) {
Some((pos, _)) => {
s.drain(..pos);
}
None => {
s.clear();
}
}
}
fn main() {
let mut example = String::from("stringofletters");
crop_letters(&mut example, 3);
assert_eq!("ingofletters", example);
}
See Chris Emerson's answer if you don't actually need to modify the original String.
I found this answer which I don't consider really idiomatic:
fn crop_with_allocation(string: &str, len: usize) -> String {
string.chars().skip(len).collect()
}
fn crop_without_allocation(string: &str, len: usize) -> &str {
// optional length check
if string.len() < len {
return &"";
}
&string[len..]
}
fn main() {
let example = "stringofletters"; // works with a String if you take a reference
let cropped = crop_with_allocation(example, 3);
println!("{}", cropped);
let cropped = crop_without_allocation(example, 3);
println!("{}", cropped);
}
my version
fn crop_str(s: &str, n: usize) -> &str {
let mut it = s.chars();
for _ in 0..n {
it.next();
}
it.as_str()
}
#[test]
fn test_crop_str() {
assert_eq!(crop_str("123", 1), "23");
assert_eq!(crop_str("ЖФ1", 1), "Ф1");
assert_eq!(crop_str("ЖФ1", 2), "1");
}
Suppose I'm trying to do a fancy zero-copy parser in Rust using &str, but sometimes I need to modify the text (e.g. to implement variable substitution). I really want to do something like this:
fn main() {
let mut v: Vec<&str> = "Hello there $world!".split_whitespace().collect();
for t in v.iter_mut() {
if (t.contains("$world")) {
*t = &t.replace("$world", "Earth");
}
}
println!("{:?}", &v);
}
But of course the String returned by t.replace() doesn't live long enough. Is there a nice way around this? Perhaps there is a type which means "ideally a &str but if necessary a String"? Or maybe there is a way to use lifetime annotations to tell the compiler that the returned String should be kept alive until the end of main() (or have the same lifetime as v)?
Rust has exactly what you want in form of a Cow (Clone On Write) type.
use std::borrow::Cow;
fn main() {
let mut v: Vec<_> = "Hello there $world!".split_whitespace()
.map(|s| Cow::Borrowed(s))
.collect();
for t in v.iter_mut() {
if t.contains("$world") {
*t.to_mut() = t.replace("$world", "Earth");
}
}
println!("{:?}", &v);
}
as #sellibitze correctly notes, the to_mut() creates a new String which causes a heap allocation to store the previous borrowed value. If you are sure you only have borrowed strings, then you can use
*t = Cow::Owned(t.replace("$world", "Earth"));
In case the Vec contains Cow::Owned elements, this would still throw away the allocation. You can prevent that using the following very fragile and unsafe code (It does direct byte-based manipulation of UTF-8 strings and relies of the fact that the replacement happens to be exactly the same number of bytes.) inside your for loop.
let mut last_pos = 0; // so we don't start at the beginning every time
while let Some(pos) = t[last_pos..].find("$world") {
let p = pos + last_pos; // find always starts at last_pos
last_pos = pos + 5;
unsafe {
let s = t.to_mut().as_mut_vec(); // operating on Vec is easier
s.remove(p); // remove $ sign
for (c, sc) in "Earth".bytes().zip(&mut s[p..]) {
*sc = c;
}
}
}
Note that this is tailored exactly to the "$world" -> "Earth" mapping. Any other mappings require careful consideration inside the unsafe code.
std::borrow::Cow, specifically used as Cow<'a, str>, where 'a is the lifetime of the string being parsed.
use std::borrow::Cow;
fn main() {
let mut v: Vec<Cow<'static, str>> = vec![];
v.push("oh hai".into());
v.push(format!("there, {}.", "Mark").into());
println!("{:?}", v);
}
Produces:
["oh hai", "there, Mark."]
I am trying to edit a string in place by passing it to mutate(), see below.
Simplified example:
fn mutate(string: &mut &str) -> &str {
string[0] = 'a'; // mutate string
string
}
fn do_something(string: &str) {
println!("{}", string);
}
fn main() {
let string = "Hello, world!";
loop {
string = mutate(&mut string);
do_something(string);
}
}
But I get the following compilation error:
main.rs:1:33: 1:37 error: missing lifetime specifier [E0106]
main.rs:1 fn mutate(string: &mut &str) -> &str {
^~~~
main.rs:1:33: 1:37 help: this function's return type contains a borrowed value, but the signature does not say which one of `string`'s 2 elided lifetimes it is borrowed from
main.rs:1 fn mutate(string: &mut &str) -> &str {
^~~~
Why do I get this error and how can I achieve what I want?
You can't change a string slice at all. &mut &str is not an appropriate type anyway, because it literally is a mutable pointer to an immutable slice. And all string slices are immutable.
In Rust strings are valid UTF-8 sequences, and UTF-8 is a variable-width encoding. Consequently, in general changing a character may change the length of the string in bytes. This can't be done with slices (because they always have fixed length) and it may cause reallocation for owned strings. Moreover, in 99% of cases changing a character inside a string is not what you really want.
In order to do what you want with unicode code points you need to do something like this:
fn replace_char_at(s: &str, idx: uint, c: char) -> String {
let mut r = String::with_capacity(s.len());
for (i, d) in s.char_indices() {
r.push(if i == idx { c } else { d });
}
r
}
However, this has O(n) efficiency because it has to iterate through the original slice, and it also won't work correctly with complex characters - it may replace a letter but leave an accent or vice versa.
More correct way for text processing is to iterate through grapheme clusters, it will take diacritics and other similar things correctly (mostly):
fn replace_grapheme_at(s: &str, idx: uint, c: &str) -> String {
let mut r = String::with_capacity(s.len());
for (i, g) in s.grapheme_indices(true) {
r.push_str(if i == idx { c } else { g });
}
r
}
There is also some support for pure ASCII strings in std::ascii module, but it is likely to be reformed soon. Anyway, that's how it could be used:
fn replace_ascii_char_at(s: String, idx: uint, c: char) -> String {
let mut ascii_s = s.into_ascii();
ascii_s[idx] = c.to_ascii();
String::from_utf8(ascii_s.into_bytes()).unwrap()
}
It will panic if either s contains non-ASCII characters or c is not an ASCII character.