Why use immutable Vector or String in Rust - rust

I'm learning Rust and learned, that for making an expandable string or array, the String and Vec structs are used. And to modify a String or Vec, the corresponding variable needs to be annotated with mut.
let mut myString = String::from("Hello");
let mut myArray = vec![1, 2, 3];
My question is, why would you use or declare an immutable String or Vec like
let myString = String::from("Hello");
let myArray = vec![1, 2, 3];
instead of a true array or str literal like
let myString = "Hello";
let myArray = [1, 2, 3];
What would be the point of that, does it have any benefits? And what may be other use cases for immutable String's and Vec's?
Edit:
Either I am completely missing something obvious or my question isn't fully understood. I get why you want to use a mutable String or Vec over the str literal or an array, since the latter are immutable. But why would one use an immutable String or Vec over the str literal or the array (which are also immutable)?

You might then do something with that string, for example use it to populate a struct:
struct MyStruct {
field: String,
}
...
let s = String::from("hello");
let mut mystruct = MyStruct { field: s };
You could also, for example, return it from a function.

One use case could be the following:
fn main() {
let text: [u8; 6] = [0x48, 0x45, 0x4c, 0x4c, 0x4f, 0x00];
let converted = unsafe { CStr::from_ptr(text.as_ptr() as *const c_char) }.to_string_lossy().to_string();
println!("{}", converted);
}
This is a bit constructed, but imagine you want to convert a null terminated C string (which might come from the network) from a raw pointer to some Rust string, but you don't know whether it has invalid UTF-8 in it. to_string_lossy() returns a Cow which either points to the original bytes (in case everything is valid UTF-8), but if that's not the case, it will basically copy the string and do a new allocation, replace the invalid characters with the UTF-8 replacement character and then point to that.
This is of course quite nice, because (presumably) most of the time you get away without copying the original C string, but in some cases, it might not be. But if you don't care about that and don't want to work with a Cow, it might make sense to convert it to a String, which you don't need to be mutable afterwards. But it's not possible to have a &str in case the original text contains invalid UTF-8.

Related

Why can I declare a str variable as mutable if Rust str is immutable?

According to: https://web.mit.edu/rust-lang_v1.25/arch/amd64_ubuntu1404/share/doc/rust/html/book/first-edition/strings.html
rust str is immutable, and cannot be used when mutability is required.
However, the following program compiles and works
fn main() {
let mut mystr = "foo";
mystr = "bar";
{
mystr = "baz";
}
println!("{:?}", mystr);
}
Can someone explain the mutability of str in Rust?
I expect let mut mystr = "foo"; to result in compilation error since str in Rust is immutable. But it compiles.
You did not change the string itself. &str is basically (*const u8, usize) - a pointer to the buffer and a length. While mutating a variable with type &str, you’re just replacing one pointer with another and not mutating the original buffer. Immutability of a string literal means that the buffer is actually linked to your binary (and, as I remember, is contained in .rodata), so you cannot change it’s contents. To actually mutate a string, use a heap-allocated one - String.

How to change a String into a Vec and can also modify Vec's value in Rust?

I want to change a String into a vector of bytes and also modify its value, I have looked up and find How do I convert a string into a vector of bytes in rust?
but this can only get a reference and I cannot modify the vector. I want a to be 0, b to be 1 and so on, so after changing it into bytes I also need to subtract 97. Here is my attempt:
fn main() {
let s: String = "abcdefg".to_string();
let mut vec = (s.as_bytes()).clone();
println!("{:?}", vec);
for i in 0..vec.len() {
vec[i] -= 97;
}
println!("{:?}", vec);
}
but the compiler says
error[E0594]: cannot assign to `vec[_]`, which is behind a `&` reference
Can anyone help me to fix this?
You could get a Vec<u8> out of the String with the into_bytes method. An even better way, though, may be to iterate over the String's bytes with the bytes method, do the maths on the fly, and then collect the result:
fn main() {
let s = "abcdefg";
let vec: Vec<u8> = s.bytes().map(|b| b - b'a').collect();
println!("{:?}", vec); // [0, 1, 2, 3, 4, 5, 6]
}
But as #SvenMarnach correctly points out, this won't re-use s's buffer but allocate a new one. So, unless you need s again, the into_bytes method will be more efficient.
Strings in Rust are encoded in UTF-8. The (safe) interface of the String type enforces that the underlying buffer always is valid UTF-8, so it can't allow direct arbitrary byte modifications. However, you can convert a String into a Vec<u8> using the into_bytes() mehod. You can then modify the vector, and potentially convert it back to a string using String::from_utf8() if desired. The last step will verify that the buffer still is vaid UTF-8, and will fail if it isn't.
Instead of modifying the bytes of the string, you could also consider modifying the characters, which are potentially encoded by multiple bytes in the UTF-8 encoding. You can iterate over the characters of the string using the chars() method, convert each character to whatever you want, and then collect into a new string, or alternatively into a vector of integers, depending on your needs.
To understand what's going on, check the type of the vec variable. If you don't have an IDE/editor that can display the type to you, you can do this:
let mut vec: () = (s.as_bytes()).clone();
The resulting error message is explanative:
3 | let mut vec: () = (s.as_bytes()).clone();
| -- ^^^^^^^^^^^^^^^^^^^^^^ expected `()`, found `&[u8]`
So, what's happening is that the .clone() simply cloned the reference returned by as_bytes(), and didn't create a Vec<u8> from the &[u8]. In general, you can use .to_owned() in this kind of case, but in this specific case, using .into_bytes() on the String is best.

Rust: If &str is supposed to be a pointer to hardcoded binary than why can I make it mutable and change it during runtime?

I am currently trying to understand the difference between &str, str, and String in Rust. I am very new to the programming language and have been banging my head on this for a while. I get the idea that String has a length and pointer that is stored on the stack and that the pointer points to some data on the heap, which contains the string data. I also get that it is stored on the heap because we don't know how much memory its string data will take up at runtime and therefore it can't be stored on the stack. For str, I understand that it is a hardcoded value in binary, which means that it must be immutable and that the only way we can get to it is with a reference: &str. If an &str must be immutable. Then why doesn't the following code result in a compiler error? Please help. I have been searching the internet and this website for hours now and I can't find an answer.
let mut s: &str = "foo";
s = "foobar";
The str itself is immutable, but the &str is a reference to a string. When you do s = "foobar" you're making s point to a different string. Here's an example that should hopefully illustrate this... rust playground.
fn main() {
let mut s: &str = "foo";
let p = s;
s = "foobar";
println!("{:?}", p);
println!("{:?}", s);
}
The rust book on pointers might also be a helpful resource to understanding pointers.
In this example it's not the str that is mutable, it's the &.
s is storing a reference to some str, s = "foobar" stores a different reference to a different str at the same locaton s.
Note the difference from let s: &mut str = "foobar", which would allow for mutating the string slice even though s is not marked as mutable.

How can I append a char or &str to a String without first converting it to String?

I am attempting to write a lexer for fun, however something keeps bothering me.
let mut chars: Vec<char> = Vec::new();
let mut contents = String::new();
let mut tokens: Vec<&String> = Vec::new();
let mut append = String::new();
//--snip--
for _char in chars {
append += &_char.to_string();
append = append.trim().to_string();
if append.contains("print") {
println!("print found at: \n{}", append);
append = "".to_string();
}
}
Any time I want to do something as simple as append a &str to a String I have to convert it using .to_string, String::from(), .to_owned, etc.
Is there something I am doing wrong, so that I don't have to constantly do this, or is this the primary way of appending?
If you're trying to do something with a type, check the documentation. From the documentation for String:
push: "Appends the given char to the end of this String."
push_str: "Appends a given string slice onto the end of this String."
It's important to understand the differences between String and &str, and why different methods accept and return each of them.
A &str or &mut str are usually preferred in function arguments and return types. That's because they are just pointers to data so nothing needs to be copied or moved when they are passed around.
A String is returned when a function needs to do some new allocation, while &str and &mut str are slices into an existing String. Even though &mut str is mutable, you can't mutate it in a way that increases its length because that would require additional allocation.
The trim function is able to return a &str slice because that doesn't involve mutating the original string - a trimmed string is just a substring, which a slice perfectly describes. But sometimes that isn't possible; for example, a function that pads a string with an extra character would have to return a String because it would be allocating new memory.
You can reduce the number of type conversions in your code by choosing different methods:
for c in chars {
append.push(c); // append += &_char.to_string();
append = append.trim().to_string();
if append.contains("print") {
println!("print found at: \n{}", append);
append.clear(); // append = "".to_string();
}
}
There isn't anything like a trim_in_place method for String, so the way you have done it is probably the only way.

How to convert a String into a &'static str

How do I convert a String into a &str? More specifically, I would like to convert it into a str with the static lifetime (&'static str).
Updated for Rust 1.0
You cannot obtain &'static str from a String because Strings may not live for the entire life of your program, and that's what &'static lifetime means. You can only get a slice parameterized by String own lifetime from it.
To go from a String to a slice &'a str you can use slicing syntax:
let s: String = "abcdefg".to_owned();
let s_slice: &str = &s[..]; // take a full slice of the string
Alternatively, you can use the fact that String implements Deref<Target=str> and perform an explicit reborrowing:
let s_slice: &str = &*s; // s : String
// *s : str (via Deref<Target=str>)
// &*s: &str
There is even another way which allows for even more concise syntax but it can only be used if the compiler is able to determine the desired target type (e.g. in function arguments or explicitly typed variable bindings). It is called deref coercion and it allows using just & operator, and the compiler will automatically insert an appropriate amount of *s based on the context:
let s_slice: &str = &s; // okay
fn take_name(name: &str) { ... }
take_name(&s); // okay as well
let not_correct = &s; // this will give &String, not &str,
// because the compiler does not know
// that you want a &str
Note that this pattern is not unique for String/&str - you can use it with every pair of types which are connected through Deref, for example, with CString/CStr and OsString/OsStr from std::ffi module or PathBuf/Path from std::path module.
You can do it, but it involves leaking the memory of the String. This is not something you should do lightly. By leaking the memory of the String, we guarantee that the memory will never be freed (thus the leak). Therefore, any references to the inner object can be interpreted as having the 'static lifetime.
fn string_to_static_str(s: String) -> &'static str {
Box::leak(s.into_boxed_str())
}
fn main() {
let mut s = String::new();
std::io::stdin().read_line(&mut s).unwrap();
let s: &'static str = string_to_static_str(s);
}
As of Rust version 1.26, it is possible to convert a String to &'static str without using unsafe code:
fn string_to_static_str(s: String) -> &'static str {
Box::leak(s.into_boxed_str())
}
This converts the String instance into a boxed str and immediately leaks it. This frees all excess capacity the string may currently occupy.
Note that there are almost always solutions that are preferable over leaking objects, e.g. using the crossbeam crate if you want to share state between threads.
TL;DR: you can get a &'static str from a String which itself has a 'static lifetime.
Although the other answers are correct and most useful, there's a (not so useful) edge case, where you can indeed convert a String to a &'static str:
The lifetime of a reference must always be shorter or equal to the lifetime of the referenced object. I.e. the referenced object has to live longer (or equal long) than the reference. Since 'static means the entire lifetime of a program, a longer lifetime does not exist. But an equal lifetime will be sufficient. So if a String has a lifetime of 'static, you can get a &'static str reference from it.
Creating a static of type String has theoretically become possible with Rust 1.31 when the const fn feature was released. Unfortunately, the only const function returning a String is String::new() currently, and it's still behind a feature gate (so Rust nightly is required for now).
So the following code does the desired conversion (using nightly) ...and actually has no practical use except for completeness of showing that it is possible in this edge case.
#![feature(const_string_new)]
static MY_STRING: String = String::new();
fn do_something(_: &'static str) {
// ...
}
fn main() {
do_something(&MY_STRING);
}

Resources