I am following the Rust book to learn Rust and currently learning about ownership. Here it mentions,
We’ve already seen string literals, where a string value is hardcoded
into our program. String literals are convenient, but they aren’t
suitable for every situation in which we may want to use text. One
reason is that they’re immutable.
The code below runs without any problem, Here I changed the value of a, if immutability can be changed whats the problem stated there?
fn main() {
let mut a = "Hello";
println!("{}", a);
a = " World";
println!("{}",a);
}
The executable binary produced by the rust compiler contains the string literals "Hello" and "World" in the read-only data section rodata.
$ cargo build --release
$ readelf -x .rodata target/release/demo | grep Hello
0x0003c000 48656c6c 6f000000 0a576f72 6c640000 Hello....World..
Because these literals are placed in an immutable section, the operating system prohibits modifying them.
fn main() {
let mut a = "Hello";
println!("{}", a);
unsafe { (a.as_ptr() as *mut u8).write(42) };
println!("{}", a);
}
$ cargo run
Hello
Segmentation fault
However, the variable a is of type &str, so a pointer to a string slice, and it lives on the stack. Therefore, it is totally valid for a to first point to the address of "Hello", and then to the address of "World".
Related
I am currently trying to understand the difference between &str, str, and String in Rust. I am very new to the programming language and have been banging my head on this for a while. I get the idea that String has a length and pointer that is stored on the stack and that the pointer points to some data on the heap, which contains the string data. I also get that it is stored on the heap because we don't know how much memory its string data will take up at runtime and therefore it can't be stored on the stack. For str, I understand that it is a hardcoded value in binary, which means that it must be immutable and that the only way we can get to it is with a reference: &str. If an &str must be immutable. Then why doesn't the following code result in a compiler error? Please help. I have been searching the internet and this website for hours now and I can't find an answer.
let mut s: &str = "foo";
s = "foobar";
The str itself is immutable, but the &str is a reference to a string. When you do s = "foobar" you're making s point to a different string. Here's an example that should hopefully illustrate this... rust playground.
fn main() {
let mut s: &str = "foo";
let p = s;
s = "foobar";
println!("{:?}", p);
println!("{:?}", s);
}
The rust book on pointers might also be a helpful resource to understanding pointers.
In this example it's not the str that is mutable, it's the &.
s is storing a reference to some str, s = "foobar" stores a different reference to a different str at the same locaton s.
Note the difference from let s: &mut str = "foobar", which would allow for mutating the string slice even though s is not marked as mutable.
I have a couple of pieces of code, once errors out and the other doesn't, and I don't understand why.
The one that errors out when compiling:
fn main() {
let s1 = String::from("hello");
println!("{}", *s1);
}
This throws: doesn't have a size known at compile-time, on the line println!("{}", *s1);
The one that works:
fn main() {
let s1 = String::from("hello");
print_string(&s1);
}
fn print_string(s1: &String) {
println!("{}", *s1);
}
Why is this happening? Aren't both correct ways to access the string contents and printing them?
In the first snippet you’re dereferencing a String. This yields an str which is a dynamically sized type (sometimes called unsized types in older texts). DSTs are somewhat difficult to use directly
In the second snippet you’re dereferencing a &String, which yields a regular String, which is a normal sized type.
In both cases the dereference is completely useless, why are you even using one?
I saw in the Rust book that you can define two different variables with the same name:
let hello = "Hello";
let hello = "Goodbye";
println!("My variable hello contains: {}", hello);
This prints out:
My variable hello contains: Goodbye
What happens with the first hello? Does it get freed up? How could I access it?
I know it would be bad to name two variables the same, but if this happens by accident because I declare it 100 lines below it could be a real pain.
Rust does not have a garbage collector.
Does Rust free up the memory of overwritten variables?
Yes, otherwise it'd be a memory leak, which would be a pretty terrible design decision. The memory is freed when the variable is reassigned:
struct Noisy;
impl Drop for Noisy {
fn drop(&mut self) {
eprintln!("Dropped")
}
}
fn main() {
eprintln!("0");
let mut thing = Noisy;
eprintln!("1");
thing = Noisy;
eprintln!("2");
}
0
1
Dropped
2
Dropped
what happens with the first hello
It is shadowed.
Nothing "special" happens to the data referenced by the variable, other than the fact that you can no longer access it. It is still dropped when the variable goes out of scope:
struct Noisy;
impl Drop for Noisy {
fn drop(&mut self) {
eprintln!("Dropped")
}
}
fn main() {
eprintln!("0");
let thing = Noisy;
eprintln!("1");
let thing = Noisy;
eprintln!("2");
}
0
1
2
Dropped
Dropped
See also:
Is the resource of a shadowed variable binding freed immediately?
I know it would be bad to name two variables the same
It's not "bad", it's a design decision. I would say that using shadowing like this is a bad idea:
let x = "Anna";
println!("User's name is {}", x);
let x = 42;
println!("The tax rate is {}", x);
Using shadowing like this is reasonable to me:
let name = String::from(" Vivian ");
let name = name.trim();
println!("User's name is {}", name);
See also:
Why do I need rebinding/shadowing when I can have mutable variable binding?
but if this happens by accident because I declare it 100 lines below it could be a real pain.
Don't have functions that are so big that you "accidentally" do something. That's applicable in any programming language.
Is there a way of cleaning memory manually?
You can call drop:
eprintln!("0");
let thing = Noisy;
drop(thing);
eprintln!("1");
let thing = Noisy;
eprintln!("2");
0
Dropped
1
2
Dropped
However, as oli_obk - ker points out, the stack memory taken by the variable will not be freed until the function exits, only the resources taken by the variable.
All discussions of drop require showing its (very complicated) implementation:
fn drop<T>(_: T) {}
What if I declare the variable in a global scope outside of the other functions?
Global variables are never freed, if you can even create them to start with.
There is a difference between shadowing and reassigning (overwriting) a variable when it comes to drop order.
All local variables are normally dropped when they go out of scope, in reverse order of declaration (see The Rust Programming Language's chapter on Drop). This includes shadowed variables. It's easy to check this by wrapping the value in a simple wrapper struct that prints something when it (the wrapper) is dropped (just before the value itself is dropped):
use std::fmt::Debug;
struct NoisyDrop<T: Debug>(T);
impl<T: Debug> Drop for NoisyDrop<T> {
fn drop(&mut self) {
println!("dropping {:?}", self.0);
}
}
fn main() {
let hello = NoisyDrop("Hello");
let hello = NoisyDrop("Goodbye");
println!("My variable hello contains: {}", hello.0);
}
prints the following (playground):
My variable hello contains: Goodbye
dropping "Goodbye"
dropping "Hello"
That's because a new let binding in a scope does not overwrite the previous binding, so it's just as if you had written
let hello1 = NoisyDrop("Hello");
let hello2 = NoisyDrop("Goodbye");
println!("My variable hello contains: {}", hello2.0);
Notice that this behavior is different from the following, superficially very similar, code (playground):
fn main() {
let mut hello = NoisyDrop("Hello");
hello = NoisyDrop("Goodbye");
println!("My variable hello contains: {}", hello.0);
}
which not only drops them in the opposite order, but drops the first value before printing the message! That's because when you assign to a variable (instead of shadowing it with a new one), the original value gets dropped first, before the new value is moved in.
I began by saying that local variables are "normally" dropped when they go out of scope. Because you can move values into and out of variables, the analysis of figuring out when variables need to be dropped can sometimes not be done until runtime. In such cases, the compiler actually inserts code to track "liveness" and drop those values when necessary, so you can't accidentally cause leaks by overwriting a value. (However, it's still possible to safely leak memory by calling mem::forget, or by creating an Rc-cycle with internal mutability.)
See also
What's the semantic of assignment in Rust?
There are a few things to note here:
In the program you gave, when compiling it, the "Hello" string does not appear in the binary. This might be a compiler optimization because the first value is not used.
fn main(){
let hello = "Hello xxxxxxxxxxxxxxxx"; // Added for searching more easily.
let hello = "Goodbye";
println!("My variable hello contains: {}", hello);
}
Then test:
$ rustc ./stackoverflow.rs
$ cat stackoverflow | grep "xxx"
# No results
$ cat stackoverflow | grep "Goodbye"
Binary file (standard input) matches
$ cat stackoverflow | grep "My variable hello contains"
Binary file (standard input) matches
Note that if you print the first value, the string does appear in the binary though, so this proves that this is a compiler optimization to not store unused values.
Another thing to consider is that both values assigned to hello (i.e. "Hello" and "Goodbye") have a &str type. This is a pointer to a string stored statically in the binary after compiling. An example of a dynamically generated string would be when you generate a hash from some data, like MD5 or SHA algorithms (the resulting string does not exist statically in the binary).
fn main(){
// Added the type to make it more clear.
let hello: &str = "Hello";
let hello: &str = "Goodbye";
// This is wrong (does not compile):
// let hello: String = "Goodbye";
println!("My variable hello contains: {}", hello);
}
This means that the variable is simply pointing to a location in the static memory. No memory gets allocated during runtime, nor gets freed. Even if the optimization mentioned above didn't exist (i.e. omit unused strings), only the memory address location pointed by hello would change, but the memory is still used by static strings.
The story would be different for a String type, and for that refer to the other answers.
There are several questions that seem to be about the same problem I'm having. For example see here and here. Basically I'm trying to build a String in a local function, but then return it as a &str. Slicing isn't working because the lifetime is too short. I can't use str directly in the function because I need to build it dynamically. However, I'd also prefer not to return a String since the nature of the object this is going into is static once it's built. Is there a way to have my cake and eat it too?
Here's a minimal non-compiling reproduction:
fn return_str<'a>() -> &'a str {
let mut string = "".to_string();
for i in 0..10 {
string.push_str("ACTG");
}
&string[..]
}
No, you cannot do it. There are at least two explanations why it is so.
First, remember that references are borrowed, i.e. they point to some data but do not own it, it is owned by someone else. In this particular case the string, a slice to which you want to return, is owned by the function because it is stored in a local variable.
When the function exits, all its local variables are destroyed; this involves calling destructors, and the destructor of String frees the memory used by the string. However, you want to return a borrowed reference pointing to the data allocated for that string. It means that the returned reference immediately becomes dangling - it points to invalid memory!
Rust was created, among everything else, to prevent such problems. Therefore, in Rust it is impossible to return a reference pointing into local variables of the function, which is possible in languages like C.
There is also another explanation, slightly more formal. Let's look at your function signature:
fn return_str<'a>() -> &'a str
Remember that lifetime and generic parameters are, well, parameters: they are set by the caller of the function. For example, some other function may call it like this:
let s: &'static str = return_str();
This requires 'a to be 'static, but it is of course impossible - your function does not return a reference to a static memory, it returns a reference with a strictly lesser lifetime. Thus such function definition is unsound and is prohibited by the compiler.
Anyway, in such situations you need to return a value of an owned type, in this particular case it will be an owned String:
fn return_str() -> String {
let mut string = String::new();
for _ in 0..10 {
string.push_str("ACTG");
}
string
}
In certain cases, you are passed a string slice and may conditionally want to create a new string. In these cases, you can return a Cow. This allows for the reference when possible and an owned String otherwise:
use std::borrow::Cow;
fn return_str<'a>(name: &'a str) -> Cow<'a, str> {
if name.is_empty() {
let name = "ACTG".repeat(10);
name.into()
} else {
name.into()
}
}
You can choose to leak memory to convert a String to a &'static str:
fn return_str() -> &'static str {
let string = "ACTG".repeat(10);
Box::leak(string.into_boxed_str())
}
This is a really bad idea in many cases as the memory usage will grow forever every time this function is called.
If you wanted to return the same string every call, see also:
How to create a static string at compile time
The problem is that you are trying to create a reference to a string that will disappear when the function returns.
A simple solution in this case is to pass in the empty string to the function. This will explicitly ensure that the referred string will still exist in the scope where the function returns:
fn return_str(s: &mut String) -> &str {
for _ in 0..10 {
s.push_str("ACTG");
}
&s[..]
}
fn main() {
let mut s = String::new();
let s = return_str(&mut s);
assert_eq!("ACTGACTGACTGACTGACTGACTGACTGACTGACTGACTG", s);
}
Code in Rust Playground:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=2499ded42d3ee92d6023161fe82e9b5f
This is an old question but a very common one. There are many answers but none of them addresses the glaring misconception people have about the strings and string slices, which stems from not knowing their true nature.
But lets start with the obvious question before addressing the implied one: Can we return a reference to a local variable?
What we are asking to achieve is the textbook definition of a dangling pointer. Local variables will be dropped when the function completes its execution. In other words they will be pop off the execution stack and any reference to the local variables then on will be pointing to some garbage data.
Best course of action is either returning the string or its clone. No need to obsess over the speed.
However, I believe the essence of the question is if there is a way to convert a String into an str? The answer is no and this is where the misconception lies:
You can not turn a String into an str by borrowing it. Because a String is heap allocated. If you take a reference to it, you still be using heap allocated data but through a reference. str, on the other hand, is stored directly in the data section of the executable file and it is static. When you take a reference to a string, you will get matching type signature for common string manipulations, not an actual &str.
You can check out this post for detailed explanation:
What are the differences between Rust's `String` and `str`?
Now, there may be a workaround for this particular use case if you absolutely use static text:
Since you use combinations of four bases A, C, G, T, in groups of four, you can create a list of all possible outcomes as &str and use them through some data structure. You will jump some hoops but certainly doable.
if it is possible to create the resulting STRING in a static way at compile time, this would be a solution without memory leaking
#[macro_use]
extern crate lazy_static;
fn return_str<'a>() -> &'a str {
lazy_static! {
static ref STRING: String = {
"ACTG".repeat(10)
};
}
&STRING
}
Yes you can - the method replace_range provides a work around -
let a = "0123456789";
//println!("{}",a[3..5]); fails - doesn't have a size known at compile-time
let mut b = String::from(a);
b.replace_range(5..,"");
b.replace_range(0..2,"");
println!("{}",b); //succeeds
It took blood sweat and tears to achieve this!
There are several questions that seem to be about the same problem I'm having. For example see here and here. Basically I'm trying to build a String in a local function, but then return it as a &str. Slicing isn't working because the lifetime is too short. I can't use str directly in the function because I need to build it dynamically. However, I'd also prefer not to return a String since the nature of the object this is going into is static once it's built. Is there a way to have my cake and eat it too?
Here's a minimal non-compiling reproduction:
fn return_str<'a>() -> &'a str {
let mut string = "".to_string();
for i in 0..10 {
string.push_str("ACTG");
}
&string[..]
}
No, you cannot do it. There are at least two explanations why it is so.
First, remember that references are borrowed, i.e. they point to some data but do not own it, it is owned by someone else. In this particular case the string, a slice to which you want to return, is owned by the function because it is stored in a local variable.
When the function exits, all its local variables are destroyed; this involves calling destructors, and the destructor of String frees the memory used by the string. However, you want to return a borrowed reference pointing to the data allocated for that string. It means that the returned reference immediately becomes dangling - it points to invalid memory!
Rust was created, among everything else, to prevent such problems. Therefore, in Rust it is impossible to return a reference pointing into local variables of the function, which is possible in languages like C.
There is also another explanation, slightly more formal. Let's look at your function signature:
fn return_str<'a>() -> &'a str
Remember that lifetime and generic parameters are, well, parameters: they are set by the caller of the function. For example, some other function may call it like this:
let s: &'static str = return_str();
This requires 'a to be 'static, but it is of course impossible - your function does not return a reference to a static memory, it returns a reference with a strictly lesser lifetime. Thus such function definition is unsound and is prohibited by the compiler.
Anyway, in such situations you need to return a value of an owned type, in this particular case it will be an owned String:
fn return_str() -> String {
let mut string = String::new();
for _ in 0..10 {
string.push_str("ACTG");
}
string
}
In certain cases, you are passed a string slice and may conditionally want to create a new string. In these cases, you can return a Cow. This allows for the reference when possible and an owned String otherwise:
use std::borrow::Cow;
fn return_str<'a>(name: &'a str) -> Cow<'a, str> {
if name.is_empty() {
let name = "ACTG".repeat(10);
name.into()
} else {
name.into()
}
}
You can choose to leak memory to convert a String to a &'static str:
fn return_str() -> &'static str {
let string = "ACTG".repeat(10);
Box::leak(string.into_boxed_str())
}
This is a really bad idea in many cases as the memory usage will grow forever every time this function is called.
If you wanted to return the same string every call, see also:
How to create a static string at compile time
The problem is that you are trying to create a reference to a string that will disappear when the function returns.
A simple solution in this case is to pass in the empty string to the function. This will explicitly ensure that the referred string will still exist in the scope where the function returns:
fn return_str(s: &mut String) -> &str {
for _ in 0..10 {
s.push_str("ACTG");
}
&s[..]
}
fn main() {
let mut s = String::new();
let s = return_str(&mut s);
assert_eq!("ACTGACTGACTGACTGACTGACTGACTGACTGACTGACTG", s);
}
Code in Rust Playground:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=2499ded42d3ee92d6023161fe82e9b5f
This is an old question but a very common one. There are many answers but none of them addresses the glaring misconception people have about the strings and string slices, which stems from not knowing their true nature.
But lets start with the obvious question before addressing the implied one: Can we return a reference to a local variable?
What we are asking to achieve is the textbook definition of a dangling pointer. Local variables will be dropped when the function completes its execution. In other words they will be pop off the execution stack and any reference to the local variables then on will be pointing to some garbage data.
Best course of action is either returning the string or its clone. No need to obsess over the speed.
However, I believe the essence of the question is if there is a way to convert a String into an str? The answer is no and this is where the misconception lies:
You can not turn a String into an str by borrowing it. Because a String is heap allocated. If you take a reference to it, you still be using heap allocated data but through a reference. str, on the other hand, is stored directly in the data section of the executable file and it is static. When you take a reference to a string, you will get matching type signature for common string manipulations, not an actual &str.
You can check out this post for detailed explanation:
What are the differences between Rust's `String` and `str`?
Now, there may be a workaround for this particular use case if you absolutely use static text:
Since you use combinations of four bases A, C, G, T, in groups of four, you can create a list of all possible outcomes as &str and use them through some data structure. You will jump some hoops but certainly doable.
if it is possible to create the resulting STRING in a static way at compile time, this would be a solution without memory leaking
#[macro_use]
extern crate lazy_static;
fn return_str<'a>() -> &'a str {
lazy_static! {
static ref STRING: String = {
"ACTG".repeat(10)
};
}
&STRING
}
Yes you can - the method replace_range provides a work around -
let a = "0123456789";
//println!("{}",a[3..5]); fails - doesn't have a size known at compile-time
let mut b = String::from(a);
b.replace_range(5..,"");
b.replace_range(0..2,"");
println!("{}",b); //succeeds
It took blood sweat and tears to achieve this!