Confused about ownership in situations involving lines and map - rust

fn problem() -> Vec<&'static str> {
let my_string = String::from("First Line\nSecond Line");
my_string.lines().collect()
}
This fails with the compilation error:
|
7 | my_string.lines().collect()
| ---------^^^^^^^^^^^^^^^^^^
| |
| returns a value referencing data owned by the current function
| `my_string` is borrowed here
I understand what this error means - it's to stop you returning a reference to a value which has gone out of scope. Having looked at the type signatures of the functions involved, it appears that the problem is with the lines method, which borrows the string it's called on. But why does this matter? I'm iterating over the lines of the string in order to get a vector of the parts, and what I'm returning is this "new" vector, not anything that would (illegally) directly reference my_string.
(I'm aware I could fix this particular example very easily by just using the string literal rather than converting to an "owned" string with String::from. This is a toy example to reproduce the problem - in my "real" code the string variable is read from a file, so I obviously can't use a literal.)
What's even more mysterious to me is that the following variation on the function, which to me ought to suffer from the same problem, works fine:
fn this_is_ok() -> Vec<i32> {
let my_string = String::from("1\n2\n3\n4");
my_string.lines().map(|n| n.parse().unwrap()).collect()
}
The reason can't be map doing some magic, because this also fails:
fn also_fails() -> Vec<&'static str> {
let my_string = String::from("First Line\nSecond Line");
my_string.lines().map(|s| s).collect()
}
I've been playing about for quite a while, trying various different functions inside the map - and some pass and some fail, and I've honestly no idea what the difference is. And all this is making me realise that I have very little handle on how Rust's ownership/borrowing rules work in non-trivial cases, even though I thought I at least understood the basics. So if someone could give me a relatively clear and comprehensive guide to what is going on in all these examples, and how it might be possible to fix those which fail, in some straightforward way, I would be extremely grateful!

The key is in the type of the value yielded by lines: &str. In order to avoid unnecessary clones, lines actually returns references to slices of the string it's called on, and when you collect it to a Vec, that Vec's elements are simply references to slices of your string. So, of course when your function exits and the string is dropped, the references inside the Vec will be dropped and invalid. Remember, &str is a borrowed string, and String is an owned string.
The parsing works because you take those &strs then you read them into an i32, so the data is transferred to a new value and you no longer need a reference to the original string.
To fix your problem, simply use str::to_owned to convert each element into a String:
fn problem() -> Vec<String> {
let my_string = String::from("First Line\nSecond Line");
my_string.lines().map(|v| v.to_owned()).collect()
}
It should be noted that to_string also works, and that to_owned is actually part of the ToOwned trait, so it is useful for other borrowed types as well.
For references to sized values (str is unsized so this doesn't apply), such as an Iterator<Item = &i32>, you can simply use Iterator::cloned to clone every element so they are no longer references.
An alternative solution would be to take the String as an argument so it, and therefore references to it, can live past the scope of the function:
fn problem(my_string: &str) -> Vec<&str> {
my_string.lines().collect()
}

The problem here is that this line:
let my_string = String::from("First Line\nSecond Line");
copies the string data to a buffer allocated on the heap (so no longer 'static). Then lines returns references to that heap-allocated buffer.
Note that &str also implements a lines method, so you don't need to copy the string data to the heap, you can use your string directly:
fn problem() -> Vec<&'static str> {
let my_string = "First Line\nSecond Line";
my_string.lines().collect()
}
Playground
which avoids all unnecessary allocations and copying.

Related

Why does a &str not coerce to a &String when using Vec::contains?

A friend asked me to explain the following quirk in Rust. I was unable to, hence this question:
fn main() {
let l: Vec<String> = Vec::new();
//let ret = l.contains(&String::from(func())); // works
let ret = l.contains(func()); // does not work
println!("ret: {}", ret);
}
fn func() -> & 'static str {
"hello"
}
Example on the Rust Playground
The compiler will complain like this:
error[E0308]: mismatched types
--> src/main.rs:4:26
|
4 | let ret = l.contains(func()); // does not work
| ^^^^^^ expected struct `std::string::String`, found str
|
= note: expected type `&std::string::String`
found type `&'static str`
In other words, &str does not coerce with &String.
At first I thought it was to do with 'static, however that is a red herring.
The commented line fixes the example at the cost of an extra allocation.
My questions:
Why doesn't &str coerce with &String?
Is there a way to make the call to contains work without the extra allocation?
Your first question should be answer already by #Marko.
Your second question, should be easy to answer as well, just use a closure:
let ret = l.iter().any(|x| x == func());
Edit:
Not the "real" answer anymore, but I let this here for people who might be interested in a solution for this.
It seems that the Rust developers intend to adjust the signature of contains to allow the example posted above to work.
In some sense, this is a known bug in contains. It sounds like the fix won't allow those types to coerce, but will allow the above example to work.
std::string::String is a growable, heap-allocated data structure whereas string slice (str) is an immutable fixed-length string somewhere in memory. String slice is used as a borrowed type, via &str. Consider it as view to some string date that resides somewhere in memory. That is why it does not make sense for str to coerce to String, while the other way around perfectly makes sense. You have a heap-allocated String somewhere in memory and you want to use a view (a string slice) to that string.
To answer your second question. There is no way to make the code work in the current form. You either need to change to a vector of string slices (that way, there will be no extra allocation) or use something other then contains method.

Cannot push string to vec<&str> [duplicate]

There are several questions that seem to be about the same problem I'm having. For example see here and here. Basically I'm trying to build a String in a local function, but then return it as a &str. Slicing isn't working because the lifetime is too short. I can't use str directly in the function because I need to build it dynamically. However, I'd also prefer not to return a String since the nature of the object this is going into is static once it's built. Is there a way to have my cake and eat it too?
Here's a minimal non-compiling reproduction:
fn return_str<'a>() -> &'a str {
let mut string = "".to_string();
for i in 0..10 {
string.push_str("ACTG");
}
&string[..]
}
No, you cannot do it. There are at least two explanations why it is so.
First, remember that references are borrowed, i.e. they point to some data but do not own it, it is owned by someone else. In this particular case the string, a slice to which you want to return, is owned by the function because it is stored in a local variable.
When the function exits, all its local variables are destroyed; this involves calling destructors, and the destructor of String frees the memory used by the string. However, you want to return a borrowed reference pointing to the data allocated for that string. It means that the returned reference immediately becomes dangling - it points to invalid memory!
Rust was created, among everything else, to prevent such problems. Therefore, in Rust it is impossible to return a reference pointing into local variables of the function, which is possible in languages like C.
There is also another explanation, slightly more formal. Let's look at your function signature:
fn return_str<'a>() -> &'a str
Remember that lifetime and generic parameters are, well, parameters: they are set by the caller of the function. For example, some other function may call it like this:
let s: &'static str = return_str();
This requires 'a to be 'static, but it is of course impossible - your function does not return a reference to a static memory, it returns a reference with a strictly lesser lifetime. Thus such function definition is unsound and is prohibited by the compiler.
Anyway, in such situations you need to return a value of an owned type, in this particular case it will be an owned String:
fn return_str() -> String {
let mut string = String::new();
for _ in 0..10 {
string.push_str("ACTG");
}
string
}
In certain cases, you are passed a string slice and may conditionally want to create a new string. In these cases, you can return a Cow. This allows for the reference when possible and an owned String otherwise:
use std::borrow::Cow;
fn return_str<'a>(name: &'a str) -> Cow<'a, str> {
if name.is_empty() {
let name = "ACTG".repeat(10);
name.into()
} else {
name.into()
}
}
You can choose to leak memory to convert a String to a &'static str:
fn return_str() -> &'static str {
let string = "ACTG".repeat(10);
Box::leak(string.into_boxed_str())
}
This is a really bad idea in many cases as the memory usage will grow forever every time this function is called.
If you wanted to return the same string every call, see also:
How to create a static string at compile time
The problem is that you are trying to create a reference to a string that will disappear when the function returns.
A simple solution in this case is to pass in the empty string to the function. This will explicitly ensure that the referred string will still exist in the scope where the function returns:
fn return_str(s: &mut String) -> &str {
for _ in 0..10 {
s.push_str("ACTG");
}
&s[..]
}
fn main() {
let mut s = String::new();
let s = return_str(&mut s);
assert_eq!("ACTGACTGACTGACTGACTGACTGACTGACTGACTGACTG", s);
}
Code in Rust Playground:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=2499ded42d3ee92d6023161fe82e9b5f
This is an old question but a very common one. There are many answers but none of them addresses the glaring misconception people have about the strings and string slices, which stems from not knowing their true nature.
But lets start with the obvious question before addressing the implied one: Can we return a reference to a local variable?
What we are asking to achieve is the textbook definition of a dangling pointer. Local variables will be dropped when the function completes its execution. In other words they will be pop off the execution stack and any reference to the local variables then on will be pointing to some garbage data.
Best course of action is either returning the string or its clone. No need to obsess over the speed.
However, I believe the essence of the question is if there is a way to convert a String into an str? The answer is no and this is where the misconception lies:
You can not turn a String into an str by borrowing it. Because a String is heap allocated. If you take a reference to it, you still be using heap allocated data but through a reference. str, on the other hand, is stored directly in the data section of the executable file and it is static. When you take a reference to a string, you will get matching type signature for common string manipulations, not an actual &str.
You can check out this post for detailed explanation:
What are the differences between Rust's `String` and `str`?
Now, there may be a workaround for this particular use case if you absolutely use static text:
Since you use combinations of four bases A, C, G, T, in groups of four, you can create a list of all possible outcomes as &str and use them through some data structure. You will jump some hoops but certainly doable.
if it is possible to create the resulting STRING in a static way at compile time, this would be a solution without memory leaking
#[macro_use]
extern crate lazy_static;
fn return_str<'a>() -> &'a str {
lazy_static! {
static ref STRING: String = {
"ACTG".repeat(10)
};
}
&STRING
}
Yes you can - the method replace_range provides a work around -
let a = "0123456789";
//println!("{}",a[3..5]); fails - doesn't have a size known at compile-time
let mut b = String::from(a);
b.replace_range(5..,"");
b.replace_range(0..2,"");
println!("{}",b); //succeeds
It took blood sweat and tears to achieve this!

Converting String to &str with lifetime [duplicate]

There are several questions that seem to be about the same problem I'm having. For example see here and here. Basically I'm trying to build a String in a local function, but then return it as a &str. Slicing isn't working because the lifetime is too short. I can't use str directly in the function because I need to build it dynamically. However, I'd also prefer not to return a String since the nature of the object this is going into is static once it's built. Is there a way to have my cake and eat it too?
Here's a minimal non-compiling reproduction:
fn return_str<'a>() -> &'a str {
let mut string = "".to_string();
for i in 0..10 {
string.push_str("ACTG");
}
&string[..]
}
No, you cannot do it. There are at least two explanations why it is so.
First, remember that references are borrowed, i.e. they point to some data but do not own it, it is owned by someone else. In this particular case the string, a slice to which you want to return, is owned by the function because it is stored in a local variable.
When the function exits, all its local variables are destroyed; this involves calling destructors, and the destructor of String frees the memory used by the string. However, you want to return a borrowed reference pointing to the data allocated for that string. It means that the returned reference immediately becomes dangling - it points to invalid memory!
Rust was created, among everything else, to prevent such problems. Therefore, in Rust it is impossible to return a reference pointing into local variables of the function, which is possible in languages like C.
There is also another explanation, slightly more formal. Let's look at your function signature:
fn return_str<'a>() -> &'a str
Remember that lifetime and generic parameters are, well, parameters: they are set by the caller of the function. For example, some other function may call it like this:
let s: &'static str = return_str();
This requires 'a to be 'static, but it is of course impossible - your function does not return a reference to a static memory, it returns a reference with a strictly lesser lifetime. Thus such function definition is unsound and is prohibited by the compiler.
Anyway, in such situations you need to return a value of an owned type, in this particular case it will be an owned String:
fn return_str() -> String {
let mut string = String::new();
for _ in 0..10 {
string.push_str("ACTG");
}
string
}
In certain cases, you are passed a string slice and may conditionally want to create a new string. In these cases, you can return a Cow. This allows for the reference when possible and an owned String otherwise:
use std::borrow::Cow;
fn return_str<'a>(name: &'a str) -> Cow<'a, str> {
if name.is_empty() {
let name = "ACTG".repeat(10);
name.into()
} else {
name.into()
}
}
You can choose to leak memory to convert a String to a &'static str:
fn return_str() -> &'static str {
let string = "ACTG".repeat(10);
Box::leak(string.into_boxed_str())
}
This is a really bad idea in many cases as the memory usage will grow forever every time this function is called.
If you wanted to return the same string every call, see also:
How to create a static string at compile time
The problem is that you are trying to create a reference to a string that will disappear when the function returns.
A simple solution in this case is to pass in the empty string to the function. This will explicitly ensure that the referred string will still exist in the scope where the function returns:
fn return_str(s: &mut String) -> &str {
for _ in 0..10 {
s.push_str("ACTG");
}
&s[..]
}
fn main() {
let mut s = String::new();
let s = return_str(&mut s);
assert_eq!("ACTGACTGACTGACTGACTGACTGACTGACTGACTGACTG", s);
}
Code in Rust Playground:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=2499ded42d3ee92d6023161fe82e9b5f
This is an old question but a very common one. There are many answers but none of them addresses the glaring misconception people have about the strings and string slices, which stems from not knowing their true nature.
But lets start with the obvious question before addressing the implied one: Can we return a reference to a local variable?
What we are asking to achieve is the textbook definition of a dangling pointer. Local variables will be dropped when the function completes its execution. In other words they will be pop off the execution stack and any reference to the local variables then on will be pointing to some garbage data.
Best course of action is either returning the string or its clone. No need to obsess over the speed.
However, I believe the essence of the question is if there is a way to convert a String into an str? The answer is no and this is where the misconception lies:
You can not turn a String into an str by borrowing it. Because a String is heap allocated. If you take a reference to it, you still be using heap allocated data but through a reference. str, on the other hand, is stored directly in the data section of the executable file and it is static. When you take a reference to a string, you will get matching type signature for common string manipulations, not an actual &str.
You can check out this post for detailed explanation:
What are the differences between Rust's `String` and `str`?
Now, there may be a workaround for this particular use case if you absolutely use static text:
Since you use combinations of four bases A, C, G, T, in groups of four, you can create a list of all possible outcomes as &str and use them through some data structure. You will jump some hoops but certainly doable.
if it is possible to create the resulting STRING in a static way at compile time, this would be a solution without memory leaking
#[macro_use]
extern crate lazy_static;
fn return_str<'a>() -> &'a str {
lazy_static! {
static ref STRING: String = {
"ACTG".repeat(10)
};
}
&STRING
}
Yes you can - the method replace_range provides a work around -
let a = "0123456789";
//println!("{}",a[3..5]); fails - doesn't have a size known at compile-time
let mut b = String::from(a);
b.replace_range(5..,"");
b.replace_range(0..2,"");
println!("{}",b); //succeeds
It took blood sweat and tears to achieve this!

Rust lifetime with a vec! does not work as intended [duplicate]

There are several questions that seem to be about the same problem I'm having. For example see here and here. Basically I'm trying to build a String in a local function, but then return it as a &str. Slicing isn't working because the lifetime is too short. I can't use str directly in the function because I need to build it dynamically. However, I'd also prefer not to return a String since the nature of the object this is going into is static once it's built. Is there a way to have my cake and eat it too?
Here's a minimal non-compiling reproduction:
fn return_str<'a>() -> &'a str {
let mut string = "".to_string();
for i in 0..10 {
string.push_str("ACTG");
}
&string[..]
}
No, you cannot do it. There are at least two explanations why it is so.
First, remember that references are borrowed, i.e. they point to some data but do not own it, it is owned by someone else. In this particular case the string, a slice to which you want to return, is owned by the function because it is stored in a local variable.
When the function exits, all its local variables are destroyed; this involves calling destructors, and the destructor of String frees the memory used by the string. However, you want to return a borrowed reference pointing to the data allocated for that string. It means that the returned reference immediately becomes dangling - it points to invalid memory!
Rust was created, among everything else, to prevent such problems. Therefore, in Rust it is impossible to return a reference pointing into local variables of the function, which is possible in languages like C.
There is also another explanation, slightly more formal. Let's look at your function signature:
fn return_str<'a>() -> &'a str
Remember that lifetime and generic parameters are, well, parameters: they are set by the caller of the function. For example, some other function may call it like this:
let s: &'static str = return_str();
This requires 'a to be 'static, but it is of course impossible - your function does not return a reference to a static memory, it returns a reference with a strictly lesser lifetime. Thus such function definition is unsound and is prohibited by the compiler.
Anyway, in such situations you need to return a value of an owned type, in this particular case it will be an owned String:
fn return_str() -> String {
let mut string = String::new();
for _ in 0..10 {
string.push_str("ACTG");
}
string
}
In certain cases, you are passed a string slice and may conditionally want to create a new string. In these cases, you can return a Cow. This allows for the reference when possible and an owned String otherwise:
use std::borrow::Cow;
fn return_str<'a>(name: &'a str) -> Cow<'a, str> {
if name.is_empty() {
let name = "ACTG".repeat(10);
name.into()
} else {
name.into()
}
}
You can choose to leak memory to convert a String to a &'static str:
fn return_str() -> &'static str {
let string = "ACTG".repeat(10);
Box::leak(string.into_boxed_str())
}
This is a really bad idea in many cases as the memory usage will grow forever every time this function is called.
If you wanted to return the same string every call, see also:
How to create a static string at compile time
The problem is that you are trying to create a reference to a string that will disappear when the function returns.
A simple solution in this case is to pass in the empty string to the function. This will explicitly ensure that the referred string will still exist in the scope where the function returns:
fn return_str(s: &mut String) -> &str {
for _ in 0..10 {
s.push_str("ACTG");
}
&s[..]
}
fn main() {
let mut s = String::new();
let s = return_str(&mut s);
assert_eq!("ACTGACTGACTGACTGACTGACTGACTGACTGACTGACTG", s);
}
Code in Rust Playground:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=2499ded42d3ee92d6023161fe82e9b5f
This is an old question but a very common one. There are many answers but none of them addresses the glaring misconception people have about the strings and string slices, which stems from not knowing their true nature.
But lets start with the obvious question before addressing the implied one: Can we return a reference to a local variable?
What we are asking to achieve is the textbook definition of a dangling pointer. Local variables will be dropped when the function completes its execution. In other words they will be pop off the execution stack and any reference to the local variables then on will be pointing to some garbage data.
Best course of action is either returning the string or its clone. No need to obsess over the speed.
However, I believe the essence of the question is if there is a way to convert a String into an str? The answer is no and this is where the misconception lies:
You can not turn a String into an str by borrowing it. Because a String is heap allocated. If you take a reference to it, you still be using heap allocated data but through a reference. str, on the other hand, is stored directly in the data section of the executable file and it is static. When you take a reference to a string, you will get matching type signature for common string manipulations, not an actual &str.
You can check out this post for detailed explanation:
What are the differences between Rust's `String` and `str`?
Now, there may be a workaround for this particular use case if you absolutely use static text:
Since you use combinations of four bases A, C, G, T, in groups of four, you can create a list of all possible outcomes as &str and use them through some data structure. You will jump some hoops but certainly doable.
if it is possible to create the resulting STRING in a static way at compile time, this would be a solution without memory leaking
#[macro_use]
extern crate lazy_static;
fn return_str<'a>() -> &'a str {
lazy_static! {
static ref STRING: String = {
"ACTG".repeat(10)
};
}
&STRING
}
Yes you can - the method replace_range provides a work around -
let a = "0123456789";
//println!("{}",a[3..5]); fails - doesn't have a size known at compile-time
let mut b = String::from(a);
b.replace_range(5..,"");
b.replace_range(0..2,"");
println!("{}",b); //succeeds
It took blood sweat and tears to achieve this!

How do I transform &str to ~str in Rust?

This is for the current 0.6 Rust trunk by the way, not sure the exact commit.
Let's say I want to for each over some strings, and my closure takes a borrowed string pointer argument (&str). I want my closure to add its argument to an owned vector of owned strings ~[~str] to be returned. My understanding of Rust is weak, but I think that strings are a special case where you can't dereference them with * right? How do I get my strings from &str into the vector's push method which takes a ~str?
Here's some code that doesn't compile
fn read_all_lines() -> ~[~str] {
let mut result = ~[];
let reader = io::stdin();
let util = #reader as #io::ReaderUtil;
for util.each_line |line| {
result.push(line);
}
result
}
It doesn't compile because it's inferring result's type to be [&str] since that's what I'm pushing onto it. Not to mention its lifetime will be wrong since I'm adding a shorter-lived variable to it.
I realize I could use ReaderUtil's read_line() method which returns a ~str. But this is just an example.
So, how do I get an owned string from a borrowed string? Or am I totally misunderstanding.
You should call the StrSlice trait's method, to_owned, as in:
fn read_all_lines() -> ~[~str] {
let mut result = ~[];
let reader = io::stdin();
let util = #reader as #io::ReaderUtil;
for util.each_line |line| {
result.push(line.to_owned());
}
result
}
StrSlice trait docs are here:
http://static.rust-lang.org/doc/core/str.html#trait-strslice
You can't.
For one, it doesn't work semantically: a ~str promises that only one thing owns it at a time. But a &str is borrowed, so what happens to the place you borrowed from? It has no way of knowing that you're trying to steal away its only reference, and it would be pretty rude to trash the caller's data out from under it besides.
For another, it doesn't work logically: ~-pointers and #-pointers are allocated in completely different heaps, and a & doesn't know which heap, so it can't be converted to ~ and still guarantee that the underlying data lives in the right place.
So you can either use read_line or make a copy, which I'm... not quite sure how to do :)
I do wonder why the API is like this, when & is the most restricted of the pointers. ~ should work just as well here; it's not like the iterated strings already exist somewhere else and need to be borrowed.
At first I thought it was possible to use copy line to create owning pointer from the borrowed pointer to the string but this apparently copies burrowed pointer.
So I found str::from_slice(s: &str) -> ~str. This is probably what you need.

Resources