This works:
fn user_add<'x>(data: &'x Input, db: &'x mut Database<'x>) -> HandlerOutput {
//let input: UserAddIn = json::decode(&data.post).unwrap();
//let username = input.username.as_bytes();
//let password = input.password.as_bytes();
db.put(b"Hi", b"hello");
//db.delete(username);
Ok("Hi".to_string())
}
This does not work:
fn user_add<'x>(data: &'x Input, db: &'x mut Database<'x>) -> HandlerOutput {
//let input: UserAddIn = json::decode(&data.post).unwrap();
//let username = input.username.as_bytes();
//let password = input.password.as_bytes();
let my_str = "hi".to_string();
let username = my_str.as_bytes();
db.put(username, b"hello");
//db.delete(username);
Ok("Hi".to_string())
}
Compiler output:
src/handlers.rs:85:17: 85:23 error: `my_str` does not live long enough
src/handlers.rs:85 let username = my_str.as_bytes();
^~~~~~
src/handlers.rs:80:77: 89:2 note: reference must be valid for the lifetime 'x as defined on the block at 80:76...
src/handlers.rs:80 fn user_add<'x>(data: &'x Input, db: &'x mut Database<'x>) -> HandlerOutput {
src/handlers.rs:81 //let input: UserAddIn = json::decode(&data.post).unwrap();
src/handlers.rs:82 //let username = input.username.as_bytes();
src/handlers.rs:83 //let password = input.password.as_bytes();
src/handlers.rs:84 let my_str = "hi".to_string();
src/handlers.rs:85 let username = my_str.as_bytes();
...
src/handlers.rs:84:32: 89:2 note: ...but borrowed value is only valid for the block suffix following statement 0 at 84:31
src/handlers.rs:84 let my_str = "hi".to_string();
src/handlers.rs:85 let username = my_str.as_bytes();
src/handlers.rs:86 db.put(username, b"hello");
src/handlers.rs:87 //db.delete(username);
src/handlers.rs:88 Ok("Hi".to_string())
src/handlers.rs:89 }
I've seen several questions about lifetime in Rust and I think the book is not that clear about it. I still use lifetimes as trial and error. This specific case has confused me because I've made several attempts fighting against the compiler and this is just the last error I got. If you have some Rust skills please consider editing the part about lifetimes in the book.
In the first case b"Hi" is a byte literal, and has type &'static [u8] which means “slice of u8 with infinite lifetime”. The function put needs some lifetime 'x, since 'static live is bigger than any lifetime, Rust is happy to use it.
In the second case
let my_str = "hi".to_string();
let username = my_str.as_bytes();
username is a reference to the inner buffer of my_str and cannot outlive it. The compiler complains because the first argument of put should have a lifetime 'x which is broader than that of my_str (local to user_add). Rust won't allow you to do that because db would point to dangling data at the end of the function call:
user_add(input, &mut db);
// `my_str` was local to `user_add` and doesn't exist anymore
// if Rust had allowed you to put it in `db`, `db` would now contain some invalid data here
Thanks to #mcarton for answering why the error happens. In this answer I hope it becames clear how to solve it too.
The compiler's code generation is perfect but the error message is
just terribly confusing to me.
The problem was in another library that I made, that happens to be a
database. The database struct contains an entry that holds slices.
The lifetime of the slices was set as:
struct Entry<'a> {
key: &'a [u8],
value: &'a [u8],
}
pub struct Database<'a> {
file: File,
entries: Vec<Entry<'a>>,
}
It means that the data that the slice holds need to live longer than the
database struct. The username variable goes out of scope but the database holding a reference to it still lives. So it means that the database would have to hold data that lives longer than it, like static variables, which makes the database useless.
The library compiled okay. But the error showed elsewhere.
The solution for that was to exchange the slices for vectors because
vectors are not pointers. The vectors can live less than the database.
struct Entry {
key: Vec<u8>,
value: Vec<u8>,
}
pub struct Database {
file: File,
entries: Vec<Entry>,
}
Related
So I am trying to test if I understand lifetimes, and wanted to create a scenario that would fail at compile time. The code I came up with is below:
#[test]
fn lifetime() {
struct Identity<'a> {
first_name: &'a str
}
let name: Identity;
{
let first: &str = "hello";
name = Identity {
first_name: first
};
}
println!("{}", name.first_name);
}
the reasoning is that instance of Identity should live as long as what first_name refrences.
Then in the code I create let first: &str = "hello" with a smaller scope, set it to let name: Identity; and then after first should have gone out of scope, I then attempted to print name.first_name. I was expecting this not to compile,, but it compile fine.
What am I missing in my understanding of how lifetimes work and why did this compile?
#Edit
updating the code to have this instead made the compilation fail:
let string = String::from("hello");
let first: &str = string.as_str();
still curious to know why the original code worked.
Because you move first into name. first references 'static data (a special lifetime that lives for the entirety of a program), a literal string in this case which can never go out of scope.
To make your test fail to compile, try referencing data that will go out of scope:
#[test]
fn lifetime() {
struct Identity<'a> {
first_name: &'a str
}
let name: Identity;
{
let first: String = String::from("hello");
name = Identity {
first_name: &first,
};
// `first` will go out of scope here and any references to it
// (like `&first` within `name`) will become invalid.
}
// `first` has been dropped so you can't reference it anymore here:
println!("{}", name.first_name);
}
Variable first is an alias to a string literal with static lifetime. Therefore, the let first: &str = "hello"; has a hidden lifetime specifier "static" and it is equivalent to:
let first: &'static str = "hello";
That enables the variable name to have a lifetime anything up to static. However, it is lifetime is determined by the outer scope, which is shorter but long enough to enable that println statement.
I have problems with understanding the behavior and availability of structs with multiple lifetime parameters. Consider the following:
struct My<'a,'b> {
first: &'a String,
second: &'b String
}
fn main() {
let my;
let first = "first".to_string();
{
let second = "second".to_string();
my = My{
first: &first,
second: &second
}
}
println!("{}", my.first)
}
The error message says that
|
13 | second: &second
| ^^^^^^^ borrowed value does not live long enough
14 | }
15 | }
| - `second` dropped here while still borrowed
16 | println!("{}", my.first)
| -------- borrow later used here
First, I do not access the .second element of the struct. So, I do not see the problem.
Second, the struct has two life time parameters. I assume that compiler tracks the fields of struct seperately.
For example the following compiles fine:
struct Own {
first: String,
second: String
}
fn main() {
let my;
let first = "first".to_string();
{
let second = "second".to_string();
my = Own{
first: first,
second: second
}
}
std::mem::drop(my.second);
println!("{}", my.first)
}
Which means that even though, .second of the struct is dropped that does not invalidate the whole struct. I can still access the non-dropped elements.
Why doesn't the same the same work for structs with references?
The struct has two independent lifetime parameters. Just like a struct with two type parameters are independent of each other, I would expect that these two lifetimes are independent as well. But the error message suggest that in the case of lifetimes these are not independent. The resultant struct does not have two lifetime parameters but only one that is the smaller of the two.
If the validity of struct containing two references limited to the lifetime of reference with the smallest lifetime, then my question is what is the difference between
struct My1<'a,'b>{
f: &'a X,
s: &'b Y,
}
and
struct My2<'a>{
f: &'a X,
s: &'a Y
}
I would expect that structs with multiple lifetime parameters to behave similar to functions with multiple lifetime parameters. Consider these two functions
fn fun_single<'a>(x:&'a str, y: &'a str) -> &'a str {
if x.len() <= y.len() {&x[0..1]} else {&y[0..1]}
}
fn fun_double<'a,'b>(x: &'a str, y:&'b str) -> &'a str {
&x[0..1]
}
fn main() {
let first = "first".to_string();
let second = "second".to_string();
let ref_first = &first;
let ref_second = &second;
let result_ref = fun_single(ref_first, ref_second);
std::mem::drop(second);
println!("{result_ref}")
}
In this version we get the result from a function with single life time parameter. Compiler thinks that two function parameters are related so it picks the smallest lifetime for the reference we return from the function. So it does not compile this version.
But if we just replace the line
let result_ref = fun_single(ref_first, ref_second);
with
let result_ref = fun_double(ref_first, ref_second);
the compiler sees that two lifetimes are independent so even when you drop second result_ref is still valid, the lifetime of the return reference is not the smallest but independent from second parameter and it compiles.
I would expect that structs with multiple lifetimes and functions with multiple lifetimes to behave similarly. But they don't.
What am I missing here?
I assume that compiler tracks the fields of struct seperately.
I think that's the core of your confusion. The compiler does track each lifetime separately, but only statically at compile time, not during runtime. It follows from this that Rust generally can not allow structs to be partially valid.
So, while you do specify two lifetime parameters, the compiler figures that the struct can only be valid as long as both of them are alive: that is, until the shorter-lived one lives.
But then how does the second example work? It relies on an exceptional feature of the compiler, called Partial Moving. That means that whenever you move out of a struct, it allows you to move disjoint parts separately.
It is essentially a syntax sugar for the following:
struct Own {
first: String,
second: String
}
fn main() {
let my;
let first = "first".to_string();
{
let second = "second".to_string();
my = Own{
first: first,
second: second
}
}
let Own{
first: my_first,
second: my_second,
} = my;
std::mem::drop(my_second);
println!("{}", my_first);
}
Note that this too is a static feature, so the following will not compile (even though it would work when run):
struct Own {
first: String,
second: String
}
fn main() {
let my;
let first = "first".to_string();
{
let second = "second".to_string();
my = Own{
first: first,
second: second
}
}
if false {
std::mem::drop(my.first);
}
println!("{}", my.first)
}
The struct may not be moved as a whole once it has been partially moved, so not even this allows you to have partially valid structs.
A local variable may be partially initialized, such as in your second example. Rust can track this for local variables and give you an error if you attempt to access the uninitialized parts.
However in your first example the variable isn't actually partially initialized, it's fully initialized (you give it both the first and second field). Then, when second goes out of scope, my is still fully initialized, but it's second field is now invalid (but initialized). Thus it doesn't even let the variable exist past when second is dropped to avoid an invalid reference.
Rust could track this since you have 2 lifetimes and name the second lifetime a special 'never that would signal the reference is always invalid, but it currently doesn't.
In module db.rs , while reading the value from the DataBase, i got the Error temporary value dropped while borrowed
consider using a let binding to create a longer lived value
use bytes::Bytes;
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
#[derive(Clone, Debug)]
pub struct Db {
// pub entries: Arc<bool>,
pub entries: Arc<Mutex<HashMap<String, Bytes>>>,
}
impl Db {
pub fn new() -> Db {
Db {
entries: Arc::new(Mutex::new(HashMap::new())),
}
}
/// Reads data from the database
pub fn read(&mut self, arr: &[String]) -> Result<Bytes, &'static str> {
let key = &arr[1];
let query_result = self.entries.lock().unwrap().get(key);// Error in this Line.
if let Some(value) = query_result {
return Ok(Bytes::from("hello"));
} else {
return Err("no such key found");
}
}
}
but when i modify the code and trying to get the value in the next line, it didn't give any error.
let query_result = self.entries.lock().unwrap();
let result = query_result.get(key);
can anyone help me understand what's going on under the hood?
We can see why Rust thinks this is an error by checking how Mutex::lock works. If successful, it doesn't return a reference directly, it returns a MutexGuard struct that can deref into the type it wraps, a HashMap in your case.
The signature of Deref::deref<Target = T> is (with the elided lifetimes added):
fn deref<'a>(&'a self) -> &'a T
This means that the MutexGuard can only give us a reference to the HashMap inside for as long as it is itself alive (the lifetime 'a). But because you never store it anywhere, instead dereferencing it directly, Rust thinks that it gets dropped right after the call to get. But you keep the result of get around, which can only live for as long as the reference to the HashMap passed into it, which in turn only lives as long as the MutexGuard which gets dropped immediately.
If you store the MutexGuard, on the other hand, like
let guard = self.entries.lock().unwrap();
let query_result = guard.get(key);
it only gets dropped at the end of the scope, so any references it gave out are also valid until the end of the scope.
I'm writing a library that should read from something implementing the BufRead trait; a network data stream, standard input, etc. The first function is supposed to read a data unit from that reader and return a populated struct filled mostly with &'a str values parsed from a frame from the wire.
Here is a minimal version:
mod mymod {
use std::io::prelude::*;
use std::io;
pub fn parse_frame<'a, T>(mut reader: T)
where
T: BufRead,
{
for line in reader.by_ref().lines() {
let line = line.expect("reading header line");
if line.len() == 0 {
// got empty line; done with header
break;
}
// split line
let splitted = line.splitn(2, ':');
let line_parts: Vec<&'a str> = splitted.collect();
println!("{} has value {}", line_parts[0], line_parts[1]);
}
// more reads down here, therefore the reader.by_ref() above
// (otherwise: use of moved value).
}
}
use std::io;
fn main() {
let stdin = io::stdin();
let locked = stdin.lock();
mymod::parse_frame(locked);
}
An error shows up which I cannot fix after trying different solutions:
error: `line` does not live long enough
--> src/main.rs:16:28
|
16 | let splitted = line.splitn(2, ':');
| ^^^^ does not live long enough
...
20 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the body at 8:4...
--> src/main.rs:8:5
|
8 | / {
9 | | for line in reader.by_ref().lines() {
10 | | let line = line.expect("reading header line");
11 | | if line.len() == 0 {
... |
22 | | // (otherwise: use of moved value).
23 | | }
| |_____^
The lifetime 'a is defined on a struct and implementation of a data keeper structure because the &str requires an explicit lifetime. These code parts were removed as part of the minimal example.
BufReader has a lines() method which returns Result<String, Err>. I handle errors using expect or match and thus unpack the Result so that the program now has the bare String. This will then be done multiple times to populate a data structure.
Many answers say that the unwrap result needs to be bound to a variable otherwise it gets lost because it is a temporary value. But I already saved the unpacked Result value in the variable line and I still get the error.
How to fix this error - could not get it working after hours trying.
Does it make sense to do all these lifetime declarations just for &str in a data keeper struct? This will be mostly a readonly data structure, at most replacing whole field values. String could also be used, but have found articles saying that String has lower performance than &str - and this frame parser function will be called many times and is performance-critical.
Similar questions exist on Stack Overflow, but none quite answers the situation here.
For completeness and better understanding, following is an excerpt from complete source code as to why lifetime question came up:
Data structure declaration:
// tuple
pub struct Header<'a>(pub &'a str, pub &'a str);
pub struct Frame<'a> {
pub frameType: String,
pub bodyType: &'a str,
pub port: &'a str,
pub headers: Vec<Header<'a>>,
pub body: Vec<u8>,
}
impl<'a> Frame<'a> {
pub fn marshal(&'a self) {
//TODO
println!("marshal!");
}
}
Complete function definition:
pub fn parse_frame<'a, T>(mut reader: T) -> Result<Frame<'a>, io::Error> where T: BufRead {
Your problem can be reduced to this:
fn foo<'a>() {
let thing = String::from("a b");
let parts: Vec<&'a str> = thing.split(" ").collect();
}
You create a String inside your function, then declare that references to that string are guaranteed to live for the lifetime 'a. Unfortunately, the lifetime 'a isn't under your control — the caller of the function gets to pick what the lifetime is. That's how generic parameters work!
What would happen if the caller of the function specified the 'static lifetime? How would it be possible for your code, which allocates a value at runtime, to guarantee that the value lives longer than even the main function? It's not possible, which is why the compiler has reported an error.
Once you've gained a bit more experience, the function signature fn foo<'a>() will jump out at you like a red alert — there's a generic parameter that isn't used. That's most likely going to mean bad news.
return a populated struct filled mostly with &'a str
You cannot possibly do this with the current organization of your code. References have to point to something. You are not providing anywhere for the pointed-at values to live. You cannot return an allocated String as a string slice.
Before you jump to it, no you cannot store a value and a reference to that value in the same struct.
Instead, you need to split the code that creates the String and that which parses a &str and returns more &str references. That's how all the existing zero-copy parsers work. You could look at those for inspiration.
String has lower performance than &str
No, it really doesn't. Creating lots of extraneous Strings is a bad idea, sure, just like allocating too much is a bad idea in any language.
Maybe the following program gives clues for others who also also having their first problems with lifetimes:
fn main() {
// using String und &str Slice
let my_str: String = "fire".to_owned();
let returned_str: MyStruct = my_func_str(&my_str);
println!("Received return value: {ret}", ret = returned_str.version);
// using Vec<u8> und &[u8] Slice
let my_vec: Vec<u8> = "fire".to_owned().into_bytes();
let returned_u8: MyStruct2 = my_func_vec(&my_vec);
println!("Received return value: {ret:?}", ret = returned_u8.version);
}
// using String -> str
fn my_func_str<'a>(some_str: &'a str) -> MyStruct<'a> {
MyStruct {
version: &some_str[0..2],
}
}
struct MyStruct<'a> {
version: &'a str,
}
// using Vec<u8> -> & [u8]
fn my_func_vec<'a>(some_vec: &'a Vec<u8>) -> MyStruct2<'a> {
MyStruct2 {
version: &some_vec[0..2],
}
}
struct MyStruct2<'a> {
version: &'a [u8],
}
I'm trying to execute a function on chunks of a vector and then send the result back using the message passing library.
However, I get a strange error about the lifetime of the vector that isn't even participating in the thread operations:
src/lib.rs:153:27: 154:25 error: borrowed value does not live long enough
src/lib.rs:153 let extended_segments = (segment_size..max_val)
error: src/lib.rs:154 .collect::<Vec<_>>()borrowed value does not live long enough
note: reference must be valid for the static lifetime...:153
let extended_segments = (segment_size..max_val)
src/lib.rs:153:3: 155:27: 154 .collect::<Vec<_>>()
note: but borrowed value is only valid for the statement at 153:2:
reference must be valid for the static lifetime...
src/lib.rs:
let extended_segments = (segment_size..max_val)
consider using a `let` binding to increase its lifetime
I tried moving around the iterator and adding lifetimes to different places, but I couldn't get the checker to pass and still stay on type.
The offending code is below, based on the concurrency chapter in the Rust book. (Complete code is at github.)
use std::sync::mpsc;
use std::thread;
fn sieve_segment(a: &[usize], b: &[usize]) -> Vec<usize> {
vec![]
}
fn eratosthenes_sieve(val: usize) -> Vec<usize> {
vec![]
}
pub fn segmented_sieve_parallel(max_val: usize, mut segment_size: usize) -> Vec<usize> {
if max_val <= ((2 as i64).pow(16) as usize) {
// early return if the highest value is small enough (empirical)
return eratosthenes_sieve(max_val);
}
if segment_size > ((max_val as f64).sqrt() as usize) {
segment_size = (max_val as f64).sqrt() as usize;
println!("Segment size is larger than √{}. Reducing to {} to keep resource use down.",
max_val,
segment_size);
}
let small_primes = eratosthenes_sieve((max_val as f64).sqrt() as usize);
let mut big_primes = small_primes.clone();
let (tx, rx): (mpsc::Sender<Vec<usize>>, mpsc::Receiver<Vec<usize>>) = mpsc::channel();
let extended_segments = (segment_size..max_val)
.collect::<Vec<_>>()
.chunks(segment_size);
for this_segment in extended_segments.clone() {
let small_primes = small_primes.clone();
let tx = tx.clone();
thread::spawn(move || {
let sieved_segment = sieve_segment(&small_primes, this_segment);
tx.send(sieved_segment).unwrap();
});
}
for _ in 1..extended_segments.count() {
big_primes.extend(&rx.recv().unwrap());
}
big_primes
}
fn main() {}
How do I understand and avoid this error? I'm not sure how to make the lifetime of the thread closure static as in this question and still have the function be reusable (i.e., not main()). I'm not sure how to "consume all things that come into [the closure]" as mentioned in this question. And I'm not sure where to insert .map(|s| s.into()) to ensure that all references become moves, nor am I sure I want to.
When trying to reproduce a problem, I'd encourage you to create a MCVE by removing all irrelevant code. In this case, something like this seems to produce the same error:
fn segmented_sieve_parallel(max_val: usize, segment_size: usize) {
let foo = (segment_size..max_val)
.collect::<Vec<_>>()
.chunks(segment_size);
}
fn main() {}
Let's break that down:
Create an iterator between numbers.
Collect all of them into a Vec<usize>.
Return an iterator that contains references to the vector.
Since the vector isn't bound to any variable, it's dropped at the end of the statement. This would leave the iterator pointing to an invalid region of memory, so that's disallowed.
Check out the definition of slice::chunks:
fn chunks(&self, size: usize) -> Chunks<T>
pub struct Chunks<'a, T> where T: 'a {
// some fields omitted
}
The lifetime marker 'a lets you know that the iterator contains a reference to something. Lifetime elision has removed the 'a from the function, which looks like this, expanded:
fn chunks<'a>(&'a self, size: usize) -> Chunks<'a, T>
Check out this line of the error message:
help: consider using a let binding to increase its lifetime
You can follow that as such:
fn segmented_sieve_parallel(max_val: usize, segment_size: usize) {
let foo = (segment_size..max_val)
.collect::<Vec<_>>();
let bar = foo.chunks(segment_size);
}
fn main() {}
Although I'd write it as
fn segmented_sieve_parallel(max_val: usize, segment_size: usize) {
let foo: Vec<_> = (segment_size..max_val).collect();
let bar = foo.chunks(segment_size);
}
fn main() {}
Re-inserting this code back into your original problem won't solve the problem, but it will be much easier to understand. That's because you are attempting to pass a reference to thread::spawn, which may outlive the current thread. Thus, everything passed to thread::spawn must have the 'static lifetime. There are tons of questions that detail why that must be prevented and a litany of solutions, including scoped threads and cloning the vector.
Cloning the vector is the easiest, but potentially inefficient:
for this_segment in extended_segments.clone() {
let this_segment = this_segment.to_vec();
// ...
}