I'm writing a program to extract information from log files (which are in text format). The overall flow is
Read the file line-by-line into a String
Create a ParsedLine structure which borrows several string slices from that line (some using Cow)
Use the ParsedLine to write a CSV record.
This has been going very well so far, but I have run into a problem I do not understand, I think it is with lifetimes or data-flow analysis. The problem is with a small refactor I am trying to make.
I have this function which works:
fn process_line(columns: &[Column], line: String, writer: &mut Writer<File>) {
let parsed_line = ParsedLine::new(&line);
if parsed_line.is_err() {
let data = vec![""];
writer.write_record(&data).expect("Writing a CSV record should always succeed.");
return;
}
let parsed_line = parsed_line.unwrap();
// let data = output::make_output_record(&parsed_line, columns);
// The below code works. But if I try to pull it out into a separate function
// Rust will not compile it.
let mut data = Vec::new();
for column in columns {
match column.name.as_str() {
config::LOG_DATE => data.push(parsed_line.log_date),
config::LOG_LEVEL => data.push(parsed_line.log_level),
config::MESSAGE => data.push(&parsed_line.message),
_ => {
let ci_comparer = UniCase::new(column.name.as_str());
match parsed_line.kvps.get(&ci_comparer) {
Some(val) => {
let x = val.as_ref();
data.push(x);
},
None => data.push(""),
}
},
}
}
writer.write_record(&data).expect("Writing a CSV record should always succeed.");
}
But I want to pull out the bit of code that constructs data into a separate function so that I can test it more easily. Here's the function:
pub fn make_output_record<'p, 't, 'c>(parsed_line: &'p ParsedLine<'t>, columns: &'c [Column]) -> Vec<&'t str> {
let mut data = Vec::new();
for column in columns {
match column.name.as_str() {
config::LOG_DATE => data.push(parsed_line.log_date),
config::LOG_LEVEL => data.push(parsed_line.log_level),
config::MESSAGE => data.push(&parsed_line.message),
_ => {
let ci_comparer = UniCase::new(column.name.as_str());
match parsed_line.kvps.get(&ci_comparer) {
// This is the problem here. To make it explicit:
// val is a "&'t Cow<'t, str>" and x is "&'t str"
Some(val) => {
let x = val.as_ref();
data.push(x);
},
None => data.push(""),
}
},
}
}
data
}
And the error I get and do not understand is:
error[E0623]: lifetime mismatch
--> src/main.rs:201:5
|
177 | pub fn make_output_record<'p, 't, 'c>(parsed_line: &'p ParsedLine<'t>, columns: &'c [Column]) -> Vec<&'t str> {
| ------------ ------------
| |
| this parameter and the return type are declared with different lifetimes...
...
201 | data
| ^^^^ ...but data from `columns` is returned here
The compiler thinks that the returned vector contains information from Columns, but Columns is actually only used to get the name of the column, which is then used to lookup a value in the kvps HashMap (UniCase is used to make the lookup case-insensitive). If a value is found, we add the &str to data.
So I don't understand why the compiler thinks that something from Columns ends up in data, because to my mind Columns is just a bit of metadata used to drive the final contents of data, but does not in and of itself appear in data. Once the kvps lookup is done and we have the value Columns might as well not exist.
I've tried various ways of fixing this (including adding explicit lifetimes to everything, removing some lifetimes, and adding various outlives lifetime specifications) but no combination appears to be able to tell the compiler that Columns is not used in data.
For reference, here is the definition of ParsedLine:
#[derive(Debug, Default, PartialEq, Eq)]
pub struct ParsedLine<'t> {
pub line: &'t str,
pub log_date: &'t str,
pub log_level: &'t str,
pub message: Cow<'t, str>,
pub kvps: HashMap<UniCase<&'t str>, Cow<'t, str>>
}
Note that I am resisting getting rid of the Cows: I assume this would fix the problem, but the number of String allocations would probably rise by a factor of 20 and I'd like to avoid that. The current program is impressively fast!
I suspect the problem is actually with that UniCase<&'t str> and I need to give the key it's own lifetime. Not sure how though.
So my question is
Why can't I easily move this code into a new function?
How do I fix it?
I appreciate this is a rather long question. It may be easier to fiddle with the code locally. It is on Github and the error should be reproducable with:
git clone https://github.com/PhilipDaniels/log-file-processor
git checkout 80158b3
cargo build
The call for make_output_record from process_line will infer the lifetime parameter of make_output_record.
pub fn make_output_record<'p>(parsed_line: &'p ParsedLine, columns: &'p [Column]) -> Vec<&'p str> {
This means 'p is a lifetime which the owner will be alive in process_line's scope(because of the inference). According to your code parsed_line and columnslives in 'p. The 'p is the common lifetime for your return value and the arguments. That's why your code was not working because 'p, 't ,'c is not common for arguments and your return value.
I simplified your code in here, this is the working version, you can have your error back if you add other life time parameters back to make_output_record.
Related
I want to make an iterator that owns another iterator and gives items based on that other iterator. Originally, the inner iterator is formed from the result of a database query, it gives the raw data rows as arrived from the DB. The outer iterator takes items of the inner iterator and puts them into a struct that is meaningful within my program. Because different software versions store the same data in different database table structures, I have a parser trait that takes a row and creates a structure. My outer iterator takes two parameters for creation: the iterator for the DB rows and an object which implements how to parse the data.
But I run into a lifetime error which I don't really see the reason of, and following the compiler's hints only lead me in circles. I literally follow the compiler's advice and getting back to the same problem. I tried to minic the code and bring it to a minimal form to reproduce the same compiler errors I'm getting. I'm not entirely sure if it could be minimized further, but I also wanted it to resemble my real code.
Here is the sample:
struct Storeroom<'a> {
storeroom_id: i64,
version: &'a str
}
trait StoreroomParser {
fn parse(&self, row: Row) -> Result<Storeroom, Error>;
}
struct StoreroomParserX;
impl StoreroomParser for StoreroomParserX {
fn parse(&self, row: Row) -> Result<Storeroom, Error> {
Ok(Storeroom { storeroom_id: row.dummy, version: "0.0.0"})
}
}
struct StoreroomIterator {
rows: Box<dyn Iterator<Item = Row>>,
parser: Box<dyn StoreroomParser>
}
impl StoreroomIterator {
fn new() -> Result<Self, Error> {
let mut rows: Vec<Row> = vec![];
rows.push(Row { dummy: 4});
rows.push(Row { dummy: 6});
rows.push(Row { dummy: 8});
let rows = Box::new(rows.into_iter());
let parser = Box::new(StoreroomParserX {});
Ok(Self {rows, parser})
}
}
impl Iterator for StoreroomIterator {
type Item<'a> = Result<Storeroom<'a>, Error>;
fn next(&mut self) -> Option<Self::Item> {
if let Some(nextrow) = self.rows.next() {
Some(self.parser.parse(nextrow))
}
else {
None
}
}
}
During my first attempt, the compiler suggested to add a lifetime annotation to the Item type declaration, because it uses a struct that requires a lifetime. But this resulted in the following error:
error[E0658]: generic associated types are unstable
--> src/main.rs:59:5
|
59 | type Item<'a> = Result<Storeroom<'a>, Error>;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: see issue #44265 <https://github.com/rust-lang/rust/issues/44265> for more information
error[E0195]: lifetime parameters or bounds on type `Item` do not match the trait declaration
--> src/main.rs:59:14
|
59 | type Item<'a> = Result<Storeroom<'a>, Error>;
| ^^^^ lifetimes do not match type in trait
Here's the sample code on Playground.
When I tried to mitigate this by moving the lifetime annotation to the impl block instead, I provoked the following error I can't progress from:
error: lifetime may not live long enough
--> src/main.rs:61:13
|
56 | impl<'a> Iterator for StoreroomIterator<'a> {
| -- lifetime `'a` defined here
...
59 | fn next(&mut self) -> Option<Self::Item> {
| - let's call the lifetime of this reference `'1`
60 | if let Some(nextrow) = self.rows.next() {
61 | Some(self.parser.parse(nextrow))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ associated function was supposed to return data with lifetime `'a` but it is returning data with lifetime `'1`
Playground.
I've been stuck on this problem for about a week now. Do you have any ideas how to resolve these errors? I'm also thinking that I should probably just use map() on the rows with whatever closure it takes to properly convert the data, but at this point it would definitely feel like a compromise.
So you say you force version to be 'static with Box::leak(). If so, you can can just remove the lifetime parameter entirely:
struct Storeroom {
storeroom_id: i64,
version: &'static str
}
playground
You also mention that the compiler "forces" you to Box<dyn> rows and parser. You can avoid that by making StoreroomIterator generic over two types for the two members. Only change needed is to take rows in the constructor:
struct StoreroomIterator<R: Iterator<Item = Row>, P: StoreroomParser> {
rows: R,
parser: P
}
impl<R: Iterator<Item = Row>> StoreroomIterator<R, StoreroomParserX> {
fn new(rows: R) -> Result<Self, Error> {
let parser = StoreroomParserX {};
Ok(Self { rows, parser })
}
}
playground
It may be possible to get everything to work with lifetimes as well, but from your incomplete example, it's hard to say exactly. You may want to store a String containing the version in Storeroom and then add a version() method to generate a Version on demand, rather than generating them all up front. But it's hard to say without knowing what this all is for. You may just want to switch to a different library for handling version comparisons.
This question already has answers here:
Why can't I store a value and a reference to that value in the same struct?
(4 answers)
Closed 8 months ago.
The following is a snippet of a more complicated code, the idea is loading a SQL table and setting a hashmap with one of the table struct fields as the key and keeping the structure as the value (implementation details are not important since the code works fine if I clone the String, however, the Strings in the DB can be arbitrarily long and cloning can be expensive).
The following code will fail with
error[E0382]: use of partially moved value: `foo`
--> src/main.rs:24:35
|
24 | foo_hashmap.insert(foo.a, foo);
| ----- ^^^ value used here after partial move
| |
| value partially moved here
|
= note: partial move occurs because `foo.a` has type `String`, which does not implement the `Copy` trait
For more information about this error, try `rustc --explain E0382`.
use std::collections::HashMap;
struct Foo {
a: String,
b: String,
}
fn main() {
let foo_1 = Foo {
a: "bar".to_string(),
b: "bar".to_string(),
};
let foo_2 = Foo {
a: "bar".to_string(),
b: "bar".to_string(),
};
let foo_vec = vec![foo_1, foo_2];
let mut foo_hashmap = HashMap::new();
foo_vec.into_iter().for_each(|foo| {
foo_hashmap.insert(foo.a, foo); // foo.a.clone() will make this compile
});
}
The struct Foo cannot implement Copy since its fields are String. I tried wrapping foo.a with Rc::new(RefCell::new()) but later went down the pitfall of missing the trait Hash for RefCell<String>, so currently I'm not certain in either using something else for the struct fields (will Cow work?), or to handle that logic within the for_each loop.
There are at least two problems here: First, the resulting HashMap<K, V> would be a self-referential struct, as the K borrows V; there are many questions and answers on SA about the pitfalls of this. Second, even if you could construct such a HashMap, you'd easily break the guarantees provided by HashMap, which allows you to modify V while assuming that K always stays constant: There is no way to get a &mut K for a HashMap, but you can get a &mut V; if K is actually a &V, one could easily modify K through V (by ways of mutating Foo.a ) and break the map.
One possibility is to change Foo.a from a String to a Rc<str>, which you can clone with minimal runtime cost in order to put the value both in the K and into V. As Rc<str> is Borrow<str>, you can still look up values in the map by means of &str. This still has the - theoretical - downside that you can break the map by getting a &mut Foo from the map and std::mem::swap the a, which makes it impossible to look up the correct value from its keys; but you'd have to do that deliberately.
Another option is to actually use a HashSet instead of a HashMap, and use a newtype for Foo which behaves like a Foo.a. You'd have to implement PartialEq, Eq, Hash (and Borrow<str> for good measure) like this:
use std::collections::HashSet;
#[derive(Debug)]
struct Foo {
a: String,
b: String,
}
/// A newtype for `Foo` which behaves like a `str`
#[derive(Debug)]
struct FooEntry(Foo);
/// `FooEntry` compares to other `FooEntry` only via `.a`
impl PartialEq<FooEntry> for FooEntry {
fn eq(&self, other: &FooEntry) -> bool {
self.0.a == other.0.a
}
}
impl Eq for FooEntry {}
/// It also hashes the same way as a `Foo.a`
impl std::hash::Hash for FooEntry {
fn hash<H>(&self, hasher: &mut H)
where
H: std::hash::Hasher,
{
self.0.a.hash(hasher);
}
}
/// Due to the above, we can implement `Borrow`, so now we can look up
/// a `FooEntry` in the Set using &str
impl std::borrow::Borrow<str> for FooEntry {
fn borrow(&self) -> &str {
&self.0.a
}
}
fn main() {
let foo_1 = Foo {
a: "foo".to_string(),
b: "bar".to_string(),
};
let foo_2 = Foo {
a: "foobar".to_string(),
b: "barfoo".to_string(),
};
let foo_vec = vec![foo_1, foo_2];
let mut foo_hashmap = HashSet::new();
foo_vec.into_iter().for_each(|foo| {
foo_hashmap.insert(FooEntry(foo));
});
// Look up `Foo` using &str as keys...
println!("{:?}", foo_hashmap.get("foo").unwrap().0);
println!("{:?}", foo_hashmap.get("foobar").unwrap().0);
}
Notice that HashSet provides no way to get a &mut FooEntry due to the reasons described above. You'd have to use RefCell (and read what the docs of HashSet have to say about this).
The third option is to simply clone() the foo.a as you described. Given the above, this is probably the most simple solution. If using an Rc<str> doesn't bother you for other reasons, this would be my choice.
Sidenote: If you don't need to modify a and/or b, a Box<str> instead of String is smaller by one machine word.
I want to develop a library for boolean formulas with Rust and I'm pretty new to Rust.
The idea is to have immutable formulas which are created and cached by a (obviously mutable) formula factory. So a user would first create a formula factory and then use it to create formulas which are returned as references.
The problem is that the compiler basically does not let me create more than one formula, because this would mean that there is more than one mutable borrow of the formula factory object.
let mut f = FormulaFactory::new();
let a = f.variable("a");
let b = f.variable("b"); // error: cannot borrow `f` as mutable more than once at a time
let ab = f.and(a, b);
I understand this violation of rules, but on the other hand I think that in this case everything would be ok (at least in a single-threaded setting). Is there a simple way to get around this problem or do I rather have to think about a different, more rust-compatible approach?
Some more information: 'static lifetime is not an option in the targeted scenario. The user might want to create multiple formula factories and especially drop them if the formulas are no longer needed.
Just for reference a minimal example (strongly simplified – obviously a Formula will also have a formula type, in this example there are only variables and conjunctions):
#![feature(hash_set_entry)]
use std::collections::HashSet;
#[derive(PartialEq, Eq, Hash)]
pub struct Formula<'a> {
variable: Option<&'a str>,
operands: Vec<&'a Formula<'a>>,
}
pub struct FormulaFactory<'a> {
variables: HashSet<Formula<'a>>,
conjunctions: HashSet<Formula<'a>>,
}
impl<'a> FormulaFactory<'a> {
pub fn new() -> FormulaFactory<'a> {
FormulaFactory {
variables: HashSet::new(),
conjunctions: HashSet::new(),
}
}
pub fn variable(&mut self, name: &'a str) -> &Formula<'a> {
(&mut self.variables).get_or_insert(Formula{variable: Some(name), operands: vec![]})
}
pub fn and(&mut self, op1: &'a Formula<'a>, op2: &'a Formula<'a>) -> &Formula<'a> {
(&mut self.conjunctions).get_or_insert(Formula{variable: None, operands: vec![op1, op2]})
}
}
fn main() {
let mut f = FormulaFactory::new();
let a = f.variable("a");
let b = f.variable("b"); // error: cannot borrow `f` as mutable more than once at a time
let ab = f.and(a, b);
println!("{}", ab.operands[0].variable.unwrap())
}
The variable a is a reference contained in the f object. As long as you have references of a you cannot modify f as it already aliased by a. I think the best way to approach this would be to have a Vec of Formula in the FormulaFactory struct called formulas (for the sake of simplicity) and give out only FormulaIndex objects which is just a usize representing the index of the Formula within the formulas field. This is the same approach that petgraph takes where the nodes field in a Graph contains a Vec of Node but the Graph api only gives out NodeIndex objects.
I've very recently started studying Rust, and while working on a test program, I wrote this method:
pub fn add_transition(&mut self, start_state: u32, end_state: u32) -> Result<bool, std::io::Error> {
let mut m: Vec<Page>;
let pages: &mut Vec<Page> = match self.page_cache.get_mut(&start_state) {
Some(p) => p,
None => {
m = self.index.get_pages(start_state, &self.file)?;
&mut m
}
};
// omitted code that mutates pages
// ...
Ok(true)
}
it does work as expected, but I'm not convinced about the m variable. If I remove it, the code looks more elegant:
pub fn add_transition(&mut self, start_state: u32, end_state: u32) -> Result<bool, std::io::Error> {
let pages: &mut Vec<Page> = match self.page_cache.get_mut(&start_state) {
Some(p) => p,
None => &mut self.index.get_pages(start_state, &self.file)?
};
// omitted code that mutates pages
// ...
Ok(true)
}
but I get:
error[E0716]: temporary value dropped while borrowed
--> src\module1\mod.rs:28:29
|
26 | let pages: &mut Vec<Page> = match self.page_cache.get_mut(&start_state) {
| _____________________________________-
27 | | Some(p) => p,
28 | | None => &mut self.index.get_pages(start_state, &self.file)?
| | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^-
| | | |
| | | temporary value is freed at the end of this statement
| | creates a temporary which is freed while still in use
29 | | };
| |_________- borrow later used here
|
= note: consider using a `let` binding to create a longer lived value
I fully understand the error, which directed me to the working snippet, but I'm wondering if there's a more elegant and/or idiomatic way of writing this code. I am declaring m at the beginning of the function, only to prevent a temporary variable from being freed too early. Is there a way of telling the compiler that the lifetime of the return value of self.index.get_pages should be the whole add_transition function?
Further details:
Page is a relatively big struct, so I'd rather not implement the Copy trait nor I'd clone it.
page_cache is of type HashMap<u32, Vec<Page>>
self.index.get_pages is relatively slow and I'm using page_cache to cache results
The return type of self.index.get_pages is Result<Vec<Page>, std::io::Error>
This is normal, your 'cleaner' code basically comes down to do something as follows:
let y = {
let x = 42;
&x
};
Here it should be obvious that you cannot return a reference to x because x is dropped at the end of the block. Those rules don't change when working with temporary values: self.index.get_pages(start_state, &self.file)? creates a temporary value that is dropped at the end of the block (line 29) and thus you can't return a reference to it.
The workaround via m now moves that temporary into the m binding one block up which will live long enough for pages to work with it.
Now for alternatives, I guess page_cache is a HashMap? Then you could alternatively do something like let pages = self.page_cache.entry(start_state).or_insert_with(||self.index.get_pages(...))?;. The only problem with that approach is that get_pages returns a Result while the current cache stores Vec<Page> (the Ok branch only). You could adapt the cache to actually store Result instead, which I think is semantically also better since you want to cache the results of that function call, so why not do that for Err? But if you have a good reason to not cache Err, the approach you have should work just fine.
Yours is probably the most efficient way, but in theory not necessary, and one can be more elegant.
Another way of doing it is to use a trait object in this case — have the variable be of the type dyn DerefMut<Vec<Page>>. This basically means that this variable can hold any type that implements the trait DerefMut<Vec<Page>>>, two types that do so are &mut Vec<Page> and Vec<Page>, in that case the variable can hold either of these, but the contents can only be referenced via DerefMut.
So the following code works as an illustration:
struct Foo {
inner : Option<Vec<i32>>,
}
impl Foo {
fn new () -> Self {
Foo { inner : None }
}
fn init (&mut self) {
self.inner = Some(Vec::new())
}
fn get_mut_ref (&mut self) -> Option<&mut Vec<i32>> {
self.inner.as_mut()
}
}
fn main () {
let mut foo : Foo = Foo::new();
let mut m : Box<dyn AsMut<Vec<i32>>> = match foo.get_mut_ref() {
Some(r) => Box::new(r),
None => Box::new(vec![1,2,3]),
};
m.as_mut().as_mut().push(4);
}
The key here is the type Box<dyn AsMut<Vec<i32>>; this means that it can be a box that holds any type, so long the type implement AsMut<Vec<i32>>, because it's boxed in we also need .as_mut().as_mut() to get the actual &mut <Vec<i32>> out of it.
Because different types can have different sizes; they also cannot be allocated on the stack, so they must be behind some pointer, a Box is typically chosen therefore, and in this case necessary, a normal pointer that is sans ownership of it's pointee will face similar problems to those you face.
One might argue that this code is more elegant, but yours is certainly more efficient and does not require further heap allocation.
I'm writing a library that should read from something implementing the BufRead trait; a network data stream, standard input, etc. The first function is supposed to read a data unit from that reader and return a populated struct filled mostly with &'a str values parsed from a frame from the wire.
Here is a minimal version:
mod mymod {
use std::io::prelude::*;
use std::io;
pub fn parse_frame<'a, T>(mut reader: T)
where
T: BufRead,
{
for line in reader.by_ref().lines() {
let line = line.expect("reading header line");
if line.len() == 0 {
// got empty line; done with header
break;
}
// split line
let splitted = line.splitn(2, ':');
let line_parts: Vec<&'a str> = splitted.collect();
println!("{} has value {}", line_parts[0], line_parts[1]);
}
// more reads down here, therefore the reader.by_ref() above
// (otherwise: use of moved value).
}
}
use std::io;
fn main() {
let stdin = io::stdin();
let locked = stdin.lock();
mymod::parse_frame(locked);
}
An error shows up which I cannot fix after trying different solutions:
error: `line` does not live long enough
--> src/main.rs:16:28
|
16 | let splitted = line.splitn(2, ':');
| ^^^^ does not live long enough
...
20 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the body at 8:4...
--> src/main.rs:8:5
|
8 | / {
9 | | for line in reader.by_ref().lines() {
10 | | let line = line.expect("reading header line");
11 | | if line.len() == 0 {
... |
22 | | // (otherwise: use of moved value).
23 | | }
| |_____^
The lifetime 'a is defined on a struct and implementation of a data keeper structure because the &str requires an explicit lifetime. These code parts were removed as part of the minimal example.
BufReader has a lines() method which returns Result<String, Err>. I handle errors using expect or match and thus unpack the Result so that the program now has the bare String. This will then be done multiple times to populate a data structure.
Many answers say that the unwrap result needs to be bound to a variable otherwise it gets lost because it is a temporary value. But I already saved the unpacked Result value in the variable line and I still get the error.
How to fix this error - could not get it working after hours trying.
Does it make sense to do all these lifetime declarations just for &str in a data keeper struct? This will be mostly a readonly data structure, at most replacing whole field values. String could also be used, but have found articles saying that String has lower performance than &str - and this frame parser function will be called many times and is performance-critical.
Similar questions exist on Stack Overflow, but none quite answers the situation here.
For completeness and better understanding, following is an excerpt from complete source code as to why lifetime question came up:
Data structure declaration:
// tuple
pub struct Header<'a>(pub &'a str, pub &'a str);
pub struct Frame<'a> {
pub frameType: String,
pub bodyType: &'a str,
pub port: &'a str,
pub headers: Vec<Header<'a>>,
pub body: Vec<u8>,
}
impl<'a> Frame<'a> {
pub fn marshal(&'a self) {
//TODO
println!("marshal!");
}
}
Complete function definition:
pub fn parse_frame<'a, T>(mut reader: T) -> Result<Frame<'a>, io::Error> where T: BufRead {
Your problem can be reduced to this:
fn foo<'a>() {
let thing = String::from("a b");
let parts: Vec<&'a str> = thing.split(" ").collect();
}
You create a String inside your function, then declare that references to that string are guaranteed to live for the lifetime 'a. Unfortunately, the lifetime 'a isn't under your control — the caller of the function gets to pick what the lifetime is. That's how generic parameters work!
What would happen if the caller of the function specified the 'static lifetime? How would it be possible for your code, which allocates a value at runtime, to guarantee that the value lives longer than even the main function? It's not possible, which is why the compiler has reported an error.
Once you've gained a bit more experience, the function signature fn foo<'a>() will jump out at you like a red alert — there's a generic parameter that isn't used. That's most likely going to mean bad news.
return a populated struct filled mostly with &'a str
You cannot possibly do this with the current organization of your code. References have to point to something. You are not providing anywhere for the pointed-at values to live. You cannot return an allocated String as a string slice.
Before you jump to it, no you cannot store a value and a reference to that value in the same struct.
Instead, you need to split the code that creates the String and that which parses a &str and returns more &str references. That's how all the existing zero-copy parsers work. You could look at those for inspiration.
String has lower performance than &str
No, it really doesn't. Creating lots of extraneous Strings is a bad idea, sure, just like allocating too much is a bad idea in any language.
Maybe the following program gives clues for others who also also having their first problems with lifetimes:
fn main() {
// using String und &str Slice
let my_str: String = "fire".to_owned();
let returned_str: MyStruct = my_func_str(&my_str);
println!("Received return value: {ret}", ret = returned_str.version);
// using Vec<u8> und &[u8] Slice
let my_vec: Vec<u8> = "fire".to_owned().into_bytes();
let returned_u8: MyStruct2 = my_func_vec(&my_vec);
println!("Received return value: {ret:?}", ret = returned_u8.version);
}
// using String -> str
fn my_func_str<'a>(some_str: &'a str) -> MyStruct<'a> {
MyStruct {
version: &some_str[0..2],
}
}
struct MyStruct<'a> {
version: &'a str,
}
// using Vec<u8> -> & [u8]
fn my_func_vec<'a>(some_vec: &'a Vec<u8>) -> MyStruct2<'a> {
MyStruct2 {
version: &some_vec[0..2],
}
}
struct MyStruct2<'a> {
version: &'a [u8],
}