I'm implementing a bytecode VM and am struggling referencing data stored in a parsed representation of the bytecode. As is the nature of (most) bytecode, it and thus its parsed representation remain unmodified once it's initialized. A separate Vm contains the mutable parts (stack etc.) along with that module. I made an MCVE with additional explanatory comments to illustrate the problem; it's at the bottom and on the playground. The parsed bytecode may look like this:
Module { struct_types: {"Bar": StructType::Named(["a", "b"])} }
The strings "Bar", "a", "b" are references into the bytecode and have lifetime 'b, so we also have lifetimes in the types Module<'b> and StructType<'b>.
After creating this, I will want to create struct instances, think let bar = Bar { a: (), b: () };. At least currently, each struct instance needs to hold a reference to its type, so that type might look like this:
pub struct Struct<'b> {
struct_type: &'b bytecode::StructType<'b>,
fields: Vec<Value<'b>>,
}
The values of a struct's fields may be constants whose value is stored in the bytecode, so the Value enum has a lifetime 'b as well, and that works. The problem is that I have a &'b bytecode::StructType<'b> in the first field: how do I get a reference that lives long enough? I think the reference would actually be valid long enough.
The part of the code that I suspect to be the critical one is here:
pub fn struct_type(&self, _name: &str) -> Option<&'b StructType<'b>> {
// self.struct_types.get(name)
todo!("fix lifetime problems")
}
With the commented out code, I can't get a 'b reference because the reference self.struct_types lives too short; to fix that I'd need to do &'b self which would spread virally through the code; also, most of the times I need to borrow the Vm mutably, which doesn't work if all those exclusive self references have to live long.
Introducing a separate lifetime 'm so that I could return a &'m StructType<'b> sounds like something that I could try as well, but that sounds just as viral and in addition introduces a separate lifetime I need to keep track of; being able to replace 'b with 'm (or at least only having on in each place) would be a bit nicer.
Finally this feels like something that pinning could be helpful with, but I don't understand that topic enough to make an educated guess on how that could be approached.
MCVE
#![allow(dead_code)]
mod bytecode {
use std::collections::BTreeMap;
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum StructType<'b> {
/// unit struct type; doesn't have fields
Empty,
/// tuple struct type; fields are positional
Positional(usize),
/// "normal" struct type; fields are named
Named(Vec<&'b str>),
}
impl<'b> StructType<'b> {
pub fn field_count(&self) -> usize {
match self {
Self::Empty => 0,
Self::Positional(field_count) => *field_count,
Self::Named(fields) => fields.len(),
}
}
}
#[derive(Debug, Clone)]
pub struct Module<'b> {
struct_types: BTreeMap<&'b str, StructType<'b>>,
}
impl<'b> Module<'b> {
// here is the problem: I would like to return a reference with lifetime 'b.
// from the point I start executing instructions, I know that I won't modify
// the module (particularly, I won't add entries to the map), so I think that
// lifetime should be possible - pinning? `&'b self` everywhere? idk
pub fn struct_type(&self, _name: &str) -> Option<&'b StructType<'b>> {
// self.struct_types.get(name)
todo!("fix lifetime problems")
}
}
pub fn parse<'b>(bytecode: &'b str) -> Module<'b> {
// this would use nom to parse actual bytecode
assert_eq!(bytecode, "struct Bar { a, b }");
let bar = &bytecode[7..10];
let a = &bytecode[13..14];
let b = &bytecode[16..17];
let fields = vec![a, b];
let bar_struct = StructType::Named(fields);
let struct_types = BTreeMap::from_iter([
(bar, bar_struct),
]);
Module { struct_types }
}
}
mod vm {
use crate::bytecode::{self, StructType};
#[derive(Debug, Clone)]
pub enum Value<'b> {
Unit,
Struct(Struct<'b>),
}
#[derive(Debug, Clone)]
pub struct Struct<'b> {
struct_type: &'b bytecode::StructType<'b>,
fields: Vec<Value<'b>>,
}
impl<'b> Struct<'b> {
pub fn new(struct_type: &'b bytecode::StructType<'b>, fields: Vec<Value<'b>>) -> Self {
Struct { struct_type, fields }
}
}
#[derive(Debug, Clone)]
pub struct Vm<'b> {
module: bytecode::Module<'b>,
}
impl<'b> Vm<'b> {
pub fn new(module: bytecode::Module<'b>) -> Self {
Self { module }
}
pub fn create_struct(&mut self, type_name: &'b str) -> Value<'b> {
let struct_type: &'b StructType<'b> = self.module.struct_type(type_name).unwrap();
// just initialize the fields to something, we don't care
let fields = vec![Value::Unit; struct_type.field_count()];
let value = Value::Struct(Struct::new(struct_type, fields));
value
}
}
}
pub fn main() {
// the bytecode contains all constants needed at runtime;
// we're just interested in how struct types are handled
// obviously the real bytecode is not as human-readable
let bytecode = "struct Bar { a, b }";
// we parse that into a module that, among other things,
// has a map of all struct types
let module = bytecode::parse(bytecode);
println!("{:?}", module);
// we create a Vm that is capable of running commands
// that are stored in the module
let mut vm = vm::Vm::new(module);
// now we try to execute an instruction to create a struct value
// the instruction for this contains a reference to the type name
// stored in the bytecode.
// the struct value contains a reference to its type and holds its field values.
let value = {
let bar = &bytecode[7..10];
vm.create_struct(bar)
};
println!("{:?}", value);
}
&'b bytecode::StructType<'b> is a classic anti-pattern in Rust, it strongly indicates incorrectly annotated lifetimes. It doesn't make sense that an object would depend on some lifetime and borrowing it creates the same lifetime. That is very rare to happen on purpose.
So I suspect you need two lifetimes, which I will call 'm and 'b:
'b: the lifetime of the bytecode string, everything that references it will use &'b str.
'm: the lifetime of the Module object. Everything that references it or its contained StructType will use this lifetime.
If split into two lifetimes and adjusted correctly, it simply works:
#![allow(dead_code)]
mod bytecode {
use std::{collections::BTreeMap, iter::FromIterator};
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum StructType<'b> {
/// unit struct type; doesn't have fields
Empty,
/// tuple struct type; fields are positional
Positional(usize),
/// "normal" struct type; fields are named
Named(Vec<&'b str>),
}
impl<'b> StructType<'b> {
pub fn field_count(&self) -> usize {
match self {
Self::Empty => 0,
Self::Positional(field_count) => *field_count,
Self::Named(fields) => fields.len(),
}
}
}
#[derive(Debug, Clone)]
pub struct Module<'b> {
struct_types: BTreeMap<&'b str, StructType<'b>>,
}
impl<'b> Module<'b> {
// here is the problem: I would like to return a reference with lifetime 'b.
// from the point I start executing instructions, I know that I won't modify
// the module (particularly, I won't add entries to the map), so I think that
// lifetime should be possible - pinning? `&'b self` everywhere? idk
pub fn struct_type(&self, name: &str) -> Option<&StructType<'b>> {
self.struct_types.get(name)
}
}
pub fn parse<'b>(bytecode: &'b str) -> Module<'b> {
// this would use nom to parse actual bytecode
assert_eq!(bytecode, "struct Bar { a, b }");
let bar = &bytecode[7..10];
let a = &bytecode[13..14];
let b = &bytecode[16..17];
let fields = vec![a, b];
let bar_struct = StructType::Named(fields);
let struct_types = BTreeMap::from_iter([(bar, bar_struct)]);
Module { struct_types }
}
}
mod vm {
use crate::bytecode::{self, StructType};
#[derive(Debug, Clone)]
pub enum Value<'b, 'm> {
Unit,
Struct(Struct<'b, 'm>),
}
#[derive(Debug, Clone)]
pub struct Struct<'b, 'm> {
struct_type: &'m bytecode::StructType<'b>,
fields: Vec<Value<'b, 'm>>,
}
impl<'b, 'm> Struct<'b, 'm> {
pub fn new(struct_type: &'m bytecode::StructType<'b>, fields: Vec<Value<'b, 'm>>) -> Self {
Struct {
struct_type,
fields,
}
}
}
#[derive(Debug, Clone)]
pub struct Vm<'b> {
module: bytecode::Module<'b>,
}
impl<'b> Vm<'b> {
pub fn new(module: bytecode::Module<'b>) -> Self {
Self { module }
}
pub fn create_struct(&mut self, type_name: &str) -> Value<'b, '_> {
let struct_type: &StructType<'b> = self.module.struct_type(type_name).unwrap();
// just initialize the fields to something, we don't care
let fields = vec![Value::Unit; struct_type.field_count()];
let value = Value::Struct(Struct::new(struct_type, fields));
value
}
}
}
pub fn main() {
// the bytecode contains all constants needed at runtime;
// we're just interested in how struct types are handled
// obviously the real bytecode is not as human-readable
let bytecode = "struct Bar { a, b }";
// we parse that into a module that, among other things,
// has a map of all struct types
let module = bytecode::parse(bytecode);
println!("{:?}", module);
// we create a Vm that is capable of running commands
// that are stored in the module
let mut vm = vm::Vm::new(module);
// now we try to execute an instruction to create a struct value
// the instruction for this contains a reference to the type name
// stored in the bytecode.
// the struct value contains a reference to its type and holds its field values.
let value = {
let bar = &bytecode[7..10];
vm.create_struct(bar)
};
println!("{:?}", value);
}
Module { struct_types: {"Bar": Named(["a", "b"])} }
Struct(Struct { struct_type: Named(["a", "b"]), fields: [Unit, Unit] })
It can further be simplified, however, due to the fact that 'm is connected to 'b, and therefore everything that depends on 'm automatically also has access to 'b objects, because 'b is guaranteed to outlive 'm.
Therefore, let's introduce 'a, which we will now use inside of the vm mod to reference anything from the bytecode mod. This will further allow lifetime elysion to happen at a couple of points, simplifying the code even further:
#![allow(dead_code)]
mod bytecode {
use std::{collections::BTreeMap, iter::FromIterator};
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum StructType<'b> {
/// unit struct type; doesn't have fields
Empty,
/// tuple struct type; fields are positional
Positional(usize),
/// "normal" struct type; fields are named
Named(Vec<&'b str>),
}
impl<'b> StructType<'b> {
pub fn field_count(&self) -> usize {
match self {
Self::Empty => 0,
Self::Positional(field_count) => *field_count,
Self::Named(fields) => fields.len(),
}
}
}
#[derive(Debug, Clone)]
pub struct Module<'b> {
struct_types: BTreeMap<&'b str, StructType<'b>>,
}
impl<'b> Module<'b> {
// here is the problem: I would like to return a reference with lifetime 'b.
// from the point I start executing instructions, I know that I won't modify
// the module (particularly, I won't add entries to the map), so I think that
// lifetime should be possible - pinning? `&'b self` everywhere? idk
pub fn struct_type(&self, name: &str) -> Option<&StructType<'b>> {
self.struct_types.get(name)
}
}
pub fn parse<'b>(bytecode: &'b str) -> Module<'b> {
// this would use nom to parse actual bytecode
assert_eq!(bytecode, "struct Bar { a, b }");
let bar = &bytecode[7..10];
let a = &bytecode[13..14];
let b = &bytecode[16..17];
let fields = vec![a, b];
let bar_struct = StructType::Named(fields);
let struct_types = BTreeMap::from_iter([(bar, bar_struct)]);
Module { struct_types }
}
}
mod vm {
use crate::bytecode::{self, StructType};
#[derive(Debug, Clone)]
pub enum Value<'a> {
Unit,
Struct(Struct<'a>),
}
#[derive(Debug, Clone)]
pub struct Struct<'a> {
struct_type: &'a bytecode::StructType<'a>,
fields: Vec<Value<'a>>,
}
impl<'a> Struct<'a> {
pub fn new(struct_type: &'a bytecode::StructType, fields: Vec<Value<'a>>) -> Self {
Struct {
struct_type,
fields,
}
}
}
#[derive(Debug, Clone)]
pub struct Vm<'a> {
module: bytecode::Module<'a>,
}
impl<'a> Vm<'a> {
pub fn new(module: bytecode::Module<'a>) -> Self {
Self { module }
}
pub fn create_struct(&mut self, type_name: &str) -> Value {
let struct_type: &StructType = self.module.struct_type(type_name).unwrap();
// just initialize the fields to something, we don't care
let fields = vec![Value::Unit; struct_type.field_count()];
let value = Value::Struct(Struct::new(struct_type, fields));
value
}
}
}
pub fn main() {
// the bytecode contains all constants needed at runtime;
// we're just interested in how struct types are handled
// obviously the real bytecode is not as human-readable
let bytecode = "struct Bar { a, b }";
// we parse that into a module that, among other things,
// has a map of all struct types
let module = bytecode::parse(bytecode);
println!("{:?}", module);
// we create a Vm that is capable of running commands
// that are stored in the module
let mut vm = vm::Vm::new(module);
// now we try to execute an instruction to create a struct value
// the instruction for this contains a reference to the type name
// stored in the bytecode.
// the struct value contains a reference to its type and holds its field values.
let value = {
let bar = &bytecode[7..10];
vm.create_struct(bar)
};
println!("{:?}", value);
}
Module { struct_types: {"Bar": Named(["a", "b"])} }
Struct(Struct { struct_type: Named(["a", "b"]), fields: [Unit, Unit] })
Fun fact: This is now one of the rare cases where we legitimately have to use &'a bytecode::StructType<'a>, so take my opening statement with a grain of salt, and you were kind of right all along :)
The crazy thing is if we then rename 'a to 'b to be consistent with your original code, we get almost your code with only some minor differences:
#![allow(dead_code)]
mod bytecode {
use std::{collections::BTreeMap, iter::FromIterator};
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum StructType<'b> {
/// unit struct type; doesn't have fields
Empty,
/// tuple struct type; fields are positional
Positional(usize),
/// "normal" struct type; fields are named
Named(Vec<&'b str>),
}
impl<'b> StructType<'b> {
pub fn field_count(&self) -> usize {
match self {
Self::Empty => 0,
Self::Positional(field_count) => *field_count,
Self::Named(fields) => fields.len(),
}
}
}
#[derive(Debug, Clone)]
pub struct Module<'b> {
struct_types: BTreeMap<&'b str, StructType<'b>>,
}
impl<'b> Module<'b> {
// here is the problem: I would like to return a reference with lifetime 'b.
// from the point I start executing instructions, I know that I won't modify
// the module (particularly, I won't add entries to the map), so I think that
// lifetime should be possible - pinning? `&'b self` everywhere? idk
pub fn struct_type(&self, name: &str) -> Option<&StructType<'b>> {
self.struct_types.get(name)
}
}
pub fn parse<'b>(bytecode: &'b str) -> Module<'b> {
// this would use nom to parse actual bytecode
assert_eq!(bytecode, "struct Bar { a, b }");
let bar = &bytecode[7..10];
let a = &bytecode[13..14];
let b = &bytecode[16..17];
let fields = vec![a, b];
let bar_struct = StructType::Named(fields);
let struct_types = BTreeMap::from_iter([(bar, bar_struct)]);
Module { struct_types }
}
}
mod vm {
use crate::bytecode::{self, StructType};
#[derive(Debug, Clone)]
pub enum Value<'b> {
Unit,
Struct(Struct<'b>),
}
#[derive(Debug, Clone)]
pub struct Struct<'b> {
struct_type: &'b bytecode::StructType<'b>,
fields: Vec<Value<'b>>,
}
impl<'b> Struct<'b> {
pub fn new(struct_type: &'b bytecode::StructType, fields: Vec<Value<'b>>) -> Self {
Struct {
struct_type,
fields,
}
}
}
#[derive(Debug, Clone)]
pub struct Vm<'b> {
module: bytecode::Module<'b>,
}
impl<'b> Vm<'b> {
pub fn new(module: bytecode::Module<'b>) -> Self {
Self { module }
}
pub fn create_struct(&mut self, type_name: &str) -> Value {
let struct_type: &StructType = self.module.struct_type(type_name).unwrap();
// just initialize the fields to something, we don't care
let fields = vec![Value::Unit; struct_type.field_count()];
let value = Value::Struct(Struct::new(struct_type, fields));
value
}
}
}
pub fn main() {
// the bytecode contains all constants needed at runtime;
// we're just interested in how struct types are handled
// obviously the real bytecode is not as human-readable
let bytecode = "struct Bar { a, b }";
// we parse that into a module that, among other things,
// has a map of all struct types
let module = bytecode::parse(bytecode);
println!("{:?}", module);
// we create a Vm that is capable of running commands
// that are stored in the module
let mut vm = vm::Vm::new(module);
// now we try to execute an instruction to create a struct value
// the instruction for this contains a reference to the type name
// stored in the bytecode.
// the struct value contains a reference to its type and holds its field values.
let value = {
let bar = &bytecode[7..10];
vm.create_struct(bar)
};
println!("{:?}", value);
}
Module { struct_types: {"Bar": Named(["a", "b"])} }
Struct(Struct { struct_type: Named(["a", "b"]), fields: [Unit, Unit] })
So the actual fix for your original code is as follows:
4c4
< use std::collections::BTreeMap;
---
> use std::{collections::BTreeMap, iter::FromIterator};
36,38c36,37
< pub fn struct_type(&self, _name: &str) -> Option<&'b StructType<'b>> {
< // self.struct_types.get(name)
< todo!("fix lifetime problems")
---
> pub fn struct_type(&self, name: &str) -> Option<&StructType<'b>> {
> self.struct_types.get(name)
73c72
< pub fn new(struct_type: &'b bytecode::StructType<'b>, fields: Vec<Value<'b>>) -> Self {
---
> pub fn new(struct_type: &'b bytecode::StructType, fields: Vec<Value<'b>>) -> Self {
91,92c90,91
< pub fn create_struct(&mut self, type_name: &'b str) -> Value<'b> {
< let struct_type: &'b StructType<'b> = self.module.struct_type(type_name).unwrap();
---
> pub fn create_struct(&mut self, type_name: &str) -> Value {
> let struct_type: &StructType = self.module.struct_type(type_name).unwrap();
I hope deriving them step by step made it somewhat clear why those changes are necessary.
I wonder whether there's a way to preserve the original String using serde_json? Consider this example:
#[derive(Debug, Serialize, Deserialize)]
struct User {
#[serde(skip)]
pub raw: String,
pub id: u64,
pub login: String,
}
{
"id": 123,
"login": "johndoe"
}
My structure would end up containing such values:
User {
raw: String::from(r#"{"id": 123,"login": "johndoe"}"#),
id: 1,
login: String::from("johndoe")
}
Currently, I'm doing that by deserializing into Value, then deserializing this value into the User structure and assigning Value to the raw field, but that doesn't seem right, perhaps there's a better way to do so?
This solution uses the RawValue type from serde_json to first get the original input string. Then a new Deserializer is created from that String to deserialize the User type.
This solution can work with readers, by using Box<serde_json::value::RawValue> as an intermediary type and it can also work with struct which borrow from the input, by using &'de serde_json::value::RawValue as the intermediary. You can test it in the solution by (un-)commenting the borrow field.
use std::marker::PhantomData;
#[derive(Debug, serde::Serialize, serde::Deserialize)]
#[serde(remote = "Self")]
struct User<'a> {
#[serde(skip)]
pub raw: String,
pub id: u64,
pub login: String,
// Test for borrowing input data
// pub borrow: &'a str,
#[serde(skip)]
pub ignored: PhantomData<&'a ()>,
}
impl serde::Serialize for User<'_> {
fn serialize<S: serde::Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
Self::serialize(self, serializer)
}
}
impl<'a, 'de> serde::Deserialize<'de> for User<'a>
where
'de: 'a,
{
fn deserialize<D: serde::Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
use serde::de::Error;
// Deserializing a `&'a RawValue` would also work here
// but then you loose support for deserializing from readers
let raw: Box<serde_json::value::RawValue> = Box::deserialize(deserializer)?;
// Use this line instead if you have a struct which borrows from the input
// let raw = <&'de serde_json::value::RawValue>::deserialize(deserializer)?;
let mut raw_value_deserializer = serde_json::Deserializer::from_str(raw.get());
let mut user =
User::deserialize(&mut raw_value_deserializer).map_err(|err| D::Error::custom(err))?;
user.raw = raw.get().to_string();
Ok(user)
}
}
fn main() {
// Test serialization
let u = User {
raw: String::new(),
id: 456,
login: "USERNAME".to_string(),
// Test for borrowing input data
// borrow: "foobar",
ignored: PhantomData,
};
let json = serde_json::to_string(&u).unwrap();
println!("{}", json);
// Test deserialization
let u2: User = serde_json::from_str(&json).unwrap();
println!("{:#?}", u2);
}
Test on the Playground.
When using a Arc<Mutex<>> wrapper for a struct, is it possible to deref the wrapper into one of the internal structs fields? In this case I want to deref my PersonWrapper struct into a String from the inner PersonStruct "name" field.
use std::sync::{Mutex, Arc};
pub struct PersonWrapper{
inner: Arc<Mutex<Person>>
}
impl PersonWrapper{
pub fn new(name: String, age: i32,)->Self{
PersonWrapper{
inner: Arc::new(Mutex::new(Person::new(name, age)))
}
}
}
pub struct Person{
name: String,
age: i32,
}
impl Person{
pub fn new(name: String, age: i32)->Self{
Person{
name,
age,
}
}
}
impl std::ops::Deref for PersonWrapper{
type Target = String;
fn deref(&self) -> &Self::Target {
&self.inner.into_inner().unwrap().name
// &self.inner.lock().unwrap().node_id.clone()
}
}
fn main(){
let wrapped_person = PersonWrapper::new(String::from("Zondo"), 45);
assert_eq!(*wrapped_person, String::from("Zondo"));
}
gives errors:
error[E0515]: cannot return reference to temporary value
&self.inner.into_inner().unwrap().name
^--------------------------------^^^^^
||
|temporary value created here
returns a reference to data owned by the current function
error[E0507]: cannot move out of an `Arc`
&self.inner.into_inner().unwrap().name
^^^^^^^^^^ move occurs because value has type `Mutex<Person>`, which does not implement the `Copy` trait
Playground
If you need to provide convenient access to a part of data behind a Mutex, combining the locking and "projecting" (narrowing to a field), you can use the Mutex from the parking_lot crate. Among many other optimizations and improvements, it offers MutexGuard. For example:
use std::sync::Arc;
use parking_lot::{Mutex, MutexGuard, MappedMutexGuard};
pub struct PersonWrapper {
inner: Arc<Mutex<Person>>,
}
impl PersonWrapper {
pub fn new(name: String, age: i32) -> Self {
PersonWrapper {
inner: Arc::new(Mutex::new(Person::new(name, age))),
}
}
pub fn name(&self) -> MappedMutexGuard<String> {
MutexGuard::map(self.inner.lock(), |p| &mut p.name)
}
}
Now you can use name() as if it were a locked mutex that only held the name part of the Person:
let wrapped_person = PersonWrapper::new("Zondo".to_owned(), 45);
assert_eq!(&*wrapped_person.name(), "Zondo");
*wrapped_person.name() = "Mondo".to_owned();
assert_eq!(&*wrapped_person.name(), "Mondo");
Playground
I'm trying to do the equivalent of Ruby's Enumerable.collect() in Rust.
I have an Option<Vec<Attachment>> and I want to create a Option<Vec<String>> from it, with String::new() elements in case of None guid.
#[derive(Debug)]
pub struct Attachment {
pub guid: Option<String>,
}
fn main() {
let ov: Option<Vec<Attachment>> =
Some(vec![Attachment { guid: Some("rere34r34r34r34".to_string()) },
Attachment { guid: Some("5345345534rtyr5345".to_string()) }]);
let foo: Option<Vec<String>> = match ov {
Some(x) => {
x.iter()
.map(|&attachment| attachment.guid.unwrap_or(String::new()))
.collect()
}
None => None,
};
}
The error in the compiler is clear:
error[E0277]: the trait bound `std::option::Option<std::vec::Vec<std::string::String>>: std::iter::FromIterator<std::string::String>` is not satisfied
--> src/main.rs:15:18
|
15 | .collect()
| ^^^^^^^ the trait `std::iter::FromIterator<std::string::String>` is not implemented for `std::option::Option<std::vec::Vec<std::string::String>>`
|
= note: a collection of type `std::option::Option<std::vec::Vec<std::string::String>>` cannot be built from an iterator over elements of type `std::string::String`
If I remember what I've read from the documentation so far, I cannot implement traits for struct that I don't own.
How can I do this using iter().map(...).collect() or maybe another way?
You should read and memorize all of the methods on Option (and Result). These are used so pervasively in Rust that knowing what is present will help you immensely.
For example, your match statement is Option::map.
Since you never said you couldn't transfer ownership of the Strings, I'd just do that. This will avoid any extra allocation:
let foo: Option<Vec<_>> =
ov.map(|i| i.into_iter().map(|a| a.guid.unwrap_or_else(String::new)).collect());
Note we don't have to specify the type inside the Vec; it can be inferred.
You can of course introduce functions to make it cleaner:
impl Attachment {
fn into_guid(self) -> String {
self.guid.unwrap_or_else(String::new)
}
}
// ...
let foo: Option<Vec<_>> = ov.map(|i| i.into_iter().map(Attachment::into_guid).collect());
If you don't want to give up ownership of the String, you can do the same concept but with a string slice:
impl Attachment {
fn guid(&self) -> &str {
self.guid.as_ref().map_or("", String::as_str)
}
}
// ...
let foo: Option<Vec<_>> = ov.as_ref().map(|i| i.iter().map(|a| a.guid().to_owned()).collect());
Here, we have to use Option::as_ref to avoid moving the guid out of the Attachment, then convert to a &str with String::as_str, providing a default value. We likewise don't take ownership of the Option of ov, and thus need to iterate over references, and ultimately allocate new Strings with ToOwned.
Here is a solution that works:
#[derive(Debug)]
pub struct Attachment {
pub guid: Option<String>,
}
fn main() {
let ov: Option<Vec<Attachment>> =
Some(vec![Attachment { guid: Some("rere34r34r34r34".to_string()) },
Attachment { guid: Some("5345345534rtyr5345".to_string()) }]);
let foo: Option<Vec<_>> = ov.map(|x|
x.iter().map(|a| a.guid.as_ref().unwrap_or(&String::new()).clone()).collect());
println!("{:?}", foo);
}
One of the issues with the above code is stopping the guid being moved out of the Attachment and into the vector. My example calls clone to move cloned instances into the vector.
This works, but I think it looks nicer wrapped in a trait impl for Option<T>. Perhaps this is a better ... option ...:
trait CloneOr<T, U>
where U: Into<T>,
T: Clone
{
fn clone_or(&self, other: U) -> T;
}
impl<T, U> CloneOr<T, U> for Option<T>
where U: Into<T>,
T: Clone
{
fn clone_or(&self, other: U) -> T {
self.as_ref().unwrap_or(&other.into()).clone()
}
}
#[derive(Debug)]
pub struct Attachment {
pub guid: Option<String>,
}
fn main() {
let ov: Option<Vec<Attachment>> =
Some(vec![Attachment { guid: Some("rere34r34r34r34".to_string()) },
Attachment { guid: Some("5345345534rtyr5345".to_string()) },
Attachment { guid: None }]);
let foo: Option<Vec<_>> =
ov.map(|x| x.iter().map(|a| a.guid.clone_or("")).collect());
println!("{:?}", foo);
}
Essentially the unwrapping and cloning is hidden behind a trait implementation that attaches to Option<T>.
Here it is running on the playground.